Peter Zijlstra peterz@infradead.org writes:
On Tue, Aug 20, 2019 at 11:54:20AM +0100, Valentin Schneider wrote:
Turns out a cfs_rq->runtime_remaining can become positive in assign_cfs_rq_runtime(), but this codepath has no call to unthrottle_cfs_rq().
This can leave us in a situation where we have a throttled cfs_rq with positive ->runtime_remaining, which breaks the math in distribute_cfs_runtime(): this function expects a negative value so that it may safely negate it into a positive value.
Add the missing unthrottle_cfs_rq(). While at it, add a WARN_ON where we expect negative values, and pull in a comment from the mailing list that didn't make it in [1].
This didn't exist because it's not supposed to be possible to call account_cfs_rq_runtime on a throttled cfs_rq at all, so that's the invariant being violated. Do you know what the code path causing this looks like?
This would allow both list del and add while distribute is doing a foreach, but I think that the racing behavior would just be to restart the distribute loop, which is fine.
Cc: stable@vger.kernel.org Fixes: ec12cb7f31e2 ("sched: Accumulate per-cfs_rq cpu usage and charge against bandwidth") Reported-by: Liangyan liangyan.peng@linux.alibaba.com Signed-off-by: Valentin Schneider valentin.schneider@arm.com
Thanks!
kernel/sched/fair.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1054d2cf6aaa..219ff3f328e5 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4385,6 +4385,11 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq) return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time; } +static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq) +{
- return cfs_bandwidth_used() && cfs_rq->throttled;
+}
/* returns 0 on failure to allocate runtime */ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq) { @@ -4411,6 +4416,9 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq) cfs_rq->runtime_remaining += amount;
- if (cfs_rq->runtime_remaining > 0 && cfs_rq_throttled(cfs_rq))
unthrottle_cfs_rq(cfs_rq);
- return cfs_rq->runtime_remaining > 0;
} @@ -4439,11 +4447,6 @@ void account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec) __account_cfs_rq_runtime(cfs_rq, delta_exec); } -static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq) -{
- return cfs_bandwidth_used() && cfs_rq->throttled;
-}
/* check whether cfs_rq, or any parent, is throttled */ static inline int throttled_hierarchy(struct cfs_rq *cfs_rq) { @@ -4628,6 +4631,10 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining) if (!cfs_rq_throttled(cfs_rq)) goto next;
/* By the above check, this should never be true */
WARN_ON(cfs_rq->runtime_remaining > 0);
runtime = -cfs_rq->runtime_remaining + 1; if (runtime > remaining) runtime = remaining;/* Pick the minimum amount to return to a positive quota state */
-- 2.22.0