[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng mmapfork, which forks thousands of threads, in a sched group on a 224 cores system. The load balance doesn't handle this correctly because env->imbalance never decreases and it will stop pulling tasks only after reaching loop_max, which can be equal to the number of running tasks of the cfs. Make sure that imbalance will be decreased by at least 1.
We can't simply ensure that task_h_load() returns at least one because it would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [removed misfit part which was not implemented yet] Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Reviewed-by: Dietmar Eggemann dietmar.eggemann@arm.com Tested-by: Dietmar Eggemann dietmar.eggemann@arm.com Cc: stable@vger.kernel.org # v4.19 v4.14 v4.9 v4.4 cc: Sasha Levin sashal@kernel.org Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org ---
This patch also applies on v4.14.188 v4.9.230 and v4.4.230
kernel/sched/fair.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 92b1e71f13c8..d8c249e6dcb7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7337,7 +7337,15 @@ static int detach_tasks(struct lb_env *env) if (!can_migrate_task(p, env)) goto next;
- load = task_h_load(p); + /* + * Depending of the number of CPUs and tasks and the + * cgroup hierarchy, task_h_load() can return a null + * value. Make sure that env->imbalance decreases + * otherwise detach_tasks() will stop only after + * detaching up to loop_max tasks. + */ + load = max_t(unsigned long, task_h_load(p), 1); +
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed) goto next;
On Mon, Jul 20, 2020 at 10:34:01AM +0200, Vincent Guittot wrote:
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng mmapfork, which forks thousands of threads, in a sched group on a 224 cores system. The load balance doesn't handle this correctly because env->imbalance never decreases and it will stop pulling tasks only after reaching loop_max, which can be equal to the number of running tasks of the cfs. Make sure that imbalance will be decreased by at least 1.
We can't simply ensure that task_h_load() returns at least one because it would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [removed misfit part which was not implemented yet] Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Reviewed-by: Valentin Schneider valentin.schneider@arm.com Reviewed-by: Dietmar Eggemann dietmar.eggemann@arm.com Tested-by: Dietmar Eggemann dietmar.eggemann@arm.com Cc: stable@vger.kernel.org # v4.19 v4.14 v4.9 v4.4 cc: Sasha Levin sashal@kernel.org Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
This patch also applies on v4.14.188 v4.9.230 and v4.4.230
Thanks for all of the backports, now queued up.
greg k-h
linux-stable-mirror@lists.linaro.org