When a cfs rq is throttled, the latter and its child are removed from the leaf list but their nr_running is not changed which includes staying higher than 1. When a task is enqueued in this throttled branch, the cfs rqs must be added back in order to ensure correct ordering in the list but this can only happens if nr_running == 1. When cfs bandwidth is used, we call unconditionnaly list_add_leaf_cfs_rq() when enqueuing an entity to make sure that the complete branch will be added.
Reported-by: Christian Borntraeger borntraeger@de.ibm.com Tested-by: Christian Borntraeger borntraeger@de.ibm.com Cc: stable@vger.kernel.org #v5.1+ Signed-off-by: Vincent Guittot vincent.guittot@linaro.org --- kernel/sched/fair.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fcc968669aea..bdc5bb72ab31 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4117,6 +4117,7 @@ static inline void check_schedstat_required(void) #endif }
+static inline bool cfs_bandwidth_used(void);
/* * MIGRATION @@ -4195,10 +4196,16 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) __enqueue_entity(cfs_rq, se); se->on_rq = 1;
- if (cfs_rq->nr_running == 1) { + /* + * When bandwidth control is enabled, cfs might have been removed because of + * a parent been throttled but cfs->nr_running > 1. Try to add it + * unconditionnally. + */ + if (cfs_rq->nr_running == 1 || cfs_bandwidth_used()) list_add_leaf_cfs_rq(cfs_rq); + + if (cfs_rq->nr_running == 1) check_enqueue_throttle(cfs_rq); - } }
static void __clear_buddies_last(struct sched_entity *se)