On 11/04/2018 10.00, Vlastimil Babka wrote:
cache_reap() is initially scheduled in start_cpu_timer() via schedule_delayed_work_on(). But then the next iterations are scheduled via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.
Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run on the originally intended cpu, although it's still preferred. I was able to demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also happen due to migrating timers in nohz context. As a result, some cpu's would be calling cache_reap() more frequently and others never.
This patch uses schedule_delayed_work_on() with the current cpu when scheduling the next iteration.
Signed-off-by: Vlastimil Babka vbabka@suse.cz Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") CC: stable@vger.kernel.org Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: David Rientjes rientjes@google.com Cc: Pekka Enberg penberg@kernel.org Cc: Christoph Lameter cl@linux.com Cc: Tejun Heo tj@kernel.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: John Stultz john.stultz@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Stephen Boyd sboyd@kernel.org
Acked-by: Pekka Enberg penberg@kernel.org
mm/slab.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/slab.c b/mm/slab.c index 9095c3945425..a76006aae857 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w) next_reap_node(); out: /* Set up the next iteration */
- schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
- schedule_delayed_work_on(smp_processor_id(), work,
}round_jiffies_relative(REAPTIMEOUT_AC));
void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)