On Wed, Apr 11, 2018 at 09:00:07AM +0200, Vlastimil Babka wrote:
cache_reap() is initially scheduled in start_cpu_timer() via schedule_delayed_work_on(). But then the next iterations are scheduled via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.
Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run on the originally intended cpu, although it's still preferred. I was able to demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also happen due to migrating timers in nohz context. As a result, some cpu's would be calling cache_reap() more frequently and others never.
This patch uses schedule_delayed_work_on() with the current cpu when scheduling the next iteration.
Could you write down part about "so what's the user effect on some condition?". It would really help to pick up the patch.
Signed-off-by: Vlastimil Babka vbabka@suse.cz Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") CC: stable@vger.kernel.org Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: David Rientjes rientjes@google.com Cc: Pekka Enberg penberg@kernel.org Cc: Christoph Lameter cl@linux.com Cc: Tejun Heo tj@kernel.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: John Stultz john.stultz@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Stephen Boyd sboyd@kernel.org
mm/slab.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/slab.c b/mm/slab.c index 9095c3945425..a76006aae857 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w) next_reap_node(); out: /* Set up the next iteration */
- schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
- schedule_delayed_work_on(smp_processor_id(), work,
round_jiffies_relative(REAPTIMEOUT_AC));
} void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo) -- 2.16.3