Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Y... Signed-off-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com CC: Jann Horn jannh@google.com CC: Thomas Gleixner tglx@linutronix.de CC: Peter Zijlstra (Intel) peterz@infradead.org CC: Ingo Molnar mingo@kernel.org CC: Andrea Parri parri.andrea@gmail.com CC: Andrew Hunter ahh@google.com CC: Andy Lutomirski luto@kernel.org CC: Avi Kivity avi@scylladb.com CC: Benjamin Herrenschmidt benh@kernel.crashing.org CC: Boqun Feng boqun.feng@gmail.com CC: Dave Watson davejwatson@fb.com CC: David Sehr sehr@google.com CC: Greg Hackmann ghackmann@google.com CC: H. Peter Anvin hpa@zytor.com CC: Linus Torvalds torvalds@linux-foundation.org CC: Maged Michael maged.michael@gmail.com CC: Michael Ellerman mpe@ellerman.id.au CC: Paul E. McKenney paulmck@linux.vnet.ibm.com CC: Paul Mackerras paulus@samba.org CC: Russell King linux@armlinux.org.uk CC: Will Deacon will.deacon@arm.com CC: stable@vger.kernel.org # v4.16+ CC: linux-api@vger.kernel.org --- kernel/sched/membarrier.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 76e0eaf4654e..305fdcc4c5f7 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -81,12 +81,27 @@ static int membarrier_global_expedited(void)
rcu_read_lock(); p = task_rcu_dereference(&cpu_rq(cpu)->curr); - if (p && p->mm && (atomic_read(&p->mm->membarrier_state) & - MEMBARRIER_STATE_GLOBAL_EXPEDITED)) { - if (!fallback) - __cpumask_set_cpu(cpu, tmpmask); - else - smp_call_function_single(cpu, ipi_mb, NULL, 1); + /* + * Skip this CPU if the runqueue's current task is NULL or if + * it is a kernel thread. + */ + if (p && READ_ONCE(p->mm)) { + bool mm_match; + + /* + * Read p->mm and access membarrier_state while holding + * the task lock to ensure existence of mm. + */ + task_lock(p); + mm_match = p->mm && (atomic_read(&p->mm->membarrier_state) & + MEMBARRIER_STATE_GLOBAL_EXPEDITED); + task_unlock(p); + if (mm_match) { + if (!fallback) + __cpumask_set_cpu(cpu, tmpmask); + else + smp_call_function_single(cpu, ipi_mb, NULL, 1); + } } rcu_read_unlock(); }
On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Hmm. I think this is right. You shouldn't access another threads mm pointer without proper locking.
That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, which would allow speculatively reading data off the mm pointer under RCU. It might not be the *right* mm if somebody just did an exit, but for things like this it shouldn't matter.
But if this is the only case that might care, it sounds like just doing the proper locking is the right approach.
Linus
On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Hmm. I think this is right. You shouldn't access another threads mm pointer without proper locking.
That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, which would allow speculatively reading data off the mm pointer under RCU. It might not be the *right* mm if somebody just did an exit, but for things like this it shouldn't matter.
That sounds much simpler and more effective than the contention-reduction approach that I suggested. ;-)
Thanx, Paul
But if this is the only case that might care, it sounds like just doing the proper locking is the right approach.
Linus
----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@linux.ibm.com wrote:
On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Hmm. I think this is right. You shouldn't access another threads mm pointer without proper locking.
That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, which would allow speculatively reading data off the mm pointer under RCU. It might not be the *right* mm if somebody just did an exit, but for things like this it shouldn't matter.
That sounds much simpler and more effective than the contention-reduction approach that I suggested. ;-)
I'd be tempted to stick to the locking approach for a fix, and implement Linus' type-safe mm_cachep idea if anyone complains about the overhead of membarrier GLOBAL_EXPEDITED (and submit for a future merge window).
I tested the KASAN splat reproducer from Jann locally, and confirmed that my patch fixes the issue it reproduces.
Please let me know if the task_lock() approach is OK as a fix for now.
I'm also awaiting a Tested-by from Jann before submitting this for real.
Thanks,
Mathieu
Thanx, Paul
But if this is the only case that might care, it sounds like just doing the proper locking is the right approach.
Linus
On Mon, Jan 28, 2019 at 04:07:26PM -0500, Mathieu Desnoyers wrote:
----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@linux.ibm.com wrote:
On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Hmm. I think this is right. You shouldn't access another threads mm pointer without proper locking.
That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, which would allow speculatively reading data off the mm pointer under RCU. It might not be the *right* mm if somebody just did an exit, but for things like this it shouldn't matter.
That sounds much simpler and more effective than the contention-reduction approach that I suggested. ;-)
I'd be tempted to stick to the locking approach for a fix, and implement Linus' type-safe mm_cachep idea if anyone complains about the overhead of membarrier GLOBAL_EXPEDITED (and submit for a future merge window).
I tested the KASAN splat reproducer from Jann locally, and confirmed that my patch fixes the issue it reproduces.
Please let me know if the task_lock() approach is OK as a fix for now.
Agreed, no need for added complexity until there is a clear need.
I'm also awaiting a Tested-by from Jann before submitting this for real.
Makes sense to me!
Thanx, Paul
Thanks,
Mathieu
Thanx, Paul
But if this is the only case that might care, it sounds like just doing the proper locking is the right approach.
Linus
-- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
On Mon, Jan 28, 2019 at 7:27 PM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Link: https://lore.kernel.org/lkml/CAG48ez2G8ctF8dHS42TF37pThfr3y0RNOOYTmxvACm4u8Y... Signed-off-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com
The patch looks good to me, and to be sure, I've also given it a spin - I can't trigger a splat anymore. You can add:
Tested-by: Jann Horn jannh@google.com
linux-stable-mirror@lists.linaro.org