On Mon, Jan 28, 2019 at 04:07:26PM -0500, Mathieu Desnoyers wrote:
----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@linux.ibm.com wrote:
On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote:
On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
Jann Horn identified a racy access to p->mm in the global expedited command of the membarrier system call.
The suggested fix is to hold the task_lock() around the accesses to p->mm and to the mm_struct membarrier_state field to guarantee the existence of the mm_struct.
Hmm. I think this is right. You shouldn't access another threads mm pointer without proper locking.
That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, which would allow speculatively reading data off the mm pointer under RCU. It might not be the *right* mm if somebody just did an exit, but for things like this it shouldn't matter.
That sounds much simpler and more effective than the contention-reduction approach that I suggested. ;-)
I'd be tempted to stick to the locking approach for a fix, and implement Linus' type-safe mm_cachep idea if anyone complains about the overhead of membarrier GLOBAL_EXPEDITED (and submit for a future merge window).
I tested the KASAN splat reproducer from Jann locally, and confirmed that my patch fixes the issue it reproduces.
Please let me know if the task_lock() approach is OK as a fix for now.
Agreed, no need for added complexity until there is a clear need.
I'm also awaiting a Tested-by from Jann before submitting this for real.
Makes sense to me!
Thanx, Paul
Thanks,
Mathieu
Thanx, Paul
But if this is the only case that might care, it sounds like just doing the proper locking is the right approach.
Linus
-- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com