On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote:
On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel riel@surriel.com wrote:
On Mon, 27 Aug 2018 16:04:16 -0700 Andy Lutomirski luto@kernel.org wrote:
The 0day bot is still chewing on this, but I've tested it a bit locally and it seems to do the right thing.
Hi Andy,
the version of the patch below should fix the bug we talked about in email yesterday. It should automatically cover kernel threads in lazy TLB mode, because current->mm will be NULL, while the cpu_tlbstate.loaded_mm should never be NULL.
That's better than mine. I tweaked it a bit and added some debugging, and I got this:
https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86...
I made the loaded_mm handling a little more conservative to make it more obvious that switch_mm_irqs_off() is safe regardless of exactly when it gets called relative to switching current.
I am not convinced that the dance of writing cpu_tlbstate.loaded_mm twice, with a barrier on each end, is useful or necessary.
At the time switch_mm_irqs_off returns, nmi_uaccess_ok() will still return false, because we have not switched "current" to the task that owns the next mm_struct yet.
We just have to make sure to: 1) Change cpu_tlbstate.loaded_mm before we manipulate CR3, and 2) Change "current" only once enough of the mm stuff has been switched, __switch_to seems to get that right.
Between the time switch_mm_irqs_off() sets cpu_tlbstate to the next mm, and __switch_to moves() over current, nmi_uaccess_ok() will return false.