On Mon, Dec 16, 2019 at 01:00:44PM +0100, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 625110b5e9dae9074d8a7e67dd07f821a053eed7 Mon Sep 17 00:00:00 2001 From: Thomas Hellstrom thellstrom@vmware.com Date: Sat, 30 Nov 2019 17:51:32 -0800 Subject: [PATCH] mm/memory.c: fix a huge pud insertion race during faulting
A huge pud page can theoretically be faulted in racing with pmd_alloc() in __handle_mm_fault(). That will lead to pmd_alloc() returning an invalid pmd pointer.
Fix this by adding a pud_trans_unstable() function similar to pmd_trans_unstable() and check whether the pud is really stable before using the pmd pointer.
Race: Thread 1: Thread 2: Comment create_huge_pud() Fallback - not taken. create_huge_pud() Taken. pmd_alloc() Returns an invalid pointer.
This will result in user-visible huge page data corruption.
Note that this was caught during a code audit rather than a real experienced problem. It looks to me like the only implementation that currently creates huge pud pagetable entries is dev_dax_huge_fault() which doesn't appear to care much about private (COW) mappings or write-tracking which is, I believe, a prerequisite for create_huge_pud() falling back on thread 1, but not in thread 2.
Link: http://lkml.kernel.org/r/20191115115808.21181-2-thomas_os@shipmail.org Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages") Signed-off-by: Thomas Hellstrom thellstrom@vmware.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Arnd Bergmann arnd@arndb.de Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
This one doesn't apply cleanly because 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma") has changed what transparent_hugepage_enabled() does.
The "right" backport here would be to simply change from calling __transparent_hugepage_enabled() to calling transparent_hugepage_enabled() as we don't have 7635d9cbe832 in older kernels, but I worry that if we do end up backporting some part of that logic change later it will diverge us from upstream and will cause for subtle issues that are difficult to debug.
So unless Michal / Andrew yell at me for this, I'm going to take in 7635d9cbe832 even though it's clearly a new feature just to make 625110b5e9da and future patches apply cleanly, and avoid future issues.