Changes since v1 [1]: * Introduce _safe versions of the set_pte family of helpers (Dave) * Fix the validation check to accept rewriting the same entry (Peter) * Fix up the email reference links in the changelogs (Peter)
[1]: https://lkml.org/lkml/2018/11/20/300
---
From patch 5:
Commit f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling:
WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem]
Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing pte entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings.
Additionally, Dave wanted some runtime assurances that kernel_physical_mapping_init() is only populating and not changing existing page table entries. Patches 1 - 4 are implementing a new set_pte() family of helpers for that purpose.
Patch 5 is tagged for -stable because the false positive warning is now showing up on 4.19-stable kernels. Patches 1 - 4 are not tagged for -stable, but if the sanity checking is needed please mark them as such.
The hang that was observed while developing the sanity checking implementation was resolved by Peter's suggestion to not trigger when the same pte value is being rewritten.
---
Dan Williams (5): generic/pgtable: Make {pmd,pud}_same() unconditionally available generic/pgtable: Introduce {p4d,pgd}_same() generic/pgtable: Introduce set_pte_safe() x86/mm: Validate kernel_physical_mapping_init() pte population x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
arch/x86/include/asm/pgalloc.h | 27 +++++++++++++++ arch/x86/mm/init_64.c | 30 +++++++---------- include/asm-generic/5level-fixup.h | 1 + include/asm-generic/pgtable-nop4d-hack.h | 1 + include/asm-generic/pgtable-nop4d.h | 1 + include/asm-generic/pgtable-nopud.h | 1 + include/asm-generic/pgtable.h | 53 +++++++++++++++++++++++++----- 7 files changed, 87 insertions(+), 27 deletions(-)
Commit f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling:
WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem]
Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing pte entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings.
[1]: https://lore.kernel.org/lkml/9DFD717D-857D-493D-A606-B635D72BAC21@amacapital... [2]: https://lore.kernel.org/lkml/749919a4-cdb1-48a3-adb4-adb81a5fa0b5@intel.com
Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Borislav Petkov bp@alien8.de Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski luto@kernel.org Suggested-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com --- arch/x86/mm/init_64.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 3e25ac2793ef..484c1b92f078 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -584,7 +584,6 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end, paddr_end, page_size_mask, prot); - __flush_tlb_all(); continue; } /* @@ -627,7 +626,6 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end, pud_populate_safe(&init_mm, pud, pmd); spin_unlock(&init_mm.page_table_lock); } - __flush_tlb_all();
update_page_count(PG_LEVEL_1G, pages);
@@ -668,7 +666,6 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end, paddr_last = phys_pud_init(pud, paddr, paddr_end, page_size_mask); - __flush_tlb_all(); continue; }
@@ -680,7 +677,6 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end, p4d_populate_safe(&init_mm, p4d, pud); spin_unlock(&init_mm.page_table_lock); } - __flush_tlb_all();
return paddr_last; } @@ -733,8 +729,6 @@ kernel_physical_mapping_init(unsigned long paddr_start, if (pgd_changed) sync_global_pgds(vaddr_start, vaddr_end - 1);
- __flush_tlb_all(); - return paddr_last; }
A small nit:
On Fri, Nov 30, 2018 at 04:35:32PM -0800, Dan Williams wrote:
I much prefer the form:
https://lkml.kernel.org/r/%24%7Bmsgid%7D
over the direct lore links. Luckily both contain the full msgid which is the important part.
(I still think lore has a horrible UI)
On Fri, Nov 30, 2018 at 04:35:32PM -0800, Dan Williams wrote:
Commit f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling:
WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem]
Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing pte entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings.
Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Borislav Petkov bp@alien8.de Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski luto@kernel.org Suggested-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com
Hi Dan,
This patch on it's own doesn't apply to any of the stable trees, does it maybe depend on some of the previous patches in this series?
-- Thanks, Sasha
On Sat, Dec 1, 2018 at 10:43 PM Sasha Levin sashal@kernel.org wrote:
On Fri, Nov 30, 2018 at 04:35:32PM -0800, Dan Williams wrote:
Commit f77084d96355 "x86/mm/pat: Disable preemption around __flush_tlb_all()" addressed a case where __flush_tlb_all() is called without preemption being disabled. It also left a warning to catch other cases where preemption is not disabled. That warning triggers for the memory hotplug path which is also used for persistent memory enabling:
WARNING: CPU: 35 PID: 911 at ./arch/x86/include/asm/tlbflush.h:460 RIP: 0010:__flush_tlb_all+0x1b/0x3a [..] Call Trace: phys_pud_init+0x29c/0x2bb kernel_physical_mapping_init+0xfc/0x219 init_memory_mapping+0x1a5/0x3b0 arch_add_memory+0x2c/0x50 devm_memremap_pages+0x3aa/0x610 pmem_attach_disk+0x585/0x700 [nd_pmem]
Andy wondered why a path that can sleep was using __flush_tlb_all() [1] and Dave confirmed the expectation for TLB flush is for modifying / invalidating existing pte entries, but not initial population [2]. Drop the usage of __flush_tlb_all() in phys_{p4d,pud,pmd}_init() on the expectation that this path is only ever populating empty entries for the linear map. Note, at linear map teardown time there is a call to the all-cpu flush_tlb_all() to invalidate the removed mappings.
Fixes: f77084d96355 ("x86/mm/pat: Disable preemption around __flush_tlb_all()") Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Borislav Petkov bp@alien8.de Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski luto@kernel.org Suggested-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com
Hi Dan,
This patch on it's own doesn't apply to any of the stable trees, does it maybe depend on some of the previous patches in this series?
It does not strictly depend on them, but it does need to be rebased without them. The minimum patch for -stable are these __flush_tlb_all() removals, but without the set_{pte,pmd,pud,p4d,pgd}_safe() conversion. It's also an option to backport those helpers, and conversion but I'd defer to x86/mm folks to make that call.
On 12/2/18 9:04 AM, Dan Williams wrote:
This patch on it's own doesn't apply to any of the stable trees, does it maybe depend on some of the previous patches in this series?
It does not strictly depend on them, but it does need to be rebased without them. The minimum patch for -stable are these __flush_tlb_all() removals, but without the set_{pte,pmd,pud,p4d,pgd}_safe() conversion. It's also an option to backport those helpers, and conversion but I'd defer to x86/mm folks to make that call.
I think we should skip backporting the helpers for now. Hopefully, mainline will help us catch bugs with those helpers and we can get the fixes for those bugs backported if they show up.
I mostly say this because with the addition of the 5-level paging code, there's been a lot of churn in the area of these helpers. They will be a mild pain to backport, and I'd expect more bugs from the backports than from the lack of the new helpers.
On Fri, Nov 30, 2018 at 04:35:06PM -0800, Dan Williams wrote:
Dan Williams (5): generic/pgtable: Make {pmd,pud}_same() unconditionally available generic/pgtable: Introduce {p4d,pgd}_same() generic/pgtable: Introduce set_pte_safe() x86/mm: Validate kernel_physical_mapping_init() pte population x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
Since you failed to send me 1,2, only for 3-5:
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
although going by the subjects, the earlier two patches should be fine too.
On Sat, Dec 1, 2018 at 2:28 AM Peter Zijlstra peterz@infradead.org wrote:
On Fri, Nov 30, 2018 at 04:35:06PM -0800, Dan Williams wrote:
Dan Williams (5): generic/pgtable: Make {pmd,pud}_same() unconditionally available generic/pgtable: Introduce {p4d,pgd}_same() generic/pgtable: Introduce set_pte_safe() x86/mm: Validate kernel_physical_mapping_init() pte population x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
Since you failed to send me 1,2, only for 3-5:
Whoops, sorry about that, I'll add you to my "auto-cc on all if cc'd on one" list.
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
Thanks.
although going by the subjects, the earlier two patches should be fine too.
On Fri, Nov 30, 2018 at 04:35:06PM -0800, Dan Williams wrote:
Dan Williams (5): generic/pgtable: Make {pmd,pud}_same() unconditionally available generic/pgtable: Introduce {p4d,pgd}_same() generic/pgtable: Introduce set_pte_safe() x86/mm: Validate kernel_physical_mapping_init() pte population x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
Looks good to me. For the whole patchset:
Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
linux-stable-mirror@lists.linaro.org