From: Joerg Roedel jroedel@suse.de
Backport commits from upstream to fix a data corruption issue that gets exposed when using PTI on x86-32.
Please consider them for inclusion into stable-5.2.
Joerg Roedel (3): x86/mm: Check for pfn instead of page in vmalloc_sync_one() x86/mm: Sync also unmappings in vmalloc_sync_all() mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
arch/x86/mm/fault.c | 15 ++++++--------- mm/vmalloc.c | 9 +++++++++ 2 files changed, 15 insertions(+), 9 deletions(-)
From: Joerg Roedel jroedel@suse.de
commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0 upstream.
Do not require a struct page for the mapped memory location because it might not exist. This can happen when an ioremapped region is mapped with 2MB pages.
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.org --- arch/x86/mm/fault.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 46df4c6aae46..499331be9bfe 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -200,7 +200,7 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address) if (!pmd_present(*pmd)) set_pmd(pmd, *pmd_k); else - BUG_ON(pmd_page(*pmd) != pmd_page(*pmd_k)); + BUG_ON(pmd_pfn(*pmd) != pmd_pfn(*pmd_k));
return pmd_k; }
From: Joerg Roedel jroedel@suse.de
commit 8e998fc24de47c55b47a887f6c95ab91acd4a720 upstream.
With huge-page ioremap areas the unmappings also need to be synced between all page-tables. Otherwise it can cause data corruption when a region is unmapped and later re-used.
Make the vmalloc_sync_one() function ready to sync unmappings and make sure vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD is found.
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.org --- arch/x86/mm/fault.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 499331be9bfe..26a8b4b1b9ed 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -194,11 +194,12 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
pmd = pmd_offset(pud, address); pmd_k = pmd_offset(pud_k, address); - if (!pmd_present(*pmd_k)) - return NULL;
- if (!pmd_present(*pmd)) + if (pmd_present(*pmd) != pmd_present(*pmd_k)) set_pmd(pmd, *pmd_k); + + if (!pmd_present(*pmd_k)) + return NULL; else BUG_ON(pmd_pfn(*pmd) != pmd_pfn(*pmd_k));
@@ -220,17 +221,13 @@ void vmalloc_sync_all(void) spin_lock(&pgd_lock); list_for_each_entry(page, &pgd_list, lru) { spinlock_t *pgt_lock; - pmd_t *ret;
/* the pgt_lock only for Xen */ pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
spin_lock(pgt_lock); - ret = vmalloc_sync_one(page_address(page), address); + vmalloc_sync_one(page_address(page), address); spin_unlock(pgt_lock); - - if (!ret) - break; } spin_unlock(&pgd_lock); }
From: Joerg Roedel jroedel@suse.de
commit 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167 upstream.
On x86-32 with PTI enabled, parts of the kernel page-tables are not shared between processes. This can cause mappings in the vmalloc/ioremap area to persist in some page-tables after the region is unmapped and released.
When the region is re-used the processes with the old mappings do not fault in the new mappings but still access the old ones.
This causes undefined behavior, in reality often data corruption, kernel oopses and panics and even spontaneous reboots.
Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to all page-tables in the system before the regions can be re-used.
References: https://bugzilla.suse.com/show_bug.cgi?id=1118689 Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Dave Hansen dave.hansen@linux.intel.com Link: https://lkml.kernel.org/r/20190719184652.11391-4-joro@8bytes.org --- mm/vmalloc.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 0f76cca32a1c..080d30408ce3 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1213,6 +1213,12 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end) if (unlikely(valist == NULL)) return false;
+ /* + * First make sure the mappings are removed from all page-tables + * before they are freed. + */ + vmalloc_sync_all(); + /* * TODO: to calculate a flush range without looping. * The list can be up to lazy_max_pages() elements. @@ -3001,6 +3007,9 @@ EXPORT_SYMBOL(remap_vmalloc_range); /* * Implement a stub for vmalloc_sync_all() if the architecture chose not to * have one. + * + * The purpose of this function is to make sure the vmalloc area + * mappings are identical in all page-tables in the system. */ void __weak vmalloc_sync_all(void) {
On Tue, Aug 13, 2019 at 05:28:11PM +0200, Joerg Roedel wrote:
From: Joerg Roedel jroedel@suse.de
Backport commits from upstream to fix a data corruption issue that gets exposed when using PTI on x86-32.
Please consider them for inclusion into stable-5.2.
Thanks for these. Based on the Fixes: tags on the commits, I've taken them all the way back to 4.4.y.
greg k-h
Hi Greg,
On Tue, Aug 13, 2019 at 08:36:42PM +0200, Greg Kroah-Hartman wrote:
On Tue, Aug 13, 2019 at 05:28:11PM +0200, Joerg Roedel wrote:
From: Joerg Roedel jroedel@suse.de
Backport commits from upstream to fix a data corruption issue that gets exposed when using PTI on x86-32.
Please consider them for inclusion into stable-5.2.
Thanks for these. Based on the Fixes: tags on the commits, I've taken them all the way back to 4.4.y.
Thank you! The problem almost only exposes itself when PTI on x86-32 is enabled (which was merged in 4.19), so I only backported down to that kernel.
But it is right that it might be possible to trigger the problem on older kernels too, e.g. in some 32bit XEN-pv configurations that use same PAE page-table structures as the PTI code for 32bit.
So thanks for picking the fixes up for the older kernels too.
Regards,
Joerg
linux-stable-mirror@lists.linaro.org