This is a note to let you know that I've just added the patch titled
dm: correctly handle chained bios in dec_pending()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-correctly-handle-chained-bios-in-dec_pending.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8dd601fa8317243be887458c49f6c29c2f3d719f Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb(a)suse.com>
Date: Thu, 15 Feb 2018 20:00:15 +1100
Subject: dm: correctly handle chained bios in dec_pending()
From: NeilBrown <neilb(a)suse.com>
commit 8dd601fa8317243be887458c49f6c29c2f3d719f upstream.
dec_pending() is given an error status (possibly 0) to be recorded
against a bio. It can be called several times on the one 'struct
dm_io', and it is careful to only assign a non-zero error to
io->status. However when it then assigned io->status to bio->bi_status,
it is not careful and could overwrite a genuine error status with 0.
This can happen when chained bios are in use. If a bio is chained
beneath the bio that this dm_io is handling, the child bio might
complete and set bio->bi_status before the dm_io completes.
This has been possible since chained bios were introduced in 3.14, and
has become a lot easier to trigger with commit 18a25da84354 ("dm: ensure
bio submission follows a depth-first tree walk") as that commit caused
dm to start using chained bios itself.
A particular failure mode is that if a bio spans an 'error' target and a
working target, the 'error' fragment will complete instantly and set the
->bi_status, and the other fragment will normally complete a little
later, and will clear ->bi_status.
The fix is simply to only assign io_error to bio->bi_status when
io_error is not zero.
Reported-and-tested-by: Milan Broz <gmazyland(a)gmail.com>
Cc: stable(a)vger.kernel.org (v3.14+)
Signed-off-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -809,7 +809,8 @@ static void dec_pending(struct dm_io *io
} else {
/* done with normal IO or empty flush */
trace_block_bio_complete(md->queue, bio, io_error);
- bio->bi_error = io_error;
+ if (io_error)
+ bio->bi_error = io_error;
bio_endio(bio);
}
}
Patches currently in stable-queue which might be from neilb(a)suse.com are
queue-4.9/dm-correctly-handle-chained-bios-in-dec_pending.patch
This is a note to let you know that I've just added the patch titled
vfs: don't do RCU lookup of empty pathnames
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vfs-don-t-do-rcu-lookup-of-empty-pathnames.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c0eb027e5aef70b71e5a38ee3e264dc0b497f343 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Sun, 2 Apr 2017 17:10:08 -0700
Subject: vfs: don't do RCU lookup of empty pathnames
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit c0eb027e5aef70b71e5a38ee3e264dc0b497f343 upstream.
Normal pathname lookup doesn't allow empty pathnames, but using
AT_EMPTY_PATH (with name_to_handle_at() or fstatat(), for example) you
can trigger an empty pathname lookup.
And not only is the RCU lookup in that case entirely unnecessary
(because we'll obviously immediately finalize the end result), it is
actively wrong.
Why? An empth path is a special case that will return the original
'dirfd' dentry - and that dentry may not actually be RCU-free'd,
resulting in a potential use-after-free if we were to initialize the
path lazily under the RCU read lock and depend on complete_walk()
finalizing the dentry.
Found by syzkaller and KASAN.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Reported-by: Vegard Nossum <vegard.nossum(a)gmail.com>
Acked-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Eric Biggers <ebiggers3(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/namei.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2000,6 +2000,9 @@ static const char *path_init(struct name
int retval = 0;
const char *s = nd->name->name;
+ if (!*s)
+ flags &= ~LOOKUP_RCU;
+
nd->last_type = LAST_ROOT; /* if there are only slashes... */
nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT;
nd->depth = 0;
Patches currently in stable-queue which might be from torvalds(a)linux-foundation.org are
queue-4.4/x86-cpu-change-type-of-x86_cache_size-variable-to-unsigned-int.patch
queue-4.4/mm-hide-a-warning-for-compile_test.patch
queue-4.4/vfs-don-t-do-rcu-lookup-of-empty-pathnames.patch
queue-4.4/kvm-x86-reduce-retpoline-performance-impact-in-slot_handle_level_range-by-always-inlining-iterator-helper-methods.patch
This is a note to let you know that I've just added the patch titled
dm: correctly handle chained bios in dec_pending()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dm-correctly-handle-chained-bios-in-dec_pending.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8dd601fa8317243be887458c49f6c29c2f3d719f Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb(a)suse.com>
Date: Thu, 15 Feb 2018 20:00:15 +1100
Subject: dm: correctly handle chained bios in dec_pending()
From: NeilBrown <neilb(a)suse.com>
commit 8dd601fa8317243be887458c49f6c29c2f3d719f upstream.
dec_pending() is given an error status (possibly 0) to be recorded
against a bio. It can be called several times on the one 'struct
dm_io', and it is careful to only assign a non-zero error to
io->status. However when it then assigned io->status to bio->bi_status,
it is not careful and could overwrite a genuine error status with 0.
This can happen when chained bios are in use. If a bio is chained
beneath the bio that this dm_io is handling, the child bio might
complete and set bio->bi_status before the dm_io completes.
This has been possible since chained bios were introduced in 3.14, and
has become a lot easier to trigger with commit 18a25da84354 ("dm: ensure
bio submission follows a depth-first tree walk") as that commit caused
dm to start using chained bios itself.
A particular failure mode is that if a bio spans an 'error' target and a
working target, the 'error' fragment will complete instantly and set the
->bi_status, and the other fragment will normally complete a little
later, and will clear ->bi_status.
The fix is simply to only assign io_error to bio->bi_status when
io_error is not zero.
Reported-and-tested-by: Milan Broz <gmazyland(a)gmail.com>
Cc: stable(a)vger.kernel.org (v3.14+)
Signed-off-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/md/dm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -974,7 +974,8 @@ static void dec_pending(struct dm_io *io
} else {
/* done with normal IO or empty flush */
trace_block_bio_complete(md->queue, bio, io_error);
- bio->bi_error = io_error;
+ if (io_error)
+ bio->bi_error = io_error;
bio_endio(bio);
}
}
Patches currently in stable-queue which might be from neilb(a)suse.com are
queue-4.4/dm-correctly-handle-chained-bios-in-dec_pending.patch
This is a note to let you know that I've just added the patch titled
x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-mm-hwpoison-don-t-unconditionally-unmap-kernel-1-1-pages.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fd0e786d9d09024f67bd71ec094b110237dc3840 Mon Sep 17 00:00:00 2001
From: Tony Luck <tony.luck(a)intel.com>
Date: Thu, 25 Jan 2018 14:23:48 -0800
Subject: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
From: Tony Luck <tony.luck(a)intel.com>
commit fd0e786d9d09024f67bd71ec094b110237dc3840 upstream.
In the following commit:
ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
... we added code to memory_failure() to unmap the page from the
kernel 1:1 virtual address space to avoid speculative access to the
page logging additional errors.
But memory_failure() may not always succeed in taking the page offline,
especially if the page belongs to the kernel. This can happen if
there are too many corrected errors on a page and either mcelog(8)
or drivers/ras/cec.c asks to take a page offline.
Since we remove the 1:1 mapping early in memory_failure(), we can
end up with the page unmapped, but still in use. On the next access
the kernel crashes :-(
There are also various debug paths that call memory_failure() to simulate
occurrence of an error. Since there is no actual error in memory, we
don't need to map out the page for those cases.
Revert most of the previous attempt and keep the solution local to
arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
1) there is a real error
2) memory_failure() succeeds.
All of this only applies to 64-bit systems. 32-bit kernel doesn't map
all of memory into kernel space. It isn't worth adding the code to unmap
the piece that is mapped because nobody would run a 32-bit kernel on a
machine that has recoverable machine checks.
Signed-off-by: Tony Luck <tony.luck(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Robert (Persistent Memory) <elliott(a)hpe.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org #v4.14
Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/page_64.h | 4 ----
arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 +++++++++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 17 +++++++++++------
include/linux/mm_inline.h | 6 ------
mm/memory-failure.c | 2 --
5 files changed, 26 insertions(+), 18 deletions(-)
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -52,10 +52,6 @@ static inline void clear_page(void *page
void copy_page(void *to, void *from);
-#ifdef CONFIG_X86_MCE
-#define arch_unmap_kpfn arch_unmap_kpfn
-#endif
-
#endif /* !__ASSEMBLY__ */
#ifdef CONFIG_X86_VSYSCALL_EMULATION
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -115,4 +115,19 @@ static inline void mce_unregister_inject
extern struct mca_config mca_cfg;
+#ifndef CONFIG_X86_64
+/*
+ * On 32-bit systems it would be difficult to safely unmap a poison page
+ * from the kernel 1:1 map because there are no non-canonical addresses that
+ * we can use to refer to the address without risking a speculative access.
+ * However, this isn't much of an issue because:
+ * 1) Few unmappable pages are in the 1:1 map. Most are in HIGHMEM which
+ * are only mapped into the kernel as needed
+ * 2) Few people would run a 32-bit kernel on a machine that supports
+ * recoverable errors because they have too much memory to boot 32-bit.
+ */
+static inline void mce_unmap_kpfn(unsigned long pfn) {}
+#define mce_unmap_kpfn mce_unmap_kpfn
+#endif
+
#endif /* __X86_MCE_INTERNAL_H__ */
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -106,6 +106,10 @@ static struct irq_work mce_irq_work;
static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn);
+#endif
+
/*
* CPU/chipset specific EDAC code can register a notifier call here to print
* MCE errors in a human-readable form.
@@ -582,7 +586,8 @@ static int srao_decode_notifier(struct n
if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) {
pfn = mce->addr >> PAGE_SHIFT;
- memory_failure(pfn, MCE_VECTOR, 0);
+ if (memory_failure(pfn, MCE_VECTOR, 0))
+ mce_unmap_kpfn(pfn);
}
return NOTIFY_OK;
@@ -1049,12 +1054,13 @@ static int do_memory_failure(struct mce
ret = memory_failure(m->addr >> PAGE_SHIFT, MCE_VECTOR, flags);
if (ret)
pr_err("Memory error not recovered");
+ else
+ mce_unmap_kpfn(m->addr >> PAGE_SHIFT);
return ret;
}
-#if defined(arch_unmap_kpfn) && defined(CONFIG_MEMORY_FAILURE)
-
-void arch_unmap_kpfn(unsigned long pfn)
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn)
{
unsigned long decoy_addr;
@@ -1065,7 +1071,7 @@ void arch_unmap_kpfn(unsigned long pfn)
* We would like to just call:
* set_memory_np((unsigned long)pfn_to_kaddr(pfn), 1);
* but doing that would radically increase the odds of a
- * speculative access to the posion page because we'd have
+ * speculative access to the poison page because we'd have
* the virtual address of the kernel 1:1 mapping sitting
* around in registers.
* Instead we get tricky. We create a non-canonical address
@@ -1090,7 +1096,6 @@ void arch_unmap_kpfn(unsigned long pfn)
if (set_memory_np(decoy_addr, 1))
pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
-
}
#endif
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -127,10 +127,4 @@ static __always_inline enum lru_list pag
#define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
-#ifdef arch_unmap_kpfn
-extern void arch_unmap_kpfn(unsigned long pfn);
-#else
-static __always_inline void arch_unmap_kpfn(unsigned long pfn) { }
-#endif
-
#endif
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1146,8 +1146,6 @@ int memory_failure(unsigned long pfn, in
return 0;
}
- arch_unmap_kpfn(pfn);
-
orig_head = hpage = compound_head(p);
num_poisoned_pages_inc();
Patches currently in stable-queue which might be from tony.luck(a)intel.com are
queue-4.15/x86-cpu-rename-cpu_data.x86_mask-to-cpu_data.x86_stepping.patch
queue-4.15/x86-mm-mm-hwpoison-don-t-unconditionally-unmap-kernel-1-1-pages.patch
This is a note to let you know that I've just added the patch titled
x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-mm-hwpoison-don-t-unconditionally-unmap-kernel-1-1-pages.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fd0e786d9d09024f67bd71ec094b110237dc3840 Mon Sep 17 00:00:00 2001
From: Tony Luck <tony.luck(a)intel.com>
Date: Thu, 25 Jan 2018 14:23:48 -0800
Subject: x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
From: Tony Luck <tony.luck(a)intel.com>
commit fd0e786d9d09024f67bd71ec094b110237dc3840 upstream.
In the following commit:
ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
... we added code to memory_failure() to unmap the page from the
kernel 1:1 virtual address space to avoid speculative access to the
page logging additional errors.
But memory_failure() may not always succeed in taking the page offline,
especially if the page belongs to the kernel. This can happen if
there are too many corrected errors on a page and either mcelog(8)
or drivers/ras/cec.c asks to take a page offline.
Since we remove the 1:1 mapping early in memory_failure(), we can
end up with the page unmapped, but still in use. On the next access
the kernel crashes :-(
There are also various debug paths that call memory_failure() to simulate
occurrence of an error. Since there is no actual error in memory, we
don't need to map out the page for those cases.
Revert most of the previous attempt and keep the solution local to
arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
1) there is a real error
2) memory_failure() succeeds.
All of this only applies to 64-bit systems. 32-bit kernel doesn't map
all of memory into kernel space. It isn't worth adding the code to unmap
the piece that is mapped because nobody would run a 32-bit kernel on a
machine that has recoverable machine checks.
Signed-off-by: Tony Luck <tony.luck(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Robert (Persistent Memory) <elliott(a)hpe.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org #v4.14
Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/page_64.h | 4 ----
arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 +++++++++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 17 +++++++++++------
include/linux/mm_inline.h | 6 ------
mm/memory-failure.c | 2 --
5 files changed, 26 insertions(+), 18 deletions(-)
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -52,10 +52,6 @@ static inline void clear_page(void *page
void copy_page(void *to, void *from);
-#ifdef CONFIG_X86_MCE
-#define arch_unmap_kpfn arch_unmap_kpfn
-#endif
-
#endif /* !__ASSEMBLY__ */
#ifdef CONFIG_X86_VSYSCALL_EMULATION
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -115,4 +115,19 @@ static inline void mce_unregister_inject
extern struct mca_config mca_cfg;
+#ifndef CONFIG_X86_64
+/*
+ * On 32-bit systems it would be difficult to safely unmap a poison page
+ * from the kernel 1:1 map because there are no non-canonical addresses that
+ * we can use to refer to the address without risking a speculative access.
+ * However, this isn't much of an issue because:
+ * 1) Few unmappable pages are in the 1:1 map. Most are in HIGHMEM which
+ * are only mapped into the kernel as needed
+ * 2) Few people would run a 32-bit kernel on a machine that supports
+ * recoverable errors because they have too much memory to boot 32-bit.
+ */
+static inline void mce_unmap_kpfn(unsigned long pfn) {}
+#define mce_unmap_kpfn mce_unmap_kpfn
+#endif
+
#endif /* __X86_MCE_INTERNAL_H__ */
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -106,6 +106,10 @@ static struct irq_work mce_irq_work;
static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn);
+#endif
+
/*
* CPU/chipset specific EDAC code can register a notifier call here to print
* MCE errors in a human-readable form.
@@ -582,7 +586,8 @@ static int srao_decode_notifier(struct n
if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) {
pfn = mce->addr >> PAGE_SHIFT;
- memory_failure(pfn, MCE_VECTOR, 0);
+ if (memory_failure(pfn, MCE_VECTOR, 0))
+ mce_unmap_kpfn(pfn);
}
return NOTIFY_OK;
@@ -1049,12 +1054,13 @@ static int do_memory_failure(struct mce
ret = memory_failure(m->addr >> PAGE_SHIFT, MCE_VECTOR, flags);
if (ret)
pr_err("Memory error not recovered");
+ else
+ mce_unmap_kpfn(m->addr >> PAGE_SHIFT);
return ret;
}
-#if defined(arch_unmap_kpfn) && defined(CONFIG_MEMORY_FAILURE)
-
-void arch_unmap_kpfn(unsigned long pfn)
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn)
{
unsigned long decoy_addr;
@@ -1065,7 +1071,7 @@ void arch_unmap_kpfn(unsigned long pfn)
* We would like to just call:
* set_memory_np((unsigned long)pfn_to_kaddr(pfn), 1);
* but doing that would radically increase the odds of a
- * speculative access to the posion page because we'd have
+ * speculative access to the poison page because we'd have
* the virtual address of the kernel 1:1 mapping sitting
* around in registers.
* Instead we get tricky. We create a non-canonical address
@@ -1090,7 +1096,6 @@ void arch_unmap_kpfn(unsigned long pfn)
if (set_memory_np(decoy_addr, 1))
pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
-
}
#endif
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -127,10 +127,4 @@ static __always_inline enum lru_list pag
#define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
-#ifdef arch_unmap_kpfn
-extern void arch_unmap_kpfn(unsigned long pfn);
-#else
-static __always_inline void arch_unmap_kpfn(unsigned long pfn) { }
-#endif
-
#endif
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1146,8 +1146,6 @@ int memory_failure(unsigned long pfn, in
return 0;
}
- arch_unmap_kpfn(pfn);
-
orig_head = hpage = compound_head(p);
num_poisoned_pages_inc();
Patches currently in stable-queue which might be from tony.luck(a)intel.com are
queue-4.14/x86-cpu-rename-cpu_data.x86_mask-to-cpu_data.x86_stepping.patch
queue-4.14/x86-mm-mm-hwpoison-don-t-unconditionally-unmap-kernel-1-1-pages.patch
On Wed, Feb 21, 2018 at 07:06:39AM +1100, NeilBrown wrote:
>
> Commit 8dd601fa8317243be887458c49f6c29c2f3d719f upstream.
Thanks for this, and the 4.9 backport, both now applied.
greg k-h
The patch below does not apply to the 4.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From fd0e786d9d09024f67bd71ec094b110237dc3840 Mon Sep 17 00:00:00 2001
From: Tony Luck <tony.luck(a)intel.com>
Date: Thu, 25 Jan 2018 14:23:48 -0800
Subject: [PATCH] x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1
pages
In the following commit:
ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
... we added code to memory_failure() to unmap the page from the
kernel 1:1 virtual address space to avoid speculative access to the
page logging additional errors.
But memory_failure() may not always succeed in taking the page offline,
especially if the page belongs to the kernel. This can happen if
there are too many corrected errors on a page and either mcelog(8)
or drivers/ras/cec.c asks to take a page offline.
Since we remove the 1:1 mapping early in memory_failure(), we can
end up with the page unmapped, but still in use. On the next access
the kernel crashes :-(
There are also various debug paths that call memory_failure() to simulate
occurrence of an error. Since there is no actual error in memory, we
don't need to map out the page for those cases.
Revert most of the previous attempt and keep the solution local to
arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:
1) there is a real error
2) memory_failure() succeeds.
All of this only applies to 64-bit systems. 32-bit kernel doesn't map
all of memory into kernel space. It isn't worth adding the code to unmap
the piece that is mapped because nobody would run a 32-bit kernel on a
machine that has recoverable machine checks.
Signed-off-by: Tony Luck <tony.luck(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Robert (Persistent Memory) <elliott(a)hpe.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org #v4.14
Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 4baa6bceb232..d652a3808065 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -52,10 +52,6 @@ static inline void clear_page(void *page)
void copy_page(void *to, void *from);
-#ifdef CONFIG_X86_MCE
-#define arch_unmap_kpfn arch_unmap_kpfn
-#endif
-
#endif /* !__ASSEMBLY__ */
#ifdef CONFIG_X86_VSYSCALL_EMULATION
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index aa0d5df9dc60..e956eb267061 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -115,4 +115,19 @@ static inline void mce_unregister_injector_chain(struct notifier_block *nb) { }
extern struct mca_config mca_cfg;
+#ifndef CONFIG_X86_64
+/*
+ * On 32-bit systems it would be difficult to safely unmap a poison page
+ * from the kernel 1:1 map because there are no non-canonical addresses that
+ * we can use to refer to the address without risking a speculative access.
+ * However, this isn't much of an issue because:
+ * 1) Few unmappable pages are in the 1:1 map. Most are in HIGHMEM which
+ * are only mapped into the kernel as needed
+ * 2) Few people would run a 32-bit kernel on a machine that supports
+ * recoverable errors because they have too much memory to boot 32-bit.
+ */
+static inline void mce_unmap_kpfn(unsigned long pfn) {}
+#define mce_unmap_kpfn mce_unmap_kpfn
+#endif
+
#endif /* __X86_MCE_INTERNAL_H__ */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 75f405ac085c..8ff94d1e2dce 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -105,6 +105,10 @@ static struct irq_work mce_irq_work;
static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn);
+#endif
+
/*
* CPU/chipset specific EDAC code can register a notifier call here to print
* MCE errors in a human-readable form.
@@ -590,7 +594,8 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val,
if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) {
pfn = mce->addr >> PAGE_SHIFT;
- memory_failure(pfn, 0);
+ if (!memory_failure(pfn, 0))
+ mce_unmap_kpfn(pfn);
}
return NOTIFY_OK;
@@ -1057,12 +1062,13 @@ static int do_memory_failure(struct mce *m)
ret = memory_failure(m->addr >> PAGE_SHIFT, flags);
if (ret)
pr_err("Memory error not recovered");
+ else
+ mce_unmap_kpfn(m->addr >> PAGE_SHIFT);
return ret;
}
-#if defined(arch_unmap_kpfn) && defined(CONFIG_MEMORY_FAILURE)
-
-void arch_unmap_kpfn(unsigned long pfn)
+#ifndef mce_unmap_kpfn
+static void mce_unmap_kpfn(unsigned long pfn)
{
unsigned long decoy_addr;
@@ -1073,7 +1079,7 @@ void arch_unmap_kpfn(unsigned long pfn)
* We would like to just call:
* set_memory_np((unsigned long)pfn_to_kaddr(pfn), 1);
* but doing that would radically increase the odds of a
- * speculative access to the posion page because we'd have
+ * speculative access to the poison page because we'd have
* the virtual address of the kernel 1:1 mapping sitting
* around in registers.
* Instead we get tricky. We create a non-canonical address
@@ -1098,7 +1104,6 @@ void arch_unmap_kpfn(unsigned long pfn)
if (set_memory_np(decoy_addr, 1))
pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
-
}
#endif
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index c30b32e3c862..10191c28fc04 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -127,10 +127,4 @@ static __always_inline enum lru_list page_lru(struct page *page)
#define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
-#ifdef arch_unmap_kpfn
-extern void arch_unmap_kpfn(unsigned long pfn);
-#else
-static __always_inline void arch_unmap_kpfn(unsigned long pfn) { }
-#endif
-
#endif
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 4b80ccee4535..8291b75f42c8 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1139,8 +1139,6 @@ int memory_failure(unsigned long pfn, int flags)
return 0;
}
- arch_unmap_kpfn(pfn);
-
orig_head = hpage = compound_head(p);
num_poisoned_pages_inc();
One I2C bus on my Atom E3845 board has been broken since 4.9.
It has two devices, both declared by ACPI and with built-in drivers.
There are two back-to-back transactions originating from the kernel, one
targeting each device. The first transaction works, the second one locks
up the I2C controller. The controller never recovers.
These kernel logs show up whenever an I2C transaction is attempted after
this failure.
i2c-designware-pci 0000:00:18.3: timeout in disabling adapter
i2c-designware-pci 0000:00:18.3: timeout waiting for bus ready
Waiting for the I2C controller status to indicate that it is enabled
before programming it fixes the issue.
I have tested this patch on 4.14 and 4.15.
Fixes: commit 2702ea7dbec5 ("i2c: designware: wait for disable/enable only if necessary")
Cc: linux-stable <stable(a)vger.kernel.org> #4.13+
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Signed-off-by: Ben Gardner <gardner.ben(a)gmail.com>
---
drivers/i2c/busses/i2c-designware-master.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c
index ae69188..55926ef 100644
--- a/drivers/i2c/busses/i2c-designware-master.c
+++ b/drivers/i2c/busses/i2c-designware-master.c
@@ -209,7 +209,7 @@ static void i2c_dw_xfer_init(struct dw_i2c_dev *dev)
i2c_dw_disable_int(dev);
/* Enable the adapter */
- __i2c_dw_enable(dev, true);
+ __i2c_dw_enable_and_wait(dev, true);
/* Clear and enable interrupts */
dw_readl(dev, DW_IC_CLR_INTR);
--
2.7.4
Commit f7f99100d8d95dbcf09e0216a143211e79418b9f ("mm: stop zeroing
memory during allocation in vmemmap") broke Xen pv domains in some
configurations, as the "Pinned" information in struct page of early
page tables could get lost.
Avoid this problem by not deferring struct page initialization when
running as Xen pv guest.
Cc: <stable(a)vger.kernel.org> #4.15
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
mm/page_alloc.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 81e18ceef579..681d504b9a40 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -347,6 +347,9 @@ static inline bool update_defer_init(pg_data_t *pgdat,
/* Always populate low zones for address-constrained allocations */
if (zone_end < pgdat_end_pfn(pgdat))
return true;
+ /* Xen PV domains need page structures early */
+ if (xen_pv_domain())
+ return true;
(*nr_initialised)++;
if ((*nr_initialised > pgdat->static_init_pgcnt) &&
(pfn & (PAGES_PER_SECTION - 1)) == 0) {
--
2.13.6
From: Eric Biggers <ebiggers(a)google.com>
The X.509 parser mishandles the case where the certificate's signature's
hash algorithm is not available in the crypto API. In this case,
x509_get_sig_params() doesn't allocate the cert->sig->digest buffer;
this part seems to be intentional. However,
public_key_verify_signature() is still called via
x509_check_for_self_signed(), which triggers the 'BUG_ON(!sig->digest)'.
Fix this by making public_key_verify_signature() return -ENOPKG if the
hash buffer has not been allocated.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled:
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: 6c2dc5ae4ab7 ("X.509: Extract signature digest and make self-signed cert checks earlier")
Reported-by: Paolo Valente <paolo.valente(a)linaro.org>
Cc: Paolo Valente <paolo.valente(a)linaro.org>
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/asymmetric_keys/public_key.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index de996586762a..e929fe1e4106 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -79,9 +79,11 @@ int public_key_verify_signature(const struct public_key *pkey,
BUG_ON(!pkey);
BUG_ON(!sig);
- BUG_ON(!sig->digest);
BUG_ON(!sig->s);
+ if (!sig->digest)
+ return -ENOPKG;
+
alg_name = sig->pkey_algo;
if (strcmp(sig->pkey_algo, "rsa") == 0) {
/* The data wangled by the RSA algorithm is typically padded
--
2.16.0.rc1.238.g530d649a79-goog