This is a note to let you know that I've just added the patch titled
x86/mm/64: Fix reboot interaction with CR4.PCIDE
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 924c6b900cfdf376b07bccfd80e62b21914f8a5a Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 8 Oct 2017 21:53:05 -0700
Subject: x86/mm/64: Fix reboot interaction with CR4.PCIDE
From: Andy Lutomirski <luto(a)kernel.org>
commit 924c6b900cfdf376b07bccfd80e62b21914f8a5a upstream.
Trying to reboot via real mode fails with PCID on: long mode cannot
be exited while CR4.PCIDE is set. (No, I have no idea why, but the
SDM and actual CPUs are in agreement here.) The result is a GPF and
a hang instead of a reboot.
I didn't catch this in testing because neither my computer nor my VM
reboots this way. I can trigger it with reboot=bios, though.
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Reported-and-tested-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.15075247…
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/reboot.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -106,6 +106,10 @@ void __noreturn machine_real_restart(uns
load_cr3(initial_page_table);
#else
write_cr3(real_mode_header->trampoline_pgd);
+
+ /* Exiting long mode will fail if CR4.PCIDE is set. */
+ if (static_cpu_has(X86_FEATURE_PCID))
+ cr4_clear_bits(X86_CR4_PCIDE);
#endif
/* Jump to the identity-mapped low memory code */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
queue-4.9/x86-mm-enable-cr4.pcide-on-supported-systems.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.9/x86-mm-disable-pcid-on-32-bit-kernels.patch
queue-4.9/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Add the 'nopcid' boot option to turn off PCID
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0790c9aad84901ca1bdc14746175549c8b5da215 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:20 -0700
Subject: x86/mm: Add the 'nopcid' boot option to turn off PCID
From: Andy Lutomirski <luto(a)kernel.org>
commit 0790c9aad84901ca1bdc14746175549c8b5da215 upstream.
The parameter is only present on x86_64 systems to save a few bytes,
as PCID is always disabled on x86_32.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/common.c | 18 ++++++++++++++++++
2 files changed, 20 insertions(+)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2795,6 +2795,8 @@ bytes respectively. Such letter suffixes
nopat [X86] Disable PAT (page attribute table extension of
pagetables) support.
+ nopcid [X86-64] Disable the PCID cpu feature.
+
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -163,6 +163,24 @@ static int __init x86_mpx_setup(char *s)
}
__setup("nompx", x86_mpx_setup);
+#ifdef CONFIG_X86_64
+static int __init x86_pcid_setup(char *s)
+{
+ /* require an exact match without trailing characters */
+ if (strlen(s))
+ return 0;
+
+ /* do not emit a message if the feature is not present */
+ if (!boot_cpu_has(X86_FEATURE_PCID))
+ return 1;
+
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ pr_info("nopcid: PCID feature disabled\n");
+ return 1;
+}
+__setup("nopcid", x86_pcid_setup);
+#endif
+
static int __init x86_noinvpcid_setup(char *s)
{
/* noinvpcid doesn't accept parameters */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
queue-4.9/x86-mm-enable-cr4.pcide-on-supported-systems.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.9/x86-mm-disable-pcid-on-32-bit-kernels.patch
queue-4.9/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Enable CR4.PCIDE on supported systems
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-enable-cr4.pcide-on-supported-systems.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 660da7c9228f685b2ebe664f9fd69aaddcc420b5 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:21 -0700
Subject: x86/mm: Enable CR4.PCIDE on supported systems
From: Andy Lutomirski <luto(a)kernel.org>
commit 660da7c9228f685b2ebe664f9fd69aaddcc420b5 upstream.
We can use PCID if the CPU has PCID and PGE and we're not on Xen.
By itself, this has no effect. A followup patch will start using PCID.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/6327ecd907b32f79d5aa0d466f04503bbec5df88.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 8 ++++++++
arch/x86/kernel/cpu/common.c | 22 ++++++++++++++++++++++
arch/x86/xen/enlighten.c | 6 ++++++
3 files changed, 36 insertions(+)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -190,6 +190,14 @@ static inline void __flush_tlb_all(void)
__flush_tlb_global();
else
__flush_tlb();
+
+ /*
+ * Note: if we somehow had PCID but not PGE, then this wouldn't work --
+ * we'd end up flushing kernel translations for the current ASID but
+ * we might fail to flush kernel translations for other cached ASIDs.
+ *
+ * To avoid this issue, we force PCID off if PGE is off.
+ */
}
static inline void __flush_tlb_one(unsigned long addr)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -321,6 +321,25 @@ static __always_inline void setup_smap(s
}
}
+static void setup_pcid(struct cpuinfo_x86 *c)
+{
+ if (cpu_has(c, X86_FEATURE_PCID)) {
+ if (cpu_has(c, X86_FEATURE_PGE)) {
+ cr4_set_bits(X86_CR4_PCIDE);
+ } else {
+ /*
+ * flush_tlb_all(), as currently implemented, won't
+ * work if PCID is on but PGE is not. Since that
+ * combination doesn't exist on real hardware, there's
+ * no reason to try to fully support it, but it's
+ * polite to avoid corrupting data if we're on
+ * an improperly configured VM.
+ */
+ clear_cpu_cap(c, X86_FEATURE_PCID);
+ }
+ }
+}
+
/*
* Some CPU features depend on higher CPUID levels, which may not always
* be available due to CPUID level capping or broken virtualization
@@ -952,6 +971,9 @@ static void identify_cpu(struct cpuinfo_
setup_smep(c);
setup_smap(c);
+ /* Set up PCID */
+ setup_pcid(c);
+
/*
* The vendor-specific functions might have changed features.
* Now we do "generic changes."
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -433,6 +433,12 @@ static void __init xen_init_cpuid_mask(v
~((1 << X86_FEATURE_MTRR) | /* disable MTRR */
(1 << X86_FEATURE_ACC)); /* thermal monitoring */
+ /*
+ * Xen PV would need some work to support PCID: CR3 handling as well
+ * as xen_flush_tlb_others() would need updating.
+ */
+ cpuid_leaf1_ecx_mask &= ~(1 << X86_FEATURE_PCID); /* disable PCID */
+
if (!xen_initial_domain())
cpuid_leaf1_edx_mask &=
~((1 << X86_FEATURE_ACPI)); /* disable ACPI */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
queue-4.4/x86-mm-enable-cr4.pcide-on-supported-systems.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
queue-4.4/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
This is a note to let you know that I've just added the patch titled
x86/mm/64: Fix reboot interaction with CR4.PCIDE
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 924c6b900cfdf376b07bccfd80e62b21914f8a5a Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 8 Oct 2017 21:53:05 -0700
Subject: x86/mm/64: Fix reboot interaction with CR4.PCIDE
From: Andy Lutomirski <luto(a)kernel.org>
commit 924c6b900cfdf376b07bccfd80e62b21914f8a5a upstream.
Trying to reboot via real mode fails with PCID on: long mode cannot
be exited while CR4.PCIDE is set. (No, I have no idea why, but the
SDM and actual CPUs are in agreement here.) The result is a GPF and
a hang instead of a reboot.
I didn't catch this in testing because neither my computer nor my VM
reboots this way. I can trigger it with reboot=bios, though.
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Reported-and-tested-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.15075247…
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/reboot.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -93,6 +93,10 @@ void __noreturn machine_real_restart(uns
load_cr3(initial_page_table);
#else
write_cr3(real_mode_header->trampoline_pgd);
+
+ /* Exiting long mode will fail if CR4.PCIDE is set. */
+ if (static_cpu_has(X86_FEATURE_PCID))
+ cr4_clear_bits(X86_CR4_PCIDE);
#endif
/* Jump to the identity-mapped low memory code */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
queue-4.4/x86-mm-enable-cr4.pcide-on-supported-systems.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
queue-4.4/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Add the 'nopcid' boot option to turn off PCID
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0790c9aad84901ca1bdc14746175549c8b5da215 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:20 -0700
Subject: x86/mm: Add the 'nopcid' boot option to turn off PCID
From: Andy Lutomirski <luto(a)kernel.org>
commit 0790c9aad84901ca1bdc14746175549c8b5da215 upstream.
The parameter is only present on x86_64 systems to save a few bytes,
as PCID is always disabled on x86_32.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/common.c | 18 ++++++++++++++++++
2 files changed, 20 insertions(+)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2555,6 +2555,8 @@ bytes respectively. Such letter suffixes
nopat [X86] Disable PAT (page attribute table extension of
pagetables) support.
+ nopcid [X86-64] Disable the PCID cpu feature.
+
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -162,6 +162,24 @@ static int __init x86_mpx_setup(char *s)
}
__setup("nompx", x86_mpx_setup);
+#ifdef CONFIG_X86_64
+static int __init x86_pcid_setup(char *s)
+{
+ /* require an exact match without trailing characters */
+ if (strlen(s))
+ return 0;
+
+ /* do not emit a message if the feature is not present */
+ if (!boot_cpu_has(X86_FEATURE_PCID))
+ return 1;
+
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ pr_info("nopcid: PCID feature disabled\n");
+ return 1;
+}
+__setup("nopcid", x86_pcid_setup);
+#endif
+
static int __init x86_noinvpcid_setup(char *s)
{
/* noinvpcid doesn't accept parameters */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-add-the-nopcid-boot-option-to-turn-off-pcid.patch
queue-4.4/x86-mm-enable-cr4.pcide-on-supported-systems.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
queue-4.4/x86-mm-64-fix-reboot-interaction-with-cr4.pcide.patch
This is a note to let you know that I've just added the patch titled
kbuild: add '-fno-stack-check' to kernel build options
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-add-fno-stack-check-to-kernel-build-options.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3ce120b16cc548472f80cf8644f90eda958cf1b6 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Fri, 29 Dec 2017 17:34:43 -0800
Subject: kbuild: add '-fno-stack-check' to kernel build options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 3ce120b16cc548472f80cf8644f90eda958cf1b6 upstream.
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander(a)tsoy.me>
Reported-and-tested-by: Toralf Förster <toralf.foerster(a)gmx.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 3 +++
1 file changed, 3 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -788,6 +788,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
+
# conserve stack if available
KBUILD_CFLAGS += $(call cc-option,-fconserve-stack)
Patches currently in stable-queue which might be from torvalds(a)linux-foundation.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/kbuild-add-fno-stack-check-to-kernel-build-options.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.9/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
kbuild: add '-fno-stack-check' to kernel build options
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-add-fno-stack-check-to-kernel-build-options.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3ce120b16cc548472f80cf8644f90eda958cf1b6 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Fri, 29 Dec 2017 17:34:43 -0800
Subject: kbuild: add '-fno-stack-check' to kernel build options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 3ce120b16cc548472f80cf8644f90eda958cf1b6 upstream.
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander(a)tsoy.me>
Reported-and-tested-by: Toralf Förster <toralf.foerster(a)gmx.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 3 +++
1 file changed, 3 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -782,6 +782,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
+
# conserve stack if available
KBUILD_CFLAGS += $(call cc-option,-fconserve-stack)
Patches currently in stable-queue which might be from torvalds(a)linux-foundation.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/kbuild-add-fno-stack-check-to-kernel-build-options.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
kbuild: add '-fno-stack-check' to kernel build options
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-add-fno-stack-check-to-kernel-build-options.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3ce120b16cc548472f80cf8644f90eda958cf1b6 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Fri, 29 Dec 2017 17:34:43 -0800
Subject: kbuild: add '-fno-stack-check' to kernel build options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 3ce120b16cc548472f80cf8644f90eda958cf1b6 upstream.
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander(a)tsoy.me>
Reported-and-tested-by: Toralf Förster <toralf.foerster(a)gmx.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 3 +++
1 file changed, 3 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -802,6 +802,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
+
# conserve stack if available
KBUILD_CFLAGS += $(call cc-option,-fconserve-stack)
Patches currently in stable-queue which might be from torvalds(a)linux-foundation.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/kbuild-add-fno-stack-check-to-kernel-build-options.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
kbuild: add '-fno-stack-check' to kernel build options
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-add-fno-stack-check-to-kernel-build-options.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3ce120b16cc548472f80cf8644f90eda958cf1b6 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Fri, 29 Dec 2017 17:34:43 -0800
Subject: kbuild: add '-fno-stack-check' to kernel build options
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 3ce120b16cc548472f80cf8644f90eda958cf1b6 upstream.
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander(a)tsoy.me>
Reported-and-tested-by: Toralf Förster <toralf.foerster(a)gmx.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Jiri Kosina <jikos(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 3 +++
1 file changed, 3 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -762,6 +762,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
+
# conserve stack if available
KBUILD_CFLAGS += $(call cc-option,-fconserve-stack)
Patches currently in stable-queue which might be from torvalds(a)linux-foundation.org are
queue-3.18/kbuild-add-fno-stack-check-to-kernel-build-options.patch
I just discovered that commit 7042c2b9a19e74972f5783ab3b7ef98ebdee293f
in 4.14.4 completely breaks the external displays attached to my Dell
TB16 dock.
How/where should I report this? (Does the stable tree use the regular
kernel Bugzilla?)
Thanks!
--
========================================================================
Ian Pilcher arequipeno(a)gmail.com
-------- "I grew up before Mark Zuckerberg invented friendship" --------
========================================================================
From: Eric Biggers <ebiggers(a)google.com>
If a message sent to a PF_KEY socket ended with one of the extensions
that takes a 'struct sadb_address' but there were not enough bytes
remaining in the message for the ->sa_family member of the 'struct
sockaddr' which is supposed to follow, then verify_address_len() read
past the end of the message, into uninitialized memory. Fix it by
returning -EINVAL in this case.
This bug was found using syzkaller with KMSAN.
Reproducer:
#include <linux/pfkeyv2.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
char buf[24] = { 0 };
struct sadb_msg *msg = (void *)buf;
struct sadb_address *addr = (void *)(msg + 1);
msg->sadb_msg_version = PF_KEY_V2;
msg->sadb_msg_type = SADB_DELETE;
msg->sadb_msg_len = 3;
addr->sadb_address_len = 1;
addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;
write(sock, buf, 24);
}
Reported-by: Alexander Potapenko <glider(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
net/key/af_key.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 3dffb892d52c..596499cc8b2f 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -401,6 +401,11 @@ static int verify_address_len(const void *p)
#endif
int len;
+ if (sp->sadb_address_len <
+ DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family),
+ sizeof(uint64_t)))
+ return -EINVAL;
+
switch (addr->sa_family) {
case AF_INET:
len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t));
--
2.15.1
I can confirm now, that that kernel breaks both a desktop (an ThinkPad T440s i5) and a headless server (i3930) setup. For the server the attached .config works fine but switching from CONFIG_GENERIC_CPU to CONFIG_MCORE2 legt them hang at boot w/op any messages. Similar picture at the desktop.
Both are stable Gentoo Linux hardened systems.
This issue seems to exist in mainline too, probably visible with d120cd749 (stable) and 9aaefe7b59 (upstream).
--
Toralf
PGP C4EACDDE 0076E94E
Hi Greg,
v4.4.y-stable-queue reference used to be v4.4.108-15-gbb44baf,
now it is v4.4.108. Did something get lost, or did you drop
the pending patches ?
Guenter
This is a note to let you know that I've just added the patch titled
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ca6c99c0794875c6d1db6e22f246699691ab7e6b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 22 May 2017 15:30:01 -0700
Subject: x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
From: Andy Lutomirski <luto(a)kernel.org>
commit ca6c99c0794875c6d1db6e22f246699691ab7e6b upstream.
flush_tlb_page() was very similar to flush_tlb_mm_range() except that
it had a couple of issues:
- It was missing an smp_mb() in the case where
current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit)
- It was missing tracepoints and vm counter updates.
The only reason that I can see for keeping it at as a separate
function is that it could avoid a few branches that
flush_tlb_mm_range() needs to decide to flush just one page. This
hardly seems worthwhile. If we decide we want to get rid of those
branches again, a better way would be to introduce an
__flush_tlb_mm_range() helper and make both flush_tlb_page() and
flush_tlb_mm_range() use it.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.149549206…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 6 +++++-
arch/x86/mm/tlb.c | 27 ---------------------------
2 files changed, 5 insertions(+), 28 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -297,11 +297,15 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)
extern void flush_tlb_all(void);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a)
+{
+ flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, VM_NONE);
+}
+
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -354,33 +354,6 @@ out:
preempt_enable();
}
-void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
-{
- struct mm_struct *mm = vma->vm_mm;
-
- preempt_disable();
-
- if (current->active_mm == mm) {
- if (current->mm) {
- /*
- * Implicit full barrier (INVLPG) that synchronizes
- * with switch_mm.
- */
- __flush_tlb_one(start);
- } else {
- leave_mm(smp_processor_id());
-
- /* Synchronize with switch_mm. */
- smp_mb();
- }
- }
-
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);
-
- preempt_enable();
-}
-
static void do_flush_tlb_all(void *info)
{
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.9/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Disable PCID on 32-bit kernels
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-disable-pcid-on-32-bit-kernels.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From cba4671af7550e008f7a7835f06df0763825bf3e Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:19 -0700
Subject: x86/mm: Disable PCID on 32-bit kernels
From: Andy Lutomirski <luto(a)kernel.org>
commit cba4671af7550e008f7a7835f06df0763825bf3e upstream.
32-bit kernels on new hardware will see PCID in CPUID, but PCID can
only be used in 64-bit mode. Rather than making all PCID code
conditional, just disable the feature on 32-bit builds.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/disabled-features.h | 4 +++-
arch/x86/kernel/cpu/bugs.c | 8 ++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -21,11 +21,13 @@
# define DISABLE_K6_MTRR (1<<(X86_FEATURE_K6_MTRR & 31))
# define DISABLE_CYRIX_ARR (1<<(X86_FEATURE_CYRIX_ARR & 31))
# define DISABLE_CENTAUR_MCR (1<<(X86_FEATURE_CENTAUR_MCR & 31))
+# define DISABLE_PCID 0
#else
# define DISABLE_VME 0
# define DISABLE_K6_MTRR 0
# define DISABLE_CYRIX_ARR 0
# define DISABLE_CENTAUR_MCR 0
+# define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31))
#endif /* CONFIG_X86_64 */
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
@@ -43,7 +45,7 @@
#define DISABLED_MASK1 0
#define DISABLED_MASK2 0
#define DISABLED_MASK3 (DISABLE_CYRIX_ARR|DISABLE_CENTAUR_MCR|DISABLE_K6_MTRR)
-#define DISABLED_MASK4 0
+#define DISABLED_MASK4 (DISABLE_PCID)
#define DISABLED_MASK5 0
#define DISABLED_MASK6 0
#define DISABLED_MASK7 0
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -19,6 +19,14 @@
void __init check_bugs(void)
{
+#ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+ * that it can't be enabled in 32-bit mode.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+#endif
+
identify_boot_cpu();
#ifndef CONFIG_SMP
pr_info("CPU: ");
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.9/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ca6c99c0794875c6d1db6e22f246699691ab7e6b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 22 May 2017 15:30:01 -0700
Subject: x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
From: Andy Lutomirski <luto(a)kernel.org>
commit ca6c99c0794875c6d1db6e22f246699691ab7e6b upstream.
flush_tlb_page() was very similar to flush_tlb_mm_range() except that
it had a couple of issues:
- It was missing an smp_mb() in the case where
current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit)
- It was missing tracepoints and vm counter updates.
The only reason that I can see for keeping it at as a separate
function is that it could avoid a few branches that
flush_tlb_mm_range() needs to decide to flush just one page. This
hardly seems worthwhile. If we decide we want to get rid of those
branches again, a better way would be to introduce an
__flush_tlb_mm_range() helper and make both flush_tlb_page() and
flush_tlb_mm_range() use it.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.149549206…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 6 +++++-
arch/x86/mm/tlb.c | 27 ---------------------------
2 files changed, 5 insertions(+), 28 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -296,11 +296,15 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)
extern void flush_tlb_all(void);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a)
+{
+ flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, VM_NONE);
+}
+
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -339,33 +339,6 @@ out:
preempt_enable();
}
-void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
-{
- struct mm_struct *mm = vma->vm_mm;
-
- preempt_disable();
-
- if (current->active_mm == mm) {
- if (current->mm) {
- /*
- * Implicit full barrier (INVLPG) that synchronizes
- * with switch_mm.
- */
- __flush_tlb_one(start);
- } else {
- leave_mm(smp_processor_id());
-
- /* Synchronize with switch_mm. */
- smp_mb();
- }
- }
-
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);
-
- preempt_enable();
-}
-
static void do_flush_tlb_all(void *info)
{
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Disable PCID on 32-bit kernels
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-disable-pcid-on-32-bit-kernels.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From cba4671af7550e008f7a7835f06df0763825bf3e Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 29 Jun 2017 08:53:19 -0700
Subject: x86/mm: Disable PCID on 32-bit kernels
From: Andy Lutomirski <luto(a)kernel.org>
commit cba4671af7550e008f7a7835f06df0763825bf3e upstream.
32-bit kernels on new hardware will see PCID in CPUID, but PCID can
only be used in 64-bit mode. Rather than making all PCID code
conditional, just disable the feature on 32-bit builds.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Reviewed-by: Nadav Amit <nadav.amit(a)gmail.com>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.149875120…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/disabled-features.h | 4 +++-
arch/x86/kernel/cpu/bugs.c | 8 ++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -21,11 +21,13 @@
# define DISABLE_K6_MTRR (1<<(X86_FEATURE_K6_MTRR & 31))
# define DISABLE_CYRIX_ARR (1<<(X86_FEATURE_CYRIX_ARR & 31))
# define DISABLE_CENTAUR_MCR (1<<(X86_FEATURE_CENTAUR_MCR & 31))
+# define DISABLE_PCID 0
#else
# define DISABLE_VME 0
# define DISABLE_K6_MTRR 0
# define DISABLE_CYRIX_ARR 0
# define DISABLE_CENTAUR_MCR 0
+# define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31))
#endif /* CONFIG_X86_64 */
/*
@@ -35,7 +37,7 @@
#define DISABLED_MASK1 0
#define DISABLED_MASK2 0
#define DISABLED_MASK3 (DISABLE_CYRIX_ARR|DISABLE_CENTAUR_MCR|DISABLE_K6_MTRR)
-#define DISABLED_MASK4 0
+#define DISABLED_MASK4 (DISABLE_PCID)
#define DISABLED_MASK5 0
#define DISABLED_MASK6 0
#define DISABLED_MASK7 0
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -19,6 +19,14 @@
void __init check_bugs(void)
{
+#ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+ * that it can't be enabled in 32-bit mode.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+#endif
+
identify_boot_cpu();
#ifndef CONFIG_SMP
pr_info("CPU: ");
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.4/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
queue-4.4/x86-mm-disable-pcid-on-32-bit-kernels.patch
This is a note to let you know that I've just added the patch titled
block: don't let passthrough IO go into .make_request_fn()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
block-don-t-let-passthrough-io-go-into-.make_request_fn.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 14cb0dc6479dc5ebc63b3a459a5d89a2f1b39fed Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Mon, 18 Dec 2017 15:40:43 +0800
Subject: block: don't let passthrough IO go into .make_request_fn()
From: Ming Lei <ming.lei(a)redhat.com>
commit 14cb0dc6479dc5ebc63b3a459a5d89a2f1b39fed upstream.
Commit a8821f3f3("block: Improvements to bounce-buffer handling") tries
to make sure that the bio to .make_request_fn won't exceed BIO_MAX_PAGES,
but ignores that passthrough I/O can use blk_queue_bounce() too.
Especially, passthrough IO may not be sector-aligned, and the check
of 'sectors < bio_sectors(*bio_orig)' inside __blk_queue_bounce() may
become true even though the max bvec number doesn't exceed BIO_MAX_PAGES,
then cause the bio splitted, and the original passthrough bio is submited
to generic_make_request().
This patch fixes this issue by checking if the bio is passthrough IO,
and use bio_kmalloc() to allocate the cloned passthrough bio.
Cc: NeilBrown <neilb(a)suse.com>
Fixes: a8821f3f3("block: Improvements to bounce-buffer handling")
Tested-by: Michele Ballabio <barra_cuda(a)katamail.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
block/bounce.c | 6 ++++--
include/linux/blkdev.h | 21 +++++++++++++++++++--
2 files changed, 23 insertions(+), 4 deletions(-)
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -200,6 +200,7 @@ static void __blk_queue_bounce(struct re
unsigned i = 0;
bool bounce = false;
int sectors = 0;
+ bool passthrough = bio_is_passthrough(*bio_orig);
bio_for_each_segment(from, *bio_orig, iter) {
if (i++ < BIO_MAX_PAGES)
@@ -210,13 +211,14 @@ static void __blk_queue_bounce(struct re
if (!bounce)
return;
- if (sectors < bio_sectors(*bio_orig)) {
+ if (!passthrough && sectors < bio_sectors(*bio_orig)) {
bio = bio_split(*bio_orig, sectors, GFP_NOIO, bounce_bio_split);
bio_chain(bio, *bio_orig);
generic_make_request(*bio_orig);
*bio_orig = bio;
}
- bio = bio_clone_bioset(*bio_orig, GFP_NOIO, bounce_bio_set);
+ bio = bio_clone_bioset(*bio_orig, GFP_NOIO, passthrough ? NULL :
+ bounce_bio_set);
bio_for_each_segment_all(to, bio, i) {
struct page *page = to->bv_page;
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -241,14 +241,24 @@ struct request {
struct request *next_rq;
};
+static inline bool blk_op_is_scsi(unsigned int op)
+{
+ return op == REQ_OP_SCSI_IN || op == REQ_OP_SCSI_OUT;
+}
+
+static inline bool blk_op_is_private(unsigned int op)
+{
+ return op == REQ_OP_DRV_IN || op == REQ_OP_DRV_OUT;
+}
+
static inline bool blk_rq_is_scsi(struct request *rq)
{
- return req_op(rq) == REQ_OP_SCSI_IN || req_op(rq) == REQ_OP_SCSI_OUT;
+ return blk_op_is_scsi(req_op(rq));
}
static inline bool blk_rq_is_private(struct request *rq)
{
- return req_op(rq) == REQ_OP_DRV_IN || req_op(rq) == REQ_OP_DRV_OUT;
+ return blk_op_is_private(req_op(rq));
}
static inline bool blk_rq_is_passthrough(struct request *rq)
@@ -256,6 +266,13 @@ static inline bool blk_rq_is_passthrough
return blk_rq_is_scsi(rq) || blk_rq_is_private(rq);
}
+static inline bool bio_is_passthrough(struct bio *bio)
+{
+ unsigned op = bio_op(bio);
+
+ return blk_op_is_scsi(op) || blk_op_is_private(op);
+}
+
static inline unsigned short req_get_ioprio(struct request *req)
{
return req->ioprio;
Patches currently in stable-queue which might be from ming.lei(a)redhat.com are
queue-4.14/block-don-t-let-passthrough-io-go-into-.make_request_fn.patch
queue-4.14/block-fix-blk_rq_append_bio.patch
This is a note to let you know that I've just added the patch titled
block: fix blk_rq_append_bio
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
block-fix-blk_rq_append_bio.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0abc2a10389f0c9070f76ca906c7382788036b93 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Mon, 18 Dec 2017 15:40:44 +0800
Subject: block: fix blk_rq_append_bio
From: Jens Axboe <axboe(a)kernel.dk>
commit 0abc2a10389f0c9070f76ca906c7382788036b93 upstream.
Commit caa4b02476e3(blk-map: call blk_queue_bounce from blk_rq_append_bio)
moves blk_queue_bounce() into blk_rq_append_bio(), but don't consider
the fact that the bounced bio becomes invisible to caller since the
parameter type is 'struct bio *'. Make it a pointer to a pointer to
a bio, so the caller sees the right bio also after a bounce.
Fixes: caa4b02476e3 ("blk-map: call blk_queue_bounce from blk_rq_append_bio")
Cc: Christoph Hellwig <hch(a)lst.de>
Reported-by: Michele Ballabio <barra_cuda(a)katamail.com>
(handling failure of blk_rq_append_bio(), only call bio_get() after
blk_rq_append_bio() returns OK)
Tested-by: Michele Ballabio <barra_cuda(a)katamail.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
block/blk-map.c | 38 +++++++++++++++++++++----------------
drivers/scsi/osd/osd_initiator.c | 4 ++-
drivers/target/target_core_pscsi.c | 4 +--
include/linux/blkdev.h | 2 -
4 files changed, 28 insertions(+), 20 deletions(-)
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -12,22 +12,29 @@
#include "blk.h"
/*
- * Append a bio to a passthrough request. Only works can be merged into
- * the request based on the driver constraints.
+ * Append a bio to a passthrough request. Only works if the bio can be merged
+ * into the request based on the driver constraints.
*/
-int blk_rq_append_bio(struct request *rq, struct bio *bio)
+int blk_rq_append_bio(struct request *rq, struct bio **bio)
{
- blk_queue_bounce(rq->q, &bio);
+ struct bio *orig_bio = *bio;
+
+ blk_queue_bounce(rq->q, bio);
if (!rq->bio) {
- blk_rq_bio_prep(rq->q, rq, bio);
+ blk_rq_bio_prep(rq->q, rq, *bio);
} else {
- if (!ll_back_merge_fn(rq->q, rq, bio))
+ if (!ll_back_merge_fn(rq->q, rq, *bio)) {
+ if (orig_bio != *bio) {
+ bio_put(*bio);
+ *bio = orig_bio;
+ }
return -EINVAL;
+ }
- rq->biotail->bi_next = bio;
- rq->biotail = bio;
- rq->__data_len += bio->bi_iter.bi_size;
+ rq->biotail->bi_next = *bio;
+ rq->biotail = *bio;
+ rq->__data_len += (*bio)->bi_iter.bi_size;
}
return 0;
@@ -80,14 +87,12 @@ static int __blk_rq_map_user_iov(struct
* We link the bounce buffer in and could have to traverse it
* later so we have to get a ref to prevent it from being freed
*/
- ret = blk_rq_append_bio(rq, bio);
- bio_get(bio);
+ ret = blk_rq_append_bio(rq, &bio);
if (ret) {
- bio_endio(bio);
__blk_rq_unmap_user(orig_bio);
- bio_put(bio);
return ret;
}
+ bio_get(bio);
return 0;
}
@@ -220,7 +225,7 @@ int blk_rq_map_kern(struct request_queue
int reading = rq_data_dir(rq) == READ;
unsigned long addr = (unsigned long) kbuf;
int do_copy = 0;
- struct bio *bio;
+ struct bio *bio, *orig_bio;
int ret;
if (len > (queue_max_hw_sectors(q) << 9))
@@ -243,10 +248,11 @@ int blk_rq_map_kern(struct request_queue
if (do_copy)
rq->rq_flags |= RQF_COPY_USER;
- ret = blk_rq_append_bio(rq, bio);
+ orig_bio = bio;
+ ret = blk_rq_append_bio(rq, &bio);
if (unlikely(ret)) {
/* request is too big */
- bio_put(bio);
+ bio_put(orig_bio);
return ret;
}
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -1576,7 +1576,9 @@ static struct request *_make_request(str
return req;
for_each_bio(bio) {
- ret = blk_rq_append_bio(req, bio);
+ struct bio *bounce_bio = bio;
+
+ ret = blk_rq_append_bio(req, &bounce_bio);
if (ret)
return ERR_PTR(ret);
}
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -920,7 +920,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct
" %d i: %d bio: %p, allocating another"
" bio\n", bio->bi_vcnt, i, bio);
- rc = blk_rq_append_bio(req, bio);
+ rc = blk_rq_append_bio(req, &bio);
if (rc) {
pr_err("pSCSI: failed to append bio\n");
goto fail;
@@ -938,7 +938,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct
}
if (bio) {
- rc = blk_rq_append_bio(req, bio);
+ rc = blk_rq_append_bio(req, &bio);
if (rc) {
pr_err("pSCSI: failed to append bio\n");
goto fail;
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -952,7 +952,7 @@ extern int blk_rq_prep_clone(struct requ
extern void blk_rq_unprep_clone(struct request *rq);
extern blk_status_t blk_insert_cloned_request(struct request_queue *q,
struct request *rq);
-extern int blk_rq_append_bio(struct request *rq, struct bio *bio);
+extern int blk_rq_append_bio(struct request *rq, struct bio **bio);
extern void blk_delay_queue(struct request_queue *, unsigned long);
extern void blk_queue_split(struct request_queue *, struct bio **);
extern void blk_recount_segments(struct request_queue *, struct bio *);
Patches currently in stable-queue which might be from axboe(a)kernel.dk are
queue-4.14/block-don-t-let-passthrough-io-go-into-.make_request_fn.patch
queue-4.14/block-fix-blk_rq_append_bio.patch
Hi,
there are two commits in master that should be cherry-picked for
4.14.x:
commit 0abc2a10389f0c9070f76ca906c7382788036b93
block: fix blk_rq_append_bio
commit 14cb0dc6479dc5ebc63b3a459a5d89a2f1b39fed
block: don't let passthrough IO go into .make_request_fn()
These avoid oopses, especially on x86-32.
Thanks,
Michele Ballabio
This is a note to let you know that I've just added the patch titled
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 29961b59a51f8c6838a26a45e871a7ed6771809b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:20 -0700
Subject: x86/mm: Remove flush_tlb() and flush_tlb_current_task()
From: Andy Lutomirski <luto(a)kernel.org>
commit 29961b59a51f8c6838a26a45e871a7ed6771809b upstream.
I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/mm/tlb.c | 17 -----------------
2 files changed, 26 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -205,7 +205,6 @@ static inline void __flush_tlb_one(unsig
/*
* TLB flushing:
*
- * - flush_tlb() flushes the current mm struct TLBs
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
@@ -237,11 +236,6 @@ static inline void flush_tlb_all(void)
__flush_tlb_all();
}
-static inline void flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
static inline void local_flush_tlb(void)
{
__flush_tlb_up();
@@ -303,14 +297,11 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)
extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
-#define flush_tlb() flush_tlb_current_task()
-
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -287,23 +287,6 @@ void native_flush_tlb_others(const struc
smp_call_function_many(cpumask, flush_tlb_func, &info, 1);
}
-void flush_tlb_current_task(void)
-{
- struct mm_struct *mm = current->mm;
-
- preempt_disable();
-
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-
- /* This is an implicit full barrier that synchronizes with switch_mm. */
- local_flush_tlb();
-
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
- preempt_enable();
-}
-
/*
* See Documentation/x86/tlb.txt for details. We choose 33
* because it is large enough to cover the vast majority (at
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9ccee2373f0658f234727700e619df097ba57023 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:19 -0700
Subject: x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
From: Andy Lutomirski <luto(a)kernel.org>
commit 9ccee2373f0658f234727700e619df097ba57023 upstream.
mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.
Compile-tested only because I don't know whether software that uses
this mechanism even exists.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Sasha Levin <sasha.levin(a)oracle.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -191,7 +191,7 @@ static void mark_screen_rdonly(struct mm
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Make flush_tlb_mm_range() more predictable
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-make-flush_tlb_mm_range-more-predictable.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ce27374fabf553153c3f53efcaa9bfab9216bd8c Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:21 -0700
Subject: x86/mm: Make flush_tlb_mm_range() more predictable
From: Andy Lutomirski <luto(a)kernel.org>
commit ce27374fabf553153c3f53efcaa9bfab9216bd8c upstream.
I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.
The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.
As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Dave Hansen <dave.hansen(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -307,6 +307,12 @@ void flush_tlb_mm_range(struct mm_struct
unsigned long base_pages_to_flush = TLB_FLUSH_ALL;
preempt_disable();
+
+ if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
+ base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ if (base_pages_to_flush > tlb_single_page_flush_ceiling)
+ base_pages_to_flush = TLB_FLUSH_ALL;
+
if (current->active_mm != mm) {
/* Synchronize with switch_mm. */
smp_mb();
@@ -323,15 +329,11 @@ void flush_tlb_mm_range(struct mm_struct
goto out;
}
- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
-
/*
* Both branches below are implicit full barriers (MOV to CR or
* INVLPG) that synchronize with switch_mm.
*/
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
+ if (base_pages_to_flush == TLB_FLUSH_ALL) {
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
local_flush_tlb();
} else {
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.9/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.9/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9ccee2373f0658f234727700e619df097ba57023 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:19 -0700
Subject: x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
From: Andy Lutomirski <luto(a)kernel.org>
commit 9ccee2373f0658f234727700e619df097ba57023 upstream.
mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.
Compile-tested only because I don't know whether software that uses
this mechanism even exists.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Sasha Levin <sasha.levin(a)oracle.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -187,7 +187,7 @@ static void mark_screen_rdonly(struct mm
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Make flush_tlb_mm_range() more predictable
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-make-flush_tlb_mm_range-more-predictable.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ce27374fabf553153c3f53efcaa9bfab9216bd8c Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:21 -0700
Subject: x86/mm: Make flush_tlb_mm_range() more predictable
From: Andy Lutomirski <luto(a)kernel.org>
commit ce27374fabf553153c3f53efcaa9bfab9216bd8c upstream.
I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.
The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.
As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Dave Hansen <dave.hansen(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -292,6 +292,12 @@ void flush_tlb_mm_range(struct mm_struct
unsigned long base_pages_to_flush = TLB_FLUSH_ALL;
preempt_disable();
+
+ if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
+ base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ if (base_pages_to_flush > tlb_single_page_flush_ceiling)
+ base_pages_to_flush = TLB_FLUSH_ALL;
+
if (current->active_mm != mm) {
/* Synchronize with switch_mm. */
smp_mb();
@@ -308,15 +314,11 @@ void flush_tlb_mm_range(struct mm_struct
goto out;
}
- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
-
/*
* Both branches below are implicit full barriers (MOV to CR or
* INVLPG) that synchronize with switch_mm.
*/
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
+ if (base_pages_to_flush == TLB_FLUSH_ALL) {
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
local_flush_tlb();
} else {
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 29961b59a51f8c6838a26a45e871a7ed6771809b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sat, 22 Apr 2017 00:01:20 -0700
Subject: x86/mm: Remove flush_tlb() and flush_tlb_current_task()
From: Andy Lutomirski <luto(a)kernel.org>
commit 29961b59a51f8c6838a26a45e871a7ed6771809b upstream.
I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.149284437…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/mm/tlb.c | 17 -----------------
2 files changed, 26 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -204,7 +204,6 @@ static inline void __flush_tlb_one(unsig
/*
* TLB flushing:
*
- * - flush_tlb() flushes the current mm struct TLBs
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
@@ -236,11 +235,6 @@ static inline void flush_tlb_all(void)
__flush_tlb_all();
}
-static inline void flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
static inline void local_flush_tlb(void)
{
__flush_tlb_up();
@@ -302,14 +296,11 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)
extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
-#define flush_tlb() flush_tlb_current_task()
-
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -272,23 +272,6 @@ void native_flush_tlb_others(const struc
smp_call_function_many(cpumask, flush_tlb_func, &info, 1);
}
-void flush_tlb_current_task(void)
-{
- struct mm_struct *mm = current->mm;
-
- preempt_disable();
-
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-
- /* This is an implicit full barrier that synchronizes with switch_mm. */
- local_flush_tlb();
-
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
- preempt_enable();
-}
-
/*
* See Documentation/x86/tlb.txt for details. We choose 33
* because it is large enough to cover the vast majority (at
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.4/x86-vm86-32-switch-to-flush_tlb_mm_range-in-mark_screen_rdonly.patch
queue-4.4/x86-mm-make-flush_tlb_mm_range-more-predictable.patch
queue-4.4/x86-mm-remove-flush_tlb-and-flush_tlb_current_task.patch
This is a note to let you know that I've just added the patch titled
ring-buffer: Mask out the info bits when returning buffer page length
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 22 Dec 2017 20:32:35 -0500
Subject: ring-buffer: Mask out the info bits when returning buffer page length
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
commit 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -280,6 +280,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)
+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -331,7 +333,9 @@ static void rb_init_page(struct buffer_d
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-4.9/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch
queue-4.9/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch
queue-4.9/ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
queue-4.9/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch
This is a note to let you know that I've just added the patch titled
ASoC: wm_adsp: Fix validation of firmware and coeff lengths
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-wm_adsp-fix-validation-of-firmware-and-coeff-lengths.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 50dd2ea8ef67a1617e0c0658bcbec4b9fb03b936 Mon Sep 17 00:00:00 2001
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Date: Fri, 8 Dec 2017 16:15:20 +0000
Subject: ASoC: wm_adsp: Fix validation of firmware and coeff lengths
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
commit 50dd2ea8ef67a1617e0c0658bcbec4b9fb03b936 upstream.
The checks for whether another region/block header could be present
are subtracting the size from the current offset. Obviously we should
instead subtract the offset from the size.
The checks for whether the region/block data fit in the file are
adding the data size to the current offset and header size, without
checking for integer overflow. Rearrange these so that overflow is
impossible.
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Acked-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Tested-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/wm_adsp.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/sound/soc/codecs/wm_adsp.c
+++ b/sound/soc/codecs/wm_adsp.c
@@ -1465,7 +1465,7 @@ static int wm_adsp_load(struct wm_adsp *
le64_to_cpu(footer->timestamp));
while (pos < firmware->size &&
- pos - firmware->size > sizeof(*region)) {
+ sizeof(*region) < firmware->size - pos) {
region = (void *)&(firmware->data[pos]);
region_name = "Unknown";
reg = 0;
@@ -1526,8 +1526,8 @@ static int wm_adsp_load(struct wm_adsp *
regions, le32_to_cpu(region->len), offset,
region_name);
- if ((pos + le32_to_cpu(region->len) + sizeof(*region)) >
- firmware->size) {
+ if (le32_to_cpu(region->len) >
+ firmware->size - pos - sizeof(*region)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, regions, region_name,
@@ -1992,7 +1992,7 @@ static int wm_adsp_load_coeff(struct wm_
blocks = 0;
while (pos < firmware->size &&
- pos - firmware->size > sizeof(*blk)) {
+ sizeof(*blk) < firmware->size - pos) {
blk = (void *)(&firmware->data[pos]);
type = le16_to_cpu(blk->type);
@@ -2066,8 +2066,8 @@ static int wm_adsp_load_coeff(struct wm_
}
if (reg) {
- if ((pos + le32_to_cpu(blk->len) + sizeof(*blk)) >
- firmware->size) {
+ if (le32_to_cpu(blk->len) >
+ firmware->size - pos - sizeof(*blk)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, blocks, region_name,
Patches currently in stable-queue which might be from ben.hutchings(a)codethink.co.uk are
queue-4.9/asoc-wm_adsp-fix-validation-of-firmware-and-coeff-lengths.patch
This is a note to let you know that I've just added the patch titled
iw_cxgb4: Only validate the MSN for successful completions
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f55688c45442bc863f40ad678c638785b26cdce6 Mon Sep 17 00:00:00 2001
From: Steve Wise <swise(a)opengridcomputing.com>
Date: Mon, 18 Dec 2017 13:10:00 -0800
Subject: iw_cxgb4: Only validate the MSN for successful completions
From: Steve Wise <swise(a)opengridcomputing.com>
commit f55688c45442bc863f40ad678c638785b26cdce6 upstream.
If the RECV CQE is in error, ignore the MSN check. This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).
Signed-off-by: Steve Wise <swise(a)opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/cxgb4/cq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -575,10 +575,10 @@ static int poll_cq(struct t4_wq *wq, str
ret = -EAGAIN;
goto skip_cqe;
}
- if (unlikely((CQE_WRID_MSN(hw_cqe) != (wq->rq.msn)))) {
+ if (unlikely(!CQE_STATUS(hw_cqe) &&
+ CQE_WRID_MSN(hw_cqe) != wq->rq.msn)) {
t4_set_wq_in_error(wq);
- hw_cqe->header |= htonl(CQE_STATUS_V(T4_ERR_MSN));
- goto proc_cqe;
+ hw_cqe->header |= cpu_to_be32(CQE_STATUS_V(T4_ERR_MSN));
}
goto proc_cqe;
}
Patches currently in stable-queue which might be from swise(a)opengridcomputing.com are
queue-4.9/iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
This is a note to let you know that I've just added the patch titled
ASoC: twl4030: fix child-node lookup
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-twl4030-fix-child-node-lookup.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:56 +0100
Subject: ASoC: twl4030: fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.
Fixes: 2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/twl4030.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -232,7 +232,7 @@ static struct twl4030_codec_data *twl403
struct twl4030_codec_data *pdata = dev_get_platdata(codec->dev);
struct device_node *twl4030_codec_node = NULL;
- twl4030_codec_node = of_find_node_by_name(codec->dev->parent->of_node,
+ twl4030_codec_node = of_get_child_by_name(codec->dev->parent->of_node,
"codec");
if (!pdata && twl4030_codec_node) {
@@ -241,9 +241,11 @@ static struct twl4030_codec_data *twl403
GFP_KERNEL);
if (!pdata) {
dev_err(codec->dev, "Can not allocate memory\n");
+ of_node_put(twl4030_codec_node);
return NULL;
}
twl4030_setup_pdata_of(pdata, twl4030_codec_node);
+ of_node_put(twl4030_codec_node);
}
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.9/asoc-twl4030-fix-child-node-lookup.patch
queue-4.9/asoc-da7218-fix-fix-child-node-lookup.patch
This is a note to let you know that I've just added the patch titled
ASoC: tlv320aic31xx: Fix GPIO1 register definition
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-tlv320aic31xx-fix-gpio1-register-definition.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 737e0b7b67bdfe24090fab2852044bb283282fc5 Mon Sep 17 00:00:00 2001
From: "Andrew F. Davis" <afd(a)ti.com>
Date: Wed, 29 Nov 2017 15:32:46 -0600
Subject: ASoC: tlv320aic31xx: Fix GPIO1 register definition
From: Andrew F. Davis <afd(a)ti.com>
commit 737e0b7b67bdfe24090fab2852044bb283282fc5 upstream.
GPIO1 control register is number 51, fix this here.
Fixes: bafcbfe429eb ("ASoC: tlv320aic31xx: Make the register values human readable")
Signed-off-by: Andrew F. Davis <afd(a)ti.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/tlv320aic31xx.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/codecs/tlv320aic31xx.h
+++ b/sound/soc/codecs/tlv320aic31xx.h
@@ -115,7 +115,7 @@ struct aic31xx_pdata {
/* INT2 interrupt control */
#define AIC31XX_INT2CTRL AIC31XX_REG(0, 49)
/* GPIO1 control */
-#define AIC31XX_GPIO1 AIC31XX_REG(0, 50)
+#define AIC31XX_GPIO1 AIC31XX_REG(0, 51)
#define AIC31XX_DACPRB AIC31XX_REG(0, 60)
/* ADC Instruction Set Register */
Patches currently in stable-queue which might be from afd(a)ti.com are
queue-4.9/asoc-tlv320aic31xx-fix-gpio1-register-definition.patch
This is a note to let you know that I've just added the patch titled
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 695b78b548d8a26288f041e907ff17758df9e1d5 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <mail(a)maciej.szmigiero.name>
Date: Mon, 20 Nov 2017 23:14:55 +0100
Subject: ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
From: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
commit 695b78b548d8a26288f041e907ff17758df9e1d5 upstream.
AC'97 ops (register read / write) need SSI regmap and clock, so they have
to be set after them.
We also need to set these ops back to NULL if we fail the probe.
Signed-off-by: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
Acked-by: Nicolin Chen <nicoleotsuka(a)gmail.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/fsl/fsl_ssi.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1467,12 +1467,6 @@ static int fsl_ssi_probe(struct platform
sizeof(fsl_ssi_ac97_dai));
fsl_ac97_data = ssi_private;
-
- ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
- if (ret) {
- dev_err(&pdev->dev, "could not set AC'97 ops\n");
- return ret;
- }
} else {
/* Initialize this copy of the CPU DAI driver structure */
memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_dai_template,
@@ -1583,6 +1577,14 @@ static int fsl_ssi_probe(struct platform
return ret;
}
+ if (fsl_ssi_is_ac97(ssi_private)) {
+ ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
+ if (ret) {
+ dev_err(&pdev->dev, "could not set AC'97 ops\n");
+ goto error_ac97_ops;
+ }
+ }
+
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
&ssi_private->cpu_dai_drv, 1);
if (ret) {
@@ -1666,6 +1668,10 @@ error_sound_card:
fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);
error_asoc_register:
+ if (fsl_ssi_is_ac97(ssi_private))
+ snd_soc_set_ac97_ops(NULL);
+
+error_ac97_ops:
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);
Patches currently in stable-queue which might be from mail(a)maciej.szmigiero.name are
queue-4.9/asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
This is a note to let you know that I've just added the patch titled
ASoC: da7218: fix fix child-node lookup
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-da7218-fix-fix-child-node-lookup.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bc6476d6c1edcb9b97621b5131bd169aa81f27db Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:55 +0100
Subject: ASoC: da7218: fix fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit bc6476d6c1edcb9b97621b5131bd169aa81f27db upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed.
Fixes: 4d50934abd22 ("ASoC: da7218: Add da7218 codec driver")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Acked-by: Adam Thomson <Adam.Thomson.Opensource(a)diasemi.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/da7218.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/codecs/da7218.c
+++ b/sound/soc/codecs/da7218.c
@@ -2519,7 +2519,7 @@ static struct da7218_pdata *da7218_of_to
}
if (da7218->dev_id == DA7218_DEV_ID) {
- hpldet_np = of_find_node_by_name(np, "da7218_hpldet");
+ hpldet_np = of_get_child_by_name(np, "da7218_hpldet");
if (!hpldet_np)
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.9/asoc-twl4030-fix-child-node-lookup.patch
queue-4.9/asoc-da7218-fix-fix-child-node-lookup.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - fix headset mic detection issue on a Dell machine
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d Mon Sep 17 00:00:00 2001
From: Hui Wang <hui.wang(a)canonical.com>
Date: Fri, 22 Dec 2017 11:17:45 +0800
Subject: ALSA: hda - fix headset mic detection issue on a Dell machine
From: Hui Wang <hui.wang(a)canonical.com>
commit 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d upstream.
It has the codec alc256, and add its pin definition to pin quirk
table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_realtek.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5972,6 +5972,11 @@ static const struct snd_hda_pin_quirk al
{0x1b, 0x01011020},
{0x21, 0x02211010}),
SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x14, 0x90170110},
+ {0x1b, 0x01011020},
+ {0x21, 0x0221101f}),
+ SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x12, 0x90a60160},
{0x14, 0x90170120},
{0x21, 0x02211030}),
Patches currently in stable-queue which might be from hui.wang(a)canonical.com are
queue-4.9/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda: Drop useless WARN_ON()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-drop-useless-warn_on.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a36c2638380c0a4676647a1f553b70b20d3ebce1 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Fri, 22 Dec 2017 10:45:07 +0100
Subject: ALSA: hda: Drop useless WARN_ON()
From: Takashi Iwai <tiwai(a)suse.de>
commit a36c2638380c0a4676647a1f553b70b20d3ebce1 upstream.
Since the commit 97cc2ed27e5a ("ALSA: hda - Fix yet another i915
pointer leftover in error path") cleared hdac_acomp pointer, the
WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give
a false-positive warning, as the function gets called no matter
whether the component is registered or not. For fixing it, let's get
rid of the spurious WARN_ON().
Fixes: 97cc2ed27e5a ("ALSA: hda - Fix yet another i915 pointer leftover in error path")
Reported-by: Kouta Okamoto <kouta.okamoto(a)toshiba.co.jp>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/hda/hdac_i915.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -319,7 +319,7 @@ static int hdac_component_master_match(s
*/
int snd_hdac_i915_register_notifier(const struct i915_audio_component_audio_ops *aops)
{
- if (WARN_ON(!hdac_acomp))
+ if (!hdac_acomp)
return -ENODEV;
hdac_acomp->audio_ops = aops;
Patches currently in stable-queue which might be from tiwai(a)suse.de are
queue-4.9/alsa-hda-drop-useless-warn_on.patch
queue-4.9/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
This is a note to let you know that I've just added the patch titled
ring-buffer: Mask out the info bits when returning buffer page length
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 22 Dec 2017 20:32:35 -0500
Subject: ring-buffer: Mask out the info bits when returning buffer page length
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
commit 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -280,6 +280,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)
+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -331,7 +333,9 @@ static void rb_init_page(struct buffer_d
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-4.4/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch
queue-4.4/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch
queue-4.4/ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
queue-4.4/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch
This is a note to let you know that I've just added the patch titled
ASoC: twl4030: fix child-node lookup
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-twl4030-fix-child-node-lookup.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:56 +0100
Subject: ASoC: twl4030: fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.
Fixes: 2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/twl4030.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -232,7 +232,7 @@ static struct twl4030_codec_data *twl403
struct twl4030_codec_data *pdata = dev_get_platdata(codec->dev);
struct device_node *twl4030_codec_node = NULL;
- twl4030_codec_node = of_find_node_by_name(codec->dev->parent->of_node,
+ twl4030_codec_node = of_get_child_by_name(codec->dev->parent->of_node,
"codec");
if (!pdata && twl4030_codec_node) {
@@ -241,9 +241,11 @@ static struct twl4030_codec_data *twl403
GFP_KERNEL);
if (!pdata) {
dev_err(codec->dev, "Can not allocate memory\n");
+ of_node_put(twl4030_codec_node);
return NULL;
}
twl4030_setup_pdata_of(pdata, twl4030_codec_node);
+ of_node_put(twl4030_codec_node);
}
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.4/mfd-twl4030-audio-fix-sibling-node-lookup.patch
queue-4.4/asoc-twl4030-fix-child-node-lookup.patch
queue-4.4/mfd-twl6040-fix-child-node-lookup.patch
This is a note to let you know that I've just added the patch titled
iw_cxgb4: Only validate the MSN for successful completions
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f55688c45442bc863f40ad678c638785b26cdce6 Mon Sep 17 00:00:00 2001
From: Steve Wise <swise(a)opengridcomputing.com>
Date: Mon, 18 Dec 2017 13:10:00 -0800
Subject: iw_cxgb4: Only validate the MSN for successful completions
From: Steve Wise <swise(a)opengridcomputing.com>
commit f55688c45442bc863f40ad678c638785b26cdce6 upstream.
If the RECV CQE is in error, ignore the MSN check. This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).
Signed-off-by: Steve Wise <swise(a)opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/cxgb4/cq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -579,10 +579,10 @@ static int poll_cq(struct t4_wq *wq, str
ret = -EAGAIN;
goto skip_cqe;
}
- if (unlikely((CQE_WRID_MSN(hw_cqe) != (wq->rq.msn)))) {
+ if (unlikely(!CQE_STATUS(hw_cqe) &&
+ CQE_WRID_MSN(hw_cqe) != wq->rq.msn)) {
t4_set_wq_in_error(wq);
- hw_cqe->header |= htonl(CQE_STATUS_V(T4_ERR_MSN));
- goto proc_cqe;
+ hw_cqe->header |= cpu_to_be32(CQE_STATUS_V(T4_ERR_MSN));
}
goto proc_cqe;
}
Patches currently in stable-queue which might be from swise(a)opengridcomputing.com are
queue-4.4/iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
This is a note to let you know that I've just added the patch titled
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 695b78b548d8a26288f041e907ff17758df9e1d5 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <mail(a)maciej.szmigiero.name>
Date: Mon, 20 Nov 2017 23:14:55 +0100
Subject: ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
From: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
commit 695b78b548d8a26288f041e907ff17758df9e1d5 upstream.
AC'97 ops (register read / write) need SSI regmap and clock, so they have
to be set after them.
We also need to set these ops back to NULL if we fail the probe.
Signed-off-by: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
Acked-by: Nicolin Chen <nicoleotsuka(a)gmail.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/fsl/fsl_ssi.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1408,12 +1408,6 @@ static int fsl_ssi_probe(struct platform
sizeof(fsl_ssi_ac97_dai));
fsl_ac97_data = ssi_private;
-
- ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
- if (ret) {
- dev_err(&pdev->dev, "could not set AC'97 ops\n");
- return ret;
- }
} else {
/* Initialize this copy of the CPU DAI driver structure */
memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_dai_template,
@@ -1473,6 +1467,14 @@ static int fsl_ssi_probe(struct platform
return ret;
}
+ if (fsl_ssi_is_ac97(ssi_private)) {
+ ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
+ if (ret) {
+ dev_err(&pdev->dev, "could not set AC'97 ops\n");
+ goto error_ac97_ops;
+ }
+ }
+
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
&ssi_private->cpu_dai_drv, 1);
if (ret) {
@@ -1556,6 +1558,10 @@ error_sound_card:
fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);
error_asoc_register:
+ if (fsl_ssi_is_ac97(ssi_private))
+ snd_soc_set_ac97_ops(NULL);
+
+error_ac97_ops:
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);
Patches currently in stable-queue which might be from mail(a)maciej.szmigiero.name are
queue-4.4/asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda: Drop useless WARN_ON()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-drop-useless-warn_on.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a36c2638380c0a4676647a1f553b70b20d3ebce1 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Fri, 22 Dec 2017 10:45:07 +0100
Subject: ALSA: hda: Drop useless WARN_ON()
From: Takashi Iwai <tiwai(a)suse.de>
commit a36c2638380c0a4676647a1f553b70b20d3ebce1 upstream.
Since the commit 97cc2ed27e5a ("ALSA: hda - Fix yet another i915
pointer leftover in error path") cleared hdac_acomp pointer, the
WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give
a false-positive warning, as the function gets called no matter
whether the component is registered or not. For fixing it, let's get
rid of the spurious WARN_ON().
Fixes: 97cc2ed27e5a ("ALSA: hda - Fix yet another i915 pointer leftover in error path")
Reported-by: Kouta Okamoto <kouta.okamoto(a)toshiba.co.jp>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/hda/hdac_i915.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -183,7 +183,7 @@ static int hdac_component_master_match(s
*/
int snd_hdac_i915_register_notifier(const struct i915_audio_component_audio_ops *aops)
{
- if (WARN_ON(!hdac_acomp))
+ if (!hdac_acomp)
return -ENODEV;
hdac_acomp->audio_ops = aops;
Patches currently in stable-queue which might be from tiwai(a)suse.de are
queue-4.4/alsa-usb-audio-fix-the-missing-ctl-name-suffix-at-parsing-su.patch
queue-4.4/alsa-hda-drop-useless-warn_on.patch
queue-4.4/alsa-rawmidi-avoid-racy-info-ioctl-via-ctl-device.patch
queue-4.4/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.4/acpi-apei-erst-fix-missing-error-handling-in-erst_reader.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - fix headset mic detection issue on a Dell machine
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d Mon Sep 17 00:00:00 2001
From: Hui Wang <hui.wang(a)canonical.com>
Date: Fri, 22 Dec 2017 11:17:45 +0800
Subject: ALSA: hda - fix headset mic detection issue on a Dell machine
From: Hui Wang <hui.wang(a)canonical.com>
commit 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d upstream.
It has the codec alc256, and add its pin definition to pin quirk
table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_realtek.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5954,6 +5954,11 @@ static const struct snd_hda_pin_quirk al
{0x1b, 0x01011020},
{0x21, 0x02211010}),
SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x14, 0x90170110},
+ {0x1b, 0x01011020},
+ {0x21, 0x0221101f}),
+ SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x12, 0x90a60160},
{0x14, 0x90170120},
{0x21, 0x02211030}),
Patches currently in stable-queue which might be from hui.wang(a)canonical.com are
queue-4.4/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
This is a note to let you know that I've just added the patch titled
ring-buffer: Mask out the info bits when returning buffer page length
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 22 Dec 2017 20:32:35 -0500
Subject: ring-buffer: Mask out the info bits when returning buffer page length
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
commit 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -281,6 +281,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)
+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -332,7 +334,9 @@ static void rb_init_page(struct buffer_d
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-4.14/ring-buffer-do-no-reuse-reader-page-if-still-in-use.patch
queue-4.14/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch
queue-4.14/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch
queue-4.14/ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
queue-4.14/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch
This is a note to let you know that I've just added the patch titled
iw_cxgb4: Only validate the MSN for successful completions
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f55688c45442bc863f40ad678c638785b26cdce6 Mon Sep 17 00:00:00 2001
From: Steve Wise <swise(a)opengridcomputing.com>
Date: Mon, 18 Dec 2017 13:10:00 -0800
Subject: iw_cxgb4: Only validate the MSN for successful completions
From: Steve Wise <swise(a)opengridcomputing.com>
commit f55688c45442bc863f40ad678c638785b26cdce6 upstream.
If the RECV CQE is in error, ignore the MSN check. This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).
Signed-off-by: Steve Wise <swise(a)opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/cxgb4/cq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -586,10 +586,10 @@ static int poll_cq(struct t4_wq *wq, str
ret = -EAGAIN;
goto skip_cqe;
}
- if (unlikely((CQE_WRID_MSN(hw_cqe) != (wq->rq.msn)))) {
+ if (unlikely(!CQE_STATUS(hw_cqe) &&
+ CQE_WRID_MSN(hw_cqe) != wq->rq.msn)) {
t4_set_wq_in_error(wq);
- hw_cqe->header |= htonl(CQE_STATUS_V(T4_ERR_MSN));
- goto proc_cqe;
+ hw_cqe->header |= cpu_to_be32(CQE_STATUS_V(T4_ERR_MSN));
}
goto proc_cqe;
}
Patches currently in stable-queue which might be from swise(a)opengridcomputing.com are
queue-4.14/iw_cxgb4-only-validate-the-msn-for-successful-completions.patch
This is a note to let you know that I've just added the patch titled
ring-buffer: Do no reuse reader page if still in use
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ring-buffer-do-no-reuse-reader-page-if-still-in-use.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ae415fa4c5248a8cf4faabd5a3c20576cb1ad607 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 22 Dec 2017 21:19:29 -0500
Subject: ring-buffer: Do no reuse reader page if still in use
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
commit ae415fa4c5248a8cf4faabd5a3c20576cb1ad607 upstream.
To free the reader page that is allocated with ring_buffer_alloc_read_page(),
ring_buffer_free_read_page() must be called. For faster performance, this
page can be reused by the ring buffer to avoid having to free and allocate
new pages.
The issue arises when the page is used with a splice pipe into the
networking code. The networking code may up the page counter for the page,
and keep it active while sending it is queued to go to the network. The
incrementing of the page ref does not prevent it from being reused in the
ring buffer, and this can cause the page that is being sent out to the
network to be modified before it is sent by reading new data.
Add a check to the page ref counter, and only reuse the page if it is not
being used anywhere else.
Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/trace/ring_buffer.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4443,8 +4443,13 @@ void ring_buffer_free_read_page(struct r
{
struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu];
struct buffer_data_page *bpage = data;
+ struct page *page = virt_to_page(bpage);
unsigned long flags;
+ /* If the page is still in use someplace else, we can't reuse it */
+ if (page_ref_count(page) > 1)
+ goto out;
+
local_irq_save(flags);
arch_spin_lock(&cpu_buffer->lock);
@@ -4456,6 +4461,7 @@ void ring_buffer_free_read_page(struct r
arch_spin_unlock(&cpu_buffer->lock);
local_irq_restore(flags);
+ out:
free_page((unsigned long)bpage);
}
EXPORT_SYMBOL_GPL(ring_buffer_free_read_page);
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-4.14/ring-buffer-do-no-reuse-reader-page-if-still-in-use.patch
queue-4.14/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch
queue-4.14/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch
queue-4.14/ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
queue-4.14/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch
This is a note to let you know that I've just added the patch titled
IB/mlx5: Serialize access to the VMA list
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-mlx5-serialize-access-to-the-vma-list.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ad9a3668a434faca1339789ed2f043d679199309 Mon Sep 17 00:00:00 2001
From: Majd Dibbiny <majd(a)mellanox.com>
Date: Sun, 24 Dec 2017 13:54:56 +0200
Subject: IB/mlx5: Serialize access to the VMA list
From: Majd Dibbiny <majd(a)mellanox.com>
commit ad9a3668a434faca1339789ed2f043d679199309 upstream.
User-space applications can do mmap and munmap directly at
any time.
Since the VMA list is not protected with a mutex, concurrent
accesses to the VMA list from the mmap and munmap can cause
data corruption. Add a mutex around the list.
Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API")
Reviewed-by: Yishai Hadas <yishaih(a)mellanox.com>
Signed-off-by: Majd Dibbiny <majd(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx5/main.c | 8 ++++++++
drivers/infiniband/hw/mlx5/mlx5_ib.h | 4 ++++
2 files changed, 12 insertions(+)
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1415,6 +1415,7 @@ static struct ib_ucontext *mlx5_ib_alloc
}
INIT_LIST_HEAD(&context->vma_private_list);
+ mutex_init(&context->vma_private_list_mutex);
INIT_LIST_HEAD(&context->db_page_list);
mutex_init(&context->db_page_mutex);
@@ -1576,7 +1577,9 @@ static void mlx5_ib_vma_close(struct vm
* mlx5_ib_disassociate_ucontext().
*/
mlx5_ib_vma_priv_data->vma = NULL;
+ mutex_lock(mlx5_ib_vma_priv_data->vma_private_list_mutex);
list_del(&mlx5_ib_vma_priv_data->list);
+ mutex_unlock(mlx5_ib_vma_priv_data->vma_private_list_mutex);
kfree(mlx5_ib_vma_priv_data);
}
@@ -1596,10 +1599,13 @@ static int mlx5_ib_set_vma_data(struct v
return -ENOMEM;
vma_prv->vma = vma;
+ vma_prv->vma_private_list_mutex = &ctx->vma_private_list_mutex;
vma->vm_private_data = vma_prv;
vma->vm_ops = &mlx5_ib_vm_ops;
+ mutex_lock(&ctx->vma_private_list_mutex);
list_add(&vma_prv->list, vma_head);
+ mutex_unlock(&ctx->vma_private_list_mutex);
return 0;
}
@@ -1642,6 +1648,7 @@ static void mlx5_ib_disassociate_ucontex
* mlx5_ib_vma_close.
*/
down_write(&owning_mm->mmap_sem);
+ mutex_lock(&context->vma_private_list_mutex);
list_for_each_entry_safe(vma_private, n, &context->vma_private_list,
list) {
vma = vma_private->vma;
@@ -1656,6 +1663,7 @@ static void mlx5_ib_disassociate_ucontex
list_del(&vma_private->list);
kfree(vma_private);
}
+ mutex_unlock(&context->vma_private_list_mutex);
up_write(&owning_mm->mmap_sem);
mmput(owning_mm);
put_task_struct(owning_process);
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -115,6 +115,8 @@ enum {
struct mlx5_ib_vma_private_data {
struct list_head list;
struct vm_area_struct *vma;
+ /* protect vma_private_list add/del */
+ struct mutex *vma_private_list_mutex;
};
struct mlx5_ib_ucontext {
@@ -129,6 +131,8 @@ struct mlx5_ib_ucontext {
/* Transport Domain number */
u32 tdn;
struct list_head vma_private_list;
+ /* protect vma_private_list add/del */
+ struct mutex vma_private_list_mutex;
unsigned long upd_xlt_page;
/* protect ODP/KSM */
Patches currently in stable-queue which might be from majd(a)mellanox.com are
queue-4.14/ib-mlx5-serialize-access-to-the-vma-list.patch
This is a note to let you know that I've just added the patch titled
IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-uverbs-fix-command-checking-as-part-of-ib_uverbs_ex_modify_qp.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 05d14e7b0c138cb07ba30e464f47b39434f3fdef Mon Sep 17 00:00:00 2001
From: Moni Shoua <monis(a)mellanox.com>
Date: Sun, 24 Dec 2017 13:54:57 +0200
Subject: IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
From: Moni Shoua <monis(a)mellanox.com>
commit 05d14e7b0c138cb07ba30e464f47b39434f3fdef upstream.
If the input command length is larger than the kernel supports an error should
be returned in case the unsupported bytes are not cleared, instead of the
other way aroudn. This matches what all other callers of ib_is_udata_cleared
do and will avoid user ABI problems in the future.
Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing")
Reviewed-by: Yishai Hadas <yishaih(a)mellanox.com>
Signed-off-by: Moni Shoua <monis(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/core/uverbs_cmd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2085,8 +2085,8 @@ int ib_uverbs_ex_modify_qp(struct ib_uve
return -EOPNOTSUPP;
if (ucore->inlen > sizeof(cmd)) {
- if (ib_is_udata_cleared(ucore, sizeof(cmd),
- ucore->inlen - sizeof(cmd)))
+ if (!ib_is_udata_cleared(ucore, sizeof(cmd),
+ ucore->inlen - sizeof(cmd)))
return -EOPNOTSUPP;
}
Patches currently in stable-queue which might be from monis(a)mellanox.com are
queue-4.14/ib-core-verify-that-qp-is-security-enabled-in-create-and-destroy.patch
queue-4.14/ib-uverbs-fix-command-checking-as-part-of-ib_uverbs_ex_modify_qp.patch
This is a note to let you know that I've just added the patch titled
IB/core: Verify that QP is security enabled in create and destroy
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-core-verify-that-qp-is-security-enabled-in-create-and-destroy.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4a50881bbac309e6f0684816a180bc3c14e1485d Mon Sep 17 00:00:00 2001
From: Moni Shoua <monis(a)mellanox.com>
Date: Sun, 24 Dec 2017 13:54:58 +0200
Subject: IB/core: Verify that QP is security enabled in create and destroy
From: Moni Shoua <monis(a)mellanox.com>
commit 4a50881bbac309e6f0684816a180bc3c14e1485d upstream.
The XRC target QP create flow sets up qp_sec only if there is an IB link with
LSM security enabled. However, several other related uAPI entry points blindly
follow the qp_sec NULL pointer, resulting in a possible oops.
Check for NULL before using qp_sec.
Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs")
Reviewed-by: Daniel Jurgens <danielj(a)mellanox.com>
Signed-off-by: Moni Shoua <monis(a)mellanox.com>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/core/security.c | 3 +++
drivers/infiniband/core/verbs.c | 3 ++-
2 files changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -386,6 +386,9 @@ int ib_open_shared_qp_security(struct ib
if (ret)
return ret;
+ if (!qp->qp_sec)
+ return 0;
+
mutex_lock(&real_qp->qp_sec->mutex);
ret = check_qp_port_pkey_settings(real_qp->qp_sec->ports_pkeys,
qp->qp_sec);
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1400,7 +1400,8 @@ int ib_close_qp(struct ib_qp *qp)
spin_unlock_irqrestore(&real_qp->device->event_handler_lock, flags);
atomic_dec(&real_qp->usecnt);
- ib_close_shared_qp_security(qp->qp_sec);
+ if (qp->qp_sec)
+ ib_close_shared_qp_security(qp->qp_sec);
kfree(qp);
return 0;
Patches currently in stable-queue which might be from monis(a)mellanox.com are
queue-4.14/ib-core-verify-that-qp-is-security-enabled-in-create-and-destroy.patch
queue-4.14/ib-uverbs-fix-command-checking-as-part-of-ib_uverbs_ex_modify_qp.patch
This is a note to let you know that I've just added the patch titled
IB/hfi: Only read capability registers if the capability exists
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-hfi-only-read-capability-registers-if-the-capability-exists.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4c009af473b2026caaa26107e34d7cc68dad7756 Mon Sep 17 00:00:00 2001
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Date: Fri, 22 Dec 2017 08:47:20 -0800
Subject: IB/hfi: Only read capability registers if the capability exists
From: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
commit 4c009af473b2026caaa26107e34d7cc68dad7756 upstream.
During driver init, various registers are saved to allow restoration
after an FLR or gen3 bump. Some of these registers are not available
in some circumstances (i.e. Virtual machines).
This bug makes the driver unusable when the PCI device is passed into
a VM, it fails during probe.
Delete unnecessary register read/write, and only access register if
the capability exists.
Fixes: a618b7e40af2 ("IB/hfi1: Move saving PCI values to a separate function")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/hfi1/hfi.h | 1 -
drivers/infiniband/hw/hfi1/pcie.c | 30 ++++++++++++------------------
2 files changed, 12 insertions(+), 19 deletions(-)
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1129,7 +1129,6 @@ struct hfi1_devdata {
u16 pcie_lnkctl;
u16 pcie_devctl2;
u32 pci_msix0;
- u32 pci_lnkctl3;
u32 pci_tph2;
/*
--- a/drivers/infiniband/hw/hfi1/pcie.c
+++ b/drivers/infiniband/hw/hfi1/pcie.c
@@ -411,15 +411,12 @@ int restore_pci_variables(struct hfi1_de
if (ret)
goto error;
- ret = pci_write_config_dword(dd->pcidev, PCIE_CFG_SPCIE1,
- dd->pci_lnkctl3);
- if (ret)
- goto error;
-
- ret = pci_write_config_dword(dd->pcidev, PCIE_CFG_TPH2, dd->pci_tph2);
- if (ret)
- goto error;
-
+ if (pci_find_ext_capability(dd->pcidev, PCI_EXT_CAP_ID_TPH)) {
+ ret = pci_write_config_dword(dd->pcidev, PCIE_CFG_TPH2,
+ dd->pci_tph2);
+ if (ret)
+ goto error;
+ }
return 0;
error:
@@ -469,15 +466,12 @@ int save_pci_variables(struct hfi1_devda
if (ret)
goto error;
- ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_SPCIE1,
- &dd->pci_lnkctl3);
- if (ret)
- goto error;
-
- ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_TPH2, &dd->pci_tph2);
- if (ret)
- goto error;
-
+ if (pci_find_ext_capability(dd->pcidev, PCI_EXT_CAP_ID_TPH)) {
+ ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_TPH2,
+ &dd->pci_tph2);
+ if (ret)
+ goto error;
+ }
return 0;
error:
Patches currently in stable-queue which might be from michael.j.ruhl(a)intel.com are
queue-4.14/ib-hfi-only-read-capability-registers-if-the-capability-exists.patch
This is a note to let you know that I've just added the patch titled
gpio: fix "gpio-line-names" property retrieval
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gpio-fix-gpio-line-names-property-retrieval.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 822703354774ec935169cbbc8d503236bcb54fda Mon Sep 17 00:00:00 2001
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
Date: Fri, 15 Dec 2017 15:02:33 +0100
Subject: gpio: fix "gpio-line-names" property retrieval
From: Christophe Leroy <christophe.leroy(a)c-s.fr>
commit 822703354774ec935169cbbc8d503236bcb54fda upstream.
Following commit 9427ecbed46cc ("gpio: Rework of_gpiochip_set_names()
to use device property accessors"), "gpio-line-names" DT property is
not retrieved anymore when chip->parent is not set by the driver.
This is due to OF based property reads having been replaced by device
based property reads.
This patch fixes that by making use of
fwnode_property_read_string_array() instead of
device_property_read_string_array() and handing over either
of_fwnode_handle(chip->of_node) or dev_fwnode(chip->parent)
to that function.
Fixes: 9427ecbed46cc ("gpio: Rework of_gpiochip_set_names() to use device property accessors")
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Acked-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpio/gpiolib-acpi.c | 2 +-
drivers/gpio/gpiolib-devprop.c | 17 +++++++----------
drivers/gpio/gpiolib-of.c | 3 ++-
drivers/gpio/gpiolib.h | 3 ++-
4 files changed, 12 insertions(+), 13 deletions(-)
--- a/drivers/gpio/gpiolib-acpi.c
+++ b/drivers/gpio/gpiolib-acpi.c
@@ -1074,7 +1074,7 @@ void acpi_gpiochip_add(struct gpio_chip
}
if (!chip->names)
- devprop_gpiochip_set_names(chip);
+ devprop_gpiochip_set_names(chip, dev_fwnode(chip->parent));
acpi_gpiochip_request_regions(acpi_gpio);
acpi_gpiochip_scan_gpios(acpi_gpio);
--- a/drivers/gpio/gpiolib-devprop.c
+++ b/drivers/gpio/gpiolib-devprop.c
@@ -19,30 +19,27 @@
/**
* devprop_gpiochip_set_names - Set GPIO line names using device properties
* @chip: GPIO chip whose lines should be named, if possible
+ * @fwnode: Property Node containing the gpio-line-names property
*
* Looks for device property "gpio-line-names" and if it exists assigns
* GPIO line names for the chip. The memory allocated for the assigned
* names belong to the underlying firmware node and should not be released
* by the caller.
*/
-void devprop_gpiochip_set_names(struct gpio_chip *chip)
+void devprop_gpiochip_set_names(struct gpio_chip *chip,
+ const struct fwnode_handle *fwnode)
{
struct gpio_device *gdev = chip->gpiodev;
const char **names;
int ret, i;
- if (!chip->parent) {
- dev_warn(&gdev->dev, "GPIO chip parent is NULL\n");
- return;
- }
-
- ret = device_property_read_string_array(chip->parent, "gpio-line-names",
+ ret = fwnode_property_read_string_array(fwnode, "gpio-line-names",
NULL, 0);
if (ret < 0)
return;
if (ret != gdev->ngpio) {
- dev_warn(chip->parent,
+ dev_warn(&gdev->dev,
"names %d do not match number of GPIOs %d\n", ret,
gdev->ngpio);
return;
@@ -52,10 +49,10 @@ void devprop_gpiochip_set_names(struct g
if (!names)
return;
- ret = device_property_read_string_array(chip->parent, "gpio-line-names",
+ ret = fwnode_property_read_string_array(fwnode, "gpio-line-names",
names, gdev->ngpio);
if (ret < 0) {
- dev_warn(chip->parent, "failed to read GPIO line names\n");
+ dev_warn(&gdev->dev, "failed to read GPIO line names\n");
kfree(names);
return;
}
--- a/drivers/gpio/gpiolib-of.c
+++ b/drivers/gpio/gpiolib-of.c
@@ -493,7 +493,8 @@ int of_gpiochip_add(struct gpio_chip *ch
/* If the chip defines names itself, these take precedence */
if (!chip->names)
- devprop_gpiochip_set_names(chip);
+ devprop_gpiochip_set_names(chip,
+ of_fwnode_handle(chip->of_node));
of_node_get(chip->of_node);
--- a/drivers/gpio/gpiolib.h
+++ b/drivers/gpio/gpiolib.h
@@ -224,7 +224,8 @@ static inline int gpio_chip_hwgpio(const
return desc - &desc->gdev->descs[0];
}
-void devprop_gpiochip_set_names(struct gpio_chip *chip);
+void devprop_gpiochip_set_names(struct gpio_chip *chip,
+ const struct fwnode_handle *fwnode);
/* With descriptor prefix */
Patches currently in stable-queue which might be from christophe.leroy(a)c-s.fr are
queue-4.14/gpio-fix-gpio-line-names-property-retrieval.patch
This is a note to let you know that I've just added the patch titled
cpufreq: schedutil: Use idle_calls counter of the remote CPU
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
cpufreq-schedutil-use-idle_calls-counter-of-the-remote-cpu.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 466a2b42d67644447a1765276259a3ea5531ddff Mon Sep 17 00:00:00 2001
From: Joel Fernandes <joelaf(a)google.com>
Date: Thu, 21 Dec 2017 02:22:45 +0100
Subject: cpufreq: schedutil: Use idle_calls counter of the remote CPU
From: Joel Fernandes <joelaf(a)google.com>
commit 466a2b42d67644447a1765276259a3ea5531ddff upstream.
Since the recent remote cpufreq callback work, its possible that a cpufreq
update is triggered from a remote CPU. For single policies however, the current
code uses the local CPU when trying to determine if the remote sg_cpu entered
idle or is busy. This is incorrect. To remedy this, compare with the nohz tick
idle_calls counter of the remote CPU.
Fixes: 674e75411fc2 (sched: cpufreq: Allow remote cpufreq callbacks)
Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Joel Fernandes <joelaf(a)google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/tick.h | 1 +
kernel/sched/cpufreq_schedutil.c | 2 +-
kernel/time/tick-sched.c | 13 +++++++++++++
3 files changed, 15 insertions(+), 1 deletion(-)
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -119,6 +119,7 @@ extern void tick_nohz_idle_exit(void);
extern void tick_nohz_irq_exit(void);
extern ktime_t tick_nohz_get_sleep_length(void);
extern unsigned long tick_nohz_get_idle_calls(void);
+extern unsigned long tick_nohz_get_idle_calls_cpu(int cpu);
extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
#else /* !CONFIG_NO_HZ_COMMON */
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -244,7 +244,7 @@ static void sugov_iowait_boost(struct su
#ifdef CONFIG_NO_HZ_COMMON
static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu)
{
- unsigned long idle_calls = tick_nohz_get_idle_calls();
+ unsigned long idle_calls = tick_nohz_get_idle_calls_cpu(sg_cpu->cpu);
bool ret = idle_calls == sg_cpu->saved_idle_calls;
sg_cpu->saved_idle_calls = idle_calls;
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1010,6 +1010,19 @@ ktime_t tick_nohz_get_sleep_length(void)
}
/**
+ * tick_nohz_get_idle_calls_cpu - return the current idle calls counter value
+ * for a particular CPU.
+ *
+ * Called from the schedutil frequency scaling governor in scheduler context.
+ */
+unsigned long tick_nohz_get_idle_calls_cpu(int cpu)
+{
+ struct tick_sched *ts = tick_get_tick_sched(cpu);
+
+ return ts->idle_calls;
+}
+
+/**
* tick_nohz_get_idle_calls - return the current idle calls counter value
*
* Called from the schedutil frequency scaling governor in scheduler context.
Patches currently in stable-queue which might be from joelaf(a)google.com are
queue-4.14/cpufreq-schedutil-use-idle_calls-counter-of-the-remote-cpu.patch
This is a note to let you know that I've just added the patch titled
ASoC: wm_adsp: Fix validation of firmware and coeff lengths
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-wm_adsp-fix-validation-of-firmware-and-coeff-lengths.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 50dd2ea8ef67a1617e0c0658bcbec4b9fb03b936 Mon Sep 17 00:00:00 2001
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Date: Fri, 8 Dec 2017 16:15:20 +0000
Subject: ASoC: wm_adsp: Fix validation of firmware and coeff lengths
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
commit 50dd2ea8ef67a1617e0c0658bcbec4b9fb03b936 upstream.
The checks for whether another region/block header could be present
are subtracting the size from the current offset. Obviously we should
instead subtract the offset from the size.
The checks for whether the region/block data fit in the file are
adding the data size to the current offset and header size, without
checking for integer overflow. Rearrange these so that overflow is
impossible.
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Acked-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Tested-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/wm_adsp.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/sound/soc/codecs/wm_adsp.c
+++ b/sound/soc/codecs/wm_adsp.c
@@ -1733,7 +1733,7 @@ static int wm_adsp_load(struct wm_adsp *
le64_to_cpu(footer->timestamp));
while (pos < firmware->size &&
- pos - firmware->size > sizeof(*region)) {
+ sizeof(*region) < firmware->size - pos) {
region = (void *)&(firmware->data[pos]);
region_name = "Unknown";
reg = 0;
@@ -1782,8 +1782,8 @@ static int wm_adsp_load(struct wm_adsp *
regions, le32_to_cpu(region->len), offset,
region_name);
- if ((pos + le32_to_cpu(region->len) + sizeof(*region)) >
- firmware->size) {
+ if (le32_to_cpu(region->len) >
+ firmware->size - pos - sizeof(*region)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, regions, region_name,
@@ -2253,7 +2253,7 @@ static int wm_adsp_load_coeff(struct wm_
blocks = 0;
while (pos < firmware->size &&
- pos - firmware->size > sizeof(*blk)) {
+ sizeof(*blk) < firmware->size - pos) {
blk = (void *)(&firmware->data[pos]);
type = le16_to_cpu(blk->type);
@@ -2327,8 +2327,8 @@ static int wm_adsp_load_coeff(struct wm_
}
if (reg) {
- if ((pos + le32_to_cpu(blk->len) + sizeof(*blk)) >
- firmware->size) {
+ if (le32_to_cpu(blk->len) >
+ firmware->size - pos - sizeof(*blk)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, blocks, region_name,
Patches currently in stable-queue which might be from ben.hutchings(a)codethink.co.uk are
queue-4.14/asoc-wm_adsp-fix-validation-of-firmware-and-coeff-lengths.patch
This is a note to let you know that I've just added the patch titled
ASoC: twl4030: fix child-node lookup
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-twl4030-fix-child-node-lookup.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:56 +0100
Subject: ASoC: twl4030: fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.
Fixes: 2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/twl4030.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -232,7 +232,7 @@ static struct twl4030_codec_data *twl403
struct twl4030_codec_data *pdata = dev_get_platdata(codec->dev);
struct device_node *twl4030_codec_node = NULL;
- twl4030_codec_node = of_find_node_by_name(codec->dev->parent->of_node,
+ twl4030_codec_node = of_get_child_by_name(codec->dev->parent->of_node,
"codec");
if (!pdata && twl4030_codec_node) {
@@ -241,9 +241,11 @@ static struct twl4030_codec_data *twl403
GFP_KERNEL);
if (!pdata) {
dev_err(codec->dev, "Can not allocate memory\n");
+ of_node_put(twl4030_codec_node);
return NULL;
}
twl4030_setup_pdata_of(pdata, twl4030_codec_node);
+ of_node_put(twl4030_codec_node);
}
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.14/asoc-twl4030-fix-child-node-lookup.patch
queue-4.14/asoc-da7218-fix-fix-child-node-lookup.patch
This is a note to let you know that I've just added the patch titled
ASoC: da7218: fix fix child-node lookup
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-da7218-fix-fix-child-node-lookup.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From bc6476d6c1edcb9b97621b5131bd169aa81f27db Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:55 +0100
Subject: ASoC: da7218: fix fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit bc6476d6c1edcb9b97621b5131bd169aa81f27db upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed.
Fixes: 4d50934abd22 ("ASoC: da7218: Add da7218 codec driver")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Acked-by: Adam Thomson <Adam.Thomson.Opensource(a)diasemi.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/da7218.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/codecs/da7218.c
+++ b/sound/soc/codecs/da7218.c
@@ -2520,7 +2520,7 @@ static struct da7218_pdata *da7218_of_to
}
if (da7218->dev_id == DA7218_DEV_ID) {
- hpldet_np = of_find_node_by_name(np, "da7218_hpldet");
+ hpldet_np = of_get_child_by_name(np, "da7218_hpldet");
if (!hpldet_np)
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-4.14/asoc-twl4030-fix-child-node-lookup.patch
queue-4.14/asoc-da7218-fix-fix-child-node-lookup.patch
This is a note to let you know that I've just added the patch titled
ASoC: tlv320aic31xx: Fix GPIO1 register definition
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-tlv320aic31xx-fix-gpio1-register-definition.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 737e0b7b67bdfe24090fab2852044bb283282fc5 Mon Sep 17 00:00:00 2001
From: "Andrew F. Davis" <afd(a)ti.com>
Date: Wed, 29 Nov 2017 15:32:46 -0600
Subject: ASoC: tlv320aic31xx: Fix GPIO1 register definition
From: Andrew F. Davis <afd(a)ti.com>
commit 737e0b7b67bdfe24090fab2852044bb283282fc5 upstream.
GPIO1 control register is number 51, fix this here.
Fixes: bafcbfe429eb ("ASoC: tlv320aic31xx: Make the register values human readable")
Signed-off-by: Andrew F. Davis <afd(a)ti.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/tlv320aic31xx.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/soc/codecs/tlv320aic31xx.h
+++ b/sound/soc/codecs/tlv320aic31xx.h
@@ -116,7 +116,7 @@ struct aic31xx_pdata {
/* INT2 interrupt control */
#define AIC31XX_INT2CTRL AIC31XX_REG(0, 49)
/* GPIO1 control */
-#define AIC31XX_GPIO1 AIC31XX_REG(0, 50)
+#define AIC31XX_GPIO1 AIC31XX_REG(0, 51)
#define AIC31XX_DACPRB AIC31XX_REG(0, 60)
/* ADC Instruction Set Register */
Patches currently in stable-queue which might be from afd(a)ti.com are
queue-4.14/asoc-tlv320aic31xx-fix-gpio1-register-definition.patch
This is a note to let you know that I've just added the patch titled
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 695b78b548d8a26288f041e907ff17758df9e1d5 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <mail(a)maciej.szmigiero.name>
Date: Mon, 20 Nov 2017 23:14:55 +0100
Subject: ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
From: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
commit 695b78b548d8a26288f041e907ff17758df9e1d5 upstream.
AC'97 ops (register read / write) need SSI regmap and clock, so they have
to be set after them.
We also need to set these ops back to NULL if we fail the probe.
Signed-off-by: Maciej S. Szmigiero <mail(a)maciej.szmigiero.name>
Acked-by: Nicolin Chen <nicoleotsuka(a)gmail.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/fsl/fsl_ssi.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1452,12 +1452,6 @@ static int fsl_ssi_probe(struct platform
sizeof(fsl_ssi_ac97_dai));
fsl_ac97_data = ssi_private;
-
- ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
- if (ret) {
- dev_err(&pdev->dev, "could not set AC'97 ops\n");
- return ret;
- }
} else {
/* Initialize this copy of the CPU DAI driver structure */
memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_dai_template,
@@ -1568,6 +1562,14 @@ static int fsl_ssi_probe(struct platform
return ret;
}
+ if (fsl_ssi_is_ac97(ssi_private)) {
+ ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
+ if (ret) {
+ dev_err(&pdev->dev, "could not set AC'97 ops\n");
+ goto error_ac97_ops;
+ }
+ }
+
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
&ssi_private->cpu_dai_drv, 1);
if (ret) {
@@ -1651,6 +1653,10 @@ error_sound_card:
fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);
error_asoc_register:
+ if (fsl_ssi_is_ac97(ssi_private))
+ snd_soc_set_ac97_ops(NULL);
+
+error_ac97_ops:
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);
Patches currently in stable-queue which might be from mail(a)maciej.szmigiero.name are
queue-4.14/asoc-fsl_ssi-ac-97-ops-need-regmap-clock-and-cleaning-up-on-failure.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - Fix missing COEF init for ALC225/295/299
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-fix-missing-coef-init-for-alc225-295-299.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 44be77c590f381bc629815ac789b8b15ecc4ddcf Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Wed, 27 Dec 2017 08:53:59 +0100
Subject: ALSA: hda - Fix missing COEF init for ALC225/295/299
From: Takashi Iwai <tiwai(a)suse.de>
commit 44be77c590f381bc629815ac789b8b15ecc4ddcf upstream.
There was a long-standing problem on HP Spectre X360 with Kabylake
where it lacks of the front speaker output in some situations. Also
there are other products showing the similar behavior. The culprit
seems to be the missing COEF setup on ALC codecs, ALC225/295/299,
which are all compatible.
This patch adds the proper COEF setup (to initialize idx 0x67 / bits
0x3000) for addressing the issue.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195457
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_realtek.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -324,8 +324,12 @@ static void alc_fill_eapd_coef(struct hd
case 0x10ec0292:
alc_update_coef_idx(codec, 0x4, 1<<15, 0);
break;
- case 0x10ec0215:
case 0x10ec0225:
+ case 0x10ec0295:
+ case 0x10ec0299:
+ alc_update_coef_idx(codec, 0x67, 0xf000, 0x3000);
+ /* fallthrough */
+ case 0x10ec0215:
case 0x10ec0233:
case 0x10ec0236:
case 0x10ec0255:
@@ -336,10 +340,8 @@ static void alc_fill_eapd_coef(struct hd
case 0x10ec0286:
case 0x10ec0288:
case 0x10ec0285:
- case 0x10ec0295:
case 0x10ec0298:
case 0x10ec0289:
- case 0x10ec0299:
alc_update_coef_idx(codec, 0x10, 1<<9, 0);
break;
case 0x10ec0275:
Patches currently in stable-queue which might be from tiwai(a)suse.de are
queue-4.14/alsa-hda-drop-useless-warn_on.patch
queue-4.14/alsa-hda-fix-missing-coef-init-for-alc225-295-299.patch
queue-4.14/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.14/alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
queue-4.14/alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
This is a note to let you know that I've just added the patch titled
ASoC: codecs: msm8916-wcd: Fix supported formats
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-codecs-msm8916-wcd-fix-supported-formats.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 51f493ae71adc2c49a317a13c38e54e1cdf46005 Mon Sep 17 00:00:00 2001
From: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Date: Thu, 30 Nov 2017 10:15:02 +0000
Subject: ASoC: codecs: msm8916-wcd: Fix supported formats
From: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
commit 51f493ae71adc2c49a317a13c38e54e1cdf46005 upstream.
This codec is configurable for only 16 bit and 32 bit samples, so reflect
this in the supported formats also remove 24bit sample from supported list.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/msm8916-wcd-analog.c | 2 +-
sound/soc/codecs/msm8916-wcd-digital.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/sound/soc/codecs/msm8916-wcd-analog.c
+++ b/sound/soc/codecs/msm8916-wcd-analog.c
@@ -267,7 +267,7 @@
#define MSM8916_WCD_ANALOG_RATES (SNDRV_PCM_RATE_8000 | SNDRV_PCM_RATE_16000 |\
SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_48000)
#define MSM8916_WCD_ANALOG_FORMATS (SNDRV_PCM_FMTBIT_S16_LE |\
- SNDRV_PCM_FMTBIT_S24_LE)
+ SNDRV_PCM_FMTBIT_S32_LE)
static int btn_mask = SND_JACK_BTN_0 | SND_JACK_BTN_1 |
SND_JACK_BTN_2 | SND_JACK_BTN_3 | SND_JACK_BTN_4;
--- a/sound/soc/codecs/msm8916-wcd-digital.c
+++ b/sound/soc/codecs/msm8916-wcd-digital.c
@@ -194,7 +194,7 @@
SNDRV_PCM_RATE_32000 | \
SNDRV_PCM_RATE_48000)
#define MSM8916_WCD_DIGITAL_FORMATS (SNDRV_PCM_FMTBIT_S16_LE |\
- SNDRV_PCM_FMTBIT_S24_LE)
+ SNDRV_PCM_FMTBIT_S32_LE)
struct msm8916_wcd_digital_priv {
struct clk *ahbclk, *mclk;
@@ -645,7 +645,7 @@ static int msm8916_wcd_digital_hw_params
RX_I2S_CTL_RX_I2S_MODE_MASK,
RX_I2S_CTL_RX_I2S_MODE_16);
break;
- case SNDRV_PCM_FORMAT_S24_LE:
+ case SNDRV_PCM_FORMAT_S32_LE:
snd_soc_update_bits(dai->codec, LPASS_CDC_CLK_TX_I2S_CTL,
TX_I2S_CTL_TX_I2S_MODE_MASK,
TX_I2S_CTL_TX_I2S_MODE_32);
Patches currently in stable-queue which might be from srinivas.kandagatla(a)linaro.org are
queue-4.14/asoc-codecs-msm8916-wcd-fix-supported-formats.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - change the location for one mic on a Lenovo machine
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8da5bbfc7cbba909f4f32d5e1dda3750baa5d853 Mon Sep 17 00:00:00 2001
From: Hui Wang <hui.wang(a)canonical.com>
Date: Fri, 22 Dec 2017 11:17:46 +0800
Subject: ALSA: hda - change the location for one mic on a Lenovo machine
From: Hui Wang <hui.wang(a)canonical.com>
commit 8da5bbfc7cbba909f4f32d5e1dda3750baa5d853 upstream.
There are two front mics on this machine, and current driver assign
the same name Mic to both of them, but pulseaudio can't handle them.
As a workaround, we change the location for one of them, then the
driver will assign "Front Mic" and "Mic" for them.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6305,6 +6305,7 @@ static const struct snd_pci_quirk alc269
SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
SND_PCI_QUIRK(0x17aa, 0x310c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION),
+ SND_PCI_QUIRK(0x17aa, 0x313c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION),
SND_PCI_QUIRK(0x17aa, 0x3112, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC),
Patches currently in stable-queue which might be from hui.wang(a)canonical.com are
queue-4.14/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.14/alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
queue-4.14/alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda: Drop useless WARN_ON()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-drop-useless-warn_on.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a36c2638380c0a4676647a1f553b70b20d3ebce1 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai(a)suse.de>
Date: Fri, 22 Dec 2017 10:45:07 +0100
Subject: ALSA: hda: Drop useless WARN_ON()
From: Takashi Iwai <tiwai(a)suse.de>
commit a36c2638380c0a4676647a1f553b70b20d3ebce1 upstream.
Since the commit 97cc2ed27e5a ("ALSA: hda - Fix yet another i915
pointer leftover in error path") cleared hdac_acomp pointer, the
WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give
a false-positive warning, as the function gets called no matter
whether the component is registered or not. For fixing it, let's get
rid of the spurious WARN_ON().
Fixes: 97cc2ed27e5a ("ALSA: hda - Fix yet another i915 pointer leftover in error path")
Reported-by: Kouta Okamoto <kouta.okamoto(a)toshiba.co.jp>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/hda/hdac_i915.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -325,7 +325,7 @@ static int hdac_component_master_match(s
*/
int snd_hdac_i915_register_notifier(const struct i915_audio_component_audio_ops *aops)
{
- if (WARN_ON(!hdac_acomp))
+ if (!hdac_acomp)
return -ENODEV;
hdac_acomp->audio_ops = aops;
Patches currently in stable-queue which might be from tiwai(a)suse.de are
queue-4.14/alsa-hda-drop-useless-warn_on.patch
queue-4.14/alsa-hda-fix-missing-coef-init-for-alc225-295-299.patch
queue-4.14/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.14/alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
queue-4.14/alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - fix headset mic detection issue on a Dell machine
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d Mon Sep 17 00:00:00 2001
From: Hui Wang <hui.wang(a)canonical.com>
Date: Fri, 22 Dec 2017 11:17:45 +0800
Subject: ALSA: hda - fix headset mic detection issue on a Dell machine
From: Hui Wang <hui.wang(a)canonical.com>
commit 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d upstream.
It has the codec alc256, and add its pin definition to pin quirk
table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_realtek.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6559,6 +6559,11 @@ static const struct snd_hda_pin_quirk al
{0x1b, 0x01011020},
{0x21, 0x02211010}),
SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x14, 0x90170110},
+ {0x1b, 0x01011020},
+ {0x21, 0x0221101f}),
+ SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x12, 0x90a60160},
{0x14, 0x90170120},
{0x21, 0x02211030}),
Patches currently in stable-queue which might be from hui.wang(a)canonical.com are
queue-4.14/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.14/alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
queue-4.14/alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
This is a note to let you know that I've just added the patch titled
ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 322f74ede933b3e2cb78768b6a6fdbfbf478a0c1 Mon Sep 17 00:00:00 2001
From: Hui Wang <hui.wang(a)canonical.com>
Date: Fri, 22 Dec 2017 11:17:44 +0800
Subject: ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
From: Hui Wang <hui.wang(a)canonical.com>
commit 322f74ede933b3e2cb78768b6a6fdbfbf478a0c1 upstream.
There is a headset jack on the front panel, when we plug a headset
into it, the headset mic can't trigger unsol events, and
read_pin_sense() can't detect its presence too. So add this fixup
to fix this issue.
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/pci/hda/patch_conexant.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -271,6 +271,8 @@ enum {
CXT_FIXUP_HP_SPECTRE,
CXT_FIXUP_HP_GATE_MIC,
CXT_FIXUP_MUTE_LED_GPIO,
+ CXT_FIXUP_HEADSET_MIC,
+ CXT_FIXUP_HP_MIC_NO_PRESENCE,
};
/* for hda_fixup_thinkpad_acpi() */
@@ -350,6 +352,18 @@ static void cxt_fixup_headphone_mic(stru
}
}
+static void cxt_fixup_headset_mic(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ struct conexant_spec *spec = codec->spec;
+
+ switch (action) {
+ case HDA_FIXUP_ACT_PRE_PROBE:
+ spec->parse_flags |= HDA_PINCFG_HEADSET_MIC;
+ break;
+ }
+}
+
/* OPLC XO 1.5 fixup */
/* OLPC XO-1.5 supports DC input mode (e.g. for use with analog sensors)
@@ -880,6 +894,19 @@ static const struct hda_fixup cxt_fixups
.type = HDA_FIXUP_FUNC,
.v.func = cxt_fixup_mute_led_gpio,
},
+ [CXT_FIXUP_HEADSET_MIC] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = cxt_fixup_headset_mic,
+ },
+ [CXT_FIXUP_HP_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x1a, 0x02a1113c },
+ { }
+ },
+ .chained = true,
+ .chain_id = CXT_FIXUP_HEADSET_MIC,
+ },
};
static const struct snd_pci_quirk cxt5045_fixups[] = {
@@ -934,6 +961,8 @@ static const struct snd_pci_quirk cxt506
SND_PCI_QUIRK(0x103c, 0x8115, "HP Z1 Gen3", CXT_FIXUP_HP_GATE_MIC),
SND_PCI_QUIRK(0x103c, 0x814f, "HP ZBook 15u G3", CXT_FIXUP_MUTE_LED_GPIO),
SND_PCI_QUIRK(0x103c, 0x822e, "HP ProBook 440 G4", CXT_FIXUP_MUTE_LED_GPIO),
+ SND_PCI_QUIRK(0x103c, 0x8299, "HP 800 G3 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x829a, "HP 800 G3 DM", CXT_FIXUP_HP_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x138d, "Asus", CXT_FIXUP_HEADPHONE_MIC_PIN),
SND_PCI_QUIRK(0x152d, 0x0833, "OLPC XO-1.5", CXT_FIXUP_OLPC_XO),
SND_PCI_QUIRK(0x17aa, 0x20f2, "Lenovo T400", CXT_PINCFG_LENOVO_TP410),
Patches currently in stable-queue which might be from hui.wang(a)canonical.com are
queue-4.14/alsa-hda-fix-headset-mic-detection-issue-on-a-dell-machine.patch
queue-4.14/alsa-hda-change-the-location-for-one-mic-on-a-lenovo-machine.patch
queue-4.14/alsa-hda-add-mic_no_presence-fixup-for-2-hp-machines.patch
This is a note to let you know that I've just added the patch titled
ring-buffer: Mask out the info bits when returning buffer page length
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 22 Dec 2017 20:32:35 -0500
Subject: ring-buffer: Mask out the info bits when returning buffer page length
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
commit 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -336,6 +336,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)
+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -387,7 +389,9 @@ static void rb_init_page(struct buffer_d
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-3.18/tracing-fix-crash-when-it-fails-to-alloc-ring-buffer.patch
queue-3.18/tracing-remove-extra-zeroing-out-of-the-ring-buffer-page.patch
queue-3.18/ring-buffer-mask-out-the-info-bits-when-returning-buffer-page-length.patch
queue-3.18/tracing-fix-possible-double-free-on-failure-of-allocating-trace-buffer.patch
This is a note to let you know that I've just added the patch titled
ASoC: twl4030: fix child-node lookup
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
asoc-twl4030-fix-child-node-lookup.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 13 Nov 2017 12:12:56 +0100
Subject: ASoC: twl4030: fix child-node lookup
From: Johan Hovold <johan(a)kernel.org>
commit 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.
Fixes: 2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
sound/soc/codecs/twl4030.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -232,7 +232,7 @@ static struct twl4030_codec_data *twl403
struct twl4030_codec_data *pdata = dev_get_platdata(codec->dev);
struct device_node *twl4030_codec_node = NULL;
- twl4030_codec_node = of_find_node_by_name(codec->dev->parent->of_node,
+ twl4030_codec_node = of_get_child_by_name(codec->dev->parent->of_node,
"codec");
if (!pdata && twl4030_codec_node) {
@@ -241,9 +241,11 @@ static struct twl4030_codec_data *twl403
GFP_KERNEL);
if (!pdata) {
dev_err(codec->dev, "Can not allocate memory\n");
+ of_node_put(twl4030_codec_node);
return NULL;
}
twl4030_setup_pdata_of(pdata, twl4030_codec_node);
+ of_node_put(twl4030_codec_node);
}
return pdata;
Patches currently in stable-queue which might be from johan(a)kernel.org are
queue-3.18/mfd-twl4030-audio-fix-sibling-node-lookup.patch
queue-3.18/asoc-twl4030-fix-child-node-lookup.patch
queue-3.18/mfd-twl6040-fix-child-node-lookup.patch
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d14587334580bc94d3ee11e8320e0c157f91ae8f Mon Sep 17 00:00:00 2001
From: Steve Wise <swise(a)opengridcomputing.com>
Date: Tue, 19 Dec 2017 14:02:10 -0800
Subject: [PATCH] iw_cxgb4: when flushing, complete all wrs in a chain
If a wr chain was posted and needed to be flushed, only the first
wr in the chain was completed with FLUSHED status. The rest were
never completed. This caused isert to hang on shutdown due to the
missing completions which left iscsi IO commands referenced, stalling
the shutdown.
Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Steve Wise <swise(a)opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com>
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
index 21495f917bcc..d5c92fc520d6 100644
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -858,6 +858,22 @@ static int complete_sq_drain_wr(struct c4iw_qp *qhp, struct ib_send_wr *wr)
return 0;
}
+static int complete_sq_drain_wrs(struct c4iw_qp *qhp, struct ib_send_wr *wr,
+ struct ib_send_wr **bad_wr)
+{
+ int ret = 0;
+
+ while (wr) {
+ ret = complete_sq_drain_wr(qhp, wr);
+ if (ret) {
+ *bad_wr = wr;
+ break;
+ }
+ wr = wr->next;
+ }
+ return ret;
+}
+
static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr)
{
struct t4_cqe cqe = {};
@@ -890,6 +906,14 @@ static void complete_rq_drain_wr(struct c4iw_qp *qhp, struct ib_recv_wr *wr)
}
}
+static void complete_rq_drain_wrs(struct c4iw_qp *qhp, struct ib_recv_wr *wr)
+{
+ while (wr) {
+ complete_rq_drain_wr(qhp, wr);
+ wr = wr->next;
+ }
+}
+
int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
struct ib_send_wr **bad_wr)
{
@@ -913,7 +937,7 @@ int c4iw_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr,
*/
if (qhp->wq.flushed) {
spin_unlock_irqrestore(&qhp->lock, flag);
- err = complete_sq_drain_wr(qhp, wr);
+ err = complete_sq_drain_wrs(qhp, wr, bad_wr);
return err;
}
num_wrs = t4_sq_avail(&qhp->wq);
@@ -1061,7 +1085,7 @@ int c4iw_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr,
*/
if (qhp->wq.flushed) {
spin_unlock_irqrestore(&qhp->lock, flag);
- complete_rq_drain_wr(qhp, wr);
+ complete_rq_drain_wrs(qhp, wr);
return err;
}
num_wrs = t4_rq_avail(&qhp->wq);
This is a note to let you know that I've just added the patch titled
x86/pti: Put the LDT in its own PGD if PTI is on
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f55f0501cbf65ec41cca5058513031b711730b1d Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Tue, 12 Dec 2017 07:56:45 -0800
Subject: x86/pti: Put the LDT in its own PGD if PTI is on
From: Andy Lutomirski <luto(a)kernel.org>
commit f55f0501cbf65ec41cca5058513031b711730b1d upstream.
With PTI enabled, the LDT must be mapped in the usermode tables somewhere.
The LDT is per process, i.e. per mm.
An earlier approach mapped the LDT on context switch into a fixmap area,
but that's a big overhead and exhausted the fixmap space when NR_CPUS got
big.
Take advantage of the fact that there is an address space hole which
provides a completely unused pgd. Use this pgd to manage per-mm LDT
mappings.
This has a down side: the LDT isn't (currently) randomized, and an attack
that can write the LDT is instant root due to call gates (thanks, AMD, for
leaving call gates in AMD64 but designing them wrong so they're only useful
for exploits). This can be mitigated by making the LDT read-only or
randomizing the mapping, either of which is strightforward on top of this
patch.
This will significantly slow down LDT users, but that shouldn't matter for
important workloads -- the LDT is only used by DOSEMU(2), Wine, and very
old libc implementations.
[ tglx: Cleaned it up. ]
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Kirill A. Shutemov <kirill(a)shutemov.name>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/x86/x86_64/mm.txt | 3
arch/x86/include/asm/mmu_context.h | 59 ++++++++++++-
arch/x86/include/asm/pgtable_64_types.h | 4
arch/x86/include/asm/processor.h | 23 +++--
arch/x86/kernel/ldt.c | 139 +++++++++++++++++++++++++++++++-
arch/x86/mm/dump_pagetables.c | 9 ++
6 files changed, 220 insertions(+), 17 deletions(-)
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,7 @@ ffffea0000000000 - ffffeaffffffffff (=40
... unused hole ...
ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
... unused hole ...
+fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ...
@@ -29,7 +30,7 @@ Virtual memory map with 5 level page tab
hole caused by [56:63] sign extension
ff00000000000000 - ff0fffffffffffff (=52 bits) guard hole, reserved for hypervisor
ff10000000000000 - ff8fffffffffffff (=55 bits) direct mapping of all phys. memory
-ff90000000000000 - ff9fffffffffffff (=52 bits) hole
+ff90000000000000 - ff9fffffffffffff (=52 bits) LDT remap for PTI
ffa0000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space (12800 TB)
ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -50,10 +50,33 @@ struct ldt_struct {
* call gates. On native, we could merge the ldt_struct and LDT
* allocations, but it's not worth trying to optimize.
*/
- struct desc_struct *entries;
- unsigned int nr_entries;
+ struct desc_struct *entries;
+ unsigned int nr_entries;
+
+ /*
+ * If PTI is in use, then the entries array is not mapped while we're
+ * in user mode. The whole array will be aliased at the addressed
+ * given by ldt_slot_va(slot). We use two slots so that we can allocate
+ * and map, and enable a new LDT without invalidating the mapping
+ * of an older, still-in-use LDT.
+ *
+ * slot will be -1 if this LDT doesn't have an alias mapping.
+ */
+ int slot;
};
+/* This is a multiple of PAGE_SIZE. */
+#define LDT_SLOT_STRIDE (LDT_ENTRIES * LDT_ENTRY_SIZE)
+
+static inline void *ldt_slot_va(int slot)
+{
+#ifdef CONFIG_X86_64
+ return (void *)(LDT_BASE_ADDR + LDT_SLOT_STRIDE * slot);
+#else
+ BUG();
+#endif
+}
+
/*
* Used for LDT copy/destruction.
*/
@@ -64,6 +87,7 @@ static inline void init_new_context_ldt(
}
int ldt_dup_context(struct mm_struct *oldmm, struct mm_struct *mm);
void destroy_context_ldt(struct mm_struct *mm);
+void ldt_arch_exit_mmap(struct mm_struct *mm);
#else /* CONFIG_MODIFY_LDT_SYSCALL */
static inline void init_new_context_ldt(struct mm_struct *mm) { }
static inline int ldt_dup_context(struct mm_struct *oldmm,
@@ -71,7 +95,8 @@ static inline int ldt_dup_context(struct
{
return 0;
}
-static inline void destroy_context_ldt(struct mm_struct *mm) {}
+static inline void destroy_context_ldt(struct mm_struct *mm) { }
+static inline void ldt_arch_exit_mmap(struct mm_struct *mm) { }
#endif
static inline void load_mm_ldt(struct mm_struct *mm)
@@ -96,10 +121,31 @@ static inline void load_mm_ldt(struct mm
* that we can see.
*/
- if (unlikely(ldt))
- set_ldt(ldt->entries, ldt->nr_entries);
- else
+ if (unlikely(ldt)) {
+ if (static_cpu_has(X86_FEATURE_PTI)) {
+ if (WARN_ON_ONCE((unsigned long)ldt->slot > 1)) {
+ /*
+ * Whoops -- either the new LDT isn't mapped
+ * (if slot == -1) or is mapped into a bogus
+ * slot (if slot > 1).
+ */
+ clear_LDT();
+ return;
+ }
+
+ /*
+ * If page table isolation is enabled, ldt->entries
+ * will not be mapped in the userspace pagetables.
+ * Tell the CPU to access the LDT through the alias
+ * at ldt_slot_va(ldt->slot).
+ */
+ set_ldt(ldt_slot_va(ldt->slot), ldt->nr_entries);
+ } else {
+ set_ldt(ldt->entries, ldt->nr_entries);
+ }
+ } else {
clear_LDT();
+ }
#else
clear_LDT();
#endif
@@ -194,6 +240,7 @@ static inline int arch_dup_mmap(struct m
static inline void arch_exit_mmap(struct mm_struct *mm)
{
paravirt_arch_exit_mmap(mm);
+ ldt_arch_exit_mmap(mm);
}
#ifdef CONFIG_X86_64
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -82,10 +82,14 @@ typedef struct { pteval_t pte; } pte_t;
# define VMALLOC_SIZE_TB _AC(12800, UL)
# define __VMALLOC_BASE _AC(0xffa0000000000000, UL)
# define __VMEMMAP_BASE _AC(0xffd4000000000000, UL)
+# define LDT_PGD_ENTRY _AC(-112, UL)
+# define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
#else
# define VMALLOC_SIZE_TB _AC(32, UL)
# define __VMALLOC_BASE _AC(0xffffc90000000000, UL)
# define __VMEMMAP_BASE _AC(0xffffea0000000000, UL)
+# define LDT_PGD_ENTRY _AC(-4, UL)
+# define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
#endif
#ifdef CONFIG_RANDOMIZE_MEMORY
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -851,13 +851,22 @@ static inline void spin_lock_prefetch(co
#else
/*
- * User space process size. 47bits minus one guard page. The guard
- * page is necessary on Intel CPUs: if a SYSCALL instruction is at
- * the highest possible canonical userspace address, then that
- * syscall will enter the kernel with a non-canonical return
- * address, and SYSRET will explode dangerously. We avoid this
- * particular problem by preventing anything from being mapped
- * at the maximum canonical address.
+ * User space process size. This is the first address outside the user range.
+ * There are a few constraints that determine this:
+ *
+ * On Intel CPUs, if a SYSCALL instruction is at the highest canonical
+ * address, then that syscall will enter the kernel with a
+ * non-canonical return address, and SYSRET will explode dangerously.
+ * We avoid this particular problem by preventing anything executable
+ * from being mapped at the maximum canonical address.
+ *
+ * On AMD CPUs in the Ryzen family, there's a nasty bug in which the
+ * CPUs malfunction if they execute code from the highest canonical page.
+ * They'll speculate right off the end of the canonical space, and
+ * bad things happen. This is worked around in the same way as the
+ * Intel problem.
+ *
+ * With page table isolation enabled, we map the LDT in ... [stay tuned]
*/
#define TASK_SIZE_MAX ((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -24,6 +24,7 @@
#include <linux/uaccess.h>
#include <asm/ldt.h>
+#include <asm/tlb.h>
#include <asm/desc.h>
#include <asm/mmu_context.h>
#include <asm/syscalls.h>
@@ -51,13 +52,11 @@ static void refresh_ldt_segments(void)
static void flush_ldt(void *__mm)
{
struct mm_struct *mm = __mm;
- mm_context_t *pc;
if (this_cpu_read(cpu_tlbstate.loaded_mm) != mm)
return;
- pc = &mm->context;
- set_ldt(pc->ldt->entries, pc->ldt->nr_entries);
+ load_mm_ldt(mm);
refresh_ldt_segments();
}
@@ -94,10 +93,121 @@ static struct ldt_struct *alloc_ldt_stru
return NULL;
}
+ /* The new LDT isn't aliased for PTI yet. */
+ new_ldt->slot = -1;
+
new_ldt->nr_entries = num_entries;
return new_ldt;
}
+/*
+ * If PTI is enabled, this maps the LDT into the kernelmode and
+ * usermode tables for the given mm.
+ *
+ * There is no corresponding unmap function. Even if the LDT is freed, we
+ * leave the PTEs around until the slot is reused or the mm is destroyed.
+ * This is harmless: the LDT is always in ordinary memory, and no one will
+ * access the freed slot.
+ *
+ * If we wanted to unmap freed LDTs, we'd also need to do a flush to make
+ * it useful, and the flush would slow down modify_ldt().
+ */
+static int
+map_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt, int slot)
+{
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ bool is_vmalloc, had_top_level_entry;
+ unsigned long va;
+ spinlock_t *ptl;
+ pgd_t *pgd;
+ int i;
+
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return 0;
+
+ /*
+ * Any given ldt_struct should have map_ldt_struct() called at most
+ * once.
+ */
+ WARN_ON(ldt->slot != -1);
+
+ /*
+ * Did we already have the top level entry allocated? We can't
+ * use pgd_none() for this because it doens't do anything on
+ * 4-level page table kernels.
+ */
+ pgd = pgd_offset(mm, LDT_BASE_ADDR);
+ had_top_level_entry = (pgd->pgd != 0);
+
+ is_vmalloc = is_vmalloc_addr(ldt->entries);
+
+ for (i = 0; i * PAGE_SIZE < ldt->nr_entries * LDT_ENTRY_SIZE; i++) {
+ unsigned long offset = i << PAGE_SHIFT;
+ const void *src = (char *)ldt->entries + offset;
+ unsigned long pfn;
+ pte_t pte, *ptep;
+
+ va = (unsigned long)ldt_slot_va(slot) + offset;
+ pfn = is_vmalloc ? vmalloc_to_pfn(src) :
+ page_to_pfn(virt_to_page(src));
+ /*
+ * Treat the PTI LDT range as a *userspace* range.
+ * get_locked_pte() will allocate all needed pagetables
+ * and account for them in this mm.
+ */
+ ptep = get_locked_pte(mm, va, &ptl);
+ if (!ptep)
+ return -ENOMEM;
+ pte = pfn_pte(pfn, __pgprot(__PAGE_KERNEL & ~_PAGE_GLOBAL));
+ set_pte_at(mm, va, ptep, pte);
+ pte_unmap_unlock(ptep, ptl);
+ }
+
+ if (mm->context.ldt) {
+ /*
+ * We already had an LDT. The top-level entry should already
+ * have been allocated and synchronized with the usermode
+ * tables.
+ */
+ WARN_ON(!had_top_level_entry);
+ if (static_cpu_has(X86_FEATURE_PTI))
+ WARN_ON(!kernel_to_user_pgdp(pgd)->pgd);
+ } else {
+ /*
+ * This is the first time we're mapping an LDT for this process.
+ * Sync the pgd to the usermode tables.
+ */
+ WARN_ON(had_top_level_entry);
+ if (static_cpu_has(X86_FEATURE_PTI)) {
+ WARN_ON(kernel_to_user_pgdp(pgd)->pgd);
+ set_pgd(kernel_to_user_pgdp(pgd), *pgd);
+ }
+ }
+
+ va = (unsigned long)ldt_slot_va(slot);
+ flush_tlb_mm_range(mm, va, va + LDT_SLOT_STRIDE, 0);
+
+ ldt->slot = slot;
+#endif
+ return 0;
+}
+
+static void free_ldt_pgtables(struct mm_struct *mm)
+{
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ struct mmu_gather tlb;
+ unsigned long start = LDT_BASE_ADDR;
+ unsigned long end = start + (1UL << PGDIR_SHIFT);
+
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ tlb_gather_mmu(&tlb, mm, start, end);
+ free_pgd_range(&tlb, start, end, start, end);
+ tlb_finish_mmu(&tlb, start, end);
+#endif
+}
+
/* After calling this, the LDT is immutable. */
static void finalize_ldt_struct(struct ldt_struct *ldt)
{
@@ -156,6 +266,12 @@ int ldt_dup_context(struct mm_struct *ol
new_ldt->nr_entries * LDT_ENTRY_SIZE);
finalize_ldt_struct(new_ldt);
+ retval = map_ldt_struct(mm, new_ldt, 0);
+ if (retval) {
+ free_ldt_pgtables(mm);
+ free_ldt_struct(new_ldt);
+ goto out_unlock;
+ }
mm->context.ldt = new_ldt;
out_unlock:
@@ -174,6 +290,11 @@ void destroy_context_ldt(struct mm_struc
mm->context.ldt = NULL;
}
+void ldt_arch_exit_mmap(struct mm_struct *mm)
+{
+ free_ldt_pgtables(mm);
+}
+
static int read_ldt(void __user *ptr, unsigned long bytecount)
{
struct mm_struct *mm = current->mm;
@@ -287,6 +408,18 @@ static int write_ldt(void __user *ptr, u
new_ldt->entries[ldt_info.entry_number] = ldt;
finalize_ldt_struct(new_ldt);
+ /*
+ * If we are using PTI, map the new LDT into the userspace pagetables.
+ * If there is already an LDT, use the other slot so that other CPUs
+ * will continue to use the old LDT until install_ldt() switches
+ * them over to the new LDT.
+ */
+ error = map_ldt_struct(mm, new_ldt, old_ldt ? !old_ldt->slot : 0);
+ if (error) {
+ free_ldt_struct(old_ldt);
+ goto out_unlock;
+ }
+
install_ldt(mm, new_ldt);
free_ldt_struct(old_ldt);
error = 0;
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -52,12 +52,18 @@ enum address_markers_idx {
USER_SPACE_NR = 0,
KERNEL_SPACE_NR,
LOW_KERNEL_NR,
+#if defined(CONFIG_MODIFY_LDT_SYSCALL) && defined(CONFIG_X86_5LEVEL)
+ LDT_NR,
+#endif
VMALLOC_START_NR,
VMEMMAP_START_NR,
#ifdef CONFIG_KASAN
KASAN_SHADOW_START_NR,
KASAN_SHADOW_END_NR,
#endif
+#if defined(CONFIG_MODIFY_LDT_SYSCALL) && !defined(CONFIG_X86_5LEVEL)
+ LDT_NR,
+#endif
CPU_ENTRY_AREA_NR,
#ifdef CONFIG_X86_ESPFIX64
ESPFIX_START_NR,
@@ -82,6 +88,9 @@ static struct addr_marker address_marker
[KASAN_SHADOW_START_NR] = { KASAN_SHADOW_START, "KASAN shadow" },
[KASAN_SHADOW_END_NR] = { KASAN_SHADOW_END, "KASAN shadow end" },
#endif
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+ [LDT_NR] = { LDT_BASE_ADDR, "LDT remap" },
+#endif
[CPU_ENTRY_AREA_NR] = { CPU_ENTRY_AREA_BASE,"CPU entry Area" },
#ifdef CONFIG_X86_ESPFIX64
[ESPFIX_START_NR] = { ESPFIX_BASE_ADDR, "ESPfix Area", 16 },
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/pti: Map the vsyscall page if needed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-pti-map-the-vsyscall-page-if-needed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 85900ea51577e31b186e523c8f4e068c79ecc7d3 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Tue, 12 Dec 2017 07:56:42 -0800
Subject: x86/pti: Map the vsyscall page if needed
From: Andy Lutomirski <luto(a)kernel.org>
commit 85900ea51577e31b186e523c8f4e068c79ecc7d3 upstream.
Make VSYSCALLs work fully in PTI mode by mapping them properly to the user
space visible page tables.
[ tglx: Hide unused functions (Patch by Arnd Bergmann) ]
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 6 +--
arch/x86/include/asm/vsyscall.h | 1
arch/x86/mm/pti.c | 65 ++++++++++++++++++++++++++++++++++
3 files changed, 69 insertions(+), 3 deletions(-)
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -344,14 +344,14 @@ int in_gate_area_no_mm(unsigned long add
* vsyscalls but leave the page not present. If so, we skip calling
* this.
*/
-static void __init set_vsyscall_pgtable_user_bits(void)
+void __init set_vsyscall_pgtable_user_bits(pgd_t *root)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
- pgd = pgd_offset_k(VSYSCALL_ADDR);
+ pgd = pgd_offset_pgd(root, VSYSCALL_ADDR);
set_pgd(pgd, __pgd(pgd_val(*pgd) | _PAGE_USER));
p4d = p4d_offset(pgd, VSYSCALL_ADDR);
#if CONFIG_PGTABLE_LEVELS >= 5
@@ -373,7 +373,7 @@ void __init map_vsyscall(void)
vsyscall_mode == NATIVE
? PAGE_KERNEL_VSYSCALL
: PAGE_KERNEL_VVAR);
- set_vsyscall_pgtable_user_bits();
+ set_vsyscall_pgtable_user_bits(swapper_pg_dir);
}
BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -7,6 +7,7 @@
#ifdef CONFIG_X86_VSYSCALL_EMULATION
extern void map_vsyscall(void);
+extern void set_vsyscall_pgtable_user_bits(pgd_t *root);
/*
* Called on instruction fetch fault in vsyscall page.
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -38,6 +38,7 @@
#include <asm/cpufeature.h>
#include <asm/hypervisor.h>
+#include <asm/vsyscall.h>
#include <asm/cmdline.h>
#include <asm/pti.h>
#include <asm/pgtable.h>
@@ -223,6 +224,69 @@ static pmd_t *pti_user_pagetable_walk_pm
return pmd_offset(pud, address);
}
+#ifdef CONFIG_X86_VSYSCALL_EMULATION
+/*
+ * Walk the shadow copy of the page tables (optionally) trying to allocate
+ * page table pages on the way down. Does not support large pages.
+ *
+ * Note: this is only used when mapping *new* kernel data into the
+ * user/shadow page tables. It is never used for userspace data.
+ *
+ * Returns a pointer to a PTE on success, or NULL on failure.
+ */
+static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address)
+{
+ gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
+ pmd_t *pmd = pti_user_pagetable_walk_pmd(address);
+ pte_t *pte;
+
+ /* We can't do anything sensible if we hit a large mapping. */
+ if (pmd_large(*pmd)) {
+ WARN_ON(1);
+ return NULL;
+ }
+
+ if (pmd_none(*pmd)) {
+ unsigned long new_pte_page = __get_free_page(gfp);
+ if (!new_pte_page)
+ return NULL;
+
+ if (pmd_none(*pmd)) {
+ set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
+ new_pte_page = 0;
+ }
+ if (new_pte_page)
+ free_page(new_pte_page);
+ }
+
+ pte = pte_offset_kernel(pmd, address);
+ if (pte_flags(*pte) & _PAGE_USER) {
+ WARN_ONCE(1, "attempt to walk to user pte\n");
+ return NULL;
+ }
+ return pte;
+}
+
+static void __init pti_setup_vsyscall(void)
+{
+ pte_t *pte, *target_pte;
+ unsigned int level;
+
+ pte = lookup_address(VSYSCALL_ADDR, &level);
+ if (!pte || WARN_ON(level != PG_LEVEL_4K) || pte_none(*pte))
+ return;
+
+ target_pte = pti_user_pagetable_walk_pte(VSYSCALL_ADDR);
+ if (WARN_ON(!target_pte))
+ return;
+
+ *target_pte = *pte;
+ set_vsyscall_pgtable_user_bits(kernel_to_user_pgdp(swapper_pg_dir));
+}
+#else
+static void __init pti_setup_vsyscall(void) { }
+#endif
+
static void __init
pti_clone_pmds(unsigned long start, unsigned long end, pmdval_t clear)
{
@@ -319,4 +383,5 @@ void __init pti_init(void)
pti_clone_user_shared();
pti_clone_entry_text();
pti_setup_espfix64();
+ pti_setup_vsyscall();
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/pti: Add the pti= cmdline option and documentation
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-pti-add-the-pti-cmdline-option-and-documentation.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 41f4c20b57a4890ea7f56ff8717cc83fefb8d537 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 12 Dec 2017 14:39:52 +0100
Subject: x86/pti: Add the pti= cmdline option and documentation
From: Borislav Petkov <bp(a)suse.de>
commit 41f4c20b57a4890ea7f56ff8717cc83fefb8d537 upstream.
Keep the "nopti" optional for traditional reasons.
[ tglx: Don't allow force on when running on XEN PV and made 'on'
printout conditional ]
Requested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Andy Lutomirsky <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Link: https://lkml.kernel.org/r/20171212133952.10177-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/admin-guide/kernel-parameters.txt | 6 +++++
arch/x86/mm/pti.c | 26 +++++++++++++++++++++++-
2 files changed, 31 insertions(+), 1 deletion(-)
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3255,6 +3255,12 @@
pt. [PARIDE]
See Documentation/blockdev/paride.txt.
+ pti= [X86_64]
+ Control user/kernel address space isolation:
+ on - enable
+ off - disable
+ auto - default setting
+
pty.legacy_count=
[KNL] Number of legacy pty's. Overwrites compiled-in
default number.
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -54,21 +54,45 @@ static void __init pti_print_if_insecure
pr_info("%s\n", reason);
}
+static void __init pti_print_if_secure(const char *reason)
+{
+ if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ pr_info("%s\n", reason);
+}
+
void __init pti_check_boottime_disable(void)
{
+ char arg[5];
+ int ret;
+
if (hypervisor_is_type(X86_HYPER_XEN_PV)) {
pti_print_if_insecure("disabled on XEN PV.");
return;
}
+ ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg));
+ if (ret > 0) {
+ if (ret == 3 && !strncmp(arg, "off", 3)) {
+ pti_print_if_insecure("disabled on command line.");
+ return;
+ }
+ if (ret == 2 && !strncmp(arg, "on", 2)) {
+ pti_print_if_secure("force enabled on command line.");
+ goto enable;
+ }
+ if (ret == 4 && !strncmp(arg, "auto", 4))
+ goto autosel;
+ }
+
if (cmdline_find_option_bool(boot_command_line, "nopti")) {
pti_print_if_insecure("disabled on command line.");
return;
}
+autosel:
if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
return;
-
+enable:
setup_force_cpu_cap(X86_FEATURE_PTI);
}
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Use INVPCID for __native_flush_tlb_single()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6cff64b86aaaa07f89f50498055a20e45754b0c1 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:08:01 +0100
Subject: x86/mm: Use INVPCID for __native_flush_tlb_single()
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 6cff64b86aaaa07f89f50498055a20e45754b0c1 upstream.
This uses INVPCID to shoot down individual lines of the user mapping
instead of marking the entire user map as invalid. This
could/might/possibly be faster.
This for sure needs tlb_single_page_flush_ceiling to be redetermined;
esp. since INVPCID is _slow_.
A detailed performance analysis is available here:
https://lkml.kernel.org/r/3062e486-3539-8a1f-5724-16199420be71@intel.com
[ Peterz: Split out from big combo patch ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/cpufeatures.h | 1
arch/x86/include/asm/tlbflush.h | 23 ++++++++++++-
arch/x86/mm/init.c | 64 +++++++++++++++++++++----------------
3 files changed, 60 insertions(+), 28 deletions(-)
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -197,6 +197,7 @@
#define X86_FEATURE_CAT_L3 ( 7*32+ 4) /* Cache Allocation Technology L3 */
#define X86_FEATURE_CAT_L2 ( 7*32+ 5) /* Cache Allocation Technology L2 */
#define X86_FEATURE_CDP_L3 ( 7*32+ 6) /* Code and Data Prioritization L3 */
+#define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -85,6 +85,18 @@ static inline u16 kern_pcid(u16 asid)
return asid + 1;
}
+/*
+ * The user PCID is just the kernel one, plus the "switch bit".
+ */
+static inline u16 user_pcid(u16 asid)
+{
+ u16 ret = kern_pcid(asid);
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ ret |= 1 << X86_CR3_PTI_SWITCH_BIT;
+#endif
+ return ret;
+}
+
struct pgd_t;
static inline unsigned long build_cr3(pgd_t *pgd, u16 asid)
{
@@ -335,6 +347,8 @@ static inline void __native_flush_tlb_gl
/*
* Using INVPCID is considerably faster than a pair of writes
* to CR4 sandwiched inside an IRQ flag save/restore.
+ *
+ * Note, this works with CR4.PCIDE=0 or 1.
*/
invpcid_flush_all();
return;
@@ -368,7 +382,14 @@ static inline void __native_flush_tlb_si
if (!static_cpu_has(X86_FEATURE_PTI))
return;
- invalidate_user_asid(loaded_mm_asid);
+ /*
+ * Some platforms #GP if we call invpcid(type=1/2) before CR4.PCIDE=1.
+ * Just use invalidate_user_asid() in case we are called early.
+ */
+ if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE))
+ invalidate_user_asid(loaded_mm_asid);
+ else
+ invpcid_flush_one(user_pcid(loaded_mm_asid), addr);
}
/*
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -203,34 +203,44 @@ static void __init probe_page_size_mask(
static void setup_pcid(void)
{
-#ifdef CONFIG_X86_64
- if (boot_cpu_has(X86_FEATURE_PCID)) {
- if (boot_cpu_has(X86_FEATURE_PGE)) {
- /*
- * This can't be cr4_set_bits_and_update_boot() --
- * the trampoline code can't handle CR4.PCIDE and
- * it wouldn't do any good anyway. Despite the name,
- * cr4_set_bits_and_update_boot() doesn't actually
- * cause the bits in question to remain set all the
- * way through the secondary boot asm.
- *
- * Instead, we brute-force it and set CR4.PCIDE
- * manually in start_secondary().
- */
- cr4_set_bits(X86_CR4_PCIDE);
- } else {
- /*
- * flush_tlb_all(), as currently implemented, won't
- * work if PCID is on but PGE is not. Since that
- * combination doesn't exist on real hardware, there's
- * no reason to try to fully support it, but it's
- * polite to avoid corrupting data if we're on
- * an improperly configured VM.
- */
- setup_clear_cpu_cap(X86_FEATURE_PCID);
- }
+ if (!IS_ENABLED(CONFIG_X86_64))
+ return;
+
+ if (!boot_cpu_has(X86_FEATURE_PCID))
+ return;
+
+ if (boot_cpu_has(X86_FEATURE_PGE)) {
+ /*
+ * This can't be cr4_set_bits_and_update_boot() -- the
+ * trampoline code can't handle CR4.PCIDE and it wouldn't
+ * do any good anyway. Despite the name,
+ * cr4_set_bits_and_update_boot() doesn't actually cause
+ * the bits in question to remain set all the way through
+ * the secondary boot asm.
+ *
+ * Instead, we brute-force it and set CR4.PCIDE manually in
+ * start_secondary().
+ */
+ cr4_set_bits(X86_CR4_PCIDE);
+
+ /*
+ * INVPCID's single-context modes (2/3) only work if we set
+ * X86_CR4_PCIDE, *and* we INVPCID support. It's unusable
+ * on systems that have X86_CR4_PCIDE clear, or that have
+ * no INVPCID support at all.
+ */
+ if (boot_cpu_has(X86_FEATURE_INVPCID))
+ setup_force_cpu_cap(X86_FEATURE_INVPCID_SINGLE);
+ } else {
+ /*
+ * flush_tlb_all(), as currently implemented, won't work if
+ * PCID is on but PGE is not. Since that combination
+ * doesn't exist on real hardware, there's no reason to try
+ * to fully support it, but it's polite to avoid corrupting
+ * data if we're on an improperly configured VM.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
}
-#endif
}
#ifdef CONFIG_X86_32
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Use/Fix PCID to optimize user/kernel switches
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6fd166aae78c0ab738d49bda653cbd9e3b1491cf Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Mon, 4 Dec 2017 15:07:59 +0100
Subject: x86/mm: Use/Fix PCID to optimize user/kernel switches
From: Peter Zijlstra <peterz(a)infradead.org>
commit 6fd166aae78c0ab738d49bda653cbd9e3b1491cf upstream.
We can use PCID to retain the TLBs across CR3 switches; including those now
part of the user/kernel switch. This increases performance of kernel
entry/exit at the cost of more expensive/complicated TLB flushing.
Now that we have two address spaces, one for kernel and one for user space,
we need two PCIDs per mm. We use the top PCID bit to indicate a user PCID
(just like we use the PFN LSB for the PGD). Since we do TLB invalidation
from kernel space, the existing code will only invalidate the kernel PCID,
we augment that by marking the corresponding user PCID invalid, and upon
switching back to userspace, use a flushing CR3 write for the switch.
In order to access the user_pcid_flush_mask we use PER_CPU storage, which
means the previously established SWAPGS vs CR3 ordering is now mandatory
and required.
Having to do this memory access does require additional registers, most
sites have a functioning stack and we can spill one (RAX), sites without
functional stack need to otherwise provide the second scratch register.
Note: PCID is generally available on Intel Sandybridge and later CPUs.
Note: Up until this point TLB flushing was broken in this series.
Based-on-code-from: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/calling.h | 72 ++++++++++++++++++----
arch/x86/entry/entry_64.S | 9 +-
arch/x86/entry/entry_64_compat.S | 4 -
arch/x86/include/asm/processor-flags.h | 5 +
arch/x86/include/asm/tlbflush.h | 91 ++++++++++++++++++++++++----
arch/x86/include/uapi/asm/processor-flags.h | 7 +-
arch/x86/kernel/asm-offsets.c | 4 +
arch/x86/mm/init.c | 2
arch/x86/mm/tlb.c | 1
9 files changed, 162 insertions(+), 33 deletions(-)
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -3,6 +3,9 @@
#include <asm/unwind_hints.h>
#include <asm/cpufeatures.h>
#include <asm/page_types.h>
+#include <asm/percpu.h>
+#include <asm/asm-offsets.h>
+#include <asm/processor-flags.h>
/*
@@ -191,17 +194,21 @@ For 32-bit we have the following convent
#ifdef CONFIG_PAGE_TABLE_ISOLATION
-/* PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two halves: */
-#define PTI_SWITCH_MASK (1<<PAGE_SHIFT)
+/*
+ * PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two
+ * halves:
+ */
+#define PTI_SWITCH_PGTABLES_MASK (1<<PAGE_SHIFT)
+#define PTI_SWITCH_MASK (PTI_SWITCH_PGTABLES_MASK|(1<<X86_CR3_PTI_SWITCH_BIT))
-.macro ADJUST_KERNEL_CR3 reg:req
- /* Clear "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */
- andq $(~PTI_SWITCH_MASK), \reg
+.macro SET_NOFLUSH_BIT reg:req
+ bts $X86_CR3_PCID_NOFLUSH_BIT, \reg
.endm
-.macro ADJUST_USER_CR3 reg:req
- /* Move CR3 up a page to the user page tables: */
- orq $(PTI_SWITCH_MASK), \reg
+.macro ADJUST_KERNEL_CR3 reg:req
+ ALTERNATIVE "", "SET_NOFLUSH_BIT \reg", X86_FEATURE_PCID
+ /* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */
+ andq $(~PTI_SWITCH_MASK), \reg
.endm
.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
@@ -212,21 +219,58 @@ For 32-bit we have the following convent
.Lend_\@:
.endm
-.macro SWITCH_TO_USER_CR3 scratch_reg:req
+#define THIS_CPU_user_pcid_flush_mask \
+ PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask
+
+.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
mov %cr3, \scratch_reg
- ADJUST_USER_CR3 \scratch_reg
+
+ ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
+
+ /*
+ * Test if the ASID needs a flush.
+ */
+ movq \scratch_reg, \scratch_reg2
+ andq $(0x7FF), \scratch_reg /* mask ASID */
+ bt \scratch_reg, THIS_CPU_user_pcid_flush_mask
+ jnc .Lnoflush_\@
+
+ /* Flush needed, clear the bit */
+ btr \scratch_reg, THIS_CPU_user_pcid_flush_mask
+ movq \scratch_reg2, \scratch_reg
+ jmp .Lwrcr3_\@
+
+.Lnoflush_\@:
+ movq \scratch_reg2, \scratch_reg
+ SET_NOFLUSH_BIT \scratch_reg
+
+.Lwrcr3_\@:
+ /* Flip the PGD and ASID to the user version */
+ orq $(PTI_SWITCH_MASK), \scratch_reg
mov \scratch_reg, %cr3
.Lend_\@:
.endm
+.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req
+ pushq %rax
+ SWITCH_TO_USER_CR3_NOSTACK scratch_reg=\scratch_reg scratch_reg2=%rax
+ popq %rax
+.endm
+
.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
ALTERNATIVE "jmp .Ldone_\@", "", X86_FEATURE_PTI
movq %cr3, \scratch_reg
movq \scratch_reg, \save_reg
/*
- * Is the switch bit zero? This means the address is
- * up in real PAGE_TABLE_ISOLATION patches in a moment.
+ * Is the "switch mask" all zero? That means that both of
+ * these are zero:
+ *
+ * 1. The user/kernel PCID bit, and
+ * 2. The user/kernel "bit" that points CR3 to the
+ * bottom half of the 8k PGD
+ *
+ * That indicates a kernel CR3 value, not a user CR3.
*/
testq $(PTI_SWITCH_MASK), \scratch_reg
jz .Ldone_\@
@@ -251,7 +295,9 @@ For 32-bit we have the following convent
.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
.endm
-.macro SWITCH_TO_USER_CR3 scratch_reg:req
+.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
+.endm
+.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req
.endm
.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
.endm
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -23,7 +23,6 @@
#include <asm/segment.h>
#include <asm/cache.h>
#include <asm/errno.h>
-#include "calling.h"
#include <asm/asm-offsets.h>
#include <asm/msr.h>
#include <asm/unistd.h>
@@ -40,6 +39,8 @@
#include <asm/frame.h>
#include <linux/err.h>
+#include "calling.h"
+
.code64
.section .entry.text, "ax"
@@ -406,7 +407,7 @@ syscall_return_via_sysret:
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
- SWITCH_TO_USER_CR3 scratch_reg=%rdi
+ SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
popq %rdi
popq %rsp
@@ -744,7 +745,7 @@ GLOBAL(swapgs_restore_regs_and_return_to
* We can do future final exit work right here.
*/
- SWITCH_TO_USER_CR3 scratch_reg=%rdi
+ SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
/* Restore RDI. */
popq %rdi
@@ -857,7 +858,7 @@ native_irq_return_ldt:
*/
orq PER_CPU_VAR(espfix_stack), %rax
- SWITCH_TO_USER_CR3 scratch_reg=%rdi /* to user CR3 */
+ SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
SWAPGS /* to user GS */
popq %rdi /* Restore user RDI */
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -275,9 +275,9 @@ sysret32_from_system_call:
* switch until after after the last reference to the process
* stack.
*
- * %r8 is zeroed before the sysret, thus safe to clobber.
+ * %r8/%r9 are zeroed before the sysret, thus safe to clobber.
*/
- SWITCH_TO_USER_CR3 scratch_reg=%r8
+ SWITCH_TO_USER_CR3_NOSTACK scratch_reg=%r8 scratch_reg2=%r9
xorq %r8, %r8
xorq %r9, %r9
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -38,6 +38,11 @@
#define CR3_ADDR_MASK __sme_clr(0x7FFFFFFFFFFFF000ull)
#define CR3_PCID_MASK 0xFFFull
#define CR3_NOFLUSH BIT_ULL(63)
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+# define X86_CR3_PTI_SWITCH_BIT 11
+#endif
+
#else
/*
* CR3_ADDR_MASK needs at least bits 31:5 set on PAE systems, and we save
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -10,6 +10,8 @@
#include <asm/special_insns.h>
#include <asm/smp.h>
#include <asm/invpcid.h>
+#include <asm/pti.h>
+#include <asm/processor-flags.h>
static inline u64 inc_mm_tlb_gen(struct mm_struct *mm)
{
@@ -24,24 +26,54 @@ static inline u64 inc_mm_tlb_gen(struct
/* There are 12 bits of space for ASIDS in CR3 */
#define CR3_HW_ASID_BITS 12
+
/*
* When enabled, PAGE_TABLE_ISOLATION consumes a single bit for
* user/kernel switches
*/
-#define PTI_CONSUMED_ASID_BITS 0
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+# define PTI_CONSUMED_PCID_BITS 1
+#else
+# define PTI_CONSUMED_PCID_BITS 0
+#endif
+
+#define CR3_AVAIL_PCID_BITS (X86_CR3_PCID_BITS - PTI_CONSUMED_PCID_BITS)
-#define CR3_AVAIL_ASID_BITS (CR3_HW_ASID_BITS - PTI_CONSUMED_ASID_BITS)
/*
* ASIDs are zero-based: 0->MAX_AVAIL_ASID are valid. -1 below to account
* for them being zero-based. Another -1 is because ASID 0 is reserved for
* use by non-PCID-aware users.
*/
-#define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_ASID_BITS) - 2)
+#define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_PCID_BITS) - 2)
+
+/*
+ * 6 because 6 should be plenty and struct tlb_state will fit in two cache
+ * lines.
+ */
+#define TLB_NR_DYN_ASIDS 6
static inline u16 kern_pcid(u16 asid)
{
VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE);
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
/*
+ * Make sure that the dynamic ASID space does not confict with the
+ * bit we are using to switch between user and kernel ASIDs.
+ */
+ BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_SWITCH_BIT));
+
+ /*
+ * The ASID being passed in here should have respected the
+ * MAX_ASID_AVAILABLE and thus never have the switch bit set.
+ */
+ VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT));
+#endif
+ /*
+ * The dynamically-assigned ASIDs that get passed in are small
+ * (<TLB_NR_DYN_ASIDS). They never have the high switch bit set,
+ * so do not bother to clear it.
+ *
* If PCID is on, ASID-aware code paths put the ASID+1 into the
* PCID bits. This serves two purposes. It prevents a nasty
* situation in which PCID-unaware code saves CR3, loads some other
@@ -95,12 +127,6 @@ static inline bool tlb_defer_switch_to_i
return !static_cpu_has(X86_FEATURE_PCID);
}
-/*
- * 6 because 6 should be plenty and struct tlb_state will fit in
- * two cache lines.
- */
-#define TLB_NR_DYN_ASIDS 6
-
struct tlb_context {
u64 ctx_id;
u64 tlb_gen;
@@ -146,6 +172,13 @@ struct tlb_state {
bool invalidate_other;
/*
+ * Mask that contains TLB_NR_DYN_ASIDS+1 bits to indicate
+ * the corresponding user PCID needs a flush next time we
+ * switch to it; see SWITCH_TO_USER_CR3.
+ */
+ unsigned short user_pcid_flush_mask;
+
+ /*
* Access to this CR4 shadow and to H/W CR4 is protected by
* disabling interrupts when modifying either one.
*/
@@ -250,14 +283,41 @@ static inline void cr4_set_bits_and_upda
extern void initialize_tlbstate_and_flush(void);
/*
+ * Given an ASID, flush the corresponding user ASID. We can delay this
+ * until the next time we switch to it.
+ *
+ * See SWITCH_TO_USER_CR3.
+ */
+static inline void invalidate_user_asid(u16 asid)
+{
+ /* There is no user ASID if address space separation is off */
+ if (!IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION))
+ return;
+
+ /*
+ * We only have a single ASID if PCID is off and the CR3
+ * write will have flushed it.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_PCID))
+ return;
+
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ __set_bit(kern_pcid(asid),
+ (unsigned long *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask));
+}
+
+/*
* flush the entire current user mapping
*/
static inline void __native_flush_tlb(void)
{
+ invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid));
/*
- * If current->mm == NULL then we borrow a mm which may change during a
- * task switch and therefore we must not be preempted while we write CR3
- * back:
+ * If current->mm == NULL then we borrow a mm which may change
+ * during a task switch and therefore we must not be preempted
+ * while we write CR3 back:
*/
preempt_disable();
native_write_cr3(__native_read_cr3());
@@ -301,7 +361,14 @@ static inline void __native_flush_tlb_gl
*/
static inline void __native_flush_tlb_single(unsigned long addr)
{
+ u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
+
asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
+
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ invalidate_user_asid(loaded_mm_asid);
}
/*
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -78,7 +78,12 @@
#define X86_CR3_PWT _BITUL(X86_CR3_PWT_BIT)
#define X86_CR3_PCD_BIT 4 /* Page Cache Disable */
#define X86_CR3_PCD _BITUL(X86_CR3_PCD_BIT)
-#define X86_CR3_PCID_MASK _AC(0x00000fff,UL) /* PCID Mask */
+
+#define X86_CR3_PCID_BITS 12
+#define X86_CR3_PCID_MASK (_AC((1UL << X86_CR3_PCID_BITS) - 1, UL))
+
+#define X86_CR3_PCID_NOFLUSH_BIT 63 /* Preserve old PCID */
+#define X86_CR3_PCID_NOFLUSH _BITULL(X86_CR3_PCID_NOFLUSH_BIT)
/*
* Intel CPU features in CR4
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -17,6 +17,7 @@
#include <asm/sigframe.h>
#include <asm/bootparam.h>
#include <asm/suspend.h>
+#include <asm/tlbflush.h>
#ifdef CONFIG_XEN
#include <xen/interface/xen.h>
@@ -94,6 +95,9 @@ void common(void) {
BLANK();
DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
+ /* TLB state for the entry code */
+ OFFSET(TLB_STATE_user_pcid_flush_mask, tlb_state, user_pcid_flush_mask);
+
/* Layout info for cpu_entry_area */
OFFSET(CPU_ENTRY_AREA_tss, cpu_entry_area, tss);
OFFSET(CPU_ENTRY_AREA_entry_trampoline, cpu_entry_area, entry_trampoline);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -855,7 +855,7 @@ void __init zone_sizes_init(void)
free_area_init_nodes(max_zone_pfns);
}
-DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
+__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
.loaded_mm = &init_mm,
.next_asid = 1,
.cr4 = ~0UL, /* fail hard if we screw up cr4 shadow initialization */
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -105,6 +105,7 @@ static void load_new_mm_cr3(pgd_t *pgdir
unsigned long new_mm_cr3;
if (need_flush) {
+ invalidate_user_asid(new_asid);
new_mm_cr3 = build_cr3(pgdir, new_asid);
} else {
new_mm_cr3 = build_cr3_noflush(pgdir, new_asid);
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Share entry text PMD
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-share-entry-text-pmd.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6dc72c3cbca0580642808d677181cad4c6433893 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Dec 2017 15:07:47 +0100
Subject: x86/mm/pti: Share entry text PMD
From: Thomas Gleixner <tglx(a)linutronix.de>
commit 6dc72c3cbca0580642808d677181cad4c6433893 upstream.
Share the entry text PMD of the kernel mapping with the user space
mapping. If large pages are enabled this is a single PMD entry and at the
point where it is copied into the user page table the RW bit has not been
cleared yet. Clear it right away so the user space visible map becomes RX.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/pti.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -288,6 +288,15 @@ static void __init pti_clone_user_shared
}
/*
+ * Clone the populated PMDs of the entry and irqentry text and force it RO.
+ */
+static void __init pti_clone_entry_text(void)
+{
+ pti_clone_pmds((unsigned long) __entry_text_start,
+ (unsigned long) __irqentry_text_end, _PAGE_RW);
+}
+
+/*
* Initialize kernel page table isolation
*/
void __init pti_init(void)
@@ -298,4 +307,5 @@ void __init pti_init(void)
pr_info("enabled\n");
pti_clone_user_shared();
+ pti_clone_entry_text();
}
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Share cpu_entry_area with user space page tables
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f7cfbee91559ca7e3e961a00ffac921208a115ad Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 4 Dec 2017 15:07:45 +0100
Subject: x86/mm/pti: Share cpu_entry_area with user space page tables
From: Andy Lutomirski <luto(a)kernel.org>
commit f7cfbee91559ca7e3e961a00ffac921208a115ad upstream.
Share the cpu entry area so the user space and kernel space page tables
have the same P4D page.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/pti.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -265,6 +265,29 @@ pti_clone_pmds(unsigned long start, unsi
}
/*
+ * Clone a single p4d (i.e. a top-level entry on 4-level systems and a
+ * next-level entry on 5-level systems.
+ */
+static void __init pti_clone_p4d(unsigned long addr)
+{
+ p4d_t *kernel_p4d, *user_p4d;
+ pgd_t *kernel_pgd;
+
+ user_p4d = pti_user_pagetable_walk_p4d(addr);
+ kernel_pgd = pgd_offset_k(addr);
+ kernel_p4d = p4d_offset(kernel_pgd, addr);
+ *user_p4d = *kernel_p4d;
+}
+
+/*
+ * Clone the CPU_ENTRY_AREA into the user space visible page table.
+ */
+static void __init pti_clone_user_shared(void)
+{
+ pti_clone_p4d(CPU_ENTRY_AREA_BASE);
+}
+
+/*
* Initialize kernel page table isolation
*/
void __init pti_init(void)
@@ -273,4 +296,6 @@ void __init pti_init(void)
return;
pr_info("enabled\n");
+
+ pti_clone_user_shared();
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8a09317b895f073977346779df52f67c1056d81d Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:35 +0100
Subject: x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 8a09317b895f073977346779df52f67c1056d81d upstream.
PAGE_TABLE_ISOLATION needs to switch to a different CR3 value when it
enters the kernel and switch back when it exits. This essentially needs to
be done before leaving assembly code.
This is extra challenging because the switching context is tricky: the
registers that can be clobbered can vary. It is also hard to store things
on the stack because there is an established ABI (ptregs) or the stack is
entirely unsafe to use.
Establish a set of macros that allow changing to the user and kernel CR3
values.
Interactions with SWAPGS:
Previous versions of the PAGE_TABLE_ISOLATION code relied on having
per-CPU scratch space to save/restore a register that can be used for the
CR3 MOV. The %GS register is used to index into our per-CPU space, so
SWAPGS *had* to be done before the CR3 switch. That scratch space is gone
now, but the semantic that SWAPGS must be done before the CR3 MOV is
retained. This is good to keep because it is not that hard to do and it
allows to do things like add per-CPU debugging information.
What this does in the NMI code is worth pointing out. NMIs can interrupt
*any* context and they can also be nested with NMIs interrupting other
NMIs. The comments below ".Lnmi_from_kernel" explain the format of the
stack during this situation. Changing the format of this stack is hard.
Instead of storing the old CR3 value on the stack, this depends on the
*regular* register save/restore mechanism and then uses %r14 to keep CR3
during the NMI. It is callee-saved and will not be clobbered by the C NMI
handlers that get called.
[ PeterZ: ESPFIX optimization ]
Based-on-code-from: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/calling.h | 66 +++++++++++++++++++++++++++++++++++++++
arch/x86/entry/entry_64.S | 45 +++++++++++++++++++++++---
arch/x86/entry/entry_64_compat.S | 24 +++++++++++++-
3 files changed, 128 insertions(+), 7 deletions(-)
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -1,6 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
#include <linux/jump_label.h>
#include <asm/unwind_hints.h>
+#include <asm/cpufeatures.h>
+#include <asm/page_types.h>
/*
@@ -187,6 +189,70 @@ For 32-bit we have the following convent
#endif
.endm
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+
+/* PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two halves: */
+#define PTI_SWITCH_MASK (1<<PAGE_SHIFT)
+
+.macro ADJUST_KERNEL_CR3 reg:req
+ /* Clear "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */
+ andq $(~PTI_SWITCH_MASK), \reg
+.endm
+
+.macro ADJUST_USER_CR3 reg:req
+ /* Move CR3 up a page to the user page tables: */
+ orq $(PTI_SWITCH_MASK), \reg
+.endm
+
+.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+ mov %cr3, \scratch_reg
+ ADJUST_KERNEL_CR3 \scratch_reg
+ mov \scratch_reg, %cr3
+.endm
+
+.macro SWITCH_TO_USER_CR3 scratch_reg:req
+ mov %cr3, \scratch_reg
+ ADJUST_USER_CR3 \scratch_reg
+ mov \scratch_reg, %cr3
+.endm
+
+.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+ movq %cr3, \scratch_reg
+ movq \scratch_reg, \save_reg
+ /*
+ * Is the switch bit zero? This means the address is
+ * up in real PAGE_TABLE_ISOLATION patches in a moment.
+ */
+ testq $(PTI_SWITCH_MASK), \scratch_reg
+ jz .Ldone_\@
+
+ ADJUST_KERNEL_CR3 \scratch_reg
+ movq \scratch_reg, %cr3
+
+.Ldone_\@:
+.endm
+
+.macro RESTORE_CR3 save_reg:req
+ /*
+ * The CR3 write could be avoided when not changing its value,
+ * but would require a CR3 read *and* a scratch register.
+ */
+ movq \save_reg, %cr3
+.endm
+
+#else /* CONFIG_PAGE_TABLE_ISOLATION=n: */
+
+.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+.endm
+.macro SWITCH_TO_USER_CR3 scratch_reg:req
+.endm
+.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+.endm
+.macro RESTORE_CR3 save_reg:req
+.endm
+
+#endif
+
#endif /* CONFIG_X86_64 */
/*
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -164,6 +164,9 @@ ENTRY(entry_SYSCALL_64_trampoline)
/* Stash the user RSP. */
movq %rsp, RSP_SCRATCH
+ /* Note: using %rsp as a scratch reg. */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+
/* Load the top of the task stack into RSP */
movq CPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp
@@ -203,6 +206,10 @@ ENTRY(entry_SYSCALL_64)
*/
swapgs
+ /*
+ * This path is not taken when PAGE_TABLE_ISOLATION is disabled so it
+ * is not required to switch CR3.
+ */
movq %rsp, PER_CPU_VAR(rsp_scratch)
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
@@ -399,6 +406,7 @@ syscall_return_via_sysret:
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
+ SWITCH_TO_USER_CR3 scratch_reg=%rdi
popq %rdi
popq %rsp
@@ -736,6 +744,8 @@ GLOBAL(swapgs_restore_regs_and_return_to
* We can do future final exit work right here.
*/
+ SWITCH_TO_USER_CR3 scratch_reg=%rdi
+
/* Restore RDI. */
popq %rdi
SWAPGS
@@ -818,7 +828,9 @@ native_irq_return_ldt:
*/
pushq %rdi /* Stash user RDI */
- SWAPGS
+ SWAPGS /* to kernel GS */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi /* to kernel CR3 */
+
movq PER_CPU_VAR(espfix_waddr), %rdi
movq %rax, (0*8)(%rdi) /* user RAX */
movq (1*8)(%rsp), %rax /* user RIP */
@@ -834,7 +846,6 @@ native_irq_return_ldt:
/* Now RAX == RSP. */
andl $0xffff0000, %eax /* RAX = (RSP & 0xffff0000) */
- popq %rdi /* Restore user RDI */
/*
* espfix_stack[31:16] == 0. The page tables are set up such that
@@ -845,7 +856,11 @@ native_irq_return_ldt:
* still points to an RO alias of the ESPFIX stack.
*/
orq PER_CPU_VAR(espfix_stack), %rax
- SWAPGS
+
+ SWITCH_TO_USER_CR3 scratch_reg=%rdi /* to user CR3 */
+ SWAPGS /* to user GS */
+ popq %rdi /* Restore user RDI */
+
movq %rax, %rsp
UNWIND_HINT_IRET_REGS offset=8
@@ -945,6 +960,8 @@ ENTRY(switch_to_thread_stack)
UNWIND_HINT_FUNC
pushq %rdi
+ /* Need to switch before accessing the thread stack. */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
movq %rsp, %rdi
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
@@ -1244,7 +1261,11 @@ ENTRY(paranoid_entry)
js 1f /* negative -> in kernel */
SWAPGS
xorl %ebx, %ebx
-1: ret
+
+1:
+ SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
+
+ ret
END(paranoid_entry)
/*
@@ -1266,6 +1287,7 @@ ENTRY(paranoid_exit)
testl %ebx, %ebx /* swapgs needed? */
jnz .Lparanoid_exit_no_swapgs
TRACE_IRQS_IRETQ
+ RESTORE_CR3 save_reg=%r14
SWAPGS_UNSAFE_STACK
jmp .Lparanoid_exit_restore
.Lparanoid_exit_no_swapgs:
@@ -1293,6 +1315,8 @@ ENTRY(error_entry)
* from user mode due to an IRET fault.
*/
SWAPGS
+ /* We have user CR3. Change to kernel CR3. */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
.Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
@@ -1339,6 +1363,7 @@ ENTRY(error_entry)
* .Lgs_change's error handler with kernel gsbase.
*/
SWAPGS
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
jmp .Lerror_entry_done
.Lbstep_iret:
@@ -1348,10 +1373,11 @@ ENTRY(error_entry)
.Lerror_bad_iret:
/*
- * We came from an IRET to user mode, so we have user gsbase.
- * Switch to kernel gsbase:
+ * We came from an IRET to user mode, so we have user
+ * gsbase and CR3. Switch to kernel gsbase and CR3:
*/
SWAPGS
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
/*
* Pretend that the exception came from user mode: set up pt_regs
@@ -1383,6 +1409,10 @@ END(error_exit)
/*
* Runs on exception stack. Xen PV does not go through this path at all,
* so we can use real assembly here.
+ *
+ * Registers:
+ * %r14: Used to save/restore the CR3 of the interrupted context
+ * when PAGE_TABLE_ISOLATION is in use. Do not clobber.
*/
ENTRY(nmi)
UNWIND_HINT_IRET_REGS
@@ -1446,6 +1476,7 @@ ENTRY(nmi)
swapgs
cld
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
UNWIND_HINT_IRET_REGS base=%rdx offset=8
@@ -1698,6 +1729,8 @@ end_repeat_nmi:
movq $-1, %rsi
call do_nmi
+ RESTORE_CR3 save_reg=%r14
+
testl %ebx, %ebx /* swapgs needed? */
jnz nmi_restore
nmi_swapgs:
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -49,6 +49,10 @@
ENTRY(entry_SYSENTER_compat)
/* Interrupts are off on entry. */
SWAPGS
+
+ /* We are about to clobber %rsp anyway, clobbering here is OK */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
/*
@@ -216,6 +220,12 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram
pushq $0 /* pt_regs->r15 = 0 */
/*
+ * We just saved %rdi so it is safe to clobber. It is not
+ * preserved during the C calls inside TRACE_IRQS_OFF anyway.
+ */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
+
+ /*
* User mode is traced as though IRQs are on, and SYSENTER
* turned them off.
*/
@@ -256,10 +266,22 @@ sysret32_from_system_call:
* when the system call started, which is already known to user
* code. We zero R8-R10 to avoid info leaks.
*/
+ movq RSP-ORIG_RAX(%rsp), %rsp
+
+ /*
+ * The original userspace %rsp (RSP-ORIG_RAX(%rsp)) is stored
+ * on the process stack which is not mapped to userspace and
+ * not readable after we SWITCH_TO_USER_CR3. Delay the CR3
+ * switch until after after the last reference to the process
+ * stack.
+ *
+ * %r8 is zeroed before the sysret, thus safe to clobber.
+ */
+ SWITCH_TO_USER_CR3 scratch_reg=%r8
+
xorq %r8, %r8
xorq %r9, %r9
xorq %r10, %r10
- movq RSP-ORIG_RAX(%rsp), %rsp
swapgs
sysretl
END(entry_SYSCALL_compat)
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Populate user PGD
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-populate-user-pgd.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fc2fbc8512ed08d1de7720936fd7d2e4ce02c3a2 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:40 +0100
Subject: x86/mm/pti: Populate user PGD
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit fc2fbc8512ed08d1de7720936fd7d2e4ce02c3a2 upstream.
In clone_pgd_range() copy the init user PGDs which cover the kernel half of
the address space, so a process has all the required kernel mappings
visible.
[ tglx: Split out from the big kaiser dump and folded Andys simplification ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgtable.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1125,7 +1125,14 @@ static inline int pud_write(pud_t pud)
*/
static inline void clone_pgd_range(pgd_t *dst, pgd_t *src, int count)
{
- memcpy(dst, src, count * sizeof(pgd_t));
+ memcpy(dst, src, count * sizeof(pgd_t));
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+ /* Clone the user space pgd as well */
+ memcpy(kernel_to_user_pgdp(dst), kernel_to_user_pgdp(src),
+ count * sizeof(pgd_t));
+#endif
}
#define PTE_SHIFT ilog2(PTRS_PER_PTE)
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Map ESPFIX into user space
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-map-espfix-into-user-space.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b6bbe95b87966ba08999574db65c93c5e925a36 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Fri, 15 Dec 2017 22:08:18 +0100
Subject: x86/mm/pti: Map ESPFIX into user space
From: Andy Lutomirski <luto(a)kernel.org>
commit 4b6bbe95b87966ba08999574db65c93c5e925a36 upstream.
Map the ESPFIX pages into user space when PTI is enabled.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/pti.c | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -288,6 +288,16 @@ static void __init pti_clone_user_shared
}
/*
+ * Clone the ESPFIX P4D into the user space visinble page table
+ */
+static void __init pti_setup_espfix64(void)
+{
+#ifdef CONFIG_X86_ESPFIX64
+ pti_clone_p4d(ESPFIX_BASE_ADDR);
+#endif
+}
+
+/*
* Clone the populated PMDs of the entry and irqentry text and force it RO.
*/
static void __init pti_clone_entry_text(void)
@@ -308,4 +318,5 @@ void __init pti_init(void)
pti_clone_user_shared();
pti_clone_entry_text();
+ pti_setup_espfix64();
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Force entry through trampoline when PTI active
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8d4b067895791ab9fdb1aadfc505f64d71239dd2 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Dec 2017 15:07:43 +0100
Subject: x86/mm/pti: Force entry through trampoline when PTI active
From: Thomas Gleixner <tglx(a)linutronix.de>
commit 8d4b067895791ab9fdb1aadfc505f64d71239dd2 upstream.
Force the entry through the trampoline only when PTI is active. Otherwise
go through the normal entry code.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/common.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1339,7 +1339,10 @@ void syscall_init(void)
(entry_SYSCALL_64_trampoline - _entry_trampoline);
wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
- wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+ if (static_cpu_has(X86_FEATURE_PTI))
+ wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+ else
+ wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
#ifdef CONFIG_IA32_EMULATION
wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Disable global pages if PAGE_TABLE_ISOLATION=y
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c313ec66317d421fb5768d78c56abed2dc862264 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:34 +0100
Subject: x86/mm/pti: Disable global pages if PAGE_TABLE_ISOLATION=y
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit c313ec66317d421fb5768d78c56abed2dc862264 upstream.
Global pages stay in the TLB across context switches. Since all contexts
share the same kernel mapping, these mappings are marked as global pages
so kernel entries in the TLB are not flushed out on a context switch.
But, even having these entries in the TLB opens up something that an
attacker can use, such as the double-page-fault attack:
http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
That means that even when PAGE_TABLE_ISOLATION switches page tables
on return to user space the global pages would stay in the TLB cache.
Disable global pages so that kernel TLB entries can be flushed before
returning to user space. This way, all accesses to kernel addresses from
userspace result in a TLB miss independent of the existence of a kernel
mapping.
Suppress global pages via the __supported_pte_mask. The user space
mappings set PAGE_GLOBAL for the minimal kernel mappings which are
required for entry/exit. These mappings are set up manually so the
filtering does not take place.
[ The __supported_pte_mask simplification was written by Thomas Gleixner. ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/init.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,6 +161,12 @@ struct map_range {
static int page_size_mask;
+static void enable_global_pages(void)
+{
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ __supported_pte_mask |= _PAGE_GLOBAL;
+}
+
static void __init probe_page_size_mask(void)
{
/*
@@ -179,11 +185,11 @@ static void __init probe_page_size_mask(
cr4_set_bits_and_update_boot(X86_CR4_PSE);
/* Enable PGE if available */
+ __supported_pte_mask &= ~_PAGE_GLOBAL;
if (boot_cpu_has(X86_FEATURE_PGE)) {
cr4_set_bits_and_update_boot(X86_CR4_PGE);
- __supported_pte_mask |= _PAGE_GLOBAL;
- } else
- __supported_pte_mask &= ~_PAGE_GLOBAL;
+ enable_global_pages();
+ }
/* Enable 1 GB linear kernel mappings if available: */
if (direct_gbpages && boot_cpu_has(X86_FEATURE_GBPAGES)) {
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Allow NX poison to be set in p4d/pgd
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1c4de1ff4fe50453b968579ee86fac3da80dd783 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:38 +0100
Subject: x86/mm/pti: Allow NX poison to be set in p4d/pgd
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 1c4de1ff4fe50453b968579ee86fac3da80dd783 upstream.
With PAGE_TABLE_ISOLATION the user portion of the kernel page tables is
poisoned with the NX bit so if the entry code exits with the kernel page
tables selected in CR3, userspace crashes.
But doing so trips the p4d/pgd_bad() checks. Make sure it does not do
that.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgtable.h | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -846,7 +846,12 @@ static inline pud_t *pud_offset(p4d_t *p
static inline int p4d_bad(p4d_t p4d)
{
- return (p4d_flags(p4d) & ~(_KERNPG_TABLE | _PAGE_USER)) != 0;
+ unsigned long ignore_flags = _KERNPG_TABLE | _PAGE_USER;
+
+ if (IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION))
+ ignore_flags |= _PAGE_NX;
+
+ return (p4d_flags(p4d) & ~ignore_flags) != 0;
}
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
@@ -880,7 +885,12 @@ static inline p4d_t *p4d_offset(pgd_t *p
static inline int pgd_bad(pgd_t pgd)
{
- return (pgd_flags(pgd) & ~_PAGE_USER) != _KERNPG_TABLE;
+ unsigned long ignore_flags = _PAGE_USER;
+
+ if (IS_ENABLED(CONFIG_PAGE_TABLE_ISOLATION))
+ ignore_flags |= _PAGE_NX;
+
+ return (pgd_flags(pgd) & ~ignore_flags) != _KERNPG_TABLE;
}
static inline int pgd_none(pgd_t pgd)
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Allocate a separate user PGD
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-allocate-a-separate-user-pgd.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d9e9a6418065bb376e5de8d93ce346939b9a37a6 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:39 +0100
Subject: x86/mm/pti: Allocate a separate user PGD
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit d9e9a6418065bb376e5de8d93ce346939b9a37a6 upstream.
Kernel page table isolation requires to have two PGDs. One for the kernel,
which contains the full kernel mapping plus the user space mapping and one
for user space which contains the user space mappings and the minimal set
of kernel mappings which are required by the architecture to be able to
transition from and to user space.
Add the necessary preliminaries.
[ tglx: Split out from the big kaiser dump. EFI fixup from Kirill ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgalloc.h | 11 +++++++++++
arch/x86/kernel/head_64.S | 30 +++++++++++++++++++++++++++---
arch/x86/mm/pgtable.c | 5 +++--
arch/x86/platform/efi/efi_64.c | 5 ++++-
4 files changed, 45 insertions(+), 6 deletions(-)
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -30,6 +30,17 @@ static inline void paravirt_release_p4d(
*/
extern gfp_t __userpte_alloc_gfp;
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * Instead of one PGD, we acquire two PGDs. Being order-1, it is
+ * both 8k in size and 8k-aligned. That lets us just flip bit 12
+ * in a pointer to swap between the two 4k halves.
+ */
+#define PGD_ALLOCATION_ORDER 1
+#else
+#define PGD_ALLOCATION_ORDER 0
+#endif
+
/*
* Allocate and free page tables.
*/
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -341,6 +341,27 @@ GLOBAL(early_recursion_flag)
.balign PAGE_SIZE; \
GLOBAL(name)
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * Each PGD needs to be 8k long and 8k aligned. We do not
+ * ever go out to userspace with these, so we do not
+ * strictly *need* the second page, but this allows us to
+ * have a single set_pgd() implementation that does not
+ * need to worry about whether it has 4k or 8k to work
+ * with.
+ *
+ * This ensures PGDs are 8k long:
+ */
+#define PTI_USER_PGD_FILL 512
+/* This ensures they are 8k-aligned: */
+#define NEXT_PGD_PAGE(name) \
+ .balign 2 * PAGE_SIZE; \
+GLOBAL(name)
+#else
+#define NEXT_PGD_PAGE(name) NEXT_PAGE(name)
+#define PTI_USER_PGD_FILL 0
+#endif
+
/* Automate the creation of 1 to 1 mapping pmd entries */
#define PMDS(START, PERM, COUNT) \
i = 0 ; \
@@ -350,13 +371,14 @@ GLOBAL(name)
.endr
__INITDATA
-NEXT_PAGE(early_top_pgt)
+NEXT_PGD_PAGE(early_top_pgt)
.fill 511,8,0
#ifdef CONFIG_X86_5LEVEL
.quad level4_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
#else
.quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
#endif
+ .fill PTI_USER_PGD_FILL,8,0
NEXT_PAGE(early_dynamic_pgts)
.fill 512*EARLY_DYNAMIC_PAGE_TABLES,8,0
@@ -364,13 +386,14 @@ NEXT_PAGE(early_dynamic_pgts)
.data
#if defined(CONFIG_XEN_PV) || defined(CONFIG_XEN_PVH)
-NEXT_PAGE(init_top_pgt)
+NEXT_PGD_PAGE(init_top_pgt)
.quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
.org init_top_pgt + PGD_PAGE_OFFSET*8, 0
.quad level3_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
.org init_top_pgt + PGD_START_KERNEL*8, 0
/* (2^48-(2*1024*1024*1024))/(2^39) = 511 */
.quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC
+ .fill PTI_USER_PGD_FILL,8,0
NEXT_PAGE(level3_ident_pgt)
.quad level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
@@ -381,8 +404,9 @@ NEXT_PAGE(level2_ident_pgt)
*/
PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
#else
-NEXT_PAGE(init_top_pgt)
+NEXT_PGD_PAGE(init_top_pgt)
.fill 512,8,0
+ .fill PTI_USER_PGD_FILL,8,0
#endif
#ifdef CONFIG_X86_5LEVEL
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -355,14 +355,15 @@ static inline void _pgd_free(pgd_t *pgd)
kmem_cache_free(pgd_cache, pgd);
}
#else
+
static inline pgd_t *_pgd_alloc(void)
{
- return (pgd_t *)__get_free_page(PGALLOC_GFP);
+ return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER);
}
static inline void _pgd_free(pgd_t *pgd)
{
- free_page((unsigned long)pgd);
+ free_pages((unsigned long)pgd, PGD_ALLOCATION_ORDER);
}
#endif /* CONFIG_X86_PAE */
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -195,6 +195,9 @@ static pgd_t *efi_pgd;
* because we want to avoid inserting EFI region mappings (EFI_VA_END
* to EFI_VA_START) into the standard kernel page tables. Everything
* else can be shared, see efi_sync_low_kernel_mappings().
+ *
+ * We don't want the pgd on the pgd_list and cannot use pgd_alloc() for the
+ * allocation.
*/
int __init efi_alloc_page_tables(void)
{
@@ -207,7 +210,7 @@ int __init efi_alloc_page_tables(void)
return 0;
gfp_mask = GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO;
- efi_pgd = (pgd_t *)__get_free_page(gfp_mask);
+ efi_pgd = (pgd_t *)__get_free_pages(gfp_mask, PGD_ALLOCATION_ORDER);
if (!efi_pgd)
return -ENOMEM;
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Add mapping helper functions
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-add-mapping-helper-functions.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 61e9b3671007a5da8127955a1a3bda7e0d5f42e8 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:37 +0100
Subject: x86/mm/pti: Add mapping helper functions
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 61e9b3671007a5da8127955a1a3bda7e0d5f42e8 upstream.
Add the pagetable helper functions do manage the separate user space page
tables.
[ tglx: Split out from the big combo kaiser patch. Folded Andys
simplification and made it out of line as Boris suggested ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/pgtable.h | 6 ++
arch/x86/include/asm/pgtable_64.h | 92 ++++++++++++++++++++++++++++++++++++++
arch/x86/mm/pti.c | 41 ++++++++++++++++
3 files changed, 138 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -909,7 +909,11 @@ static inline int pgd_none(pgd_t pgd)
* pgd_offset() returns a (pgd_t *)
* pgd_index() is used get the offset into the pgd page's array of pgd_t's;
*/
-#define pgd_offset(mm, address) ((mm)->pgd + pgd_index((address)))
+#define pgd_offset_pgd(pgd, address) (pgd + pgd_index((address)))
+/*
+ * a shortcut to get a pgd_t in a given mm
+ */
+#define pgd_offset(mm, address) pgd_offset_pgd((mm)->pgd, (address))
/*
* a shortcut which implies the use of the kernel's pgd, instead
* of a process's
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -131,9 +131,97 @@ static inline pud_t native_pudp_get_and_
#endif
}
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * All top-level PAGE_TABLE_ISOLATION page tables are order-1 pages
+ * (8k-aligned and 8k in size). The kernel one is at the beginning 4k and
+ * the user one is in the last 4k. To switch between them, you
+ * just need to flip the 12th bit in their addresses.
+ */
+#define PTI_PGTABLE_SWITCH_BIT PAGE_SHIFT
+
+/*
+ * This generates better code than the inline assembly in
+ * __set_bit().
+ */
+static inline void *ptr_set_bit(void *ptr, int bit)
+{
+ unsigned long __ptr = (unsigned long)ptr;
+
+ __ptr |= BIT(bit);
+ return (void *)__ptr;
+}
+static inline void *ptr_clear_bit(void *ptr, int bit)
+{
+ unsigned long __ptr = (unsigned long)ptr;
+
+ __ptr &= ~BIT(bit);
+ return (void *)__ptr;
+}
+
+static inline pgd_t *kernel_to_user_pgdp(pgd_t *pgdp)
+{
+ return ptr_set_bit(pgdp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline pgd_t *user_to_kernel_pgdp(pgd_t *pgdp)
+{
+ return ptr_clear_bit(pgdp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline p4d_t *kernel_to_user_p4dp(p4d_t *p4dp)
+{
+ return ptr_set_bit(p4dp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline p4d_t *user_to_kernel_p4dp(p4d_t *p4dp)
+{
+ return ptr_clear_bit(p4dp, PTI_PGTABLE_SWITCH_BIT);
+}
+#endif /* CONFIG_PAGE_TABLE_ISOLATION */
+
+/*
+ * Page table pages are page-aligned. The lower half of the top
+ * level is used for userspace and the top half for the kernel.
+ *
+ * Returns true for parts of the PGD that map userspace and
+ * false for the parts that map the kernel.
+ */
+static inline bool pgdp_maps_userspace(void *__ptr)
+{
+ unsigned long ptr = (unsigned long)__ptr;
+
+ return (ptr & ~PAGE_MASK) < (PAGE_SIZE / 2);
+}
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd);
+
+/*
+ * Take a PGD location (pgdp) and a pgd value that needs to be set there.
+ * Populates the user and returns the resulting PGD that must be set in
+ * the kernel copy of the page tables.
+ */
+static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return pgd;
+ return __pti_set_user_pgd(pgdp, pgd);
+}
+#else
+static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+ return pgd;
+}
+#endif
+
static inline void native_set_p4d(p4d_t *p4dp, p4d_t p4d)
{
+#if defined(CONFIG_PAGE_TABLE_ISOLATION) && !defined(CONFIG_X86_5LEVEL)
+ p4dp->pgd = pti_set_user_pgd(&p4dp->pgd, p4d.pgd);
+#else
*p4dp = p4d;
+#endif
}
static inline void native_p4d_clear(p4d_t *p4d)
@@ -147,7 +235,11 @@ static inline void native_p4d_clear(p4d_
static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
{
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+ *pgdp = pti_set_user_pgd(pgdp, pgd);
+#else
*pgdp = pgd;
+#endif
}
static inline void native_pgd_clear(pgd_t *pgd)
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -96,6 +96,47 @@ enable:
setup_force_cpu_cap(X86_FEATURE_PTI);
}
+pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+ /*
+ * Changes to the high (kernel) portion of the kernelmode page
+ * tables are not automatically propagated to the usermode tables.
+ *
+ * Users should keep in mind that, unlike the kernelmode tables,
+ * there is no vmalloc_fault equivalent for the usermode tables.
+ * Top-level entries added to init_mm's usermode pgd after boot
+ * will not be automatically propagated to other mms.
+ */
+ if (!pgdp_maps_userspace(pgdp))
+ return pgd;
+
+ /*
+ * The user page tables get the full PGD, accessible from
+ * userspace:
+ */
+ kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd;
+
+ /*
+ * If this is normal user memory, make it NX in the kernel
+ * pagetables so that, if we somehow screw up and return to
+ * usermode with the kernel CR3 loaded, we'll get a page fault
+ * instead of allowing user code to execute with the wrong CR3.
+ *
+ * As exceptions, we don't set NX if:
+ * - _PAGE_USER is not set. This could be an executable
+ * EFI runtime mapping or something similar, and the kernel
+ * may execute from it
+ * - we don't have NX support
+ * - we're clearing the PGD (i.e. the new pgd is not present).
+ */
+ if ((pgd.pgd & (_PAGE_USER|_PAGE_PRESENT)) == (_PAGE_USER|_PAGE_PRESENT) &&
+ (__supported_pte_mask & _PAGE_NX))
+ pgd.pgd |= _PAGE_NX;
+
+ /* return the copy of the PGD we want the kernel to use: */
+ return pgd;
+}
+
/*
* Initialize kernel page table isolation
*/
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Add Kconfig
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-add-kconfig.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 385ce0ea4c078517fa51c261882c4e72fba53005 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:08:03 +0100
Subject: x86/mm/pti: Add Kconfig
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 385ce0ea4c078517fa51c261882c4e72fba53005 upstream.
Finally allow CONFIG_PAGE_TABLE_ISOLATION to be enabled.
PARAVIRT generally requires that the kernel not manage its own page tables.
It also means that the hypervisor and kernel must agree wholeheartedly
about what format the page tables are in and what they contain.
PAGE_TABLE_ISOLATION, unfortunately, changes the rules and they
can not be used together.
I've seen conflicting feedback from maintainers lately about whether they
want the Kconfig magic to go first or last in a patch series. It's going
last here because the partially-applied series leads to kernels that can
not boot in a bunch of cases. I did a run through the entire series with
CONFIG_PAGE_TABLE_ISOLATION=y to look for build errors, though.
[ tglx: Removed SMP and !PARAVIRT dependencies as they not longer exist ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
security/Kconfig | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -54,6 +54,16 @@ config SECURITY_NETWORK
implement socket and networking access controls.
If you are unsure how to answer this question, answer N.
+config PAGE_TABLE_ISOLATION
+ bool "Remove the kernel mapping in user mode"
+ depends on X86_64 && !UML
+ help
+ This feature reduces the number of hardware side channels by
+ ensuring that the majority of kernel addresses are not mapped
+ into userspace.
+
+ See Documentation/x86/pagetable-isolation.txt for more details.
+
config SECURITY_INFINIBAND
bool "Infiniband Security Hooks"
depends on SECURITY && INFINIBAND
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Add infrastructure for page table isolation
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From aa8c6248f8c75acfd610fe15d8cae23cf70d9d09 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Dec 2017 15:07:36 +0100
Subject: x86/mm/pti: Add infrastructure for page table isolation
From: Thomas Gleixner <tglx(a)linutronix.de>
commit aa8c6248f8c75acfd610fe15d8cae23cf70d9d09 upstream.
Add the initial files for kernel page table isolation, with a minimal init
function and the boot time detection for this misfeature.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/admin-guide/kernel-parameters.txt | 2
arch/x86/boot/compressed/pagetable.c | 3
arch/x86/entry/calling.h | 7 ++
arch/x86/include/asm/pti.h | 14 ++++
arch/x86/mm/Makefile | 7 +-
arch/x86/mm/init.c | 2
arch/x86/mm/pti.c | 84 ++++++++++++++++++++++++
include/linux/pti.h | 11 +++
init/main.c | 3
9 files changed, 130 insertions(+), 3 deletions(-)
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2685,6 +2685,8 @@
steal time is computed, but won't influence scheduler
behaviour
+ nopti [X86-64] Disable kernel page table isolation
+
nolapic [X86-32,APIC] Do not enable or use the local APIC.
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -23,6 +23,9 @@
*/
#undef CONFIG_AMD_MEM_ENCRYPT
+/* No PAGE_TABLE_ISOLATION support needed either: */
+#undef CONFIG_PAGE_TABLE_ISOLATION
+
#include "misc.h"
/* These actually do the work of building the kernel identity maps. */
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -205,18 +205,23 @@ For 32-bit we have the following convent
.endm
.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
mov %cr3, \scratch_reg
ADJUST_KERNEL_CR3 \scratch_reg
mov \scratch_reg, %cr3
+.Lend_\@:
.endm
.macro SWITCH_TO_USER_CR3 scratch_reg:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
mov %cr3, \scratch_reg
ADJUST_USER_CR3 \scratch_reg
mov \scratch_reg, %cr3
+.Lend_\@:
.endm
.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+ ALTERNATIVE "jmp .Ldone_\@", "", X86_FEATURE_PTI
movq %cr3, \scratch_reg
movq \scratch_reg, \save_reg
/*
@@ -233,11 +238,13 @@ For 32-bit we have the following convent
.endm
.macro RESTORE_CR3 save_reg:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
/*
* The CR3 write could be avoided when not changing its value,
* but would require a CR3 read *and* a scratch register.
*/
movq \save_reg, %cr3
+.Lend_\@:
.endm
#else /* CONFIG_PAGE_TABLE_ISOLATION=n: */
--- /dev/null
+++ b/arch/x86/include/asm/pti.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef _ASM_X86_PTI_H
+#define _ASM_X86_PTI_H
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+extern void pti_init(void);
+extern void pti_check_boottime_disable(void);
+#else
+static inline void pti_check_boottime_disable(void) { }
+#endif
+
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_X86_PTI_H */
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -43,9 +43,10 @@ obj-$(CONFIG_AMD_NUMA) += amdtopology.o
obj-$(CONFIG_ACPI_NUMA) += srat.o
obj-$(CONFIG_NUMA_EMU) += numa_emulation.o
-obj-$(CONFIG_X86_INTEL_MPX) += mpx.o
-obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
-obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
+obj-$(CONFIG_X86_INTEL_MPX) += mpx.o
+obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
+obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
+obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -20,6 +20,7 @@
#include <asm/kaslr.h>
#include <asm/hypervisor.h>
#include <asm/cpufeature.h>
+#include <asm/pti.h>
/*
* We need to define the tracepoints somewhere, and tlb.c
@@ -630,6 +631,7 @@ void __init init_mem_mapping(void)
{
unsigned long end;
+ pti_check_boottime_disable();
probe_page_size_mask();
setup_pcid();
--- /dev/null
+++ b/arch/x86/mm/pti.c
@@ -0,0 +1,84 @@
+/*
+ * Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * This code is based in part on work published here:
+ *
+ * https://github.com/IAIK/KAISER
+ *
+ * The original work was written by and and signed off by for the Linux
+ * kernel by:
+ *
+ * Signed-off-by: Richard Fellner <richard.fellner(a)student.tugraz.at>
+ * Signed-off-by: Moritz Lipp <moritz.lipp(a)iaik.tugraz.at>
+ * Signed-off-by: Daniel Gruss <daniel.gruss(a)iaik.tugraz.at>
+ * Signed-off-by: Michael Schwarz <michael.schwarz(a)iaik.tugraz.at>
+ *
+ * Major changes to the original code by: Dave Hansen <dave.hansen(a)intel.com>
+ * Mostly rewritten by Thomas Gleixner <tglx(a)linutronix.de> and
+ * Andy Lutomirsky <luto(a)amacapital.net>
+ */
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/bug.h>
+#include <linux/init.h>
+#include <linux/spinlock.h>
+#include <linux/mm.h>
+#include <linux/uaccess.h>
+
+#include <asm/cpufeature.h>
+#include <asm/hypervisor.h>
+#include <asm/cmdline.h>
+#include <asm/pti.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
+#include <asm/desc.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt) "Kernel/User page tables isolation: " fmt
+
+static void __init pti_print_if_insecure(const char *reason)
+{
+ if (boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ pr_info("%s\n", reason);
+}
+
+void __init pti_check_boottime_disable(void)
+{
+ if (hypervisor_is_type(X86_HYPER_XEN_PV)) {
+ pti_print_if_insecure("disabled on XEN PV.");
+ return;
+ }
+
+ if (cmdline_find_option_bool(boot_command_line, "nopti")) {
+ pti_print_if_insecure("disabled on command line.");
+ return;
+ }
+
+ if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ return;
+
+ setup_force_cpu_cap(X86_FEATURE_PTI);
+}
+
+/*
+ * Initialize kernel page table isolation
+ */
+void __init pti_init(void)
+{
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ pr_info("enabled\n");
+}
--- /dev/null
+++ b/include/linux/pti.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef _INCLUDE_PTI_H
+#define _INCLUDE_PTI_H
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+#include <asm/pti.h>
+#else
+static inline void pti_init(void) { }
+#endif
+
+#endif
--- a/init/main.c
+++ b/init/main.c
@@ -75,6 +75,7 @@
#include <linux/slab.h>
#include <linux/perf_event.h>
#include <linux/ptrace.h>
+#include <linux/pti.h>
#include <linux/blkdev.h>
#include <linux/elevator.h>
#include <linux/sched_clock.h>
@@ -506,6 +507,8 @@ static void __init mm_init(void)
ioremap_huge_init();
/* Should be run before the first non-init thread is created */
init_espfix_bsp();
+ /* Should be run after espfix64 is set up. */
+ pti_init();
}
asmlinkage __visible void __init start_kernel(void)
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/pti: Add functions to clone kernel PMDs
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 03f4424f348e8be95eb1bbeba09461cd7b867828 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 4 Dec 2017 15:07:42 +0100
Subject: x86/mm/pti: Add functions to clone kernel PMDs
From: Andy Lutomirski <luto(a)kernel.org>
commit 03f4424f348e8be95eb1bbeba09461cd7b867828 upstream.
Provide infrastructure to:
- find a kernel PMD for a mapping which must be visible to user space for
the entry/exit code to work.
- walk an address range and share the kernel PMD with it.
This reuses a small part of the original KAISER patches to populate the
user space page table.
[ tglx: Made it universally usable so it can be used for any kind of shared
mapping. Add a mechanism to clear specific bits in the user space
visible PMD entry. Folded Andys simplifactions ]
Originally-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/pti.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 127 insertions(+)
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -48,6 +48,11 @@
#undef pr_fmt
#define pr_fmt(fmt) "Kernel/User page tables isolation: " fmt
+/* Backporting helper */
+#ifndef __GFP_NOTRACK
+#define __GFP_NOTRACK 0
+#endif
+
static void __init pti_print_if_insecure(const char *reason)
{
if (boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
@@ -138,6 +143,128 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pg
}
/*
+ * Walk the user copy of the page tables (optionally) trying to allocate
+ * page table pages on the way down.
+ *
+ * Returns a pointer to a P4D on success, or NULL on failure.
+ */
+static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
+{
+ pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address));
+ gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
+
+ if (address < PAGE_OFFSET) {
+ WARN_ONCE(1, "attempt to walk user address\n");
+ return NULL;
+ }
+
+ if (pgd_none(*pgd)) {
+ unsigned long new_p4d_page = __get_free_page(gfp);
+ if (!new_p4d_page)
+ return NULL;
+
+ if (pgd_none(*pgd)) {
+ set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
+ new_p4d_page = 0;
+ }
+ if (new_p4d_page)
+ free_page(new_p4d_page);
+ }
+ BUILD_BUG_ON(pgd_large(*pgd) != 0);
+
+ return p4d_offset(pgd, address);
+}
+
+/*
+ * Walk the user copy of the page tables (optionally) trying to allocate
+ * page table pages on the way down.
+ *
+ * Returns a pointer to a PMD on success, or NULL on failure.
+ */
+static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
+{
+ gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
+ p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
+ pud_t *pud;
+
+ BUILD_BUG_ON(p4d_large(*p4d) != 0);
+ if (p4d_none(*p4d)) {
+ unsigned long new_pud_page = __get_free_page(gfp);
+ if (!new_pud_page)
+ return NULL;
+
+ if (p4d_none(*p4d)) {
+ set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page)));
+ new_pud_page = 0;
+ }
+ if (new_pud_page)
+ free_page(new_pud_page);
+ }
+
+ pud = pud_offset(p4d, address);
+ /* The user page tables do not use large mappings: */
+ if (pud_large(*pud)) {
+ WARN_ON(1);
+ return NULL;
+ }
+ if (pud_none(*pud)) {
+ unsigned long new_pmd_page = __get_free_page(gfp);
+ if (!new_pmd_page)
+ return NULL;
+
+ if (pud_none(*pud)) {
+ set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
+ new_pmd_page = 0;
+ }
+ if (new_pmd_page)
+ free_page(new_pmd_page);
+ }
+
+ return pmd_offset(pud, address);
+}
+
+static void __init
+pti_clone_pmds(unsigned long start, unsigned long end, pmdval_t clear)
+{
+ unsigned long addr;
+
+ /*
+ * Clone the populated PMDs which cover start to end. These PMD areas
+ * can have holes.
+ */
+ for (addr = start; addr < end; addr += PMD_SIZE) {
+ pmd_t *pmd, *target_pmd;
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+
+ pgd = pgd_offset_k(addr);
+ if (WARN_ON(pgd_none(*pgd)))
+ return;
+ p4d = p4d_offset(pgd, addr);
+ if (WARN_ON(p4d_none(*p4d)))
+ return;
+ pud = pud_offset(p4d, addr);
+ if (pud_none(*pud))
+ continue;
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ continue;
+
+ target_pmd = pti_user_pagetable_walk_pmd(addr);
+ if (WARN_ON(!target_pmd))
+ return;
+
+ /*
+ * Copy the PMD. That is, the kernelmode and usermode
+ * tables will share the last-level page tables of this
+ * address range
+ */
+ *target_pmd = pmd_clear_flags(*pmd, clear);
+ }
+}
+
+/*
* Initialize kernel page table isolation
*/
void __init pti_init(void)
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Optimize RESTORE_CR3
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-optimize-restore_cr3.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 21e94459110252d41b45c0c8ba50fd72a664d50c Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Mon, 4 Dec 2017 15:08:00 +0100
Subject: x86/mm: Optimize RESTORE_CR3
From: Peter Zijlstra <peterz(a)infradead.org>
commit 21e94459110252d41b45c0c8ba50fd72a664d50c upstream.
Most NMI/paranoid exceptions will not in fact change pagetables and would
thus not require TLB flushing, however RESTORE_CR3 uses flushing CR3
writes.
Restores to kernel PCIDs can be NOFLUSH, because we explicitly flush the
kernel mappings and now that we track which user PCIDs need flushing we can
avoid those too when possible.
This does mean RESTORE_CR3 needs an additional scratch_reg, luckily both
sites have plenty available.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/calling.h | 30 ++++++++++++++++++++++++++++--
arch/x86/entry/entry_64.S | 4 ++--
2 files changed, 30 insertions(+), 4 deletions(-)
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -281,8 +281,34 @@ For 32-bit we have the following convent
.Ldone_\@:
.endm
-.macro RESTORE_CR3 save_reg:req
+.macro RESTORE_CR3 scratch_reg:req save_reg:req
ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
+
+ ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
+
+ /*
+ * KERNEL pages can always resume with NOFLUSH as we do
+ * explicit flushes.
+ */
+ bt $X86_CR3_PTI_SWITCH_BIT, \save_reg
+ jnc .Lnoflush_\@
+
+ /*
+ * Check if there's a pending flush for the user ASID we're
+ * about to set.
+ */
+ movq \save_reg, \scratch_reg
+ andq $(0x7FF), \scratch_reg
+ bt \scratch_reg, THIS_CPU_user_pcid_flush_mask
+ jnc .Lnoflush_\@
+
+ btr \scratch_reg, THIS_CPU_user_pcid_flush_mask
+ jmp .Lwrcr3_\@
+
+.Lnoflush_\@:
+ SET_NOFLUSH_BIT \save_reg
+
+.Lwrcr3_\@:
/*
* The CR3 write could be avoided when not changing its value,
* but would require a CR3 read *and* a scratch register.
@@ -301,7 +327,7 @@ For 32-bit we have the following convent
.endm
.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
.endm
-.macro RESTORE_CR3 save_reg:req
+.macro RESTORE_CR3 scratch_reg:req save_reg:req
.endm
#endif
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1288,7 +1288,7 @@ ENTRY(paranoid_exit)
testl %ebx, %ebx /* swapgs needed? */
jnz .Lparanoid_exit_no_swapgs
TRACE_IRQS_IRETQ
- RESTORE_CR3 save_reg=%r14
+ RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
SWAPGS_UNSAFE_STACK
jmp .Lparanoid_exit_restore
.Lparanoid_exit_no_swapgs:
@@ -1730,7 +1730,7 @@ end_repeat_nmi:
movq $-1, %rsi
call do_nmi
- RESTORE_CR3 save_reg=%r14
+ RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
testl %ebx, %ebx /* swapgs needed? */
jnz nmi_restore
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 75298aa179d56cd64f54e58a19fffc8ab922b4c0 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Mon, 4 Dec 2017 15:08:04 +0100
Subject: x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
From: Borislav Petkov <bp(a)suse.de>
commit 75298aa179d56cd64f54e58a19fffc8ab922b4c0 upstream.
The upcoming support for dumping the kernel and the user space page tables
of the current process would create more random files in the top level
debugfs directory.
Add a page table directory and move the existing file to it.
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/debug_pagetables.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -22,21 +22,26 @@ static const struct file_operations ptdu
.release = single_release,
};
-static struct dentry *pe;
+static struct dentry *dir, *pe;
static int __init pt_dump_debug_init(void)
{
- pe = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
- &ptdump_fops);
- if (!pe)
+ dir = debugfs_create_dir("page_tables", NULL);
+ if (!dir)
return -ENOMEM;
+ pe = debugfs_create_file("kernel", 0400, dir, NULL, &ptdump_fops);
+ if (!pe)
+ goto err;
return 0;
+err:
+ debugfs_remove_recursive(dir);
+ return -ENOMEM;
}
static void __exit pt_dump_debug_exit(void)
{
- debugfs_remove_recursive(pe);
+ debugfs_remove_recursive(dir);
}
module_init(pt_dump_debug_init);
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0a126abd576ebc6403f063dbe20cf7416c9d9393 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Tue, 5 Dec 2017 13:34:53 +0100
Subject: x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
From: Peter Zijlstra <peterz(a)infradead.org>
commit 0a126abd576ebc6403f063dbe20cf7416c9d9393 upstream.
Ideally we'd also use sparse to enforce this separation so it becomes much
more difficult to mess up.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 55 +++++++++++++++++++++++++++++++---------
1 file changed, 43 insertions(+), 12 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -13,16 +13,33 @@
#include <asm/pti.h>
#include <asm/processor-flags.h>
-static inline u64 inc_mm_tlb_gen(struct mm_struct *mm)
-{
- /*
- * Bump the generation count. This also serves as a full barrier
- * that synchronizes with switch_mm(): callers are required to order
- * their read of mm_cpumask after their writes to the paging
- * structures.
- */
- return atomic64_inc_return(&mm->context.tlb_gen);
-}
+/*
+ * The x86 feature is called PCID (Process Context IDentifier). It is similar
+ * to what is traditionally called ASID on the RISC processors.
+ *
+ * We don't use the traditional ASID implementation, where each process/mm gets
+ * its own ASID and flush/restart when we run out of ASID space.
+ *
+ * Instead we have a small per-cpu array of ASIDs and cache the last few mm's
+ * that came by on this CPU, allowing cheaper switch_mm between processes on
+ * this CPU.
+ *
+ * We end up with different spaces for different things. To avoid confusion we
+ * use different names for each of them:
+ *
+ * ASID - [0, TLB_NR_DYN_ASIDS-1]
+ * the canonical identifier for an mm
+ *
+ * kPCID - [1, TLB_NR_DYN_ASIDS]
+ * the value we write into the PCID part of CR3; corresponds to the
+ * ASID+1, because PCID 0 is special.
+ *
+ * uPCID - [2048 + 1, 2048 + TLB_NR_DYN_ASIDS]
+ * for KPTI each mm has two address spaces and thus needs two
+ * PCID values, but we can still do with a single ASID denomination
+ * for each mm. Corresponds to kPCID + 2048.
+ *
+ */
/* There are 12 bits of space for ASIDS in CR3 */
#define CR3_HW_ASID_BITS 12
@@ -41,7 +58,7 @@ static inline u64 inc_mm_tlb_gen(struct
/*
* ASIDs are zero-based: 0->MAX_AVAIL_ASID are valid. -1 below to account
- * for them being zero-based. Another -1 is because ASID 0 is reserved for
+ * for them being zero-based. Another -1 is because PCID 0 is reserved for
* use by non-PCID-aware users.
*/
#define MAX_ASID_AVAILABLE ((1 << CR3_AVAIL_PCID_BITS) - 2)
@@ -52,6 +69,9 @@ static inline u64 inc_mm_tlb_gen(struct
*/
#define TLB_NR_DYN_ASIDS 6
+/*
+ * Given @asid, compute kPCID
+ */
static inline u16 kern_pcid(u16 asid)
{
VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE);
@@ -86,7 +106,7 @@ static inline u16 kern_pcid(u16 asid)
}
/*
- * The user PCID is just the kernel one, plus the "switch bit".
+ * Given @asid, compute uPCID
*/
static inline u16 user_pcid(u16 asid)
{
@@ -484,6 +504,17 @@ static inline void flush_tlb_page(struct
void native_flush_tlb_others(const struct cpumask *cpumask,
const struct flush_tlb_info *info);
+static inline u64 inc_mm_tlb_gen(struct mm_struct *mm)
+{
+ /*
+ * Bump the generation count. This also serves as a full barrier
+ * that synchronizes with switch_mm(): callers are required to order
+ * their read of mm_cpumask after their writes to the paging
+ * structures.
+ */
+ return atomic64_inc_return(&mm->context.tlb_gen);
+}
+
static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch,
struct mm_struct *mm)
{
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Allow flushing for future ASID switches
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-allow-flushing-for-future-asid-switches.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 2ea907c4fe7b78e5840c1dc07800eae93248cad1 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:57 +0100
Subject: x86/mm: Allow flushing for future ASID switches
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 2ea907c4fe7b78e5840c1dc07800eae93248cad1 upstream.
If changing the page tables in such a way that an invalidation of all
contexts (aka. PCIDs / ASIDs) is required, they can be actively invalidated
by:
1. INVPCID for each PCID (works for single pages too).
2. Load CR3 with each PCID without the NOFLUSH bit set
3. Load CR3 with the NOFLUSH bit set for each and do INVLPG for each address.
But, none of these are really feasible since there are ~6 ASIDs (12 with
PAGE_TABLE_ISOLATION) at the time that invalidation is required.
Instead of actively invalidating them, invalidate the *current* context and
also mark the cpu_tlbstate _quickly_ to indicate future invalidation to be
required.
At the next context-switch, look for this indicator
('invalidate_other' being set) invalidate all of the
cpu_tlbstate.ctxs[] entries.
This ensures that any future context switches will do a full flush
of the TLB, picking up the previous changes.
[ tglx: Folded more fixups from Peter ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 37 +++++++++++++++++++++++++++++--------
arch/x86/mm/tlb.c | 35 +++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+), 8 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -135,6 +135,17 @@ struct tlb_state {
bool is_lazy;
/*
+ * If set we changed the page tables in such a way that we
+ * needed an invalidation of all contexts (aka. PCIDs / ASIDs).
+ * This tells us to go invalidate all the non-loaded ctxs[]
+ * on the next context switch.
+ *
+ * The current ctx was kept up-to-date as it ran and does not
+ * need to be invalidated.
+ */
+ bool invalidate_other;
+
+ /*
* Access to this CR4 shadow and to H/W CR4 is protected by
* disabling interrupts when modifying either one.
*/
@@ -212,6 +223,14 @@ static inline unsigned long cr4_read_sha
}
/*
+ * Mark all other ASIDs as invalid, preserves the current.
+ */
+static inline void invalidate_other_asid(void)
+{
+ this_cpu_write(cpu_tlbstate.invalidate_other, true);
+}
+
+/*
* Save some of cr4 feature set we're using (e.g. Pentium 4MB
* enable and PPro Global page enable), so that any CPU's that boot
* up after us can get the correct flags. This should only be used
@@ -298,14 +317,6 @@ static inline void __flush_tlb_all(void)
*/
__flush_tlb();
}
-
- /*
- * Note: if we somehow had PCID but not PGE, then this wouldn't work --
- * we'd end up flushing kernel translations for the current ASID but
- * we might fail to flush kernel translations for other cached ASIDs.
- *
- * To avoid this issue, we force PCID off if PGE is off.
- */
}
/*
@@ -315,6 +326,16 @@ static inline void __flush_tlb_one(unsig
{
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
__flush_tlb_single(addr);
+
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return;
+
+ /*
+ * __flush_tlb_single() will have cleared the TLB entry for this ASID,
+ * but since kernel space is replicated across all, we must also
+ * invalidate all others.
+ */
+ invalidate_other_asid();
}
#define TLB_FLUSH_ALL -1UL
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -28,6 +28,38 @@
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/
+/*
+ * We get here when we do something requiring a TLB invalidation
+ * but could not go invalidate all of the contexts. We do the
+ * necessary invalidation by clearing out the 'ctx_id' which
+ * forces a TLB flush when the context is loaded.
+ */
+void clear_asid_other(void)
+{
+ u16 asid;
+
+ /*
+ * This is only expected to be set if we have disabled
+ * kernel _PAGE_GLOBAL pages.
+ */
+ if (!static_cpu_has(X86_FEATURE_PTI)) {
+ WARN_ON_ONCE(1);
+ return;
+ }
+
+ for (asid = 0; asid < TLB_NR_DYN_ASIDS; asid++) {
+ /* Do not need to flush the current asid */
+ if (asid == this_cpu_read(cpu_tlbstate.loaded_mm_asid))
+ continue;
+ /*
+ * Make sure the next time we go to switch to
+ * this asid, we do a flush:
+ */
+ this_cpu_write(cpu_tlbstate.ctxs[asid].ctx_id, 0);
+ }
+ this_cpu_write(cpu_tlbstate.invalidate_other, false);
+}
+
atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
@@ -42,6 +74,9 @@ static void choose_new_asid(struct mm_st
return;
}
+ if (this_cpu_read(cpu_tlbstate.invalidate_other))
+ clear_asid_other();
+
for (asid = 0; asid < TLB_NR_DYN_ASIDS; asid++) {
if (this_cpu_read(cpu_tlbstate.ctxs[asid].ctx_id) !=
next->context.ctx_id)
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Abstract switching CR3
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-abstract-switching-cr3.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 48e111982cda033fec832c6b0592c2acedd85d04 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Mon, 4 Dec 2017 15:07:58 +0100
Subject: x86/mm: Abstract switching CR3
From: Dave Hansen <dave.hansen(a)linux.intel.com>
commit 48e111982cda033fec832c6b0592c2acedd85d04 upstream.
In preparation to adding additional PCID flushing, abstract the
loading of a new ASID into CR3.
[ PeterZ: Split out from big combo patch ]
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -100,6 +100,24 @@ static void choose_new_asid(struct mm_st
*need_flush = true;
}
+static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush)
+{
+ unsigned long new_mm_cr3;
+
+ if (need_flush) {
+ new_mm_cr3 = build_cr3(pgdir, new_asid);
+ } else {
+ new_mm_cr3 = build_cr3_noflush(pgdir, new_asid);
+ }
+
+ /*
+ * Caution: many callers of this function expect
+ * that load_cr3() is serializing and orders TLB
+ * fills with respect to the mm_cpumask writes.
+ */
+ write_cr3(new_mm_cr3);
+}
+
void leave_mm(int cpu)
{
struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
@@ -230,7 +248,7 @@ void switch_mm_irqs_off(struct mm_struct
if (need_flush) {
this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id);
this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen);
- write_cr3(build_cr3(next->pgd, new_asid));
+ load_new_mm_cr3(next->pgd, new_asid, true);
/*
* NB: This gets called via leave_mm() in the idle path
@@ -243,7 +261,7 @@ void switch_mm_irqs_off(struct mm_struct
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
} else {
/* The new ASID is already up to date. */
- write_cr3(build_cr3_noflush(next->pgd, new_asid));
+ load_new_mm_cr3(next->pgd, new_asid, false);
/* See above wrt _rcuidle. */
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, 0);
Patches currently in stable-queue which might be from dave.hansen(a)linux.intel.com are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/mm/64: Make a full PGD-entry size hole in the memory map
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9f449772a3106bcdd4eb8fdeb281147b0e99fb30 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Tue, 12 Dec 2017 07:56:44 -0800
Subject: x86/mm/64: Make a full PGD-entry size hole in the memory map
From: Andy Lutomirski <luto(a)kernel.org>
commit 9f449772a3106bcdd4eb8fdeb281147b0e99fb30 upstream.
Shrink vmalloc space from 16384TiB to 12800TiB to enlarge the hole starting
at 0xff90000000000000 to be a full PGD entry.
A subsequent patch will use this hole for the pagetable isolation LDT
alias.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Kirill A. Shutemov <kirill(a)shutemov.name>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Documentation/x86/x86_64/mm.txt | 4 ++--
arch/x86/include/asm/pgtable_64_types.h | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -29,8 +29,8 @@ Virtual memory map with 5 level page tab
hole caused by [56:63] sign extension
ff00000000000000 - ff0fffffffffffff (=52 bits) guard hole, reserved for hypervisor
ff10000000000000 - ff8fffffffffffff (=55 bits) direct mapping of all phys. memory
-ff90000000000000 - ff91ffffffffffff (=49 bits) hole
-ff92000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space
+ff90000000000000 - ff9fffffffffffff (=52 bits) hole
+ffa0000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space (12800 TB)
ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
... unused hole ...
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -79,8 +79,8 @@ typedef struct { pteval_t pte; } pte_t;
#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
#ifdef CONFIG_X86_5LEVEL
-# define VMALLOC_SIZE_TB _AC(16384, UL)
-# define __VMALLOC_BASE _AC(0xff92000000000000, UL)
+# define VMALLOC_SIZE_TB _AC(12800, UL)
+# define __VMALLOC_BASE _AC(0xffa0000000000000, UL)
# define __VMEMMAP_BASE _AC(0xffd4000000000000, UL)
#else
# define VMALLOC_SIZE_TB _AC(32, UL)
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch
This is a note to let you know that I've just added the patch titled
x86/ldt: Make the LDT mapping RO
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-ldt-make-the-ldt-mapping-ro.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9f5cb6b32d9e0a3a7453222baaf15664d92adbf2 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Fri, 15 Dec 2017 20:35:11 +0100
Subject: x86/ldt: Make the LDT mapping RO
From: Thomas Gleixner <tglx(a)linutronix.de>
commit 9f5cb6b32d9e0a3a7453222baaf15664d92adbf2 upstream.
Now that the LDT mapping is in a known area when PAGE_TABLE_ISOLATION is
enabled its a primary target for attacks, if a user space interface fails
to validate a write address correctly. That can never happen, right?
The SDM states:
If the segment descriptors in the GDT or an LDT are placed in ROM, the
processor can enter an indefinite loop if software or the processor
attempts to update (write to) the ROM-based segment descriptors. To
prevent this problem, set the accessed bits for all segment descriptors
placed in a ROM. Also, remove operating-system or executive code that
attempts to modify segment descriptors located in ROM.
So its a valid approach to set the ACCESS bit when setting up the LDT entry
and to map the table RO. Fixup the selftest so it can handle that new mode.
Remove the manual ACCESS bit setter in set_tls_desc() as this is now
pointless. Folded the patch from Peter Ziljstra.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/desc.h | 2 ++
arch/x86/kernel/ldt.c | 7 ++++++-
arch/x86/kernel/tls.c | 11 ++---------
tools/testing/selftests/x86/ldt_gdt.c | 3 +--
4 files changed, 11 insertions(+), 12 deletions(-)
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -21,6 +21,8 @@ static inline void fill_ldt(struct desc_
desc->type = (info->read_exec_only ^ 1) << 1;
desc->type |= info->contents << 2;
+ /* Set the ACCESS bit so it can be mapped RO */
+ desc->type |= 1;
desc->s = 1;
desc->dpl = 0x3;
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -158,7 +158,12 @@ map_ldt_struct(struct mm_struct *mm, str
ptep = get_locked_pte(mm, va, &ptl);
if (!ptep)
return -ENOMEM;
- pte = pfn_pte(pfn, __pgprot(__PAGE_KERNEL & ~_PAGE_GLOBAL));
+ /*
+ * Map it RO so the easy to find address is not a primary
+ * target via some kernel interface which misses a
+ * permission check.
+ */
+ pte = pfn_pte(pfn, __pgprot(__PAGE_KERNEL_RO & ~_PAGE_GLOBAL));
set_pte_at(mm, va, ptep, pte);
pte_unmap_unlock(ptep, ptl);
}
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -93,17 +93,10 @@ static void set_tls_desc(struct task_str
cpu = get_cpu();
while (n-- > 0) {
- if (LDT_empty(info) || LDT_zero(info)) {
+ if (LDT_empty(info) || LDT_zero(info))
memset(desc, 0, sizeof(*desc));
- } else {
+ else
fill_ldt(desc, info);
-
- /*
- * Always set the accessed bit so that the CPU
- * doesn't try to write to the (read-only) GDT.
- */
- desc->type |= 1;
- }
++info;
++desc;
}
--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -122,8 +122,7 @@ static void check_valid_segment(uint16_t
* NB: Different Linux versions do different things with the
* accessed bit in set_thread_area().
*/
- if (ar != expected_ar &&
- (ldt || ar != (expected_ar | AR_ACCESSED))) {
+ if (ar != expected_ar && ar != (expected_ar | AR_ACCESSED)) {
printf("[FAIL]\t%s entry %hu has AR 0x%08X but expected 0x%08X\n",
(ldt ? "LDT" : "GDT"), index, ar, expected_ar);
nerrs++;
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.14/x86-mm-pti-allow-nx-poison-to-be-set-in-p4d-pgd.patch
queue-4.14/x86-dumpstack-indicate-in-oops-whether-pti-is-configured-and-enabled.patch
queue-4.14/x86-mm-clarify-the-whole-asid-kernel-pcid-user-pcid-naming.patch
queue-4.14/x86-mm-abstract-switching-cr3.patch
queue-4.14/x86-mm-optimize-restore_cr3.patch
queue-4.14/x86-mm-pti-force-entry-through-trampoline-when-pti-active.patch
queue-4.14/x86-entry-align-entry-text-section-to-pmd-boundary.patch
queue-4.14/x86-mm-64-make-a-full-pgd-entry-size-hole-in-the-memory-map.patch
queue-4.14/x86-cpu_entry_area-add-debugstore-entries-to-cpu_entry_area.patch
queue-4.14/x86-mm-pti-share-entry-text-pmd.patch
queue-4.14/x86-ldt-make-the-ldt-mapping-ro.patch
queue-4.14/x86-mm-dump_pagetables-check-user-space-page-table-for-wx-pages.patch
queue-4.14/x86-pti-map-the-vsyscall-page-if-needed.patch
queue-4.14/x86-mm-pti-disable-global-pages-if-page_table_isolation-y.patch
queue-4.14/x86-mm-dump_pagetables-allow-dumping-current-pagetables.patch
queue-4.14/x86-mm-pti-add-infrastructure-for-page-table-isolation.patch
queue-4.14/x86-cpufeatures-add-x86_bug_cpu_insecure.patch
queue-4.14/x86-mm-use-fix-pcid-to-optimize-user-kernel-switches.patch
queue-4.14/x86-mm-use-invpcid-for-__native_flush_tlb_single.patch
queue-4.14/x86-mm-pti-prepare-the-x86-entry-assembly-code-for-entry-exit-cr3-switching.patch
queue-4.14/x86-mm-allow-flushing-for-future-asid-switches.patch
queue-4.14/x86-mm-pti-add-kconfig.patch
queue-4.14/x86-mm-pti-add-mapping-helper-functions.patch
queue-4.14/x86-mm-pti-map-espfix-into-user-space.patch
queue-4.14/x86-mm-pti-share-cpu_entry_area-with-user-space-page-tables.patch
queue-4.14/x86-mm-pti-add-functions-to-clone-kernel-pmds.patch
queue-4.14/x86-pti-add-the-pti-cmdline-option-and-documentation.patch
queue-4.14/x86-events-intel-ds-map-debug-buffers-in-cpu_entry_area.patch
queue-4.14/x86-mm-pti-allocate-a-separate-user-pgd.patch
queue-4.14/x86-pti-put-the-ldt-in-its-own-pgd-if-pti-is-on.patch
queue-4.14/x86-mm-pti-populate-user-pgd.patch
queue-4.14/x86-mm-dump_pagetables-add-page-table-directory-to-the-debugfs-vfs-hierarchy.patch