From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Cc: stable(a)vger.kernel.org
Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index c87766c1c204..e06cde093f76 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -280,6 +280,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)
+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -331,7 +333,9 @@ static void rb_init_page(struct buffer_data_page *bpage)
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}
--
2.13.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5663d8f9bbe4bf15488f7351efb61ea20fa6de06 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Tue, 12 Dec 2017 17:15:02 +0100
Subject: [PATCH] kvm: x86: fix WARN due to uninitialized guest FPU state
------------[ cut here ]------------
Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10
RIP: 0010:ex_handler_fprestore+0x88/0x90
Call Trace:
fixup_exception+0x4e/0x60
do_general_protection+0xff/0x270
general_protection+0x22/0x30
RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
kvm_apic_accept_events+0x1c0/0x240 [kvm]
kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
kvm_vcpu_ioctl+0x479/0x880 [kvm]
do_vfs_ioctl+0x142/0x9a0
SyS_ioctl+0x74/0x80
do_syscall_64+0x15f/0x600
where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu.
To fix it, move kvm_load_guest_fpu to the very beginning of
kvm_arch_vcpu_ioctl_run.
Cc: stable(a)vger.kernel.org
Fixes: f775b13eedee2f7f3c6fdd4e90fb79090ce5d339
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 154ea27746e9..56d036b9ad75 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7264,13 +7264,12 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
{
- struct fpu *fpu = ¤t->thread.fpu;
int r;
- fpu__initialize(fpu);
-
kvm_sigset_activate(vcpu);
+ kvm_load_guest_fpu(vcpu);
+
if (unlikely(vcpu->arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) {
if (kvm_run->immediate_exit) {
r = -EINTR;
@@ -7296,14 +7295,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
}
}
- kvm_load_guest_fpu(vcpu);
-
if (unlikely(vcpu->arch.complete_userspace_io)) {
int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io;
vcpu->arch.complete_userspace_io = NULL;
r = cui(vcpu);
if (r <= 0)
- goto out_fpu;
+ goto out;
} else
WARN_ON(vcpu->arch.pio.count || vcpu->mmio_needed);
@@ -7312,9 +7309,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
else
r = vcpu_run(vcpu);
-out_fpu:
- kvm_put_guest_fpu(vcpu);
out:
+ kvm_put_guest_fpu(vcpu);
post_kvm_run_save(vcpu);
kvm_sigset_deactivate(vcpu);
From: Corey Minyard <cminyard(a)mvista.com>
This reverts commit c97e41076a298dbc4e910c33048e553658388eed.
A backport was requested of c0a32fe13cd32 "ipmi_si: fix memory leak on
new_smi", but the backport shouldn't have been done. This change needs
to be reverted, as it can result in an oops and the previous code is
correct.
Reverts: c97e41076a29 ("ipmi_si: fix memory leak on new_smi")
Link: https://bbs.archlinux.org/viewtopic.php?pid=1757130#p1757130
Reported-by: Neil Romig <neil(a)sixtythree.me.uk>
Cc: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Corey Minyard <cminyard(a)mvista.com>
---
This is for 4.14 stable tree only. In hindsight, I should scrutinize
stable kernel requests from others in the IPMI tree. Sorry about that.
drivers/char/ipmi/ipmi_si_intf.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index e1cbb78..c04aa11 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -3469,7 +3469,6 @@ static int add_smi(struct smi_info *new_smi)
ipmi_addr_src_to_str(new_smi->addr_source),
si_to_str[new_smi->si_type]);
rv = -EBUSY;
- kfree(new_smi);
goto out_err;
}
}
--
2.7.4
This is a note to let you know that I've just added the patch titled
libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 4 Dec 2017 14:07:43 -0800
Subject: libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment
From: Dan Williams <dan.j.williams(a)intel.com>
commit 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 upstream.
The following namespace configuration attempt:
# ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f
libndctl: ndctl_dax_enable: dax0.1: failed to enable
Error: namespace0.0: failed to enable
failed to reconfigure namespace: No such device or address
...fails when the backing memory range is not physically aligned to 1G:
# cat /proc/iomem | grep Persistent
210000000-30fffffff : Persistent Memory (legacy)
In the above example the 4G persistent memory range starts and ends on a
256MB boundary.
We handle this case correctly when needing to handle cases that violate
section alignment (128MB) collisions against "System RAM", and we simply
need to extend that padding/truncation for the 1GB alignment use case.
Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...")
Reported-and-tested-by: Jane Chu <jane.chu(a)oracle.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvdimm/pfn_devs.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -562,6 +562,12 @@ static struct vmem_altmap *__nvdimm_setu
return altmap;
}
+static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys)
+{
+ return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys),
+ ALIGN_DOWN(phys, nd_pfn->align));
+}
+
static int nd_pfn_init(struct nd_pfn *nd_pfn)
{
u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0;
@@ -617,13 +623,16 @@ static int nd_pfn_init(struct nd_pfn *nd
start = nsio->res.start;
size = PHYS_SECTION_ALIGN_UP(start + size) - start;
if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
- IORES_DESC_NONE) == REGION_MIXED) {
+ IORES_DESC_NONE) == REGION_MIXED
+ || !IS_ALIGNED(start + resource_size(&nsio->res),
+ nd_pfn->align)) {
size = resource_size(&nsio->res);
- end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size);
+ end_trunc = start + size - phys_pmem_align_down(nd_pfn,
+ start + size);
}
if (start_pad + end_trunc)
- dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n",
+ dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n",
dev_name(&ndns->dev), start_pad + end_trunc);
/*
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/libnvdimm-pfn-fix-start_pad-handling-for-aligned-namespaces.patch
queue-4.9/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch
queue-4.9/acpi-nfit-fix-health-event-notification.patch
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 4 Dec 2017 14:07:43 -0800
Subject: [PATCH] libnvdimm, dax: fix 1GB-aligned namespaces vs physical
misalignment
The following namespace configuration attempt:
# ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f
libndctl: ndctl_dax_enable: dax0.1: failed to enable
Error: namespace0.0: failed to enable
failed to reconfigure namespace: No such device or address
...fails when the backing memory range is not physically aligned to 1G:
# cat /proc/iomem | grep Persistent
210000000-30fffffff : Persistent Memory (legacy)
In the above example the 4G persistent memory range starts and ends on a
256MB boundary.
We handle this case correctly when needing to handle cases that violate
section alignment (128MB) collisions against "System RAM", and we simply
need to extend that padding/truncation for the 1GB alignment use case.
Cc: <stable(a)vger.kernel.org>
Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...")
Reported-and-tested-by: Jane Chu <jane.chu(a)oracle.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index db2fc7c02e01..2adada1a5855 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -583,6 +583,12 @@ static struct vmem_altmap *__nvdimm_setup_pfn(struct nd_pfn *nd_pfn,
return altmap;
}
+static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys)
+{
+ return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys),
+ ALIGN_DOWN(phys, nd_pfn->align));
+}
+
static int nd_pfn_init(struct nd_pfn *nd_pfn)
{
u32 dax_label_reserve = is_nd_dax(&nd_pfn->dev) ? SZ_128K : 0;
@@ -638,13 +644,16 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
start = nsio->res.start;
size = PHYS_SECTION_ALIGN_UP(start + size) - start;
if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM,
- IORES_DESC_NONE) == REGION_MIXED) {
+ IORES_DESC_NONE) == REGION_MIXED
+ || !IS_ALIGNED(start + resource_size(&nsio->res),
+ nd_pfn->align)) {
size = resource_size(&nsio->res);
- end_trunc = start + size - PHYS_SECTION_ALIGN_DOWN(start + size);
+ end_trunc = start + size - phys_pmem_align_down(nd_pfn,
+ start + size);
}
if (start_pad + end_trunc)
- dev_info(&nd_pfn->dev, "%s section collision, truncate %d bytes\n",
+ dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n",
dev_name(&ndns->dev), start_pad + end_trunc);
/*
This is a note to let you know that I've just added the patch titled
pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d2b3c353595a855794f8b9df5b5bdbe8deb0c413 Mon Sep 17 00:00:00 2001
From: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Date: Mon, 4 Dec 2017 12:11:02 +0300
Subject: pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems
From: Mika Westerberg <mika.westerberg(a)linux.intel.com>
commit d2b3c353595a855794f8b9df5b5bdbe8deb0c413 upstream.
Guenter Roeck reported an interrupt storm on a prototype system which is
based on Cyan Chromebook. The root cause turned out to be a incorrectly
configured pin that triggers spurious interrupts. This will be fixed in
coreboot but currently we need to prevent the interrupt storm from
happening by masking all interrupts (but not GPEs) on those systems.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953
Fixes: bcb48cca23ec ("pinctrl: cherryview: Do not mask all interrupts in probe")
Reported-and-tested-by: Guenter Roeck <linux(a)roeck-us.net>
Reported-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/pinctrl/intel/pinctrl-cherryview.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/drivers/pinctrl/intel/pinctrl-cherryview.c
+++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
@@ -1466,6 +1466,22 @@ static int chv_gpio_probe(struct chv_pin
offset += range->npins;
}
+ /*
+ * The same set of machines in chv_no_valid_mask[] have incorrectly
+ * configured GPIOs that generate spurious interrupts so we use
+ * this same list to apply another quirk for them.
+ *
+ * See also https://bugzilla.kernel.org/show_bug.cgi?id=197953.
+ */
+ if (!need_valid_mask) {
+ /*
+ * Mask all interrupts the community is able to generate
+ * but leave the ones that can only generate GPEs unmasked.
+ */
+ chv_writel(GENMASK(31, pctrl->community->nirqs),
+ pctrl->regs + CHV_INTMASK);
+ }
+
/* Clear all interrupts */
chv_writel(0xffff, pctrl->regs + CHV_INTSTAT);
Patches currently in stable-queue which might be from mika.westerberg(a)linux.intel.com are
queue-4.4/pinctrl-cherryview-mask-all-interrupts-on-intel_strago-based-systems.patch
This is a note to let you know that I've just added the patch titled
x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4831b779403a836158917d59a7ca880483c67378 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 10 Dec 2017 22:47:20 -0800
Subject: x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
From: Andy Lutomirski <luto(a)kernel.org>
commit 4831b779403a836158917d59a7ca880483c67378 upstream.
If something goes wrong with pagetable setup, vsyscall=native will
accidentally fall back to emulation. Make it warn and fail so that we
notice.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -139,6 +139,10 @@ bool emulate_vsyscall(struct pt_regs *re
WARN_ON_ONCE(address != regs->ip);
+ /* This should be unreachable in NATIVE mode. */
+ if (WARN_ON(vsyscall_mode == NATIVE))
+ return false;
+
if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none");
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch
queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch
queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch
queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch
queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch
queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch
queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch
queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch
queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch
queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch
queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch
queue-4.14/x86-mm-create-asm-invpcid.h.patch
queue-4.14/x86-mm-remove-superfluous-barriers.patch
queue-4.14/x86-ldt-rework-locking.patch
queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch
queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch
queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch
queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch
queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch
queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch
queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch
queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch
queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch
queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch
This is a note to let you know that I've just added the patch titled
x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 49275fef986abfb8b476e4708aaecc07e7d3e087 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 10 Dec 2017 22:47:19 -0800
Subject: x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy
From: Andy Lutomirski <luto(a)kernel.org>
commit 49275fef986abfb8b476e4708aaecc07e7d3e087 upstream.
The kernel is very erratic as to which pagetables have _PAGE_USER set. The
vsyscall page gets lucky: it seems that all of the relevant pagetables are
among the apparently arbitrary ones that set _PAGE_USER. Rather than
relying on chance, just explicitly set _PAGE_USER.
This will let us clean up pagetable setup to stop setting _PAGE_USER. The
added code can also be reused by pagetable isolation to manage the
_PAGE_USER bit in the usermode tables.
[ tglx: Folded paravirt fix from Juergen Gross ]
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/vsyscall/vsyscall_64.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -37,6 +37,7 @@
#include <asm/unistd.h>
#include <asm/fixmap.h>
#include <asm/traps.h>
+#include <asm/paravirt.h>
#define CREATE_TRACE_POINTS
#include "vsyscall_trace.h"
@@ -329,16 +330,47 @@ int in_gate_area_no_mm(unsigned long add
return vsyscall_mode != NONE && (addr & PAGE_MASK) == VSYSCALL_ADDR;
}
+/*
+ * The VSYSCALL page is the only user-accessible page in the kernel address
+ * range. Normally, the kernel page tables can have _PAGE_USER clear, but
+ * the tables covering VSYSCALL_ADDR need _PAGE_USER set if vsyscalls
+ * are enabled.
+ *
+ * Some day we may create a "minimal" vsyscall mode in which we emulate
+ * vsyscalls but leave the page not present. If so, we skip calling
+ * this.
+ */
+static void __init set_vsyscall_pgtable_user_bits(void)
+{
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+
+ pgd = pgd_offset_k(VSYSCALL_ADDR);
+ set_pgd(pgd, __pgd(pgd_val(*pgd) | _PAGE_USER));
+ p4d = p4d_offset(pgd, VSYSCALL_ADDR);
+#if CONFIG_PGTABLE_LEVELS >= 5
+ p4d->p4d |= _PAGE_USER;
+#endif
+ pud = pud_offset(p4d, VSYSCALL_ADDR);
+ set_pud(pud, __pud(pud_val(*pud) | _PAGE_USER));
+ pmd = pmd_offset(pud, VSYSCALL_ADDR);
+ set_pmd(pmd, __pmd(pmd_val(*pmd) | _PAGE_USER));
+}
+
void __init map_vsyscall(void)
{
extern char __vsyscall_page;
unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
- if (vsyscall_mode != NONE)
+ if (vsyscall_mode != NONE) {
__set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall,
vsyscall_mode == NATIVE
? PAGE_KERNEL_VSYSCALL
: PAGE_KERNEL_VVAR);
+ set_vsyscall_pgtable_user_bits();
+ }
BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
(unsigned long)VSYSCALL_ADDR);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch
queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch
queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch
queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch
queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch
queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch
queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch
queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch
queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch
queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch
queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch
queue-4.14/x86-mm-create-asm-invpcid.h.patch
queue-4.14/x86-mm-remove-superfluous-barriers.patch
queue-4.14/x86-ldt-rework-locking.patch
queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch
queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch
queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch
queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch
queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch
queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch
queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch
queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch
queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch
queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch
This is a note to let you know that I've just added the patch titled
x86/uv: Use the right TLB-flush API
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-uv-use-the-right-tlb-flush-api.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3e46e0f5ee3643a1239be9046c7ba6c66ca2b329 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Tue, 5 Dec 2017 13:34:50 +0100
Subject: x86/uv: Use the right TLB-flush API
From: Peter Zijlstra <peterz(a)infradead.org>
commit 3e46e0f5ee3643a1239be9046c7ba6c66ca2b329 upstream.
Since uv_flush_tlb_others() implements flush_tlb_others() which is
about flushing user mappings, we should use __flush_tlb_single(),
which too is about flushing user mappings.
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Andrew Banman <abanman(a)hpe.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: David Laight <David.Laight(a)aculab.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: Eduardo Valentin <eduval(a)amazon.com>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mike Travis <mike.travis(a)hpe.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: aliguori(a)amazon.com
Cc: daniel.gruss(a)iaik.tugraz.at
Cc: hughd(a)google.com
Cc: keescook(a)google.com
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/platform/uv/tlb_uv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -299,7 +299,7 @@ static void bau_process_message(struct m
local_flush_tlb();
stat->d_alltlb++;
} else {
- __flush_tlb_one(msg->address);
+ __flush_tlb_single(msg->address);
stat->d_onetlb++;
}
stat->d_requestee++;
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.14/x86-entry-rename-sysenter_stack-to-cpu_entry_area_entry_stack.patch
queue-4.14/x86-mm-put-mmu-to-hardware-asid-translation-in-one-place.patch
queue-4.14/x86-vsyscall-64-explicitly-set-_page_user-in-the-pagetable-hierarchy.patch
queue-4.14/x86-uv-use-the-right-tlb-flush-api.patch
queue-4.14/x86-decoder-fix-and-update-the-opcodes-map.patch
queue-4.14/x86-mm-dump_pagetables-check-page_present-for-real.patch
queue-4.14/x86-ldt-prevent-ldt-inheritance-on-exec.patch
queue-4.14/x86-microcode-dont-abuse-the-tlb-flush-interface.patch
queue-4.14/x86-doc-remove-obvious-weirdnesses-from-the-x86-mm-layout-documentation.patch
queue-4.14/init-invoke-init_espfix_bsp-from-mm_init.patch
queue-4.14/x86-cpu_entry_area-move-it-to-a-separate-unit.patch
queue-4.14/x86-vsyscall-64-warn-and-fail-vsyscall-emulation-in-native-mode.patch
queue-4.14/x86-mm-create-asm-invpcid.h.patch
queue-4.14/x86-mm-remove-superfluous-barriers.patch
queue-4.14/x86-ldt-rework-locking.patch
queue-4.14/arch-mm-allow-arch_dup_mmap-to-fail.patch
queue-4.14/x86-cpu_entry_area-move-it-out-of-the-fixmap.patch
queue-4.14/tools-headers-sync-objtool-uapi-header.patch
queue-4.14/x86-mm-remove-hard-coded-asid-limit-checks.patch
queue-4.14/x86-kconfig-limit-nr_cpus-on-32-bit-to-a-sane-amount.patch
queue-4.14/objtool-fix-64-bit-build-on-32-bit-host.patch
queue-4.14/x86-mm-add-comments-to-clarify-which-tlb-flush-functions-are-supposed-to-flush-what.patch
queue-4.14/x86-mm-move-the-cr3-construction-functions-to-tlbflush.h.patch
queue-4.14/x86-mm-dump_pagetables-make-the-address-hints-correct-and-readable.patch
queue-4.14/x86-insn-eval-add-utility-functions-to-get-segment-selector.patch
queue-4.14/objtool-move-synced-files-to-their-original-relative-locations.patch
queue-4.14/x86-mm-use-__flush_tlb_one-for-kernel-memory.patch
queue-4.14/objtool-move-kernel-headers-code-sync-check-to-a-script.patch
queue-4.14/x86-mm-64-improve-the-memory-map-documentation.patch
queue-4.14/objtool-fix-cross-build.patch