This is a note to let you know that I've just added the patch titled
bna: integer overflow bug in debugfs
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bna-integer-overflow-bug-in-debugfs.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Fri, 17 Mar 2017 23:52:35 +0300
Subject: bna: integer overflow bug in debugfs
From: Dan Carpenter <dan.carpenter(a)oracle.com>
[ Upstream commit 13e2d5187f6b965ba3556caedb914baf81b98ed2 ]
We could allocate less memory than intended because we do:
bnad->regdata = kzalloc(len << 2, GFP_KERNEL);
The shift can overflow leading to a crash. This is debugfs code so the
impact is very small.
Fixes: 7afc5dbde091 ("bna: Add debugfs interface.")
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Acked-by: Rasesh Mody <rasesh.mody(a)cavium.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/brocade/bna/bnad_debugfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/brocade/bna/bnad_debugfs.c
+++ b/drivers/net/ethernet/brocade/bna/bnad_debugfs.c
@@ -324,7 +324,7 @@ bnad_debugfs_write_regrd(struct file *fi
return PTR_ERR(kern_buf);
rc = sscanf(kern_buf, "%x:%x", &addr, &len);
- if (rc < 2) {
+ if (rc < 2 || len > UINT_MAX >> 2) {
netdev_warn(bnad->netdev, "failed to read user buffer\n");
kfree(kern_buf);
return -EINVAL;
Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are
queue-4.4/bna-integer-overflow-bug-in-debugfs.patch
This is a note to let you know that I've just added the patch titled
bna: avoid writing uninitialized data into hw registers
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bna-avoid-writing-uninitialized-data-into-hw-registers.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Thu, 23 Mar 2017 17:07:26 +0100
Subject: bna: avoid writing uninitialized data into hw registers
From: Arnd Bergmann <arnd(a)arndb.de>
[ Upstream commit a5af83925363eb85d467933e3d6ec5a87001eb7c ]
The latest gcc-7 snapshot warns about bfa_ioc_send_enable/bfa_ioc_send_disable
writing undefined values into the hardware registers:
drivers/net/ethernet/brocade/bna/bfa_ioc.c: In function 'bfa_iocpf_sm_disabling_entry':
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+4)' is used uninitialized in this function [-Werror=uninitialized]
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+8)' is used uninitialized in this function [-Werror=uninitialized]
The two functions look like they should do the same thing, but only one
of them initializes the time stamp and clscode field. The fact that we
only get a warning for one of the two functions seems to be arbitrary,
based on the inlining decisions in the compiler.
To address this, I'm making both functions do the same thing:
- set the clscode from the ioc structure in both
- set the time stamp from ktime_get_real_seconds (which also
avoids the signed-integer overflow in 2038 and extends the
well-defined behavior until 2106).
- zero-fill the reserved field
Fixes: 8b230ed8ec96 ("bna: Brocade 10Gb Ethernet device driver")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/brocade/bna/bfa_ioc.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/brocade/bna/bfa_ioc.c
+++ b/drivers/net/ethernet/brocade/bna/bfa_ioc.c
@@ -1930,13 +1930,13 @@ static void
bfa_ioc_send_enable(struct bfa_ioc *ioc)
{
struct bfi_ioc_ctrl_req enable_req;
- struct timeval tv;
bfi_h2i_set(enable_req.mh, BFI_MC_IOC, BFI_IOC_H2I_ENABLE_REQ,
bfa_ioc_portid(ioc));
enable_req.clscode = htons(ioc->clscode);
- do_gettimeofday(&tv);
- enable_req.tv_sec = ntohl(tv.tv_sec);
+ enable_req.rsvd = htons(0);
+ /* overflow in 2106 */
+ enable_req.tv_sec = ntohl(ktime_get_real_seconds());
bfa_ioc_mbox_send(ioc, &enable_req, sizeof(struct bfi_ioc_ctrl_req));
}
@@ -1947,6 +1947,10 @@ bfa_ioc_send_disable(struct bfa_ioc *ioc
bfi_h2i_set(disable_req.mh, BFI_MC_IOC, BFI_IOC_H2I_DISABLE_REQ,
bfa_ioc_portid(ioc));
+ disable_req.clscode = htons(ioc->clscode);
+ disable_req.rsvd = htons(0);
+ /* overflow in 2106 */
+ disable_req.tv_sec = ntohl(ktime_get_real_seconds());
bfa_ioc_mbox_send(ioc, &disable_req, sizeof(struct bfi_ioc_ctrl_req));
}
Patches currently in stable-queue which might be from arnd(a)arndb.de are
queue-4.4/hwmon-asus_atk0110-fix-uninitialized-data-access.patch
queue-4.4/arm-hide-finish_arch_post_lock_switch-from-modules.patch
queue-4.4/bna-avoid-writing-uninitialized-data-into-hw-registers.patch
queue-4.4/isdn-kcapi-avoid-uninitialized-data.patch
This is a note to let you know that I've just added the patch titled
backlight: pwm_bl: Fix overflow condition
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
backlight-pwm_bl-fix-overflow-condition.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Derek Basehore <dbasehore(a)chromium.org>
Date: Tue, 29 Aug 2017 13:34:34 -0700
Subject: backlight: pwm_bl: Fix overflow condition
From: Derek Basehore <dbasehore(a)chromium.org>
[ Upstream commit 5d0c49acebc9488e37db95f1d4a55644e545ffe7 ]
This fixes an overflow condition that can happen with high max
brightness and period values in compute_duty_cycle. This fixes it by
using a 64 bit variable for computing the duty cycle.
Signed-off-by: Derek Basehore <dbasehore(a)chromium.org>
Acked-by: Thierry Reding <thierry.reding(a)gmail.com>
Reviewed-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Lee Jones <lee.jones(a)linaro.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/video/backlight/pwm_bl.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -79,14 +79,17 @@ static void pwm_backlight_power_off(stru
static int compute_duty_cycle(struct pwm_bl_data *pb, int brightness)
{
unsigned int lth = pb->lth_brightness;
- int duty_cycle;
+ u64 duty_cycle;
if (pb->levels)
duty_cycle = pb->levels[brightness];
else
duty_cycle = brightness;
- return (duty_cycle * (pb->period - lth) / pb->scale) + lth;
+ duty_cycle *= pb->period - lth;
+ do_div(duty_cycle, pb->scale);
+
+ return duty_cycle + lth;
}
static int pwm_backlight_update_status(struct backlight_device *bl)
Patches currently in stable-queue which might be from dbasehore(a)chromium.org are
queue-4.4/backlight-pwm_bl-fix-overflow-condition.patch
This is a note to let you know that I've just added the patch titled
arm: kprobes: Fix the return address of multiple kretprobes
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-kprobes-fix-the-return-address-of-multiple-kretprobes.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Tue, 14 Feb 2017 00:05:59 +0900
Subject: arm: kprobes: Fix the return address of multiple kretprobes
From: Masami Hiramatsu <mhiramat(a)kernel.org>
[ Upstream commit 06553175f585b52509c7df37d6f4a50aacb7b211 ]
This is arm port of commit 737480a0d525 ("kprobes/x86:
Fix the return address of multiple kretprobes").
Fix the return address of subsequent kretprobes when multiple
kretprobes are set on the same function.
For example:
# cd /sys/kernel/debug/tracing
# echo "r:event1 sys_symlink" > kprobe_events
# echo "r:event2 sys_symlink" >> kprobe_events
# echo 1 > events/kprobes/enable
# ln -s /tmp/foo /tmp/bar
(without this patch)
# cat trace | grep -v ^#
ln-82 [000] dn.2 68.446525: event1: (kretprobe_trampoline+0x0/0x18 <- SyS_symlink)
ln-82 [000] dn.2 68.447831: event2: (ret_fast_syscall+0x0/0x1c <- SyS_symlink)
(with this patch)
# cat trace | grep -v ^#
ln-81 [000] dn.1 39.463469: event1: (ret_fast_syscall+0x0/0x1c <- SyS_symlink)
ln-81 [000] dn.1 39.464701: event2: (ret_fast_syscall+0x0/0x1c <- SyS_symlink)
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: KUMANO Syuhei <kumano.prog(a)gmail.com>
Signed-off-by: Jon Medhurst <tixy(a)linaro.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/probes/kprobes/core.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -433,6 +433,7 @@ static __used __kprobes void *trampoline
struct hlist_node *tmp;
unsigned long flags, orig_ret_address = 0;
unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
+ kprobe_opcode_t *correct_ret_addr = NULL;
INIT_HLIST_HEAD(&empty_rp);
kretprobe_hash_lock(current, &head, &flags);
@@ -455,14 +456,34 @@ static __used __kprobes void *trampoline
/* another task is sharing our hash bucket */
continue;
+ orig_ret_address = (unsigned long)ri->ret_addr;
+
+ if (orig_ret_address != trampoline_address)
+ /*
+ * This is the real return address. Any other
+ * instances associated with this task are for
+ * other calls deeper on the call stack
+ */
+ break;
+ }
+
+ kretprobe_assert(ri, orig_ret_address, trampoline_address);
+
+ correct_ret_addr = ri->ret_addr;
+ hlist_for_each_entry_safe(ri, tmp, head, hlist) {
+ if (ri->task != current)
+ /* another task is sharing our hash bucket */
+ continue;
+
+ orig_ret_address = (unsigned long)ri->ret_addr;
if (ri->rp && ri->rp->handler) {
__this_cpu_write(current_kprobe, &ri->rp->kp);
get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
+ ri->ret_addr = correct_ret_addr;
ri->rp->handler(ri, regs);
__this_cpu_write(current_kprobe, NULL);
}
- orig_ret_address = (unsigned long)ri->ret_addr;
recycle_rp_inst(ri, &empty_rp);
if (orig_ret_address != trampoline_address)
@@ -474,7 +495,6 @@ static __used __kprobes void *trampoline
break;
}
- kretprobe_assert(ri, orig_ret_address, trampoline_address);
kretprobe_hash_unlock(current, &flags);
hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
Patches currently in stable-queue which might be from mhiramat(a)kernel.org are
queue-4.4/arm-kprobes-fix-the-return-address-of-multiple-kretprobes.patch
This is a note to let you know that I've just added the patch titled
arm: kprobes: Align stack to 8-bytes in test code
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-kprobes-align-stack-to-8-bytes-in-test-code.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Jon Medhurst <tixy(a)linaro.org>
Date: Thu, 2 Mar 2017 13:04:09 +0000
Subject: arm: kprobes: Align stack to 8-bytes in test code
From: Jon Medhurst <tixy(a)linaro.org>
[ Upstream commit 974310d047f3c7788a51d10c8d255eebdb1fa857 ]
kprobes test cases need to have a stack that is aligned to an 8-byte
boundary because they call other functions (and the ARM ABI mandates
that alignment) and because test cases include 64-bit accesses to the
stack. Unfortunately, GCC doesn't ensure this alignment for inline
assembler and for the code in question seems to always misalign it by
pushing just the LR register onto the stack. We therefore need to
explicitly perform stack alignment at the start of each test case.
Without this fix, some test cases will generate alignment faults on
systems where alignment is enforced. Even if the kernel is configured to
handle these faults in software, triggering them is ugly. It also
exposes limitations in the fault handling code which doesn't cope with
writes to the stack. E.g. when handling this instruction
strd r6, [sp, #-64]!
the fault handling code will write to a stack location below the SP
value at the point the fault occurred, which coincides with where the
exception handler has pushed the saved register context. This results in
corruption of those registers.
Signed-off-by: Jon Medhurst <tixy(a)linaro.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/probes/kprobes/test-core.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/arch/arm/probes/kprobes/test-core.c
+++ b/arch/arm/probes/kprobes/test-core.c
@@ -976,7 +976,10 @@ static void coverage_end(void)
void __naked __kprobes_test_case_start(void)
{
__asm__ __volatile__ (
- "stmdb sp!, {r4-r11} \n\t"
+ "mov r2, sp \n\t"
+ "bic r3, r2, #7 \n\t"
+ "mov sp, r3 \n\t"
+ "stmdb sp!, {r2-r11} \n\t"
"sub sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
"bic r0, lr, #1 @ r0 = inline data \n\t"
"mov r1, sp \n\t"
@@ -996,7 +999,8 @@ void __naked __kprobes_test_case_end_32(
"movne pc, r0 \n\t"
"mov r0, r4 \n\t"
"add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
- "ldmia sp!, {r4-r11} \n\t"
+ "ldmia sp!, {r2-r11} \n\t"
+ "mov sp, r2 \n\t"
"mov pc, r0 \n\t"
);
}
@@ -1012,7 +1016,8 @@ void __naked __kprobes_test_case_end_16(
"bxne r0 \n\t"
"mov r0, r4 \n\t"
"add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t"
- "ldmia sp!, {r4-r11} \n\t"
+ "ldmia sp!, {r2-r11} \n\t"
+ "mov sp, r2 \n\t"
"bx r0 \n\t"
);
}
Patches currently in stable-queue which might be from tixy(a)linaro.org are
queue-4.4/arm-kprobes-fix-the-return-address-of-multiple-kretprobes.patch
queue-4.4/arm-kprobes-align-stack-to-8-bytes-in-test-code.patch
This is a note to let you know that I've just added the patch titled
ARM: Hide finish_arch_post_lock_switch() from modules
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-hide-finish_arch_post_lock_switch-from-modules.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ef0491ea17f8019821c7e9c8e801184ecf17f85a Mon Sep 17 00:00:00 2001
From: Steven Rostedt <rostedt(a)goodmis.org>
Date: Fri, 13 May 2016 15:30:13 +0200
Subject: ARM: Hide finish_arch_post_lock_switch() from modules
From: Steven Rostedt <rostedt(a)goodmis.org>
commit ef0491ea17f8019821c7e9c8e801184ecf17f85a upstream.
The introduction of switch_mm_irqs_off() brought back an old bug
regarding the use of preempt_enable_no_resched:
As part of:
62b94a08da1b ("sched/preempt: Take away preempt_enable_no_resched() from modules")
the definition of preempt_enable_no_resched() is only available in
built-in code, not in loadable modules, so we can't generally use
it from header files.
However, the ARM version of finish_arch_post_lock_switch()
calls preempt_enable_no_resched() and is defined as a static
inline function in asm/mmu_context.h. This in turn means we cannot
include asm/mmu_context.h from modules.
With today's tip tree, asm/mmu_context.h gets included from
linux/mmu_context.h, which is normally the exact pattern one would
expect, but unfortunately, linux/mmu_context.h can be included from
the vhost driver that is a loadable module, now causing this compile
time error with modular configs:
In file included from ../include/linux/mmu_context.h:4:0,
from ../drivers/vhost/vhost.c:18:
../arch/arm/include/asm/mmu_context.h: In function 'finish_arch_post_lock_switch':
../arch/arm/include/asm/mmu_context.h:88:3: error: implicit declaration of function 'preempt_enable_no_resched' [-Werror=implicit-function-declaration]
preempt_enable_no_resched();
Andy already tried to fix the bug by including linux/preempt.h
from asm/mmu_context.h, but that didn't help. Arnd suggested reordering
the header files, which wasn't popular, so let's use this
workaround instead:
The finish_arch_post_lock_switch() definition is now also hidden
inside of #ifdef MODULE, so we don't see anything referencing
preempt_enable_no_resched() from a header file. I've built a
few hundred randconfig kernels with this, and did not see any
new problems.
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Acked-by: Russell King <rmk+kernel(a)arm.linux.org.uk>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Frederic Weisbecker <fweisbec(a)gmail.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Russell King - ARM Linux <linux(a)armlinux.org.uk>
Cc: Stephane Eranian <eranian(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vince Weaver <vincent.weaver(a)maine.edu>
Cc: linux-arm-kernel(a)lists.infradead.org
Fixes: f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler")
Link: http://lkml.kernel.org/r/1463146234-161304-1-git-send-email-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/include/asm/mmu_context.h | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -61,6 +61,7 @@ static inline void check_and_switch_cont
cpu_switch_mm(mm->pgd, mm);
}
+#ifndef MODULE
#define finish_arch_post_lock_switch \
finish_arch_post_lock_switch
static inline void finish_arch_post_lock_switch(void)
@@ -82,6 +83,7 @@ static inline void finish_arch_post_lock
preempt_enable_no_resched();
}
}
+#endif /* !MODULE */
#endif /* CONFIG_MMU */
Patches currently in stable-queue which might be from rostedt(a)goodmis.org are
queue-4.4/arm-hide-finish_arch_post_lock_switch-from-modules.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: ti: fix PCI bus dtc warnings
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-ti-fix-pci-bus-dtc-warnings.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Rob Herring <robh(a)kernel.org>
Date: Tue, 21 Mar 2017 21:03:01 -0500
Subject: ARM: dts: ti: fix PCI bus dtc warnings
From: Rob Herring <robh(a)kernel.org>
[ Upstream commit 7d79f6098d82f8c09914d7799bc96891ad9c3baf ]
dtc recently added PCI bus checks. Fix these warnings.
Signed-off-by: Rob Herring <robh(a)kernel.org>
Cc: "Benoît Cousson" <bcousson(a)baylibre.com>
Cc: Tony Lindgren <tony(a)atomide.com>
Cc: linux-omap(a)vger.kernel.org
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/dra7.dtsi | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -227,6 +227,7 @@
device_type = "pci";
ranges = <0x81000000 0 0 0x03000 0 0x00010000
0x82000000 0 0x20013000 0x13000 0 0xffed000>;
+ bus-range = <0x00 0xff>;
#interrupt-cells = <1>;
num-lanes = <1>;
ti,hwmods = "pcie1";
@@ -262,6 +263,7 @@
device_type = "pci";
ranges = <0x81000000 0 0 0x03000 0 0x00010000
0x82000000 0 0x30013000 0x13000 0 0xffed000>;
+ bus-range = <0x00 0xff>;
#interrupt-cells = <1>;
num-lanes = <1>;
ti,hwmods = "pcie2";
Patches currently in stable-queue which might be from robh(a)kernel.org are
queue-4.4/arm-dts-ti-fix-pci-bus-dtc-warnings.patch
This is a note to let you know that I've just added the patch titled
ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dts-am335x-evmsk-adjust-mmc2-param-to-allow-suspend.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: "Reizer, Eyal" <eyalr(a)ti.com>
Date: Sun, 26 Mar 2017 08:53:10 +0000
Subject: ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
From: "Reizer, Eyal" <eyalr(a)ti.com>
[ Upstream commit 9bcf53f34a2c1cebc45cc12e273dcd5f51fbc099 ]
mmc2 used for wl12xx was missing the keep-power-in suspend
parameter. As a result the board couldn't reach suspend state.
Signed-off-by: Eyal Reizer <eyalr(a)ti.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/boot/dts/am335x-evmsk.dts | 1 +
1 file changed, 1 insertion(+)
--- a/arch/arm/boot/dts/am335x-evmsk.dts
+++ b/arch/arm/boot/dts/am335x-evmsk.dts
@@ -668,6 +668,7 @@
ti,non-removable;
bus-width = <4>;
cap-power-off-card;
+ keep-power-in-suspend;
pinctrl-names = "default";
pinctrl-0 = <&mmc2_pins>;
Patches currently in stable-queue which might be from eyalr(a)ti.com are
queue-4.4/arm-dts-am335x-evmsk-adjust-mmc2-param-to-allow-suspend.patch
This is a note to let you know that I've just added the patch titled
ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-dma-mapping-disallow-dma_get_sgtable-for-non-kernel-managed-memory.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 10:35:49 CET 2017
From: Russell King <rmk+kernel(a)armlinux.org.uk>
Date: Wed, 29 Mar 2017 17:12:47 +0100
Subject: ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory
From: Russell King <rmk+kernel(a)armlinux.org.uk>
[ Upstream commit 916a008b4b8ecc02fbd035cfb133773dba1ff3d7 ]
dma_get_sgtable() tries to create a scatterlist table containing valid
struct page pointers for the coherent memory allocation passed in to it.
However, memory can be declared via dma_declare_coherent_memory(), or
via other reservation schemes which means that coherent memory is not
guaranteed to be backed by struct pages. In such cases, the resulting
scatterlist table contains pointers to invalid pages, which causes
kernel oops later.
This patch adds detection of such memory, and refuses to create a
scatterlist table for such memory.
Reported-by: Shuah Khan <shuahkhan(a)gmail.com>
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/mm/dma-mapping.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -774,13 +774,31 @@ static void arm_coherent_dma_free(struct
__arm_dma_free(dev, size, cpu_addr, handle, attrs, true);
}
+/*
+ * The whole dma_get_sgtable() idea is fundamentally unsafe - it seems
+ * that the intention is to allow exporting memory allocated via the
+ * coherent DMA APIs through the dma_buf API, which only accepts a
+ * scattertable. This presents a couple of problems:
+ * 1. Not all memory allocated via the coherent DMA APIs is backed by
+ * a struct page
+ * 2. Passing coherent DMA memory into the streaming APIs is not allowed
+ * as we will try to flush the memory through a different alias to that
+ * actually being used (and the flushes are redundant.)
+ */
int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t handle, size_t size,
struct dma_attrs *attrs)
{
- struct page *page = pfn_to_page(dma_to_pfn(dev, handle));
+ unsigned long pfn = dma_to_pfn(dev, handle);
+ struct page *page;
int ret;
+ /* If the PFN is not valid, we do not have a struct page */
+ if (!pfn_valid(pfn))
+ return -ENXIO;
+
+ page = pfn_to_page(pfn);
+
ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
if (unlikely(ret))
return ret;
Patches currently in stable-queue which might be from rmk+kernel(a)armlinux.org.uk are
queue-4.4/arm-dma-mapping-disallow-dma_get_sgtable-for-non-kernel-managed-memory.patch
queue-4.4/rtc-pl031-make-interrupt-optional.patch
This is a note to let you know that I've just added the patch titled
x86: unify tss_struct
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-unify-tss_struct.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ca241c75037b32e0216a68e39ad2801d04fa1f87 Mon Sep 17 00:00:00 2001
From: Glauber de Oliveira Costa <gcosta(a)redhat.com>
Date: Wed, 30 Jan 2008 13:31:31 +0100
Subject: x86: unify tss_struct
From: Glauber de Oliveira Costa <gcosta(a)redhat.com>
commit ca241c75037b32e0216a68e39ad2801d04fa1f87 upstream.
Although slighly different, the tss_struct is very similar in x86_64 and
i386. The really different part, which matchs the hardware vision of it, is
now called x86_hw_tss, and each of the architectures provides yours.
It's then used as a field in the outter tss_struct.
Signed-off-by: Glauber de Oliveira Costa <gcosta(a)redhat.com>
Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -272,7 +272,7 @@ struct x86_hw_tss {
u16 reserved5;
u16 io_bitmap_base;
-} __attribute__((packed)) ____cacheline_aligned;
+} __attribute__((packed));
#endif
/*
Patches currently in stable-queue which might be from gcosta(a)redhat.com are
queue-4.9/x86-unify-tss_struct.patch
This is a note to let you know that I've just added the patch titled
xhci: plat: Register shutdown for xhci_plat
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xhci-plat-register-shutdown-for-xhci_plat.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Dec 21 09:02:40 CET 2017
From: Adam Wallis <awallis(a)codeaurora.org>
Date: Tue, 28 Mar 2017 15:55:28 +0300
Subject: xhci: plat: Register shutdown for xhci_plat
From: Adam Wallis <awallis(a)codeaurora.org>
[ Upstream commit b07c12517f2aed0add8ce18146bb426b14099392 ]
Shutdown should be called for xhci_plat devices especially for
situations where kexec might be used by stopping DMA
transactions.
Signed-off-by: Adam Wallis <awallis(a)codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-plat.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -335,6 +335,7 @@ MODULE_DEVICE_TABLE(acpi, usb_xhci_acpi_
static struct platform_driver usb_xhci_driver = {
.probe = xhci_plat_probe,
.remove = xhci_plat_remove,
+ .shutdown = usb_hcd_platform_shutdown,
.driver = {
.name = "xhci-hcd",
.pm = DEV_PM_OPS,
Patches currently in stable-queue which might be from awallis(a)codeaurora.org are
queue-4.9/xhci-plat-register-shutdown-for-xhci_plat.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Use new merged flush logic in arch_tlbbatch_flush()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3f79e4c7c9c2f5c30751ea5c8dd9fd1d56b81947 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:13 -0700
Subject: x86/mm: Use new merged flush logic in arch_tlbbatch_flush()
From: Andy Lutomirski <luto(a)kernel.org>
commit 3f79e4c7c9c2f5c30751ea5c8dd9fd1d56b81947 upstream.
Now there's only one copy of the local tlb flush logic for
non-kernel pages on SMP kernels.
The only functional change is that arch_tlbbatch_flush() will now
leave_mm() on the local CPU if that CPU is in the batch and is in
TLBSTATE_LAZY mode.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -405,12 +405,8 @@ void arch_tlbbatch_flush(struct arch_tlb
int cpu = get_cpu();
- if (cpumask_test_cpu(cpu, &batch->cpumask)) {
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- local_flush_tlb();
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- }
-
+ if (cpumask_test_cpu(cpu, &batch->cpumask))
+ flush_tlb_func_local(&info, TLB_LOCAL_SHOOTDOWN);
if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids)
flush_tlb_others(&batch->cpumask, &info);
cpumask_clear(&batch->cpumask);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Rework lazy TLB to track the actual loaded mm
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3d28ebceaffab40f30afa87e33331560148d7b8b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:15 -0700
Subject: x86/mm: Rework lazy TLB to track the actual loaded mm
From: Andy Lutomirski <luto(a)kernel.org>
commit 3d28ebceaffab40f30afa87e33331560148d7b8b upstream.
Lazy TLB state is currently managed in a rather baroque manner.
AFAICT, there are three possible states:
- Non-lazy. This means that we're running a user thread or a
kernel thread that has called use_mm(). current->mm ==
current->active_mm == cpu_tlbstate.active_mm and
cpu_tlbstate.state == TLBSTATE_OK.
- Lazy with user mm. We're running a kernel thread without an mm
and we're borrowing an mm_struct. We have current->mm == NULL,
current->active_mm == cpu_tlbstate.active_mm, cpu_tlbstate.state
!= TLBSTATE_OK (i.e. TLBSTATE_LAZY or 0). The current cpu is set
in mm_cpumask(current->active_mm). CR3 points to
current->active_mm->pgd. The TLB is up to date.
- Lazy with init_mm. This happens when we call leave_mm(). We
have current->mm == NULL, current->active_mm ==
cpu_tlbstate.active_mm, but that mm is only relelvant insofar as
the scheduler is tracking it for refcounting. cpu_tlbstate.state
!= TLBSTATE_OK. The current cpu is clear in
mm_cpumask(current->active_mm). CR3 points to swapper_pg_dir,
i.e. init_mm->pgd.
This patch simplifies the situation. Other than perf, x86 stops
caring about current->active_mm at all. We have
cpu_tlbstate.loaded_mm pointing to the mm that CR3 references. The
TLB is always up to date for that mm. leave_mm() just switches us
to init_mm. There are no longer any special cases for mm_cpumask,
and switch_mm() switches mms without worrying about laziness.
After this patch, cpu_tlbstate.state serves only to tell the TLB
flush code whether it may switch to init_mm instead of doing a
normal flush.
This makes fairly extensive changes to xen_exit_mmap(), which used
to look a bit like black magic.
Perf is unchanged. With or without this change, perf may behave a bit
erratically if it tries to read user memory in kernel thread context.
We should build on this patch to teach perf to never look at user
memory when cpu_tlbstate.loaded_mm != current->mm.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/events/core.c | 3
arch/x86/include/asm/tlbflush.h | 12 +-
arch/x86/kernel/ldt.c | 7 -
arch/x86/mm/init.c | 2
arch/x86/mm/tlb.c | 216 ++++++++++++++++++++--------------------
arch/x86/xen/mmu.c | 51 ++++-----
6 files changed, 147 insertions(+), 144 deletions(-)
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2100,8 +2100,7 @@ static int x86_pmu_event_init(struct per
static void refresh_pce(void *ignored)
{
- if (current->active_mm)
- load_mm_cr4(current->active_mm);
+ load_mm_cr4(this_cpu_read(cpu_tlbstate.loaded_mm));
}
static void x86_pmu_event_mapped(struct perf_event *event)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -66,7 +66,13 @@ static inline void invpcid_flush_all_non
#endif
struct tlb_state {
- struct mm_struct *active_mm;
+ /*
+ * cpu_tlbstate.loaded_mm should match CR3 whenever interrupts
+ * are on. This means that it may not match current->active_mm,
+ * which will contain the previous user mm when we're in lazy TLB
+ * mode even if we've already switched back to swapper_pg_dir.
+ */
+ struct mm_struct *loaded_mm;
int state;
/*
@@ -249,7 +255,9 @@ void native_flush_tlb_others(const struc
static inline void reset_lazy_tlbstate(void)
{
this_cpu_write(cpu_tlbstate.state, 0);
- this_cpu_write(cpu_tlbstate.active_mm, &init_mm);
+ this_cpu_write(cpu_tlbstate.loaded_mm, &init_mm);
+
+ WARN_ON(read_cr3() != __pa_symbol(swapper_pg_dir));
}
static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch,
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -23,14 +23,15 @@
#include <asm/syscalls.h>
/* context.lock is held for us, so we don't need any locking. */
-static void flush_ldt(void *current_mm)
+static void flush_ldt(void *__mm)
{
+ struct mm_struct *mm = __mm;
mm_context_t *pc;
- if (current->active_mm != current_mm)
+ if (this_cpu_read(cpu_tlbstate.loaded_mm) != mm)
return;
- pc = ¤t->active_mm->context;
+ pc = &mm->context;
set_ldt(pc->ldt->entries, pc->ldt->size);
}
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -764,7 +764,7 @@ void __init zone_sizes_init(void)
}
DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
- .active_mm = &init_mm,
+ .loaded_mm = &init_mm,
.state = 0,
.cr4 = ~0UL, /* fail hard if we screw up cr4 shadow initialization */
};
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -28,26 +28,25 @@
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/
-/*
- * We cannot call mmdrop() because we are in interrupt context,
- * instead update mm->cpu_vm_mask.
- */
void leave_mm(int cpu)
{
- struct mm_struct *active_mm = this_cpu_read(cpu_tlbstate.active_mm);
+ struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+
+ /*
+ * It's plausible that we're in lazy TLB mode while our mm is init_mm.
+ * If so, our callers still expect us to flush the TLB, but there
+ * aren't any user TLB entries in init_mm to worry about.
+ *
+ * This needs to happen before any other sanity checks due to
+ * intel_idle's shenanigans.
+ */
+ if (loaded_mm == &init_mm)
+ return;
+
if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
BUG();
- if (cpumask_test_cpu(cpu, mm_cpumask(active_mm))) {
- cpumask_clear_cpu(cpu, mm_cpumask(active_mm));
- load_cr3(swapper_pg_dir);
- /*
- * This gets called in the idle path where RCU
- * functions differently. Tracing normally
- * uses RCU, so we have to call the tracepoint
- * specially here.
- */
- trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
- }
+
+ switch_mm(NULL, &init_mm, NULL);
}
EXPORT_SYMBOL_GPL(leave_mm);
@@ -65,108 +64,109 @@ void switch_mm_irqs_off(struct mm_struct
struct task_struct *tsk)
{
unsigned cpu = smp_processor_id();
+ struct mm_struct *real_prev = this_cpu_read(cpu_tlbstate.loaded_mm);
- if (likely(prev != next)) {
- if (IS_ENABLED(CONFIG_VMAP_STACK)) {
- /*
- * If our current stack is in vmalloc space and isn't
- * mapped in the new pgd, we'll double-fault. Forcibly
- * map it.
- */
- unsigned int stack_pgd_index = pgd_index(current_stack_pointer());
-
- pgd_t *pgd = next->pgd + stack_pgd_index;
-
- if (unlikely(pgd_none(*pgd)))
- set_pgd(pgd, init_mm.pgd[stack_pgd_index]);
- }
+ /*
+ * NB: The scheduler will call us with prev == next when
+ * switching from lazy TLB mode to normal mode if active_mm
+ * isn't changing. When this happens, there is no guarantee
+ * that CR3 (and hence cpu_tlbstate.loaded_mm) matches next.
+ *
+ * NB: leave_mm() calls us with prev == NULL and tsk == NULL.
+ */
- this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
- this_cpu_write(cpu_tlbstate.active_mm, next);
+ this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
- cpumask_set_cpu(cpu, mm_cpumask(next));
+ if (real_prev == next) {
+ /*
+ * There's nothing to do: we always keep the per-mm control
+ * regs in sync with cpu_tlbstate.loaded_mm. Just
+ * sanity-check mm_cpumask.
+ */
+ if (WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(next))))
+ cpumask_set_cpu(cpu, mm_cpumask(next));
+ return;
+ }
+ if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
- * Re-load page tables.
- *
- * This logic has an ordering constraint:
- *
- * CPU 0: Write to a PTE for 'next'
- * CPU 0: load bit 1 in mm_cpumask. if nonzero, send IPI.
- * CPU 1: set bit 1 in next's mm_cpumask
- * CPU 1: load from the PTE that CPU 0 writes (implicit)
- *
- * We need to prevent an outcome in which CPU 1 observes
- * the new PTE value and CPU 0 observes bit 1 clear in
- * mm_cpumask. (If that occurs, then the IPI will never
- * be sent, and CPU 0's TLB will contain a stale entry.)
- *
- * The bad outcome can occur if either CPU's load is
- * reordered before that CPU's store, so both CPUs must
- * execute full barriers to prevent this from happening.
- *
- * Thus, switch_mm needs a full barrier between the
- * store to mm_cpumask and any operation that could load
- * from next->pgd. TLB fills are special and can happen
- * due to instruction fetches or for no reason at all,
- * and neither LOCK nor MFENCE orders them.
- * Fortunately, load_cr3() is serializing and gives the
- * ordering guarantee we need.
- *
+ * If our current stack is in vmalloc space and isn't
+ * mapped in the new pgd, we'll double-fault. Forcibly
+ * map it.
*/
- load_cr3(next->pgd);
+ unsigned int stack_pgd_index = pgd_index(current_stack_pointer());
- trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
+ pgd_t *pgd = next->pgd + stack_pgd_index;
- /* Stop flush ipis for the previous mm */
- cpumask_clear_cpu(cpu, mm_cpumask(prev));
+ if (unlikely(pgd_none(*pgd)))
+ set_pgd(pgd, init_mm.pgd[stack_pgd_index]);
+ }
- /* Load per-mm CR4 state */
- load_mm_cr4(next);
+ this_cpu_write(cpu_tlbstate.loaded_mm, next);
-#ifdef CONFIG_MODIFY_LDT_SYSCALL
- /*
- * Load the LDT, if the LDT is different.
- *
- * It's possible that prev->context.ldt doesn't match
- * the LDT register. This can happen if leave_mm(prev)
- * was called and then modify_ldt changed
- * prev->context.ldt but suppressed an IPI to this CPU.
- * In this case, prev->context.ldt != NULL, because we
- * never set context.ldt to NULL while the mm still
- * exists. That means that next->context.ldt !=
- * prev->context.ldt, because mms never share an LDT.
- */
- if (unlikely(prev->context.ldt != next->context.ldt))
- load_mm_ldt(next);
-#endif
- } else {
- this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
- BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != next);
+ WARN_ON_ONCE(cpumask_test_cpu(cpu, mm_cpumask(next)));
+ cpumask_set_cpu(cpu, mm_cpumask(next));
- if (!cpumask_test_cpu(cpu, mm_cpumask(next))) {
- /*
- * On established mms, the mm_cpumask is only changed
- * from irq context, from ptep_clear_flush() while in
- * lazy tlb mode, and here. Irqs are blocked during
- * schedule, protecting us from simultaneous changes.
- */
- cpumask_set_cpu(cpu, mm_cpumask(next));
+ /*
+ * Re-load page tables.
+ *
+ * This logic has an ordering constraint:
+ *
+ * CPU 0: Write to a PTE for 'next'
+ * CPU 0: load bit 1 in mm_cpumask. if nonzero, send IPI.
+ * CPU 1: set bit 1 in next's mm_cpumask
+ * CPU 1: load from the PTE that CPU 0 writes (implicit)
+ *
+ * We need to prevent an outcome in which CPU 1 observes
+ * the new PTE value and CPU 0 observes bit 1 clear in
+ * mm_cpumask. (If that occurs, then the IPI will never
+ * be sent, and CPU 0's TLB will contain a stale entry.)
+ *
+ * The bad outcome can occur if either CPU's load is
+ * reordered before that CPU's store, so both CPUs must
+ * execute full barriers to prevent this from happening.
+ *
+ * Thus, switch_mm needs a full barrier between the
+ * store to mm_cpumask and any operation that could load
+ * from next->pgd. TLB fills are special and can happen
+ * due to instruction fetches or for no reason at all,
+ * and neither LOCK nor MFENCE orders them.
+ * Fortunately, load_cr3() is serializing and gives the
+ * ordering guarantee we need.
+ */
+ load_cr3(next->pgd);
+
+ /*
+ * This gets called via leave_mm() in the idle path where RCU
+ * functions differently. Tracing normally uses RCU, so we have to
+ * call the tracepoint specially here.
+ */
+ trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
+
+ /* Stop flush ipis for the previous mm */
+ WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(real_prev)) &&
+ real_prev != &init_mm);
+ cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
- /*
- * We were in lazy tlb mode and leave_mm disabled
- * tlb flush IPI delivery. We must reload CR3
- * to make sure to use no freed page tables.
- *
- * As above, load_cr3() is serializing and orders TLB
- * fills with respect to the mm_cpumask write.
- */
- load_cr3(next->pgd);
- trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
- load_mm_cr4(next);
- load_mm_ldt(next);
- }
- }
+ /* Load per-mm CR4 state */
+ load_mm_cr4(next);
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+ /*
+ * Load the LDT, if the LDT is different.
+ *
+ * It's possible that prev->context.ldt doesn't match
+ * the LDT register. This can happen if leave_mm(prev)
+ * was called and then modify_ldt changed
+ * prev->context.ldt but suppressed an IPI to this CPU.
+ * In this case, prev->context.ldt != NULL, because we
+ * never set context.ldt to NULL while the mm still
+ * exists. That means that next->context.ldt !=
+ * prev->context.ldt, because mms never share an LDT.
+ */
+ if (unlikely(real_prev->context.ldt != next->context.ldt))
+ load_mm_ldt(next);
+#endif
}
/*
@@ -246,7 +246,7 @@ static void flush_tlb_func_remote(void *
inc_irq_stat(irq_tlb_count);
- if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.active_mm))
+ if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.loaded_mm))
return;
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
@@ -337,7 +337,7 @@ void flush_tlb_mm_range(struct mm_struct
info.end = TLB_FLUSH_ALL;
}
- if (mm == current->active_mm)
+ if (mm == this_cpu_read(cpu_tlbstate.loaded_mm))
flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
flush_tlb_others(mm_cpumask(mm), &info);
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -998,37 +998,32 @@ static void xen_dup_mmap(struct mm_struc
spin_unlock(&mm->page_table_lock);
}
-
-#ifdef CONFIG_SMP
-/* Another cpu may still have their %cr3 pointing at the pagetable, so
- we need to repoint it somewhere else before we can unpin it. */
-static void drop_other_mm_ref(void *info)
+static void drop_mm_ref_this_cpu(void *info)
{
struct mm_struct *mm = info;
- struct mm_struct *active_mm;
-
- active_mm = this_cpu_read(cpu_tlbstate.active_mm);
- if (active_mm == mm && this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK)
+ if (this_cpu_read(cpu_tlbstate.loaded_mm) == mm)
leave_mm(smp_processor_id());
- /* If this cpu still has a stale cr3 reference, then make sure
- it has been flushed. */
+ /*
+ * If this cpu still has a stale cr3 reference, then make sure
+ * it has been flushed.
+ */
if (this_cpu_read(xen_current_cr3) == __pa(mm->pgd))
- load_cr3(swapper_pg_dir);
+ xen_mc_flush();
}
+#ifdef CONFIG_SMP
+/*
+ * Another cpu may still have their %cr3 pointing at the pagetable, so
+ * we need to repoint it somewhere else before we can unpin it.
+ */
static void xen_drop_mm_ref(struct mm_struct *mm)
{
cpumask_var_t mask;
unsigned cpu;
- if (current->active_mm == mm) {
- if (current->mm == mm)
- load_cr3(swapper_pg_dir);
- else
- leave_mm(smp_processor_id());
- }
+ drop_mm_ref_this_cpu(mm);
/* Get the "official" set of cpus referring to our pagetable. */
if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) {
@@ -1036,31 +1031,31 @@ static void xen_drop_mm_ref(struct mm_st
if (!cpumask_test_cpu(cpu, mm_cpumask(mm))
&& per_cpu(xen_current_cr3, cpu) != __pa(mm->pgd))
continue;
- smp_call_function_single(cpu, drop_other_mm_ref, mm, 1);
+ smp_call_function_single(cpu, drop_mm_ref_this_cpu, mm, 1);
}
return;
}
cpumask_copy(mask, mm_cpumask(mm));
- /* It's possible that a vcpu may have a stale reference to our
- cr3, because its in lazy mode, and it hasn't yet flushed
- its set of pending hypercalls yet. In this case, we can
- look at its actual current cr3 value, and force it to flush
- if needed. */
+ /*
+ * It's possible that a vcpu may have a stale reference to our
+ * cr3, because its in lazy mode, and it hasn't yet flushed
+ * its set of pending hypercalls yet. In this case, we can
+ * look at its actual current cr3 value, and force it to flush
+ * if needed.
+ */
for_each_online_cpu(cpu) {
if (per_cpu(xen_current_cr3, cpu) == __pa(mm->pgd))
cpumask_set_cpu(cpu, mask);
}
- if (!cpumask_empty(mask))
- smp_call_function_many(mask, drop_other_mm_ref, mm, 1);
+ smp_call_function_many(mask, drop_mm_ref_this_cpu, mm, 1);
free_cpumask_var(mask);
}
#else
static void xen_drop_mm_ref(struct mm_struct *mm)
{
- if (current->active_mm == mm)
- load_cr3(swapper_pg_dir);
+ drop_mm_ref_this_cpu(mm);
}
#endif
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ca6c99c0794875c6d1db6e22f246699691ab7e6b Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 22 May 2017 15:30:01 -0700
Subject: x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
From: Andy Lutomirski <luto(a)kernel.org>
commit ca6c99c0794875c6d1db6e22f246699691ab7e6b upstream.
flush_tlb_page() was very similar to flush_tlb_mm_range() except that
it had a couple of issues:
- It was missing an smp_mb() in the case where
current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit)
- It was missing tracepoints and vm counter updates.
The only reason that I can see for keeping it at as a separate
function is that it could avoid a few branches that
flush_tlb_mm_range() needs to decide to flush just one page. This
hardly seems worthwhile. If we decide we want to get rid of those
branches again, a better way would be to introduce an
__flush_tlb_mm_range() helper and make both flush_tlb_page() and
flush_tlb_mm_range() use it.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.149549206…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 5 ++++-
arch/x86/mm/tlb.c | 27 ---------------------------
2 files changed, 4 insertions(+), 28 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -304,12 +304,15 @@ static inline void flush_tlb_kernel_rang
extern void flush_tlb_all(void);
extern void flush_tlb_current_task(void);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#define flush_tlb() flush_tlb_current_task()
+static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a)
+{
+ flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, VM_NONE);
+}
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -369,33 +369,6 @@ out:
preempt_enable();
}
-void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
-{
- struct mm_struct *mm = vma->vm_mm;
-
- preempt_disable();
-
- if (current->active_mm == mm) {
- if (current->mm) {
- /*
- * Implicit full barrier (INVLPG) that synchronizes
- * with switch_mm.
- */
- __flush_tlb_one(start);
- } else {
- leave_mm(smp_processor_id());
-
- /* Synchronize with switch_mm. */
- smp_mb();
- }
- }
-
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);
-
- preempt_enable();
-}
-
static void do_flush_tlb_all(void *info)
{
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 454bbad9793f59f5656ce5971ee473a8be736ef5 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:12 -0700
Subject: x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases
From: Andy Lutomirski <luto(a)kernel.org>
commit 454bbad9793f59f5656ce5971ee473a8be736ef5 upstream.
The local flush path is very similar to the remote flush path.
Merge them.
This is intended to make no difference to behavior whatsoever. It
removes some code and will make future changes to the flushing
mechanics simpler.
This patch does remove one small optimization: flush_tlb_mm_range()
now has an unconditional smp_mb() instead of using MOV to CR3 or
INVLPG as a full barrier when applicable. I think this is okay for
a few reasons. First, smp_mb() is quite cheap compared to the cost
of a TLB flush. Second, this rearrangement makes a bigger
optimization available: with some work on the SMP function call
code, we could do the local and remote flushes in parallel. Third,
I'm planning a rework of the TLB flush algorithm that will require
an atomic operation at the beginning of each flush, and that
operation will replace the smp_mb().
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/tlbflush.h | 1
arch/x86/mm/tlb.c | 111 +++++++++++++++++-----------------------
2 files changed, 48 insertions(+), 64 deletions(-)
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -216,7 +216,6 @@ static inline void __flush_tlb_one(unsig
* ..but the i386 has somewhat limited tlb flushing capabilities,
* and page-granular flushes are available only on i486 and up.
*/
-
struct flush_tlb_info {
struct mm_struct *mm;
unsigned long start;
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -216,22 +216,9 @@ void switch_mm_irqs_off(struct mm_struct
* write/read ordering problems.
*/
-/*
- * TLB flush funcation:
- * 1) Flush the tlb entries if the cpu uses the mm that's being flushed.
- * 2) Leave the mm if we are in the lazy tlb mode.
- */
-static void flush_tlb_func(void *info)
+static void flush_tlb_func_common(const struct flush_tlb_info *f,
+ bool local, enum tlb_flush_reason reason)
{
- const struct flush_tlb_info *f = info;
-
- inc_irq_stat(irq_tlb_count);
-
- if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.active_mm))
- return;
-
- count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
-
if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
leave_mm(smp_processor_id());
return;
@@ -239,7 +226,9 @@ static void flush_tlb_func(void *info)
if (f->end == TLB_FLUSH_ALL) {
local_flush_tlb();
- trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL);
+ if (local)
+ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
+ trace_tlb_flush(reason, TLB_FLUSH_ALL);
} else {
unsigned long addr;
unsigned long nr_pages =
@@ -249,10 +238,32 @@ static void flush_tlb_func(void *info)
__flush_tlb_single(addr);
addr += PAGE_SIZE;
}
- trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, nr_pages);
+ if (local)
+ count_vm_tlb_events(NR_TLB_LOCAL_FLUSH_ONE, nr_pages);
+ trace_tlb_flush(reason, nr_pages);
}
}
+static void flush_tlb_func_local(void *info, enum tlb_flush_reason reason)
+{
+ const struct flush_tlb_info *f = info;
+
+ flush_tlb_func_common(f, true, reason);
+}
+
+static void flush_tlb_func_remote(void *info)
+{
+ const struct flush_tlb_info *f = info;
+
+ inc_irq_stat(irq_tlb_count);
+
+ if (f->mm && f->mm != this_cpu_read(cpu_tlbstate.active_mm))
+ return;
+
+ count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
+ flush_tlb_func_common(f, false, TLB_REMOTE_SHOOTDOWN);
+}
+
void native_flush_tlb_others(const struct cpumask *cpumask,
const struct flush_tlb_info *info)
{
@@ -269,11 +280,11 @@ void native_flush_tlb_others(const struc
cpu = smp_processor_id();
cpumask = uv_flush_tlb_others(cpumask, info);
if (cpumask)
- smp_call_function_many(cpumask, flush_tlb_func,
+ smp_call_function_many(cpumask, flush_tlb_func_remote,
(void *)info, 1);
return;
}
- smp_call_function_many(cpumask, flush_tlb_func,
+ smp_call_function_many(cpumask, flush_tlb_func_remote,
(void *)info, 1);
}
@@ -315,59 +326,33 @@ static unsigned long tlb_single_page_flu
void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag)
{
- unsigned long addr;
- struct flush_tlb_info info;
- /* do a global flush by default */
- unsigned long base_pages_to_flush = TLB_FLUSH_ALL;
-
- preempt_disable();
- if (current->active_mm != mm) {
- /* Synchronize with switch_mm. */
- smp_mb();
+ int cpu;
- goto out;
- }
-
- if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
- leave_mm(smp_processor_id());
-
- /* Synchronize with switch_mm. */
- smp_mb();
+ struct flush_tlb_info info = {
+ .mm = mm,
+ };
- goto out;
- }
+ cpu = get_cpu();
- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ /* Synchronize with switch_mm. */
+ smp_mb();
- /*
- * Both branches below are implicit full barriers (MOV to CR or
- * INVLPG) that synchronize with switch_mm.
- */
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- local_flush_tlb();
+ /* Should we flush just the requested range? */
+ if ((end != TLB_FLUSH_ALL) &&
+ !(vmflag & VM_HUGETLB) &&
+ ((end - start) >> PAGE_SHIFT) <= tlb_single_page_flush_ceiling) {
+ info.start = start;
+ info.end = end;
} else {
- /* flush range by one by one 'invlpg' */
- for (addr = start; addr < end; addr += PAGE_SIZE) {
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
- __flush_tlb_single(addr);
- }
- }
- trace_tlb_flush(TLB_LOCAL_MM_SHOOTDOWN, base_pages_to_flush);
-out:
- info.mm = mm;
- if (base_pages_to_flush == TLB_FLUSH_ALL) {
info.start = 0UL;
info.end = TLB_FLUSH_ALL;
- } else {
- info.start = start;
- info.end = end;
}
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
+
+ if (mm == current->active_mm)
+ flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
+ if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
flush_tlb_others(mm_cpumask(mm), &info);
- preempt_enable();
+ put_cpu();
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Reduce indentation in flush_tlb_func()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-reduce-indentation-in-flush_tlb_func.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b3b90e5af7976e46541f5029a369c9c38c5e4cea Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 22 May 2017 15:30:02 -0700
Subject: x86/mm: Reduce indentation in flush_tlb_func()
From: Andy Lutomirski <luto(a)kernel.org>
commit b3b90e5af7976e46541f5029a369c9c38c5e4cea upstream.
The leave_mm() case can just exit the function early so we don't
need to indent the entire remainder of the function.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/97901ddcc9821d7bc7b296d2918d1179f08aaf22.149549206…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -237,24 +237,26 @@ static void flush_tlb_func(void *info)
return;
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
- if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) {
- if (f->flush_end == TLB_FLUSH_ALL) {
- local_flush_tlb();
- trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL);
- } else {
- unsigned long addr;
- unsigned long nr_pages =
- (f->flush_end - f->flush_start) / PAGE_SIZE;
- addr = f->flush_start;
- while (addr < f->flush_end) {
- __flush_tlb_single(addr);
- addr += PAGE_SIZE;
- }
- trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, nr_pages);
- }
- } else
+
+ if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
leave_mm(smp_processor_id());
+ return;
+ }
+ if (f->flush_end == TLB_FLUSH_ALL) {
+ local_flush_tlb();
+ trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL);
+ } else {
+ unsigned long addr;
+ unsigned long nr_pages =
+ (f->flush_end - f->flush_start) / PAGE_SIZE;
+ addr = f->flush_start;
+ while (addr < f->flush_end) {
+ __flush_tlb_single(addr);
+ addr += PAGE_SIZE;
+ }
+ trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, nr_pages);
+ }
}
void native_flush_tlb_others(const struct cpumask *cpumask,
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d6e41f1151feeb118eee776c09323aceb4a415d9 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:17 -0700
Subject: x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Andy Lutomirski <luto(a)kernel.org>
commit d6e41f1151feeb118eee776c09323aceb4a415d9 upstream.
When PCID is enabled, CR3's PCID bits can change during context
switches, so KVM won't be able to treat CR3 as a per-mm constant any
more.
I structured this like the existing CR4 handling. Under ordinary
circumstances (PCID disabled or if the current PCID and the value
that's already in the VMCS match), then we won't do an extra VMCS
write, and we'll never do an extra direct CR3 read. The overhead
should be minimal.
I disallowed using the new helper in non-atomic context because
PCID support will cause CR3 to stop being constant in non-atomic
process context.
(Frankly, it also scares me a bit that KVM ever treated CR3 as
constant, but it looks like it was okay before.)
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: kvm(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/mmu_context.h | 19 +++++++++++++++++++
arch/x86/kvm/vmx.c | 25 +++++++++++++++++++++----
2 files changed, 40 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -268,4 +268,23 @@ static inline bool arch_pte_access_permi
{
return __pkru_allows_pkey(pte_flags_pkey(pte_flags(pte)), write);
}
+
+/*
+ * This can be used from process context to figure out what the value of
+ * CR3 is without needing to do a (slow) read_cr3().
+ *
+ * It's intended to be used for code like KVM that sneakily changes CR3
+ * and needs to restore it. It needs to be used very carefully.
+ */
+static inline unsigned long __get_current_cr3_fast(void)
+{
+ unsigned long cr3 = __pa(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd);
+
+ /* For now, be very restrictive about when this can be called. */
+ VM_WARN_ON(in_nmi() || !in_atomic());
+
+ VM_BUG_ON(cr3 != read_cr3());
+ return cr3;
+}
+
#endif /* _ASM_X86_MMU_CONTEXT_H */
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -48,6 +48,7 @@
#include <asm/kexec.h>
#include <asm/apic.h>
#include <asm/irq_remapping.h>
+#include <asm/mmu_context.h>
#include "trace.h"
#include "pmu.h"
@@ -572,6 +573,7 @@ struct vcpu_vmx {
int gs_ldt_reload_needed;
int fs_reload_needed;
u64 msr_host_bndcfgs;
+ unsigned long vmcs_host_cr3; /* May not match real cr3 */
unsigned long vmcs_host_cr4; /* May not match real cr4 */
} host_state;
struct {
@@ -4857,10 +4859,19 @@ static void vmx_set_constant_host_state(
u32 low32, high32;
unsigned long tmpl;
struct desc_ptr dt;
- unsigned long cr4;
+ unsigned long cr0, cr3, cr4;
- vmcs_writel(HOST_CR0, read_cr0() & ~X86_CR0_TS); /* 22.2.3 */
- vmcs_writel(HOST_CR3, read_cr3()); /* 22.2.3 FIXME: shadow tables */
+ cr0 = read_cr0();
+ WARN_ON(cr0 & X86_CR0_TS);
+ vmcs_writel(HOST_CR0, cr0); /* 22.2.3 */
+
+ /*
+ * Save the most likely value for this task's CR3 in the VMCS.
+ * We can't use __get_current_cr3_fast() because we're not atomic.
+ */
+ cr3 = read_cr3();
+ vmcs_writel(HOST_CR3, cr3); /* 22.2.3 FIXME: shadow tables */
+ vmx->host_state.vmcs_host_cr3 = cr3;
/* Save the most likely value for this task's CR4 in the VMCS. */
cr4 = cr4_read_shadow();
@@ -8836,7 +8847,7 @@ void vmx_arm_hv_timer(struct kvm_vcpu *v
static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long debugctlmsr, cr4;
+ unsigned long debugctlmsr, cr3, cr4;
/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
@@ -8862,6 +8873,12 @@ static void __noclone vmx_vcpu_run(struc
if (test_bit(VCPU_REGS_RIP, (unsigned long *)&vcpu->arch.regs_dirty))
vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]);
+ cr3 = __get_current_cr3_fast();
+ if (unlikely(cr3 != vmx->host_state.vmcs_host_cr3)) {
+ vmcs_writel(HOST_CR3, cr3);
+ vmx->host_state.vmcs_host_cr3 = cr3;
+ }
+
cr4 = cr4_read_shadow();
if (unlikely(cr4 != vmx->host_state.vmcs_host_cr4)) {
vmcs_writel(HOST_CR4, cr4);
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Change the leave_mm() condition for local TLB flushes
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 59f537c1dea04287165bb11407921e095250dc80 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:11 -0700
Subject: x86/mm: Change the leave_mm() condition for local TLB flushes
From: Andy Lutomirski <luto(a)kernel.org>
commit 59f537c1dea04287165bb11407921e095250dc80 upstream.
On a remote TLB flush, we leave_mm() if we're TLBSTATE_LAZY. For a
local flush_tlb_mm_range(), we leave_mm() if !current->mm. These
are approximately the same condition -- the scheduler sets lazy TLB
mode when switching to a thread with no mm.
I'm about to merge the local and remote flush code, but for ease of
verifying and bisecting the patch, I want the local and remote flush
behavior to match first. This patch changes the local code to match
the remote code.
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Acked-by: Rik van Riel <riel(a)redhat.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -328,7 +328,7 @@ void flush_tlb_mm_range(struct mm_struct
goto out;
}
- if (!current->mm) {
+ if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
leave_mm(smp_processor_id());
/* Synchronize with switch_mm. */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/mm: Be more consistent wrt PAGE_SHIFT vs PAGE_SIZE in tlb flush code
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From be4ffc0d787fafb22b89a2f29e71fea3b119205e Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Sun, 28 May 2017 10:00:16 -0700
Subject: x86/mm: Be more consistent wrt PAGE_SHIFT vs PAGE_SIZE in tlb flush code
From: Andy Lutomirski <luto(a)kernel.org>
commit be4ffc0d787fafb22b89a2f29e71fea3b119205e upstream.
Nadav pointed out that some code used PAGE_SIZE and other code used
PAGE_SHIFT. Use PAGE_SHIFT instead of multiplying or dividing by
PAGE_SIZE.
Requested-by: Nadav Amit <nadav.amit(a)gmail.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Arjan van de Ven <arjan(a)linux.intel.com>
Cc: Borislav Petkov <bpetkov(a)suse.de>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Nadav Amit <namit(a)vmware.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -220,8 +220,7 @@ static void flush_tlb_func_common(const
trace_tlb_flush(reason, TLB_FLUSH_ALL);
} else {
unsigned long addr;
- unsigned long nr_pages =
- (f->end - f->start) / PAGE_SIZE;
+ unsigned long nr_pages = (f->end - f->start) >> PAGE_SHIFT;
addr = f->start;
while (addr < f->end) {
__flush_tlb_single(addr);
@@ -374,7 +373,7 @@ void flush_tlb_kernel_range(unsigned lon
/* Balance as user space task's flush, a bit conservative */
if (end == TLB_FLUSH_ALL ||
- (end - start) > tlb_single_page_flush_ceiling * PAGE_SIZE) {
+ (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) {
on_each_cpu(do_flush_tlb_all, NULL, 1);
} else {
struct flush_tlb_info info;
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/kvm/vmx: Simplify segment_base()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kvm-vmx-simplify-segment_base.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8c2e41f7ae1234c192ef497472ad306227c77c03 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Mon, 20 Feb 2017 08:56:12 -0800
Subject: x86/kvm/vmx: Simplify segment_base()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Andy Lutomirski <luto(a)kernel.org>
commit 8c2e41f7ae1234c192ef497472ad306227c77c03 upstream.
Use actual pointer types for pointers (instead of unsigned long) and
replace hardcoded constants with the appropriate self-documenting
macros.
The function is still a bit messy, but this seems a lot better than
before to me.
This is mostly borrowed from a patch by Thomas Garnier.
Cc: Thomas Garnier <thgarnie(a)google.com>
Cc: Jim Mattson <jmattson(a)google.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2030,28 +2030,23 @@ static unsigned long segment_base(u16 se
{
struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
struct desc_struct *d;
- unsigned long table_base;
+ struct desc_struct *table;
unsigned long v;
- if (!(selector & ~3))
+ if (!(selector & ~SEGMENT_RPL_MASK))
return 0;
- table_base = gdt->address;
+ table = (struct desc_struct *)gdt->address;
- if (selector & 4) { /* from ldt */
+ if ((selector & SEGMENT_TI_MASK) == SEGMENT_LDT) {
u16 ldt_selector = kvm_read_ldt();
- if (!(ldt_selector & ~3))
+ if (!(ldt_selector & ~SEGMENT_RPL_MASK))
return 0;
- table_base = segment_base(ldt_selector);
+ table = (struct desc_struct *)segment_base(ldt_selector);
}
- d = (struct desc_struct *)(table_base + (selector & ~7));
- v = get_desc_base(d);
-#ifdef CONFIG_X86_64
- if (d->s == 0 && (d->type == 2 || d->type == 9 || d->type == 11))
- v |= ((unsigned long)((struct ldttss_desc64 *)d)->base3) << 32;
-#endif
+ v = get_desc_base(&table[selector >> 3]);
return v;
}
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.9/x86-mm-refactor-flush_tlb_mm_range-to-merge-local-and-remote-cases.patch
queue-4.9/x86-mm-pass-flush_tlb_info-to-flush_tlb_others-etc.patch
queue-4.9/x86-mm-rework-lazy-tlb-to-track-the-actual-loaded-mm.patch
queue-4.9/x86-mm-kvm-teach-kvm-s-vmx-code-that-cr3-isn-t-a-constant.patch
queue-4.9/x86-mm-use-new-merged-flush-logic-in-arch_tlbbatch_flush.patch
queue-4.9/x86-kvm-vmx-simplify-segment_base.patch
queue-4.9/x86-entry-unwind-create-stack-frames-for-saved-interrupt-registers.patch
queue-4.9/x86-mm-reduce-indentation-in-flush_tlb_func.patch
queue-4.9/x86-mm-remove-the-up-asm-tlbflush.h-code-always-use-the-formerly-smp-code.patch
queue-4.9/x86-mm-reimplement-flush_tlb_page-using-flush_tlb_mm_range.patch
queue-4.9/mm-x86-mm-make-the-batched-unmap-tlb-flush-api-more-generic.patch
queue-4.9/x86-kvm-vmx-defer-tr-reload-after-vm-exit.patch
queue-4.9/x86-mm-change-the-leave_mm-condition-for-local-tlb-flushes.patch
queue-4.9/x86-mm-be-more-consistent-wrt-page_shift-vs-page_size-in-tlb-flush-code.patch
This is a note to let you know that I've just added the patch titled
x86/kvm/vmx: remove unused variable in segment_base()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-kvm-vmx-remove-unused-variable-in-segment_base.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0fce546f9f07b94ccc9de09cf48d35e18946d2fa Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=A9r=C3=A9my=20Lefaure?= <jeremy.lefaure(a)lse.epita.fr>
Date: Sat, 25 Feb 2017 17:46:53 -0500
Subject: x86/kvm/vmx: remove unused variable in segment_base()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Jérémy Lefaure <jeremy.lefaure(a)lse.epita.fr>
commit 0fce546f9f07b94ccc9de09cf48d35e18946d2fa upstream.
The pointer 'struct desc_struct *d' is unused since commit 8c2e41f7ae12
("x86/kvm/vmx: Simplify segment_base()") so let's remove it.
Signed-off-by: Jérémy Lefaure <jeremy.lefaure(a)lse.epita.fr>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Eduardo Valentin <eduval(a)amazon.com>
Signed-off-by: Eduardo Valentin <edubezval(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 1 -
1 file changed, 1 deletion(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2016,7 +2016,6 @@ static bool update_transition_efer(struc
static unsigned long segment_base(u16 selector)
{
struct desc_ptr *gdt = this_cpu_ptr(&host_gdt);
- struct desc_struct *d;
struct desc_struct *table;
unsigned long v;
Patches currently in stable-queue which might be from jeremy.lefaure(a)lse.epita.fr are
queue-4.9/x86-kvm-vmx-remove-unused-variable-in-segment_base.patch