From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
commit ced108037c2aa542b3ed8b7afd1576064ad1362a upstream
In case prot_numa, we are under down_read(mmap_sem). It's critical to
not clear pmd intermittently to avoid race with MADV_DONTNEED which is
also under down_read(mmap_sem):
CPU0: CPU1:
change_huge_pmd(prot_numa=1)
pmdp_huge_get_and_clear_notify()
madvise_dontneed()
zap_pmd_range()
pmd_trans_huge(*pmd) == 0 (without ptl)
// skip the pmd
set_pmd_at();
// pmd is re-established
The race makes MADV_DONTNEED miss the huge pmd and don't clear it
which may break userspace.
Found by code analysis, never saw triggered.
Link: http://lkml.kernel.org/r/20170302151034.27829-3-kirill.shutemov@linux.intel…
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Hillf Danton <hillf.zj(a)alibaba-inc.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
[jwang: adjust context for 4.4]
Signed-off-by: Jack Wang <jinpu.wang(a)profitbricks.com>
---
mm/huge_memory.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ea013cb..0127b78 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1588,7 +1588,39 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
if (prot_numa && pmd_protnone(*pmd))
goto unlock;
- entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
+ /*
+ * In case prot_numa, we are under down_read(mmap_sem). It's critical
+ * to not clear pmd intermittently to avoid race with MADV_DONTNEED
+ * which is also under down_read(mmap_sem):
+ *
+ * CPU0: CPU1:
+ * change_huge_pmd(prot_numa=1)
+ * pmdp_huge_get_and_clear_notify()
+ * madvise_dontneed()
+ * zap_pmd_range()
+ * pmd_trans_huge(*pmd) == 0 (without ptl)
+ * // skip the pmd
+ * set_pmd_at();
+ * // pmd is re-established
+ *
+ * The race makes MADV_DONTNEED miss the huge pmd and don't clear it
+ * which may break userspace.
+ *
+ * pmdp_invalidate() is required to make sure we don't miss
+ * dirty/young flags set by hardware.
+ */
+ entry = *pmd;
+ pmdp_invalidate(vma, addr, pmd);
+
+ /*
+ * Recover dirty/young flags. It relies on pmdp_invalidate to not
+ * corrupt them.
+ */
+ if (pmd_dirty(*pmd))
+ entry = pmd_mkdirty(entry);
+ if (pmd_young(*pmd))
+ entry = pmd_mkyoung(entry);
+
entry = pmd_modify(entry, newprot);
if (preserve_write)
entry = pmd_mkwrite(entry);
--
2.7.4
This is a note to let you know that I've just added the patch titled
dma-buf/sw_sync: force signal all unsignaled fences on dying timeline
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dma-buf-sw_sync-force-signal-all-unsignaled-fences-on-dying-timeline.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From ea4d5a270b57fa8d4871f372ca9b97b7697fdfda Mon Sep 17 00:00:00 2001
From: Dominik Behr <dbehr(a)chromium.org>
Date: Thu, 7 Sep 2017 16:02:46 -0300
Subject: dma-buf/sw_sync: force signal all unsignaled fences on dying timeline
From: Dominik Behr <dbehr(a)chromium.org>
commit ea4d5a270b57fa8d4871f372ca9b97b7697fdfda upstream.
To avoid hanging userspace components that might have been waiting on the
active fences of the destroyed timeline we need to signal with error all
remaining fences on such timeline.
This restore the default behaviour of the Android sw_sync framework, which
Android still relies on. It was broken on the dma fence conversion a few
years ago and never fixed.
v2: Do not bother with cleanup do the list (Chris Wilson)
Reviewed-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Signed-off-by: Dominik Behr <dbehr(a)chromium.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan(a)collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170907190246.16425-2-gustav…
Cc: Jisheng Zhang <Jisheng.Zhang(a)synaptics.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/dma-buf/sw_sync.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -321,8 +321,16 @@ static int sw_sync_debugfs_open(struct i
static int sw_sync_debugfs_release(struct inode *inode, struct file *file)
{
struct sync_timeline *obj = file->private_data;
+ struct sync_pt *pt, *next;
- smp_wmb();
+ spin_lock_irq(&obj->lock);
+
+ list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
+ dma_fence_set_error(&pt->base, -ENOENT);
+ dma_fence_signal_locked(&pt->base);
+ }
+
+ spin_unlock_irq(&obj->lock);
sync_timeline_put(obj);
return 0;
Patches currently in stable-queue which might be from dbehr(a)chromium.org are
queue-4.14/dma-buf-sw_sync-force-signal-all-unsignaled-fences-on-dying-timeline.patch
The commit e948bc8fbee0 ("cpufreq: Cap the default transition delay
value to 10 ms") caused a regression on EPIA-M min-ITX computer where
shutdown or reboot hangs occasionally with a print message like:
longhaul: Warning: Timeout while waiting for idle PCI bus
cpufreq: __target_index: Failed to change cpu frequency: -16
This probably happens because the cpufreq governor tries to change the
frequency of the CPU faster than allowed by the hardware.
Before the above commit, the default transition delay was set to 200 ms
for a transition_latency of 200000 ns. Lets revert back to that
transition delay value to fix it. Note that several other transition
delay values were tested like 20 ms and 30 ms and none of them have
resolved system hang issue completely.
Fixes: e948bc8fbee0 ("cpufreq: Cap the default transition delay value to 10 ms")
Cc: 4.14+ <stable(a)vger.kernel.org> # 4.14+
Reported-by: Meelis Roos <mroos(a)linux.ee>
Suggested-by: Rafael J. Wysocki <rjw(a)rjwysocki.net>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V1->V2:
- s/20 ms/200 ms.
drivers/cpufreq/longhaul.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
index c46a12df40dd..5faa37c5b091 100644
--- a/drivers/cpufreq/longhaul.c
+++ b/drivers/cpufreq/longhaul.c
@@ -894,7 +894,7 @@ static int longhaul_cpu_init(struct cpufreq_policy *policy)
if ((longhaul_version != TYPE_LONGHAUL_V1) && (scale_voltage != 0))
longhaul_setup_voltagescaling();
- policy->cpuinfo.transition_latency = 200000; /* nsec */
+ policy->transition_delay_us = 200000; /* usec */
return cpufreq_table_validate_and_show(policy, longhaul_table);
}
--
2.14.1
When VHE is not present, KVM needs to save and restores PMSCR_EL1 when
possible. If SPE is used by the host, value of PMSCR_EL1 cannot be saved
for the guest.
If the host starts using SPE between two save+restore on the same vcpu,
restore will write the value of PMSCR_EL1 read during the first save.
Make sure __debug_save_spe_nvhe clears the value of the saved PMSCR_EL1
when the guest cannot use SPE.
Signed-off-by: Julien Thierry <julien.thierry(a)arm.com>
Cc: Christoffer Dall <christoffer.dall(a)linaro.org>
Cc: Marc Zyngier <marc.zyngier(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: <stable(a)vger.kernel.org>
---
arch/arm64/kvm/hyp/debug-sr.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index 321c9c0..f4363d4 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -74,6 +74,9 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
{
u64 reg;
+ /* Clear pmscr in case of early return */
+ *pmscr_el1 = 0;
+
/* SPE present on this CPU? */
if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_PMSVER_SHIFT))
--
1.9.1
This is a note to let you know that I've just added the patch titled
powerpc/kprobes: Disable preemption before invoking probe handler for optprobes
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-kprobes-disable-preemption-before-invoking-probe-handler-for-optprobes.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 8a2d71a3f2737e2448aa68de2b6052cb570d3d2a Mon Sep 17 00:00:00 2001
From: "Naveen N. Rao" <naveen.n.rao(a)linux.vnet.ibm.com>
Date: Mon, 23 Oct 2017 22:07:38 +0530
Subject: powerpc/kprobes: Disable preemption before invoking probe handler for optprobes
From: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
commit 8a2d71a3f2737e2448aa68de2b6052cb570d3d2a upstream.
Per Documentation/kprobes.txt, probe handlers need to be invoked with
preemption disabled. Update optimized_callback() to do so. Also move
get_kprobe_ctlblk() invocation post preemption disable, since it
accesses pre-cpu data.
This was not an issue so far since optprobes wasn't selected if
CONFIG_PREEMPT was enabled. Commit a30b85df7d599f ("kprobes: Use
synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y") changes
this.
Signed-off-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/kernel/optprobes.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -115,7 +115,6 @@ static unsigned long can_optimize(struct
static void optimized_callback(struct optimized_kprobe *op,
struct pt_regs *regs)
{
- struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long flags;
/* This is possible if op is under delayed unoptimizing */
@@ -124,13 +123,14 @@ static void optimized_callback(struct op
local_irq_save(flags);
hard_irq_disable();
+ preempt_disable();
if (kprobe_running()) {
kprobes_inc_nmissed_count(&op->kp);
} else {
__this_cpu_write(current_kprobe, &op->kp);
regs->nip = (unsigned long)op->kp.addr;
- kcb->kprobe_status = KPROBE_HIT_ACTIVE;
+ get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
opt_pre_handler(&op->kp, regs);
__this_cpu_write(current_kprobe, NULL);
}
@@ -140,6 +140,7 @@ static void optimized_callback(struct op
* local_irq_restore() will re-enable interrupts,
* if they were hard disabled.
*/
+ preempt_enable_no_resched();
local_irq_restore(flags);
}
NOKPROBE_SYMBOL(optimized_callback);
Patches currently in stable-queue which might be from naveen.n.rao(a)linux.vnet.ibm.com are
queue-4.14/kprobes-use-synchronize_rcu_tasks-for-optprobe-with-config_preempt-y.patch
queue-4.14/powerpc-kprobes-disable-preemption-before-invoking-probe-handler-for-optprobes.patch
queue-4.14/powerpc-jprobes-disable-preemption-when-triggered-through-ftrace.patch
This is a note to let you know that I've just added the patch titled
powerpc/jprobes: Disable preemption when triggered through ftrace
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-jprobes-disable-preemption-when-triggered-through-ftrace.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6baea433bc84cd148af1c524389a8d756f67412e Mon Sep 17 00:00:00 2001
From: "Naveen N. Rao" <naveen.n.rao(a)linux.vnet.ibm.com>
Date: Fri, 22 Sep 2017 14:40:47 +0530
Subject: powerpc/jprobes: Disable preemption when triggered through ftrace
From: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
commit 6baea433bc84cd148af1c524389a8d756f67412e upstream.
KPROBES_SANITY_TEST throws the below splat when CONFIG_PREEMPT is
enabled:
Kprobe smoke test: started
DEBUG_LOCKS_WARN_ON(val > preempt_count())
------------[ cut here ]------------
WARNING: CPU: 19 PID: 1 at kernel/sched/core.c:3094 preempt_count_sub+0xcc/0x140
Modules linked in:
CPU: 19 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc7-nnr+ #97
task: c0000000fea80000 task.stack: c0000000feb00000
NIP: c00000000011d3dc LR: c00000000011d3d8 CTR: c000000000a090d0
REGS: c0000000feb03400 TRAP: 0700 Not tainted (4.13.0-rc7-nnr+)
MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 28000282 XER: 00000000
CFAR: c00000000015aa18 SOFTE: 0
<snip>
NIP preempt_count_sub+0xcc/0x140
LR preempt_count_sub+0xc8/0x140
Call Trace:
preempt_count_sub+0xc8/0x140 (unreliable)
kprobe_handler+0x228/0x4b0
program_check_exception+0x58/0x3b0
program_check_common+0x16c/0x170
--- interrupt: 0 at kprobe_target+0x8/0x20
LR = init_test_probes+0x248/0x7d0
kp+0x0/0x80 (unreliable)
livepatch_handler+0x38/0x74
init_kprobes+0x1d8/0x208
do_one_initcall+0x68/0x1d0
kernel_init_freeable+0x298/0x374
kernel_init+0x24/0x160
ret_from_kernel_thread+0x5c/0x70
Instruction dump:
419effdc 3d22001b 39299240 81290000 2f890000 409effc8 3c82ffcb 3c62ffcb
3884bc68 3863bc18 4803d5fd 60000000 <0fe00000> 4bffffa8 60000000 60000000
---[ end trace 432dd46b4ce3d29f ]---
Kprobe smoke test: passed successfully
The issue is that we aren't disabling preemption in
kprobe_ftrace_handler(). Disable it.
Fixes: ead514d5fb30a0 ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE")
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao(a)linux.vnet.ibm.com>
[mpe: Trim oops a little for formatting]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/kernel/kprobes-ftrace.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -65,6 +65,7 @@ void kprobe_ftrace_handler(unsigned long
/* Disable irq for emulating a breakpoint and avoiding preempt */
local_irq_save(flags);
hard_irq_disable();
+ preempt_disable();
p = get_kprobe((kprobe_opcode_t *)nip);
if (unlikely(!p) || kprobe_disabled(p))
@@ -86,12 +87,18 @@ void kprobe_ftrace_handler(unsigned long
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
if (!p->pre_handler || !p->pre_handler(p, regs))
__skip_singlestep(p, regs, kcb, orig_nip);
- /*
- * If pre_handler returns !0, it sets regs->nip and
- * resets current kprobe.
- */
+ else {
+ /*
+ * If pre_handler returns !0, it sets regs->nip and
+ * resets current kprobe. In this case, we still need
+ * to restore irq, but not preemption.
+ */
+ local_irq_restore(flags);
+ return;
+ }
}
end:
+ preempt_enable_no_resched();
local_irq_restore(flags);
}
NOKPROBE_SYMBOL(kprobe_ftrace_handler);
Patches currently in stable-queue which might be from naveen.n.rao(a)linux.vnet.ibm.com are
queue-4.14/kprobes-use-synchronize_rcu_tasks-for-optprobe-with-config_preempt-y.patch
queue-4.14/powerpc-kprobes-disable-preemption-before-invoking-probe-handler-for-optprobes.patch
queue-4.14/powerpc-jprobes-disable-preemption-when-triggered-through-ftrace.patch
This is a note to let you know that I've just added the patch titled
kprobes: Use synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kprobes-use-synchronize_rcu_tasks-for-optprobe-with-config_preempt-y.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Wed Dec 6 18:04:41 CET 2017
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Fri, 20 Oct 2017 08:43:39 +0900
Subject: kprobes: Use synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y
From: Masami Hiramatsu <mhiramat(a)kernel.org>
[ Upstream commit a30b85df7d599f626973e9cd3056fe755bd778e0 ]
We want to wait for all potentially preempted kprobes trampoline
execution to have completed. This guarantees that any freed
trampoline memory is not in use by any task in the system anymore.
synchronize_rcu_tasks() gives such a guarantee, so use it.
Also, this guarantees to wait for all potentially preempted tasks
on the instructions which will be replaced with a jump.
Since this becomes a problem only when CONFIG_PREEMPT=y, enable
CONFIG_TASKS_RCU=y for synchronize_rcu_tasks() in that case.
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Acked-by: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth(a)linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Naveen N . Rao <naveen.n.rao(a)linux.vnet.ibm.com>
Cc: Paul E . McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/150845661962.5443.17724352636247312231.stgit@devbox
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/Kconfig | 2 +-
kernel/kprobes.c | 14 ++++++++------
2 files changed, 9 insertions(+), 7 deletions(-)
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -91,7 +91,7 @@ config STATIC_KEYS_SELFTEST
config OPTPROBES
def_bool y
depends on KPROBES && HAVE_OPTPROBES
- depends on !PREEMPT
+ select TASKS_RCU if PREEMPT
config KPROBES_ON_FTRACE
def_bool y
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -573,13 +573,15 @@ static void kprobe_optimizer(struct work
do_unoptimize_kprobes();
/*
- * Step 2: Wait for quiesence period to ensure all running interrupts
- * are done. Because optprobe may modify multiple instructions
- * there is a chance that Nth instruction is interrupted. In that
- * case, running interrupt can return to 2nd-Nth byte of jump
- * instruction. This wait is for avoiding it.
+ * Step 2: Wait for quiesence period to ensure all potentially
+ * preempted tasks to have normally scheduled. Because optprobe
+ * may modify multiple instructions, there is a chance that Nth
+ * instruction is preempted. In that case, such tasks can return
+ * to 2nd-Nth byte of jump instruction. This wait is for avoiding it.
+ * Note that on non-preemptive kernel, this is transparently converted
+ * to synchronoze_sched() to wait for all interrupts to have completed.
*/
- synchronize_sched();
+ synchronize_rcu_tasks();
/* Step 3: Optimize kprobes after quiesence period */
do_optimize_kprobes();
Patches currently in stable-queue which might be from mhiramat(a)kernel.org are
queue-4.14/kprobes-use-synchronize_rcu_tasks-for-optprobe-with-config_preempt-y.patch
queue-4.14/kprobes-x86-disable-preemption-in-ftrace-based-jprobes.patch