This is the start of the stable review cycle for the 4.9.147 release. There are 61 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Dec 22 08:58:31 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.147-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.9.147-rc1
Trent Piepho tpiepho@impinj.com rtc: snvs: Add timeouts to avoid kernel lockups
Guy Shapiro guy.shapiro@mobi-wize.com rtc: snvs: add a missing write sync
Israel Rukshin israelr@mellanox.com nvmet-rdma: fix response use after free
Hans de Goede hdegoede@redhat.com i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node
Adamski, Krzysztof (Nokia - PL/Wroclaw) krzysztof.adamski@nokia.com i2c: axxia: properly handle master timeout
Stefan Hajnoczi stefanha@redhat.com vhost/vsock: fix reset orphans race with close timeout
Steve French stfrench@microsoft.com cifs: In Kconfig CONFIG_CIFS_POSIX needs depends on legacy (insecure cifs)
Sam Bobroff sbobroff@linux.ibm.com drm/ast: Fix connector leak during driver unload
Nicolas Saenz Julienne nsaenzjulienne@suse.de ethernet: fman: fix wrong of_node_put() in probe function
Vladimir Murzin vladimir.murzin@arm.com ARM: 8815/1: V7M: align v7m_dma_inv_range() with v7 counterpart
Chris Cole chris@sageembedded.com ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handling
Alexei Starovoitov ast@kernel.org bpf: check pending signals while verifying programs
Saeed Mahameed saeedm@mellanox.com net/mlx4_en: Fix build break when CONFIG_INET is off
Anderson Luiz Alves alacn1@gmail.com mv88e6060: disable hardware level MAC learning
Juha-Matti Tilli juha-matti.tilli@iki.fi libata: whitelist all SAMSUNG MZ7KM* solid-state disks
Tony Lindgren tony@atomide.com Input: omap-keypad - fix keyboard debounce configuration
Dan Carpenter dan.carpenter@oracle.com clk: mmp: Off by one in mmp_clk_add()
Dan Carpenter dan.carpenter@oracle.com clk: mvebu: Off by one bugs in cp110_of_clk_get()
Yangtao Li tiny.windzz@gmail.com ide: pmac: add of_node_put()
Yangtao Li tiny.windzz@gmail.com drivers/tty: add missing of_node_put()
Yangtao Li tiny.windzz@gmail.com drivers/sbus/char: add of_node_put()
Yangtao Li tiny.windzz@gmail.com sbus: char: add of_node_put()
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: Fix a potential race in xprt_connect()
Dave Kleikamp dave.kleikamp@oracle.com nfs: don't dirty kernel pages read by direct-io
Toni Peltonen peltzi@peltzi.fi bonding: fix 802.3ad state sent to partner when unbinding slave
Jose Abreu joabreu@synopsys.com ARC: io.h: Implement reads{x}()/writes{x}()
Sean Paul seanpaul@chromium.org drm/msm: Grab a vblank reference when waiting for commit_done
YiFei Zhu zhuyifei1999@gmail.com x86/earlyprintk/efi: Fix infinite loop on some screen widths
Cathy Avery cavery@redhat.com scsi: vmw_pscsi: Rearrange code to avoid multiple calls to free_irq during unload
Fred Herard fred.herard@oracle.com scsi: libiscsi: Fix NULL pointer dereference in iscsi_eh_session_reset
Alexey Khoroshilov khoroshilov@ispras.ru mac80211_hwsim: fix module init error paths for netlink
Steven Rostedt (VMware) rostedt@goodmis.org locking/qspinlock: Fix build for anonymous union in older GCC compilers
Peter Zijlstra peterz@infradead.org locking/qspinlock, x86: Provide liveness guarantee
Will Deacon will.deacon@arm.com locking/qspinlock/x86: Increase _Q_PENDING_LOOPS upper bound
Peter Zijlstra peterz@infradead.org locking/qspinlock: Re-order code
Will Deacon will.deacon@arm.com locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue
Will Deacon will.deacon@arm.com locking/qspinlock: Remove duplicate clear_pending() function from PV code
Will Deacon will.deacon@arm.com locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath
Will Deacon will.deacon@arm.com locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'
Will Deacon will.deacon@arm.com locking/qspinlock: Bound spinning on pending->locked transition in slowpath
Will Deacon will.deacon@arm.com locking/qspinlock: Ensure node is initialised before updating prev->next
Paul E. McKenney paulmck@linux.vnet.ibm.com locking: Remove smp_read_barrier_depends() from queued_spin_lock_slowpath()
Michael J. Ruhl michael.j.ruhl@intel.com IB/hfi1: Remove race conditions in user_sdma send path
Ilan Peer ilan.peer@intel.com mac80211: Fix condition validating WMM IE
Emmanuel Grumbach emmanuel.grumbach@intel.com mac80211: don't WARN on bad WMM parameters from buggy APs
Chris Wilson chris@chris-wilson.co.uk drm/i915/execlists: Apply a full mb before execution for Braswell
Brian Norris briannorris@chromium.org Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
Radu Rendec radu.rendec@gmail.com powerpc/msi: Fix NULL pointer access in teardown code
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Fix memory leak of instance function hash filters
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Fix memory leak in set_trigger_filter()
Lubomir Rintel lkundrak@v3.sk ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt
Aaro Koskinen aaro.koskinen@iki.fi MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310
Jeff Moyer jmoyer@redhat.com aio: fix spectre gadget in lookup_ioctx
Chen-Yu Tsai wens@csie.org pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11
Ingo Molnar mingo@kernel.org timer/debug: Change /proc/timer_list from 0444 to 0400
Davidlohr Bueso dave@stgolabs.net lib/interval_tree_test.c: allow users to limit scope of endpoint
Davidlohr Bueso dave@stgolabs.net lib/rbtree-test: lower default params
Davidlohr Bueso dave@stgolabs.net lib/rbtree_test.c: make input module parameters
Davidlohr Bueso dave@stgolabs.net lib/interval_tree_test.c: allow full tree search
Davidlohr Bueso dave@stgolabs.net lib/interval_tree_test.c: make test options module parameters
Will Deacon will.deacon@arm.com signal: Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
-------------
Diffstat:
Makefile | 4 +- arch/arc/include/asm/io.h | 72 ++++++++++ arch/arm/mach-mmp/cputype.h | 6 +- arch/arm/mm/cache-v7.S | 8 +- arch/arm/mm/cache-v7m.S | 14 +- arch/powerpc/kernel/msi.c | 7 +- arch/x86/include/asm/qspinlock.h | 25 +++- arch/x86/include/asm/qspinlock_paravirt.h | 3 +- arch/x86/platform/efi/early_printk.c | 2 +- drivers/ata/libata-core.c | 1 + drivers/clk/mmp/clk.c | 2 +- drivers/clk/mvebu/cp110-system-controller.c | 4 +- drivers/gpu/drm/ast/ast_fb.c | 1 + drivers/gpu/drm/i915/intel_lrc.c | 7 +- drivers/gpu/drm/msm/msm_atomic.c | 5 + drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 - drivers/i2c/busses/i2c-axxia.c | 40 ++++-- drivers/i2c/busses/i2c-scmi.c | 10 +- drivers/ide/pmac.c | 1 + drivers/infiniband/hw/hfi1/user_sdma.c | 28 ++-- drivers/infiniband/hw/hfi1/user_sdma.h | 7 +- drivers/input/keyboard/omap4-keypad.c | 18 ++- drivers/mmc/host/omap.c | 11 +- drivers/net/bonding/bond_3ad.c | 3 + drivers/net/dsa/mv88e6060.c | 10 +- drivers/net/ethernet/freescale/fman/fman.c | 5 +- drivers/net/ethernet/mellanox/mlx4/Kconfig | 2 +- drivers/net/wireless/mac80211_hwsim.c | 12 +- drivers/nvme/target/rdma.c | 3 +- drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c | 2 +- drivers/rtc/rtc-snvs.c | 104 ++++++++++----- drivers/sbus/char/display7seg.c | 1 + drivers/sbus/char/envctrl.c | 2 + drivers/scsi/libiscsi.c | 4 +- drivers/scsi/vmw_pvscsi.c | 4 +- drivers/tty/serial/suncore.c | 1 + drivers/vhost/vsock.c | 22 +++- fs/aio.c | 2 + fs/cifs/Kconfig | 2 +- fs/nfs/direct.c | 9 +- include/asm-generic/qspinlock_types.h | 32 ++++- include/linux/compat.h | 3 + kernel/bpf/verifier.c | 3 + kernel/locking/qspinlock.c | 195 ++++++++++++++-------------- kernel/locking/qspinlock_paravirt.h | 42 ++---- kernel/signal.c | 17 ++- kernel/time/timer_list.c | 2 +- kernel/trace/ftrace.c | 1 + kernel/trace/trace_events_trigger.c | 6 +- lib/interval_tree_test.c | 93 ++++++++----- lib/rbtree_test.c | 55 +++++--- net/mac80211/mlme.c | 3 +- net/sunrpc/xprt.c | 11 +- 53 files changed, 604 insertions(+), 329 deletions(-)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will.deacon@arm.com
commit 22839869f21ab3850fbbac9b425ccc4c0023926f upstream.
The sigaltstack(2) system call fails with -ENOMEM if the new alternative signal stack is found to be smaller than SIGMINSTKSZ. On architectures such as arm64, where the native value for SIGMINSTKSZ is larger than the compat value, this can result in an unexpected error being reported to a compat task. See, for example:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=904385
This patch fixes the problem by extending do_sigaltstack to take the minimum signal stack size as an additional parameter, allowing the native and compat system call entry code to pass in their respective values. COMPAT_SIGMINSTKSZ is just defined as SIGMINSTKSZ if it has not been defined by the architecture.
Cc: Arnd Bergmann arnd@arndb.de Cc: Dominik Brodowski linux@dominikbrodowski.net Cc: "Eric W. Biederman" ebiederm@xmission.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Al Viro viro@zeniv.linux.org.uk Cc: Oleg Nesterov oleg@redhat.com Reported-by: Steve McIntyre steve.mcintyre@arm.com Tested-by: Steve McIntyre 93sam@debian.org Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com [signal: Fix up cherry-pick conflicts for 22839869f21a] Signed-off-by: Steve McIntyre 93sam@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/compat.h | 3 +++ kernel/signal.c | 17 ++++++++++------- 2 files changed, 13 insertions(+), 7 deletions(-)
--- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -67,6 +67,9 @@ typedef struct compat_sigaltstack { compat_size_t ss_size; } compat_stack_t; #endif +#ifndef COMPAT_MINSIGSTKSZ +#define COMPAT_MINSIGSTKSZ MINSIGSTKSZ +#endif
#define compat_jiffies_to_clock_t(x) \ (((unsigned long)(x) * COMPAT_USER_HZ) / HZ) --- a/kernel/signal.c +++ b/kernel/signal.c @@ -3116,7 +3116,8 @@ int do_sigaction(int sig, struct k_sigac }
static int -do_sigaltstack (const stack_t __user *uss, stack_t __user *uoss, unsigned long sp) +do_sigaltstack (const stack_t __user *uss, stack_t __user *uoss, unsigned long sp, + size_t min_ss_size) { stack_t oss; int error; @@ -3155,9 +3156,8 @@ do_sigaltstack (const stack_t __user *us ss_size = 0; ss_sp = NULL; } else { - error = -ENOMEM; - if (ss_size < MINSIGSTKSZ) - goto out; + if (unlikely(ss_size < min_ss_size)) + return -ENOMEM; }
current->sas_ss_sp = (unsigned long) ss_sp; @@ -3180,12 +3180,14 @@ out: } SYSCALL_DEFINE2(sigaltstack,const stack_t __user *,uss, stack_t __user *,uoss) { - return do_sigaltstack(uss, uoss, current_user_stack_pointer()); + return do_sigaltstack(uss, uoss, current_user_stack_pointer(), + MINSIGSTKSZ); }
int restore_altstack(const stack_t __user *uss) { - int err = do_sigaltstack(uss, NULL, current_user_stack_pointer()); + int err = do_sigaltstack(uss, NULL, current_user_stack_pointer(), + MINSIGSTKSZ); /* squash all but EFAULT for now */ return err == -EFAULT ? err : 0; } @@ -3226,7 +3228,8 @@ COMPAT_SYSCALL_DEFINE2(sigaltstack, set_fs(KERNEL_DS); ret = do_sigaltstack((stack_t __force __user *) (uss_ptr ? &uss : NULL), (stack_t __force __user *) &uoss, - compat_user_stack_pointer()); + compat_user_stack_pointer(), + COMPAT_MINSIGSTKSZ); set_fs(seg); if (ret >= 0 && uoss_ptr) { if (!access_ok(VERIFY_WRITE, uoss_ptr, sizeof(compat_stack_t)) ||
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a54dae0338b7f01eb0f9c7571fb9b74f791d1c6b ]
Allows for more flexible debugging.
Link: http://lkml.kernel.org/r/20170518174936.20265-3-dave@stgolabs.net Signed-off-by: Davidlohr Bueso dbueso@suse.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/interval_tree_test.c | 57 ++++++++++++++++++++++++++++------------ 1 file changed, 40 insertions(+), 17 deletions(-)
diff --git a/lib/interval_tree_test.c b/lib/interval_tree_test.c index 245900b98c8e..1093f0496d5e 100644 --- a/lib/interval_tree_test.c +++ b/lib/interval_tree_test.c @@ -1,16 +1,25 @@ #include <linux/module.h> +#include <linux/moduleparam.h> #include <linux/interval_tree.h> #include <linux/random.h> +#include <linux/slab.h> #include <asm/timex.h>
-#define NODES 100 -#define PERF_LOOPS 100000 -#define SEARCHES 100 -#define SEARCH_LOOPS 10000 +#define __param(type, name, init, msg) \ + static type name = init; \ + module_param(name, type, 0444); \ + MODULE_PARM_DESC(name, msg); + +__param(int, nnodes, 100, "Number of nodes in the interval tree"); +__param(int, perf_loops, 100000, "Number of iterations modifying the tree"); + +__param(int, nsearches, 100, "Number of searches to the interval tree"); +__param(int, search_loops, 10000, "Number of iterations searching the tree"); +
static struct rb_root root = RB_ROOT; -static struct interval_tree_node nodes[NODES]; -static u32 queries[SEARCHES]; +static struct interval_tree_node *nodes = NULL; +static u32 *queries = NULL;
static struct rnd_state rnd;
@@ -29,7 +38,8 @@ search(unsigned long query, struct rb_root *root) static void init(void) { int i; - for (i = 0; i < NODES; i++) { + + for (i = 0; i < nnodes; i++) { u32 a = prandom_u32_state(&rnd); u32 b = prandom_u32_state(&rnd); if (a <= b) { @@ -40,7 +50,7 @@ static void init(void) nodes[i].last = a; } } - for (i = 0; i < SEARCHES; i++) + for (i = 0; i < nsearches; i++) queries[i] = prandom_u32_state(&rnd); }
@@ -50,6 +60,16 @@ static int interval_tree_test_init(void) unsigned long results; cycles_t time1, time2, time;
+ nodes = kmalloc(nnodes * sizeof(struct interval_tree_node), GFP_KERNEL); + if (!nodes) + return -ENOMEM; + + queries = kmalloc(nsearches * sizeof(int), GFP_KERNEL); + if (!queries) { + kfree(nodes); + return -ENOMEM; + } + printk(KERN_ALERT "interval tree insert/remove");
prandom_seed_state(&rnd, 3141592653589793238ULL); @@ -57,39 +77,42 @@ static int interval_tree_test_init(void)
time1 = get_cycles();
- for (i = 0; i < PERF_LOOPS; i++) { - for (j = 0; j < NODES; j++) + for (i = 0; i < perf_loops; i++) { + for (j = 0; j < nnodes; j++) interval_tree_insert(nodes + j, &root); - for (j = 0; j < NODES; j++) + for (j = 0; j < nnodes; j++) interval_tree_remove(nodes + j, &root); }
time2 = get_cycles(); time = time2 - time1;
- time = div_u64(time, PERF_LOOPS); + time = div_u64(time, perf_loops); printk(" -> %llu cycles\n", (unsigned long long)time);
printk(KERN_ALERT "interval tree search");
- for (j = 0; j < NODES; j++) + for (j = 0; j < nnodes; j++) interval_tree_insert(nodes + j, &root);
time1 = get_cycles();
results = 0; - for (i = 0; i < SEARCH_LOOPS; i++) - for (j = 0; j < SEARCHES; j++) + for (i = 0; i < search_loops; i++) + for (j = 0; j < nsearches; j++) results += search(queries[j], &root);
time2 = get_cycles(); time = time2 - time1;
- time = div_u64(time, SEARCH_LOOPS); - results = div_u64(results, SEARCH_LOOPS); + time = div_u64(time, search_loops); + results = div_u64(results, search_loops); printk(" -> %llu cycles (%lu results)\n", (unsigned long long)time, results);
+ kfree(queries); + kfree(nodes); + return -EAGAIN; /* Fail will directly unload the module */ }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c46ecce431ebe6b1a9551d1f530eb432dae5c39b ]
... such that a user can specify visiting all the nodes in the tree (intersects with the world). This is a nice opposite from the very basic default query which is a single point.
Link: http://lkml.kernel.org/r/20170518174936.20265-5-dave@stgolabs.net Signed-off-by: Davidlohr Bueso dbueso@suse.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/interval_tree_test.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/interval_tree_test.c b/lib/interval_tree_test.c index 1093f0496d5e..409383463879 100644 --- a/lib/interval_tree_test.c +++ b/lib/interval_tree_test.c @@ -15,6 +15,7 @@ __param(int, perf_loops, 100000, "Number of iterations modifying the tree");
__param(int, nsearches, 100, "Number of searches to the interval tree"); __param(int, search_loops, 10000, "Number of iterations searching the tree"); +__param(bool, search_all, false, "Searches will iterate all nodes in the tree");
static struct rb_root root = RB_ROOT; @@ -24,13 +25,13 @@ static u32 *queries = NULL; static struct rnd_state rnd;
static inline unsigned long -search(unsigned long query, struct rb_root *root) +search(struct rb_root *root, unsigned long start, unsigned long last) { struct interval_tree_node *node; unsigned long results = 0;
- for (node = interval_tree_iter_first(root, query, query); node; - node = interval_tree_iter_next(node, query, query)) + for (node = interval_tree_iter_first(root, start, last); node; + node = interval_tree_iter_next(node, start, last)) results++; return results; } @@ -99,8 +100,12 @@ static int interval_tree_test_init(void)
results = 0; for (i = 0; i < search_loops; i++) - for (j = 0; j < nsearches; j++) - results += search(queries[j], &root); + for (j = 0; j < nsearches; j++) { + unsigned long start = search_all ? 0 : queries[j]; + unsigned long last = search_all ? max_endpoint : queries[j]; + + results += search(&root, start, last); + }
time2 = get_cycles(); time = time2 - time1;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 223f8911eace60c787f8767c25148b80ece9732a ]
Allows for more flexible debugging.
Link: http://lkml.kernel.org/r/20170719014603.19029-5-dave@stgolabs.net Signed-off-by: Davidlohr Bueso dbueso@suse.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/rbtree_test.c | 55 +++++++++++++++++++++++++++++------------------ 1 file changed, 34 insertions(+), 21 deletions(-)
diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c index 8b3c9dc88262..e83331aa1b7f 100644 --- a/lib/rbtree_test.c +++ b/lib/rbtree_test.c @@ -1,11 +1,18 @@ #include <linux/module.h> +#include <linux/moduleparam.h> #include <linux/rbtree_augmented.h> #include <linux/random.h> +#include <linux/slab.h> #include <asm/timex.h>
-#define NODES 100 -#define PERF_LOOPS 100000 -#define CHECK_LOOPS 100 +#define __param(type, name, init, msg) \ + static type name = init; \ + module_param(name, type, 0444); \ + MODULE_PARM_DESC(name, msg); + +__param(int, nnodes, 100, "Number of nodes in the rb-tree"); +__param(int, perf_loops, 100000, "Number of iterations modifying the rb-tree"); +__param(int, check_loops, 100, "Number of iterations modifying and verifying the rb-tree");
struct test_node { u32 key; @@ -17,7 +24,7 @@ struct test_node { };
static struct rb_root root = RB_ROOT; -static struct test_node nodes[NODES]; +static struct test_node *nodes = NULL;
static struct rnd_state rnd;
@@ -95,7 +102,7 @@ static void erase_augmented(struct test_node *node, struct rb_root *root) static void init(void) { int i; - for (i = 0; i < NODES; i++) { + for (i = 0; i < nnodes; i++) { nodes[i].key = prandom_u32_state(&rnd); nodes[i].val = prandom_u32_state(&rnd); } @@ -177,6 +184,10 @@ static int __init rbtree_test_init(void) int i, j; cycles_t time1, time2, time;
+ nodes = kmalloc(nnodes * sizeof(*nodes), GFP_KERNEL); + if (!nodes) + return -ENOMEM; + printk(KERN_ALERT "rbtree testing");
prandom_seed_state(&rnd, 3141592653589793238ULL); @@ -184,27 +195,27 @@ static int __init rbtree_test_init(void)
time1 = get_cycles();
- for (i = 0; i < PERF_LOOPS; i++) { - for (j = 0; j < NODES; j++) + for (i = 0; i < perf_loops; i++) { + for (j = 0; j < nnodes; j++) insert(nodes + j, &root); - for (j = 0; j < NODES; j++) + for (j = 0; j < nnodes; j++) erase(nodes + j, &root); }
time2 = get_cycles(); time = time2 - time1;
- time = div_u64(time, PERF_LOOPS); + time = div_u64(time, perf_loops); printk(" -> %llu cycles\n", (unsigned long long)time);
- for (i = 0; i < CHECK_LOOPS; i++) { + for (i = 0; i < check_loops; i++) { init(); - for (j = 0; j < NODES; j++) { + for (j = 0; j < nnodes; j++) { check(j); insert(nodes + j, &root); } - for (j = 0; j < NODES; j++) { - check(NODES - j); + for (j = 0; j < nnodes; j++) { + check(nnodes - j); erase(nodes + j, &root); } check(0); @@ -216,32 +227,34 @@ static int __init rbtree_test_init(void)
time1 = get_cycles();
- for (i = 0; i < PERF_LOOPS; i++) { - for (j = 0; j < NODES; j++) + for (i = 0; i < perf_loops; i++) { + for (j = 0; j < nnodes; j++) insert_augmented(nodes + j, &root); - for (j = 0; j < NODES; j++) + for (j = 0; j < nnodes; j++) erase_augmented(nodes + j, &root); }
time2 = get_cycles(); time = time2 - time1;
- time = div_u64(time, PERF_LOOPS); + time = div_u64(time, perf_loops); printk(" -> %llu cycles\n", (unsigned long long)time);
- for (i = 0; i < CHECK_LOOPS; i++) { + for (i = 0; i < check_loops; i++) { init(); - for (j = 0; j < NODES; j++) { + for (j = 0; j < nnodes; j++) { check_augmented(j); insert_augmented(nodes + j, &root); } - for (j = 0; j < NODES; j++) { - check_augmented(NODES - j); + for (j = 0; j < nnodes; j++) { + check_augmented(nnodes - j); erase_augmented(nodes + j, &root); } check_augmented(0); }
+ kfree(nodes); + return -EAGAIN; /* Fail will directly unload the module */ }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0b548e33e6cb2bff240fdaf1783783be15c29080 ]
Fengguang reported soft lockups while running the rbtree and interval tree test modules. The logic for these tests all occur in init phase, and we currently are pounding with the default values for number of nodes and number of iterations of each test. Reduce the latter by two orders of magnitude. This does not influence the value of the tests in that one thousand times by default is enough to get the picture.
Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805 Signed-off-by: Davidlohr Bueso dbueso@suse.de Reported-by: Fengguang Wu fengguang.wu@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/interval_tree_test.c | 4 ++-- lib/rbtree_test.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/interval_tree_test.c b/lib/interval_tree_test.c index 409383463879..bababcf7ffdd 100644 --- a/lib/interval_tree_test.c +++ b/lib/interval_tree_test.c @@ -11,10 +11,10 @@ MODULE_PARM_DESC(name, msg);
__param(int, nnodes, 100, "Number of nodes in the interval tree"); -__param(int, perf_loops, 100000, "Number of iterations modifying the tree"); +__param(int, perf_loops, 1000, "Number of iterations modifying the tree");
__param(int, nsearches, 100, "Number of searches to the interval tree"); -__param(int, search_loops, 10000, "Number of iterations searching the tree"); +__param(int, search_loops, 1000, "Number of iterations searching the tree"); __param(bool, search_all, false, "Searches will iterate all nodes in the tree");
diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c index e83331aa1b7f..afedd3770562 100644 --- a/lib/rbtree_test.c +++ b/lib/rbtree_test.c @@ -11,7 +11,7 @@ MODULE_PARM_DESC(name, msg);
__param(int, nnodes, 100, "Number of nodes in the rb-tree"); -__param(int, perf_loops, 100000, "Number of iterations modifying the rb-tree"); +__param(int, perf_loops, 1000, "Number of iterations modifying the rb-tree"); __param(int, check_loops, 100, "Number of iterations modifying and verifying the rb-tree");
struct test_node {
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a8ec14d4f6aa8e245efacc992c8ee6ea0464ce2a ]
Add a 'max_endpoint' parameter such that users may easily limit the size of the intervals that are randomly generated.
Link: http://lkml.kernel.org/r/20170518174936.20265-4-dave@stgolabs.net Signed-off-by: Davidlohr Bueso dbueso@suse.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/interval_tree_test.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/lib/interval_tree_test.c b/lib/interval_tree_test.c index bababcf7ffdd..222c8010bda0 100644 --- a/lib/interval_tree_test.c +++ b/lib/interval_tree_test.c @@ -17,6 +17,7 @@ __param(int, nsearches, 100, "Number of searches to the interval tree"); __param(int, search_loops, 1000, "Number of iterations searching the tree"); __param(bool, search_all, false, "Searches will iterate all nodes in the tree");
+__param(uint, max_endpoint, ~0, "Largest value for the interval's endpoint");
static struct rb_root root = RB_ROOT; static struct interval_tree_node *nodes = NULL; @@ -41,18 +42,20 @@ static void init(void) int i;
for (i = 0; i < nnodes; i++) { - u32 a = prandom_u32_state(&rnd); - u32 b = prandom_u32_state(&rnd); - if (a <= b) { - nodes[i].start = a; - nodes[i].last = b; - } else { - nodes[i].start = b; - nodes[i].last = a; - } + u32 b = (prandom_u32_state(&rnd) >> 4) % max_endpoint; + u32 a = (prandom_u32_state(&rnd) >> 4) % b; + + nodes[i].start = a; + nodes[i].last = b; } + + /* + * Limit the search scope to what the user defined. + * Otherwise we are merely measuring empty walks, + * which is pointless. + */ for (i = 0; i < nsearches; i++) - queries[i] = prandom_u32_state(&rnd); + queries[i] = (prandom_u32_state(&rnd) >> 4) % max_endpoint; }
static int interval_tree_test_init(void)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8e7df2b5b7f245c9bd11064712db5cb69044a362 ]
While it uses %pK, there's still few reasons to read this file as non-root.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Acked-by: Thomas Gleixner tglx@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/timer_list.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index ef4f16e81283..1407ed20ea93 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -399,7 +399,7 @@ static int __init init_timer_list_procfs(void) { struct proc_dir_entry *pe;
- pe = proc_create("timer_list", 0444, NULL, &timer_list_fops); + pe = proc_create("timer_list", 0400, NULL, &timer_list_fops); if (!pe) return -ENOMEM; return 0;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wens@csie.org
commit 478b6767ad26ab86d9ecc341027dd09a87b1f997 upstream.
Pin PH11 is used on various A83T board to detect a change in the OTG port's ID pin, as in when an OTG host cable is plugged in.
The incorrect offset meant the gpiochip/irqchip was activating the wrong pin for interrupts.
Fixes: 4730f33f0d82 ("pinctrl: sunxi: add allwinner A83T PIO controller support") Cc: stable@vger.kernel.org Signed-off-by: Chen-Yu Tsai wens@csie.org Acked-by: Maxime Ripard maxime.ripard@bootlin.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c +++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c @@ -568,7 +568,7 @@ static const struct sunxi_desc_pin sun8i SUNXI_PIN(SUNXI_PINCTRL_PIN(H, 11), SUNXI_FUNCTION(0x0, "gpio_in"), SUNXI_FUNCTION(0x1, "gpio_out"), - SUNXI_FUNCTION_IRQ_BANK(0x6, 2, 1)), /* PH_EINT11 */ + SUNXI_FUNCTION_IRQ_BANK(0x6, 2, 11)), /* PH_EINT11 */ };
static const struct sunxi_pinctrl_desc sun8i_a83t_pinctrl_data = {
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Moyer jmoyer@redhat.com
commit a538e3ff9dabcdf6c3f477a373c629213d1c3066 upstream.
Matthew pointed out that the ioctx_table is susceptible to spectre v1, because the index can be controlled by an attacker. The below patch should mitigate the attack for all of the aio system calls.
Cc: stable@vger.kernel.org Reported-by: Matthew Wilcox willy@infradead.org Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Jeff Moyer jmoyer@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/aio.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/aio.c +++ b/fs/aio.c @@ -40,6 +40,7 @@ #include <linux/ramfs.h> #include <linux/percpu-refcount.h> #include <linux/mount.h> +#include <linux/nospec.h>
#include <asm/kmap_types.h> #include <asm/uaccess.h> @@ -1071,6 +1072,7 @@ static struct kioctx *lookup_ioctx(unsig if (!table || id >= table->nr) goto out;
+ id = array_index_nospec(id, table->nr); ctx = rcu_dereference(table->table[id]); if (ctx && ctx->user_id == ctx_id) { if (percpu_ref_tryget_live(&ctx->users))
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aaro Koskinen aaro.koskinen@iki.fi
commit e8cde625bfe8a714a856e1366bcbb259d7346095 upstream.
Since v2.6.22 or so there has been reports [1] about OMAP MMC being broken on OMAP15XX based hardware (OMAP5910 and OMAP310). The breakage seems to have been caused by commit 46a6730e3ff9 ("mmc-omap: Fix omap to use MMC_POWER_ON") that changed clock enabling to be done on MMC_POWER_ON. This can happen multiple times in a row, and on 15XX the hardware doesn't seem to like it and the MMC just stops responding. Fix by memorizing the power mode and do the init only when necessary.
Before the patch (on Palm TE):
mmc0: new SD card at address b368 mmcblk0: mmc0:b368 SDC 977 MiB mmci-omap mmci-omap.0: command timeout (CMD18) mmci-omap mmci-omap.0: command timeout (CMD13) mmci-omap mmci-omap.0: command timeout (CMD13) mmci-omap mmci-omap.0: command timeout (CMD12) [x 6] mmci-omap mmci-omap.0: command timeout (CMD13) [x 6] mmcblk0: error -110 requesting status mmci-omap mmci-omap.0: command timeout (CMD8) mmci-omap mmci-omap.0: command timeout (CMD18) mmci-omap mmci-omap.0: command timeout (CMD13) mmci-omap mmci-omap.0: command timeout (CMD13) mmci-omap mmci-omap.0: command timeout (CMD12) [x 6] mmci-omap mmci-omap.0: command timeout (CMD13) [x 6] mmcblk0: error -110 requesting status mmcblk0: recovery failed! print_req_error: I/O error, dev mmcblk0, sector 0 Buffer I/O error on dev mmcblk0, logical block 0, async page read mmcblk0: unable to read partition table
After the patch:
mmc0: new SD card at address b368 mmcblk0: mmc0:b368 SDC 977 MiB mmcblk0: p1
The patch is based on a fix and analysis done by Ladislav Michl.
Tested on OMAP15XX/OMAP310 (Palm TE), OMAP1710 (Nokia 770) and OMAP2420 (Nokia N810).
[1] https://marc.info/?t=123175197000003&r=1&w=2
Fixes: 46a6730e3ff9 ("mmc-omap: Fix omap to use MMC_POWER_ON") Reported-by: Ladislav Michl ladis@linux-mips.org Reported-by: Andrzej Zaborowski balrogg@gmail.com Tested-by: Ladislav Michl ladis@linux-mips.org Acked-by: Tony Lindgren tony@atomide.com Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/omap.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/mmc/host/omap.c +++ b/drivers/mmc/host/omap.c @@ -104,6 +104,7 @@ struct mmc_omap_slot { unsigned int vdd; u16 saved_con; u16 bus_mode; + u16 power_mode; unsigned int fclk_freq;
struct tasklet_struct cover_tasklet; @@ -1157,7 +1158,7 @@ static void mmc_omap_set_ios(struct mmc_ struct mmc_omap_slot *slot = mmc_priv(mmc); struct mmc_omap_host *host = slot->host; int i, dsor; - int clk_enabled; + int clk_enabled, init_stream;
mmc_omap_select_slot(slot, 0);
@@ -1167,6 +1168,7 @@ static void mmc_omap_set_ios(struct mmc_ slot->vdd = ios->vdd;
clk_enabled = 0; + init_stream = 0; switch (ios->power_mode) { case MMC_POWER_OFF: mmc_omap_set_power(slot, 0, ios->vdd); @@ -1174,13 +1176,17 @@ static void mmc_omap_set_ios(struct mmc_ case MMC_POWER_UP: /* Cannot touch dsor yet, just power up MMC */ mmc_omap_set_power(slot, 1, ios->vdd); + slot->power_mode = ios->power_mode; goto exit; case MMC_POWER_ON: mmc_omap_fclk_enable(host, 1); clk_enabled = 1; dsor |= 1 << 11; + if (slot->power_mode != MMC_POWER_ON) + init_stream = 1; break; } + slot->power_mode = ios->power_mode;
if (slot->bus_mode != ios->bus_mode) { if (slot->pdata->set_bus_mode != NULL) @@ -1196,7 +1202,7 @@ static void mmc_omap_set_ios(struct mmc_ for (i = 0; i < 2; i++) OMAP_MMC_WRITE(host, CON, dsor); slot->saved_con = dsor; - if (ios->power_mode == MMC_POWER_ON) { + if (init_stream) { /* worst case at 400kHz, 80 cycles makes 200 microsecs */ int usecs = 250;
@@ -1234,6 +1240,7 @@ static int mmc_omap_new_slot(struct mmc_ slot->host = host; slot->mmc = mmc; slot->id = id; + slot->power_mode = MMC_POWER_UNDEFINED; slot->pdata = &host->pdata->slots[id];
host->slots[id] = slot;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lubomir Rintel lkundrak@v3.sk
commit 76f4e2c3b6a560cdd7a75b87df543e04d05a9e5f upstream.
cpu_is_mmp2() was equivalent to cpu_is_pj4(), wouldn't be correct for multiplatform kernels. Fix it by also considering mmp_chip_id, as is done for cpu_is_pxa168() and cpu_is_pxa910() above.
Moreover, it is only available with CONFIG_CPU_MMP2 and thus doesn't work on DT-based MMP2 machines. Enable it on CONFIG_MACH_MMP2_DT too.
Note: CONFIG_CPU_MMP2 is only used for machines that use board files instead of DT. It should perhaps be renamed. I'm not doing it now, because I don't have a better idea.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Acked-by: Arnd Bergmann arnd@arndb.de Cc: stable@vger.kernel.org Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/mach-mmp/cputype.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/arch/arm/mach-mmp/cputype.h +++ b/arch/arm/mach-mmp/cputype.h @@ -43,10 +43,12 @@ static inline int cpu_is_pxa910(void) #define cpu_is_pxa910() (0) #endif
-#ifdef CONFIG_CPU_MMP2 +#if defined(CONFIG_CPU_MMP2) || defined(CONFIG_MACH_MMP2_DT) static inline int cpu_is_mmp2(void) { - return (((read_cpuid_id() >> 8) & 0xff) == 0x58); + return (((read_cpuid_id() >> 8) & 0xff) == 0x58) && + (((mmp_chip_id & 0xfff) == 0x410) || + ((mmp_chip_id & 0xfff) == 0x610)); } #else #define cpu_is_mmp2() (0)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 3cec638b3d793b7cacdec5b8072364b41caeb0e1 upstream.
When create_event_filter() fails in set_trigger_filter(), the filter may still be allocated and needs to be freed. The caller expects the data->filter to be updated with the new filter, even if the new filter failed (we could add an error message by setting set_str parameter of create_event_filter(), but that's another update).
But because the error would just exit, filter was left hanging and nothing could free it.
Found by kmemleak detector.
Cc: stable@vger.kernel.org Fixes: bac5fb97a173a ("tracing: Add and use generic set_trigger_filter() implementation") Reviewed-by: Tom Zanussi tom.zanussi@linux.intel.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace_events_trigger.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -742,8 +742,10 @@ int set_trigger_filter(char *filter_str,
/* The filter is for the 'trigger' event, not the triggered event */ ret = create_event_filter(file->event_call, filter_str, false, &filter); - if (ret) - goto out; + /* + * If create_event_filter() fails, filter still needs to be freed. + * Which the calling code will do with data->filter. + */ assign: tmp = rcu_access_pointer(data->filter);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 2840f84f74035e5a535959d5f17269c69fa6edc5 upstream.
The following commands will cause a memory leak:
# cd /sys/kernel/tracing # mkdir instances/foo # echo schedule > instance/foo/set_ftrace_filter # rmdir instances/foo
The reason is that the hashes that hold the filters to set_ftrace_filter and set_ftrace_notrace are not freed if they contain any data on the instance and the instance is removed.
Found by kmemleak detector.
Cc: stable@vger.kernel.org Fixes: 591dffdade9f ("ftrace: Allow for function tracing instance to filter functions") Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/ftrace.c | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -4836,6 +4836,7 @@ void ftrace_destroy_filter_files(struct if (ops->flags & FTRACE_OPS_FL_ENABLED) ftrace_shutdown(ops, 0); ops->flags |= FTRACE_OPS_FL_DELETED; + ftrace_free_filter(ops); mutex_unlock(&ftrace_lock); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Radu Rendec radu.rendec@gmail.com
commit 78e7b15e17ac175e7eed9e21c6f92d03d3b0a6fa upstream.
The arch_teardown_msi_irqs() function assumes that controller ops pointers were already checked in arch_setup_msi_irqs(), but this assumption is wrong: arch_teardown_msi_irqs() can be called even when arch_setup_msi_irqs() returns an error (-ENOSYS).
This can happen in the following scenario: - msi_capability_init() calls pci_msi_setup_msi_irqs() - pci_msi_setup_msi_irqs() returns -ENOSYS - msi_capability_init() notices the error and calls free_msi_irqs() - free_msi_irqs() calls pci_msi_teardown_msi_irqs()
This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs().
The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure seems legit, as it does additional cleanup; e.g. list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs() is called and need to be cleaned up if that fails).
Fixes: 6b2fd7efeb88 ("PCI/MSI/PPC: Remove arch_msi_check_device()") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Radu Rendec radu.rendec@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/kernel/msi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c @@ -34,5 +34,10 @@ void arch_teardown_msi_irqs(struct pci_d { struct pci_controller *phb = pci_bus_to_host(dev->bus);
- phb->controller_ops.teardown_msi_irqs(dev); + /* + * We can be called even when arch_setup_msi_irqs() returns -ENOSYS, + * so check the pointer again. + */ + if (phb->controller_ops.teardown_msi_irqs) + phb->controller_ops.teardown_msi_irqs(dev); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Norris briannorris@chromium.org
commit 63238173b2faf3d6b85a416f1c69af6c7be2413f upstream.
This reverts commit 7f3ef5dedb146e3d5063b6845781ad1bb59b92b5.
It causes new warnings [1] on shutdown when running the Google Kevin or Scarlet (RK3399) boards under Chrome OS. Presumably our usage of DRM is different than what Marc and Heiko test.
We're looking at a different approach (e.g., [2]) to replace this, but IMO the revert should be taken first, as it already propagated to -stable.
[1] Report here: http://lkml.kernel.org/lkml/20181205030127.GA200921@google.com
WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x1c4/0x294 ... Call trace: drm_mode_config_cleanup+0x1c4/0x294 rockchip_drm_unbind+0x4c/0x8c component_master_del+0x88/0xb8 rockchip_drm_platform_remove+0x2c/0x44 rockchip_drm_platform_shutdown+0x20/0x2c platform_drv_shutdown+0x2c/0x38 device_shutdown+0x164/0x1b8 kernel_restart_prepare+0x40/0x48 kernel_restart+0x20/0x68 ... Memory manager not clean during takedown. WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mm.c:950 drm_mm_takedown+0x34/0x44 ... drm_mm_takedown+0x34/0x44 rockchip_drm_unbind+0x64/0x8c component_master_del+0x88/0xb8 rockchip_drm_platform_remove+0x2c/0x44 rockchip_drm_platform_shutdown+0x20/0x2c platform_drv_shutdown+0x2c/0x38 device_shutdown+0x164/0x1b8 kernel_restart_prepare+0x40/0x48 kernel_restart+0x20/0x68 ...
[2] https://patchwork.kernel.org/patch/10556151/ https://www.spinics.net/lists/linux-rockchip/msg21342.html [PATCH] drm/rockchip: shutdown drm subsystem on shutdown
Fixes: 7f3ef5dedb14 ("drm/rockchip: Allow driver to be shutdown on reboot/kexec") Cc: Jeffy Chen jeffy.chen@rock-chips.com Cc: Robin Murphy robin.murphy@arm.com Cc: Vicente Bergas vicencb@gmail.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: Heiko Stuebner heiko@sntech.de Cc: stable@vger.kernel.org Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Link: https://patchwork.freedesktop.org/patch/msgid/20181205181657.177703-1-briann... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c @@ -478,11 +478,6 @@ static int rockchip_drm_platform_remove( return 0; }
-static void rockchip_drm_platform_shutdown(struct platform_device *pdev) -{ - rockchip_drm_platform_remove(pdev); -} - static const struct of_device_id rockchip_drm_dt_ids[] = { { .compatible = "rockchip,display-subsystem", }, { /* sentinel */ }, @@ -492,7 +487,6 @@ MODULE_DEVICE_TABLE(of, rockchip_drm_dt_ static struct platform_driver rockchip_drm_platform_driver = { .probe = rockchip_drm_platform_probe, .remove = rockchip_drm_platform_remove, - .shutdown = rockchip_drm_platform_shutdown, .driver = { .name = "rockchip-drm", .of_match_table = rockchip_drm_dt_ids,
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Wilson chris@chris-wilson.co.uk
commit cf66b8a0ba142fbd1bf10ac8f3ae92d1b0cb7b8f upstream.
Braswell is really picky about having our writes posted to memory before we execute or else the GPU may see stale values. A wmb() is insufficient as it only ensures the writes are visible to other cores, we need a full mb() to ensure the writes are in memory and visible to the GPU.
The most frequent failure in flushing before execution is that we see stale PTE values and execute the wrong pages.
References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Mika Kuoppala mika.kuoppala@linux.intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20181206084431.9805-3-chris@ch... (cherry picked from commit 490b8c65b9db45896769e1095e78725775f47b3e) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -386,8 +386,13 @@ static u64 execlists_update_context(stru * may not be visible to the HW prior to the completion of the UC * register write and that we may begin execution from the context * before its image is complete leading to invalid PD chasing. + * + * Furthermore, Braswell, at least, wants a full mb to be sure that + * the writes are coherent in memory (visible to the GPU) prior to + * execution, and not just visible to other CPUs (as is the result of + * wmb). */ - wmb(); + mb(); return ce->lrc_desc; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c470bdc1aaf36669e04ba65faf1092b2d1c6cabe ]
Apparently, some APs are buggy enough to send a zeroed WMM IE. Don't WARN on this since this is not caused by a bug on the client's system.
This aligns the condition of the WARNING in drv_conf_tx with the validity check in ieee80211_sta_wmm_params. We will now pick the default values whenever we get a zeroed WMM IE.
This has been reported here: https://bugzilla.kernel.org/show_bug.cgi?id=199161
Fixes: f409079bb678 ("mac80211: sanity check CW_min/CW_max towards driver") Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 39451c84c785..7811650e76c4 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -1867,7 +1867,8 @@ static bool ieee80211_sta_wmm_params(struct ieee80211_local *local, params[ac].acm = acm; params[ac].uapsd = uapsd;
- if (params[ac].cw_min > params[ac].cw_max) { + if (params->cw_min == 0 || + params[ac].cw_min > params[ac].cw_max) { sdata_info(sdata, "AP has invalid WMM params (CWmin/max=%d/%d for ACI %d), using defaults\n", params[ac].cw_min, params[ac].cw_max, aci);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 911a26484c33e10de6237228ca1d7293548e9f49 ]
Commit c470bdc1aaf3 ("mac80211: don't WARN on bad WMM parameters from buggy APs") handled cases where an AP reports a zeroed WMM IE. However, the condition that checks the validity accessed the wrong index in the ieee80211_tx_queue_params array, thus wrongly deducing that the parameters are invalid. Fix it.
Fixes: c470bdc1aaf3 ("mac80211: don't WARN on bad WMM parameters from buggy APs") Signed-off-by: Ilan Peer ilan.peer@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 7811650e76c4..6e0aa296f134 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -1867,7 +1867,7 @@ static bool ieee80211_sta_wmm_params(struct ieee80211_local *local, params[ac].acm = acm; params[ac].uapsd = uapsd;
- if (params->cw_min == 0 || + if (params[ac].cw_min == 0 || params[ac].cw_min > params[ac].cw_max) { sdata_info(sdata, "AP has invalid WMM params (CWmin/max=%d/%d for ACI %d), using defaults\n",
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 28a9a9e83ceae2cee25b9af9ad20d53aaa9ab951 upstream
Packet queue state is over used to determine SDMA descriptor availablitity and packet queue request state.
cpu 0 ret = user_sdma_send_pkts(req, pcount); cpu 0 if (atomic_read(&pq->n_reqs)) cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE) cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE);
At this point pq->n_reqs == 0 and pq->state is incorrectly SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state to return to _INACTIVE.
This can also change the state from _DEFERRED to _ACTIVE. However, this is a mostly benign race.
Remove the racy code path.
Use n_reqs to determine if a packet queue is active or not.
Cc: stable@vger.kernel.org # 4.9.0 Reviewed-by: Mitko Haralanov mitko.haralanov@intel.com Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/user_sdma.c | 28 +++++++++----------------- drivers/infiniband/hw/hfi1/user_sdma.h | 7 ++++++- 2 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c index 619475c7d761..4c111162d552 100644 --- a/drivers/infiniband/hw/hfi1/user_sdma.c +++ b/drivers/infiniband/hw/hfi1/user_sdma.c @@ -151,10 +151,6 @@ MODULE_PARM_DESC(sdma_comp_size, "Size of User SDMA completion ring. Default: 12 #define SDMA_REQ_HAVE_AHG 1 #define SDMA_REQ_HAS_ERROR 2
-#define SDMA_PKT_Q_INACTIVE BIT(0) -#define SDMA_PKT_Q_ACTIVE BIT(1) -#define SDMA_PKT_Q_DEFERRED BIT(2) - /* * Maximum retry attempts to submit a TX request * before putting the process to sleep. @@ -408,7 +404,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp) pq->ctxt = uctxt->ctxt; pq->subctxt = fd->subctxt; pq->n_max_reqs = hfi1_sdma_comp_ring_size; - pq->state = SDMA_PKT_Q_INACTIVE; atomic_set(&pq->n_reqs, 0); init_waitqueue_head(&pq->wait); atomic_set(&pq->n_locked, 0); @@ -491,7 +486,7 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd) /* Wait until all requests have been freed. */ wait_event_interruptible( pq->wait, - (ACCESS_ONCE(pq->state) == SDMA_PKT_Q_INACTIVE)); + !atomic_read(&pq->n_reqs)); kfree(pq->reqs); kfree(pq->req_in_use); kmem_cache_destroy(pq->txreq_cache); @@ -527,6 +522,13 @@ static u8 dlid_to_selector(u16 dlid) return mapping[hash]; }
+/** + * hfi1_user_sdma_process_request() - Process and start a user sdma request + * @fp: valid file pointer + * @iovec: array of io vectors to process + * @dim: overall iovec array size + * @count: number of io vector array entries processed + */ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec, unsigned long dim, unsigned long *count) { @@ -768,20 +770,12 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec, }
set_comp_state(pq, cq, info.comp_idx, QUEUED, 0); + pq->state = SDMA_PKT_Q_ACTIVE; /* Send the first N packets in the request to buy us some time */ ret = user_sdma_send_pkts(req, pcount); if (unlikely(ret < 0 && ret != -EBUSY)) goto free_req;
- /* - * It is possible that the SDMA engine would have processed all the - * submitted packets by the time we get here. Therefore, only set - * packet queue state to ACTIVE if there are still uncompleted - * requests. - */ - if (atomic_read(&pq->n_reqs)) - xchg(&pq->state, SDMA_PKT_Q_ACTIVE); - /* * This is a somewhat blocking send implementation. * The driver will block the caller until all packets of the @@ -1526,10 +1520,8 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq) { - if (atomic_dec_and_test(&pq->n_reqs)) { - xchg(&pq->state, SDMA_PKT_Q_INACTIVE); + if (atomic_dec_and_test(&pq->n_reqs)) wake_up(&pq->wait); - } }
static void user_sdma_free_request(struct user_sdma_request *req, bool unpin) diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h index 39001714f551..09dd843a13de 100644 --- a/drivers/infiniband/hw/hfi1/user_sdma.h +++ b/drivers/infiniband/hw/hfi1/user_sdma.h @@ -53,6 +53,11 @@
extern uint extended_psn;
+enum pkt_q_sdma_state { + SDMA_PKT_Q_ACTIVE, + SDMA_PKT_Q_DEFERRED, +}; + struct hfi1_user_sdma_pkt_q { struct list_head list; unsigned ctxt; @@ -65,7 +70,7 @@ struct hfi1_user_sdma_pkt_q { struct user_sdma_request *reqs; unsigned long *req_in_use; struct iowait busy; - unsigned state; + enum pkt_q_sdma_state state; wait_queue_head_t wait; unsigned long unpinned; struct mmu_rb_handler *handler;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 548095dea63ffc016d39c35b32c628d033638aca upstream.
Queued spinlocks are not used by DEC Alpha, and furthermore operations such as READ_ONCE() and release/relaxed RMW atomics are being changed to imply smp_read_barrier_depends(). This commit therefore removes the now-redundant smp_read_barrier_depends() from queued_spin_lock_slowpath(), and adjusts the comments accordingly.
Signed-off-by: Paul E. McKenney paulmck@linux.vnet.ibm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@redhat.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index a72f5df643f8..8710fbe8d26c 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -169,7 +169,7 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) * @tail : The new queue tail code word * Return: The previous queue tail code word * - * xchg(lock, tail) + * xchg(lock, tail), which heads an address dependency * * p,*,* -> n,*,* ; prev = xchg(lock, node) */ @@ -533,13 +533,11 @@ queue: if (old & _Q_TAIL_MASK) { prev = decode_tail(old); /* - * The above xchg_tail() is also a load of @lock which generates, - * through decode_tail(), a pointer. - * - * The address dependency matches the RELEASE of xchg_tail() - * such that the access to @prev must happen after. + * The above xchg_tail() is also a load of @lock which + * generates, through decode_tail(), a pointer. The address + * dependency matches the RELEASE of xchg_tail() such that + * the subsequent access to @prev happens after. */ - smp_read_barrier_depends();
WRITE_ONCE(prev->next, node);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 95bcade33a8af38755c9b0636e36a36ad3789fe6 upstream.
When a locker ends up queuing on the qspinlock locking slowpath, we initialise the relevant mcs node and publish it indirectly by updating the tail portion of the lock word using xchg_tail. If we find that there was a pre-existing locker in the queue, we subsequently update their ->next field to point at our node so that we are notified when it's our turn to take the lock.
This can be roughly illustrated as follows:
/* Initialise the fields in node and encode a pointer to node in tail */ tail = initialise_node(node);
/* * Exchange tail into the lockword using an atomic read-modify-write * operation with release semantics */ old = xchg_tail(lock, tail);
/* If there was a pre-existing waiter ... */ if (old & _Q_TAIL_MASK) { prev = decode_tail(old); smp_read_barrier_depends();
/* ... then update their ->next field to point to node. WRITE_ONCE(prev->next, node); }
The conditional update of prev->next therefore relies on the address dependency from the result of xchg_tail ensuring order against the prior initialisation of node. However, since the release semantics of the xchg_tail operation apply only to the write portion of the RmW, then this ordering is not guaranteed and it is possible for the CPU to return old before the writes to node have been published, consequently allowing us to point prev->next to an uninitialised node.
This patch fixes the problem by making the update of prev->next a RELEASE operation, which also removes the reliance on dependency ordering.
Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Link: http://lkml.kernel.org/r/1518528177-19169-2-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 8710fbe8d26c..6fce84401dba 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -532,14 +532,15 @@ queue: */ if (old & _Q_TAIL_MASK) { prev = decode_tail(old); + /* - * The above xchg_tail() is also a load of @lock which - * generates, through decode_tail(), a pointer. The address - * dependency matches the RELEASE of xchg_tail() such that - * the subsequent access to @prev happens after. + * We must ensure that the stores to @node are observed before + * the write to prev->next. The address dependency from + * xchg_tail is not sufficient to ensure this because the read + * component of xchg_tail is unordered with respect to the + * initialisation of @node. */ - - WRITE_ONCE(prev->next, node); + smp_store_release(&prev->next, node);
pv_wait_node(node, prev); arch_mcs_spin_lock_contended(&node->locked);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 6512276d97b160d90b53285bd06f7f201459a7e3 upstream.
If a locker taking the qspinlock slowpath reads a lock value indicating that only the pending bit is set, then it will spin whilst the concurrent pending->locked transition takes effect.
Unfortunately, there is no guarantee that such a transition will ever be observed since concurrent lockers could continuously set pending and hand over the lock amongst themselves, leading to starvation. Whilst this would probably resolve in practice, it means that it is not possible to prove liveness properties about the lock and means that lock acquisition time is unbounded.
Rather than removing the pending->locked spinning from the slowpath altogether (which has been shown to heavily penalise a 2-threaded locking stress test on x86), this patch replaces the explicit spinning with a call to atomic_cond_read_relaxed and allows the architecture to provide a bound on the number of spins. For architectures that can respond to changes in cacheline state in their smp_cond_load implementation, it should be sufficient to use the default bound of 1.
Suggested-by: Waiman Long longman@redhat.com Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Waiman Long longman@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-4-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 6fce84401dba..a8da1fc5222e 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -75,6 +75,18 @@ #define MAX_NODES 4 #endif
+/* + * The pending bit spinning loop count. + * This heuristic is used to limit the number of lockword accesses + * made by atomic_cond_read_relaxed when waiting for the lock to + * transition out of the "== _Q_PENDING_VAL" state. We don't spin + * indefinitely because there's no guarantee that we'll make forward + * progress. + */ +#ifndef _Q_PENDING_LOOPS +#define _Q_PENDING_LOOPS 1 +#endif + /* * Per-CPU queue node structures; we can never have more than 4 nested * contexts: task, softirq, hardirq, nmi. @@ -422,13 +434,15 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) return;
/* - * wait for in-progress pending->locked hand-overs + * Wait for in-progress pending->locked hand-overs with a bounded + * number of spins so that we guarantee forward progress. * * 0,1,0 -> 0,0,1 */ if (val == _Q_PENDING_VAL) { - while ((val = atomic_read(&lock->val)) == _Q_PENDING_VAL) - cpu_relax(); + int cnt = _Q_PENDING_LOOPS; + val = smp_cond_load_acquire(&lock->val.counter, + (VAL != _Q_PENDING_VAL) || !cnt--); }
/*
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 625e88be1f41b53cec55827c984e4a89ea8ee9f9 upstream.
'struct __qspinlock' provides a handy union of fields so that subcomponents of the lockword can be accessed by name, without having to manage shifts and masks explicitly and take endianness into account.
This is useful in qspinlock.h and also potentially in arch headers, so move the 'struct __qspinlock' into 'struct qspinlock' and kill the extra definition.
Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Waiman Long longman@redhat.com Acked-by: Boqun Feng boqun.feng@gmail.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-3-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/include/asm/qspinlock_paravirt.h | 3 +- include/asm-generic/qspinlock_types.h | 32 +++++++++++++++- kernel/locking/qspinlock.c | 46 ++--------------------- kernel/locking/qspinlock_paravirt.h | 34 ++++++----------- 5 files changed, 46 insertions(+), 71 deletions(-)
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index eaba08076030..e07cc206919d 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -14,7 +14,7 @@ */ static inline void native_queued_spin_unlock(struct qspinlock *lock) { - smp_store_release((u8 *)lock, 0); + smp_store_release(&lock->locked, 0); }
#ifdef CONFIG_PARAVIRT_SPINLOCKS diff --git a/arch/x86/include/asm/qspinlock_paravirt.h b/arch/x86/include/asm/qspinlock_paravirt.h index 9d55f9b6e167..fc75415ae971 100644 --- a/arch/x86/include/asm/qspinlock_paravirt.h +++ b/arch/x86/include/asm/qspinlock_paravirt.h @@ -21,8 +21,7 @@ PV_CALLEE_SAVE_REGS_THUNK(__pv_queued_spin_unlock_slowpath); * * void __pv_queued_spin_unlock(struct qspinlock *lock) * { - * struct __qspinlock *l = (void *)lock; - * u8 lockval = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0); + * u8 lockval = cmpxchg(&lock->locked, _Q_LOCKED_VAL, 0); * * if (likely(lockval == _Q_LOCKED_VAL)) * return; diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic/qspinlock_types.h index 034acd0c4956..0763f065b975 100644 --- a/include/asm-generic/qspinlock_types.h +++ b/include/asm-generic/qspinlock_types.h @@ -29,13 +29,41 @@ #endif
typedef struct qspinlock { - atomic_t val; + union { + atomic_t val; + + /* + * By using the whole 2nd least significant byte for the + * pending bit, we can allow better optimization of the lock + * acquisition for the pending bit holder. + */ +#ifdef __LITTLE_ENDIAN + struct { + u8 locked; + u8 pending; + }; + struct { + u16 locked_pending; + u16 tail; + }; +#else + struct { + u16 tail; + u16 locked_pending; + }; + struct { + u8 reserved[2]; + u8 pending; + u8 locked; + }; +#endif + }; } arch_spinlock_t;
/* * Initializier */ -#define __ARCH_SPIN_LOCK_UNLOCKED { ATOMIC_INIT(0) } +#define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) }
/* * Bitfields in the atomic value: diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index a8da1fc5222e..cc98050e8bec 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -125,40 +125,6 @@ static inline __pure struct mcs_spinlock *decode_tail(u32 tail)
#define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK)
-/* - * By using the whole 2nd least significant byte for the pending bit, we - * can allow better optimization of the lock acquisition for the pending - * bit holder. - * - * This internal structure is also used by the set_locked function which - * is not restricted to _Q_PENDING_BITS == 8. - */ -struct __qspinlock { - union { - atomic_t val; -#ifdef __LITTLE_ENDIAN - struct { - u8 locked; - u8 pending; - }; - struct { - u16 locked_pending; - u16 tail; - }; -#else - struct { - u16 tail; - u16 locked_pending; - }; - struct { - u8 reserved[2]; - u8 pending; - u8 locked; - }; -#endif - }; -}; - #if _Q_PENDING_BITS == 8 /** * clear_pending_set_locked - take ownership and clear the pending bit. @@ -170,9 +136,7 @@ struct __qspinlock { */ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - - WRITE_ONCE(l->locked_pending, _Q_LOCKED_VAL); + WRITE_ONCE(lock->locked_pending, _Q_LOCKED_VAL); }
/* @@ -187,13 +151,11 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) */ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) { - struct __qspinlock *l = (void *)lock; - /* * Use release semantics to make sure that the MCS node is properly * initialized before changing the tail code. */ - return (u32)xchg_release(&l->tail, + return (u32)xchg_release(&lock->tail, tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; }
@@ -248,9 +210,7 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) */ static __always_inline void set_locked(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - - WRITE_ONCE(l->locked, _Q_LOCKED_VAL); + WRITE_ONCE(lock->locked, _Q_LOCKED_VAL); }
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index e3b5520005db..3e33336911ff 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -69,10 +69,8 @@ struct pv_node { #define queued_spin_trylock(l) pv_queued_spin_steal_lock(l) static inline bool pv_queued_spin_steal_lock(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - if (!(atomic_read(&lock->val) & _Q_LOCKED_PENDING_MASK) && - (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) { + (cmpxchg(&lock->locked, 0, _Q_LOCKED_VAL) == 0)) { qstat_inc(qstat_pv_lock_stealing, true); return true; } @@ -87,16 +85,12 @@ static inline bool pv_queued_spin_steal_lock(struct qspinlock *lock) #if _Q_PENDING_BITS == 8 static __always_inline void set_pending(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - - WRITE_ONCE(l->pending, 1); + WRITE_ONCE(lock->pending, 1); }
static __always_inline void clear_pending(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - - WRITE_ONCE(l->pending, 0); + WRITE_ONCE(lock->pending, 0); }
/* @@ -106,10 +100,8 @@ static __always_inline void clear_pending(struct qspinlock *lock) */ static __always_inline int trylock_clear_pending(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; - - return !READ_ONCE(l->locked) && - (cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL) + return !READ_ONCE(lock->locked) && + (cmpxchg(&lock->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL) == _Q_PENDING_VAL); } #else /* _Q_PENDING_BITS == 8 */ @@ -353,7 +345,6 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev) static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node) { struct pv_node *pn = (struct pv_node *)node; - struct __qspinlock *l = (void *)lock;
/* * If the vCPU is indeed halted, advance its state to match that of @@ -372,7 +363,7 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node) * the hash table later on at unlock time, no atomic instruction is * needed. */ - WRITE_ONCE(l->locked, _Q_SLOW_VAL); + WRITE_ONCE(lock->locked, _Q_SLOW_VAL); (void)pv_hash(lock, pn); }
@@ -387,7 +378,6 @@ static u32 pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) { struct pv_node *pn = (struct pv_node *)node; - struct __qspinlock *l = (void *)lock; struct qspinlock **lp = NULL; int waitcnt = 0; int loop; @@ -438,13 +428,13 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) * * Matches the smp_rmb() in __pv_queued_spin_unlock(). */ - if (xchg(&l->locked, _Q_SLOW_VAL) == 0) { + if (xchg(&lock->locked, _Q_SLOW_VAL) == 0) { /* * The lock was free and now we own the lock. * Change the lock value back to _Q_LOCKED_VAL * and unhash the table. */ - WRITE_ONCE(l->locked, _Q_LOCKED_VAL); + WRITE_ONCE(lock->locked, _Q_LOCKED_VAL); WRITE_ONCE(*lp, NULL); goto gotlock; } @@ -452,7 +442,7 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) WRITE_ONCE(pn->state, vcpu_hashed); qstat_inc(qstat_pv_wait_head, true); qstat_inc(qstat_pv_wait_again, waitcnt); - pv_wait(&l->locked, _Q_SLOW_VAL); + pv_wait(&lock->locked, _Q_SLOW_VAL);
/* * Because of lock stealing, the queue head vCPU may not be @@ -477,7 +467,6 @@ gotlock: __visible void __pv_queued_spin_unlock_slowpath(struct qspinlock *lock, u8 locked) { - struct __qspinlock *l = (void *)lock; struct pv_node *node;
if (unlikely(locked != _Q_SLOW_VAL)) { @@ -506,7 +495,7 @@ __pv_queued_spin_unlock_slowpath(struct qspinlock *lock, u8 locked) * Now that we have a reference to the (likely) blocked pv_node, * release the lock. */ - smp_store_release(&l->locked, 0); + smp_store_release(&lock->locked, 0);
/* * At this point the memory pointed at by lock can be freed/reused, @@ -532,7 +521,6 @@ __pv_queued_spin_unlock_slowpath(struct qspinlock *lock, u8 locked) #ifndef __pv_queued_spin_unlock __visible void __pv_queued_spin_unlock(struct qspinlock *lock) { - struct __qspinlock *l = (void *)lock; u8 locked;
/* @@ -540,7 +528,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock *lock) * unhash. Otherwise it would be possible to have multiple @lock * entries, which would be BAD. */ - locked = cmpxchg_release(&l->locked, _Q_LOCKED_VAL, 0); + locked = cmpxchg_release(&lock->locked, _Q_LOCKED_VAL, 0); if (likely(locked == _Q_LOCKED_VAL)) return;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 59fb586b4a07b4e1a0ee577140ab4842ba451acd upstream.
The qspinlock locking slowpath utilises a "pending" bit as a simple form of an embedded test-and-set lock that can avoid the overhead of explicit queuing in cases where the lock is held but uncontended. This bit is managed using a cmpxchg() loop which tries to transition the uncontended lock word from (0,0,0) -> (0,0,1) or (0,0,1) -> (0,1,1).
Unfortunately, the cmpxchg() loop is unbounded and lockers can be starved indefinitely if the lock word is seen to oscillate between unlocked (0,0,0) and locked (0,0,1). This could happen if concurrent lockers are able to take the lock in the cmpxchg() loop without queuing and pass it around amongst themselves.
This patch fixes the problem by unconditionally setting _Q_PENDING_VAL using atomic_fetch_or, and then inspecting the old value to see whether we need to spin on the current lock owner, or whether we now effectively hold the lock. The tricky scenario is when concurrent lockers end up queuing on the lock and the lock becomes available, causing us to see a lockword of (n,0,0). With pending now set, simply queuing could lead to deadlock as the head of the queue may not have observed the pending flag being cleared. Conversely, if the head of the queue did observe pending being cleared, then it could transition the lock from (n,0,0) -> (0,0,1) meaning that any attempt to "undo" our setting of the pending bit could race with a concurrent locker trying to set it.
We handle this race by preserving the pending bit when taking the lock after reaching the head of the queue and leaving the tail entry intact if we saw pending set, because we know that the tail is going to be updated shortly.
Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Waiman Long longman@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-6-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 102 ++++++++++++++++------------ kernel/locking/qspinlock_paravirt.h | 5 -- 2 files changed, 58 insertions(+), 49 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index cc98050e8bec..e7ab99a1f438 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -126,6 +126,17 @@ static inline __pure struct mcs_spinlock *decode_tail(u32 tail) #define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK)
#if _Q_PENDING_BITS == 8 +/** + * clear_pending - clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,* -> *,0,* + */ +static __always_inline void clear_pending(struct qspinlock *lock) +{ + WRITE_ONCE(lock->pending, 0); +} + /** * clear_pending_set_locked - take ownership and clear the pending bit. * @lock: Pointer to queued spinlock structure @@ -161,6 +172,17 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
#else /* _Q_PENDING_BITS == 8 */
+/** + * clear_pending - clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,* -> *,0,* + */ +static __always_inline void clear_pending(struct qspinlock *lock) +{ + atomic_andnot(_Q_PENDING_VAL, &lock->val); +} + /** * clear_pending_set_locked - take ownership and clear the pending bit. * @lock: Pointer to queued spinlock structure @@ -382,7 +404,7 @@ EXPORT_SYMBOL(queued_spin_unlock_wait); void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) { struct mcs_spinlock *prev, *next, *node; - u32 new, old, tail; + u32 old, tail; int idx;
BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -405,59 +427,51 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) (VAL != _Q_PENDING_VAL) || !cnt--); }
+ /* + * If we observe any contention; queue. + */ + if (val & ~_Q_LOCKED_MASK) + goto queue; + /* * trylock || pending * * 0,0,0 -> 0,0,1 ; trylock * 0,0,1 -> 0,1,1 ; pending */ - for (;;) { + val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); + if (!(val & ~_Q_LOCKED_MASK)) { /* - * If we observe any contention; queue. + * We're pending, wait for the owner to go away. + * + * *,1,1 -> *,1,0 + * + * this wait loop must be a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because not all + * clear_pending_set_locked() implementations imply full + * barriers. */ - if (val & ~_Q_LOCKED_MASK) - goto queue; - - new = _Q_LOCKED_VAL; - if (val == new) - new |= _Q_PENDING_VAL; + if (val & _Q_LOCKED_MASK) { + smp_cond_load_acquire(&lock->val.counter, + !(VAL & _Q_LOCKED_MASK)); + }
/* - * Acquire semantic is required here as the function may - * return immediately if the lock was free. + * take ownership and clear the pending bit. + * + * *,1,0 -> *,0,1 */ - old = atomic_cmpxchg_acquire(&lock->val, val, new); - if (old == val) - break; - - val = old; - } - - /* - * we won the trylock - */ - if (new == _Q_LOCKED_VAL) + clear_pending_set_locked(lock); return; + }
/* - * we're pending, wait for the owner to go away. - * - * *,1,1 -> *,1,0 - * - * this wait loop must be a load-acquire such that we match the - * store-release that clears the locked bit and create lock - * sequentiality; this is because not all clear_pending_set_locked() - * implementations imply full barriers. - */ - smp_cond_load_acquire(&lock->val.counter, !(VAL & _Q_LOCKED_MASK)); - - /* - * take ownership and clear the pending bit. - * - * *,1,0 -> *,0,1 + * If pending was clear but there are waiters in the queue, then + * we need to undo our setting of pending before we queue ourselves. */ - clear_pending_set_locked(lock); - return; + if (!(val & _Q_PENDING_MASK)) + clear_pending(lock);
/* * End of pending bit optimistic spinning and beginning of MCS @@ -561,15 +575,15 @@ locked: * claim the lock: * * n,0,0 -> 0,0,1 : lock, uncontended - * *,0,0 -> *,0,1 : lock, contended + * *,*,0 -> *,*,1 : lock, contended * - * If the queue head is the only one in the queue (lock value == tail), - * clear the tail code and grab the lock. Otherwise, we only need - * to grab the lock. + * If the queue head is the only one in the queue (lock value == tail) + * and nobody is pending, clear the tail code and grab the lock. + * Otherwise, we only need to grab the lock. */ for (;;) { /* In the PV case we might already have _Q_LOCKED_VAL set */ - if ((val & _Q_TAIL_MASK) != tail) { + if ((val & _Q_TAIL_MASK) != tail || (val & _Q_PENDING_MASK)) { set_locked(lock); break; } diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index 3e33336911ff..9c07c72fb10e 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -88,11 +88,6 @@ static __always_inline void set_pending(struct qspinlock *lock) WRITE_ONCE(lock->pending, 1); }
-static __always_inline void clear_pending(struct qspinlock *lock) -{ - WRITE_ONCE(lock->pending, 0); -} - /* * The pending bit check in pv_queued_spin_steal_lock() isn't a memory * barrier. Therefore, an atomic cmpxchg() is used to acquire the lock
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 3bea9adc96842b8a7345c7fb202c16ae9c8d5b25 upstream.
The native clear_pending() function is identical to the PV version, so the latter can simply be removed.
This fixes the build for systems with >= 16K CPUs using the PV lock implementation.
Reported-by: Waiman Long longman@redhat.com Signed-off-by: Will Deacon will.deacon@arm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20180427101619.GB21705@arm.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock_paravirt.h | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index 9c07c72fb10e..af2a24d484aa 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -105,11 +105,6 @@ static __always_inline void set_pending(struct qspinlock *lock) atomic_or(_Q_PENDING_VAL, &lock->val); }
-static __always_inline void clear_pending(struct qspinlock *lock) -{ - atomic_andnot(_Q_PENDING_VAL, &lock->val); -} - static __always_inline int trylock_clear_pending(struct qspinlock *lock) { int val = atomic_read(&lock->val);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit c61da58d8a9ba9238250a548f00826eaf44af0f7 upstream.
When a queued locker reaches the head of the queue, it claims the lock by setting _Q_LOCKED_VAL in the lockword. If there isn't contention, it must also clear the tail as part of this operation so that subsequent lockers can avoid taking the slowpath altogether.
Currently this is expressed as a cmpxchg() loop that practically only runs up to two iterations. This is confusing to the reader and unhelpful to the compiler. Rewrite the cmpxchg() loop without the loop, so that a failed cmpxchg() implies that there is contention and we just need to write to _Q_LOCKED_VAL without considering the rest of the lockword.
Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Waiman Long longman@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-7-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index e7ab99a1f438..ba5dc86a4d83 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -581,24 +581,21 @@ locked: * and nobody is pending, clear the tail code and grab the lock. * Otherwise, we only need to grab the lock. */ - for (;;) { - /* In the PV case we might already have _Q_LOCKED_VAL set */ - if ((val & _Q_TAIL_MASK) != tail || (val & _Q_PENDING_MASK)) { - set_locked(lock); - break; - } + + /* In the PV case we might already have _Q_LOCKED_VAL set */ + if ((val & _Q_TAIL_MASK) == tail) { /* * The smp_cond_load_acquire() call above has provided the - * necessary acquire semantics required for locking. At most - * two iterations of this loop may be ran. + * necessary acquire semantics required for locking. */ old = atomic_cmpxchg_relaxed(&lock->val, val, _Q_LOCKED_VAL); if (old == val) - goto release; /* No contention */ - - val = old; + goto release; /* No contention */ }
+ /* Either somebody is queued behind us or _Q_PENDING_VAL is set */ + set_locked(lock); + /* * contended path; wait for next if not observed yet, release. */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 53bf57fab7321fb42b703056a4c80fc9d986d170 upstream.
Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL) such that we loose the indent. This also result in a more natural code flow IMO.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Will Deacon will.deacon@arm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/qspinlock.c | 54 ++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ba5dc86a4d83..f493a4fce624 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -440,38 +440,36 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * 0,0,1 -> 0,1,1 ; pending */ val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); - if (!(val & ~_Q_LOCKED_MASK)) { - /* - * We're pending, wait for the owner to go away. - * - * *,1,1 -> *,1,0 - * - * this wait loop must be a load-acquire such that we match the - * store-release that clears the locked bit and create lock - * sequentiality; this is because not all - * clear_pending_set_locked() implementations imply full - * barriers. - */ - if (val & _Q_LOCKED_MASK) { - smp_cond_load_acquire(&lock->val.counter, - !(VAL & _Q_LOCKED_MASK)); - } - - /* - * take ownership and clear the pending bit. - * - * *,1,0 -> *,0,1 - */ - clear_pending_set_locked(lock); - return; + /* + * If we observe any contention; undo and queue. + */ + if (unlikely(val & ~_Q_LOCKED_MASK)) { + if (!(val & _Q_PENDING_MASK)) + clear_pending(lock); + goto queue; }
/* - * If pending was clear but there are waiters in the queue, then - * we need to undo our setting of pending before we queue ourselves. + * We're pending, wait for the owner to go away. + * + * 0,1,1 -> 0,1,0 + * + * this wait loop must be a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because not all + * clear_pending_set_locked() implementations imply full + * barriers. + */ + if (val & _Q_LOCKED_MASK) + smp_cond_load_acquire(&lock->val.counter, !(VAL & _Q_LOCKED_MASK)); + + /* + * take ownership and clear the pending bit. + * + * 0,1,0 -> 0,0,1 */ - if (!(val & _Q_PENDING_MASK)) - clear_pending(lock); + clear_pending_set_locked(lock); + return;
/* * End of pending bit optimistic spinning and beginning of MCS
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit b247be3fe89b6aba928bf80f4453d1c4ba8d2063 upstream.
On x86, atomic_cond_read_relaxed will busy-wait with a cpu_relax() loop, so it is desirable to increase the number of times we spin on the qspinlock lockword when it is found to be transitioning from pending to locked.
According to Waiman Long:
| Ideally, the spinning times should be at least a few times the typical | cacheline load time from memory which I think can be down to 100ns or | so for each cacheline load with the newest systems or up to several | hundreds ns for older systems.
which in his benchmarking corresponded to 512 iterations.
Suggested-by: Waiman Long longman@redhat.com Signed-off-by: Will Deacon will.deacon@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Waiman Long longman@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Thomas Gleixner tglx@linutronix.de Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-5-git-send-email-will.deacon@arm.c... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/qspinlock.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index e07cc206919d..8b1ba1607091 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -5,6 +5,8 @@ #include <asm-generic/qspinlock_types.h> #include <asm/paravirt.h>
+#define _Q_PENDING_LOOPS (1 << 9) + #define queued_spin_unlock queued_spin_unlock /** * queued_spin_unlock - release a queued spinlock
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.
On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load.
Using two instructions of course opens a window we previously did not have. Consider the scenario:
CPU0 CPU1 CPU2
1) lock trylock -> (0,0,1)
2) lock trylock /* fail */
3) unlock -> (0,0,0)
4) lock trylock -> (0,0,1)
5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3
6) clear-pending-set-locked -> (0,0,1)
FAIL: _2_ owners
where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence).
To avoid this we need 2 things:
- the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state).
- the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending.
On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load.
Note that observing later state is not a problem:
- if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible.
- if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending.
Suggested-by: Will Deacon will.deacon@arm.com Reported-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Will Deacon will.deacon@arm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org [bigeasy: GEN_BINARY_RMWcc macro redo] Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/qspinlock.h | 21 +++++++++++++++++++++ kernel/locking/qspinlock.c | 17 ++++++++++++++++- 2 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index 8b1ba1607091..9e78e963afb8 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -4,9 +4,30 @@ #include <asm/cpufeature.h> #include <asm-generic/qspinlock_types.h> #include <asm/paravirt.h> +#include <asm/rmwcc.h>
#define _Q_PENDING_LOOPS (1 << 9)
+#define queued_fetch_set_pending_acquire queued_fetch_set_pending_acquire + +static __always_inline bool __queued_RMW_btsl(struct qspinlock *lock) +{ + GEN_BINARY_RMWcc(LOCK_PREFIX "btsl", lock->val.counter, + "I", _Q_PENDING_OFFSET, "%0", c); +} + +static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) +{ + u32 val = 0; + + if (__queued_RMW_btsl(lock)) + val |= _Q_PENDING_VAL; + + val |= atomic_read(&lock->val) & ~_Q_PENDING_MASK; + + return val; +} + #define queued_spin_unlock queued_spin_unlock /** * queued_spin_unlock - release a queued spinlock diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index f493a4fce624..0ed478e10071 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -224,6 +224,20 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) } #endif /* _Q_PENDING_BITS == 8 */
+/** + * queued_fetch_set_pending_acquire - fetch the whole lock value and set pending + * @lock : Pointer to queued spinlock structure + * Return: The previous lock value + * + * *,*,* -> *,1,* + */ +#ifndef queued_fetch_set_pending_acquire +static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) +{ + return atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); +} +#endif + /** * set_locked - Set the lock bit and own the lock * @lock: Pointer to queued spinlock structure @@ -439,7 +453,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * 0,0,0 -> 0,0,1 ; trylock * 0,0,1 -> 0,1,1 ; pending */ - val = atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); + val = queued_fetch_set_pending_acquire(lock); + /* * If we observe any contention; undo and queue. */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6cc65be4f6f2a7186af8f3e09900787c7912dad2 ]
One of my tests compiles the kernel with gcc 4.5.3, and I hit the following build error:
include/linux/semaphore.h: In function 'sema_init': include/linux/semaphore.h:35:17: error: unknown field 'val' specified in initializer include/linux/semaphore.h:35:17: warning: missing braces around initializer include/linux/semaphore.h:35:17: warning: (near initialization for '(anonymous).raw_lock.<anonymous>.val')
I bisected it down to:
625e88be1f41 ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'")
... which makes qspinlock have an anonymous union, which makes initializing it special for older compilers. By adding strategic brackets, it makes the build happy again.
Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Acked-by: Waiman Long longman@redhat.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Boqun Feng boqun.feng@gmail.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra (Intel) peterz@infradead.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Cc: linux-arm-kernel@lists.infradead.org Fixes: 625e88be1f41 ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'") Link: http://lkml.kernel.org/r/20180621203526.172ab5c4@vmware.local.home Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/asm-generic/qspinlock_types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic/qspinlock_types.h index 0763f065b975..d10f1e7d6ba8 100644 --- a/include/asm-generic/qspinlock_types.h +++ b/include/asm-generic/qspinlock_types.h @@ -63,7 +63,7 @@ typedef struct qspinlock { /* * Initializier */ -#define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } +#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = ATOMIC_INIT(0) } }
/* * Bitfields in the atomic value:
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 05cc09de4c017663a217630682041066f2f9a5cd ]
There is no unregister netlink notifier and family on error paths in init_mac80211_hwsim(). Also there is an error path where hwsim_class is not destroyed.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov khoroshilov@ispras.ru Fixes: 62759361eb49 ("mac80211-hwsim: Provide multicast event for HWSIM_CMD_NEW_RADIO") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index 0852a1aad075..780acf23fd19 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -3403,16 +3403,16 @@ static int __init init_mac80211_hwsim(void) if (err) goto out_unregister_pernet;
+ err = hwsim_init_netlink(); + if (err) + goto out_unregister_driver; + hwsim_class = class_create(THIS_MODULE, "mac80211_hwsim"); if (IS_ERR(hwsim_class)) { err = PTR_ERR(hwsim_class); - goto out_unregister_driver; + goto out_exit_netlink; }
- err = hwsim_init_netlink(); - if (err < 0) - goto out_unregister_driver; - for (i = 0; i < radios; i++) { struct hwsim_new_radio_params param = { 0 };
@@ -3518,6 +3518,8 @@ out_free_mon: free_netdev(hwsim_mon); out_free_radios: mac80211_hwsim_free(); +out_exit_netlink: + hwsim_exit_netlink(); out_unregister_driver: platform_driver_unregister(&mac80211_hwsim_driver); out_unregister_pernet:
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 5db6dd14b31397e8cccaaddab2ff44ebec1acf25 ]
This commit addresses NULL pointer dereference in iscsi_eh_session_reset. Reference should not be made to session->leadconn when session->state is set to ISCSI_STATE_TERMINATE.
Signed-off-by: Fred Herard fred.herard@oracle.com Reviewed-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Reviewed-by: Lee Duncan lduncan@suse.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libiscsi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index cc8f2a7c2463..c79743de48f9 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -2414,8 +2414,8 @@ int iscsi_eh_session_reset(struct scsi_cmnd *sc) failed: ISCSI_DBG_EH(session, "failing session reset: Could not log back into " - "%s, %s [age %d]\n", session->targetname, - conn->persistent_address, session->age); + "%s [age %d]\n", session->targetname, + session->age); spin_unlock_bh(&session->frwd_lock); mutex_unlock(&session->eh_mutex); return FAILED;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 02f425f811cefcc4d325d7a72272651e622dc97e ]
Currently pvscsi_remove calls free_irq more than once as pvscsi_release_resources and __pvscsi_shutdown both call pvscsi_shutdown_intr. This results in a 'Trying to free already-free IRQ' warning and stack trace. To solve the problem pvscsi_shutdown_intr has been moved out of pvscsi_release_resources.
Signed-off-by: Cathy Avery cavery@redhat.com Reviewed-by: Ewan D. Milne emilne@redhat.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/vmw_pvscsi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/vmw_pvscsi.c b/drivers/scsi/vmw_pvscsi.c index 874e9f085326..fcfbe2dcd025 100644 --- a/drivers/scsi/vmw_pvscsi.c +++ b/drivers/scsi/vmw_pvscsi.c @@ -1233,8 +1233,6 @@ static void pvscsi_shutdown_intr(struct pvscsi_adapter *adapter)
static void pvscsi_release_resources(struct pvscsi_adapter *adapter) { - pvscsi_shutdown_intr(adapter); - if (adapter->workqueue) destroy_workqueue(adapter->workqueue);
@@ -1563,6 +1561,7 @@ static int pvscsi_probe(struct pci_dev *pdev, const struct pci_device_id *id) out_reset_adapter: ll_adapter_reset(adapter); out_release_resources: + pvscsi_shutdown_intr(adapter); pvscsi_release_resources(adapter); scsi_host_put(host); out_disable_device: @@ -1571,6 +1570,7 @@ out_disable_device: return error;
out_release_resources_and_disable: + pvscsi_shutdown_intr(adapter); pvscsi_release_resources(adapter); goto out_disable_device; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 79c2206d369b87b19ac29cb47601059b6bf5c291 ]
An affected screen resolution is 1366 x 768, which width is not divisible by 8, the default font width. On such screens, when longer lines are earlyprintk'ed, overflow-to-next-line can never trigger, due to the left-most x-coordinate of the next character always less than the screen width. Earlyprintk will infinite loop in trying to print the rest of the string but unable to, due to the line being full.
This patch makes the trigger consider the right-most x-coordinate, instead of left-most, as the value to compare against the screen width threshold.
Signed-off-by: YiFei Zhu zhuyifei1999@gmail.com Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Cc: Andy Lutomirski luto@kernel.org Cc: Arend van Spriel arend.vanspriel@broadcom.com Cc: Bhupesh Sharma bhsharma@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@intel.com Cc: Eric Snowberg eric.snowberg@oracle.com Cc: Hans de Goede hdegoede@redhat.com Cc: Joe Perches joe@perches.com Cc: Jon Hunter jonathanh@nvidia.com Cc: Julien Thierry julien.thierry@arm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Marc Zyngier marc.zyngier@arm.com Cc: Matt Fleming matt@codeblueprint.co.uk Cc: Nathan Chancellor natechancellor@gmail.com Cc: Peter Zijlstra peterz@infradead.org Cc: Sai Praneeth Prakhya sai.praneeth.prakhya@intel.com Cc: Sedat Dilek sedat.dilek@gmail.com Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20181129171230.18699-12-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/platform/efi/early_printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/platform/efi/early_printk.c b/arch/x86/platform/efi/early_printk.c index 5fdacb322ceb..c3e6be110b7d 100644 --- a/arch/x86/platform/efi/early_printk.c +++ b/arch/x86/platform/efi/early_printk.c @@ -179,7 +179,7 @@ early_efi_write(struct console *con, const char *str, unsigned int num) num--; }
- if (efi_x >= si->lfb_width) { + if (efi_x + font->width > si->lfb_width) { efi_x = 0; efi_y += font->height; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3b712e43e3876b42b38321ecf790a1f5fe59c834 ]
Similar to the atomic helpers, we should enable vblank while we're waiting for the commit to finish. DPU needs this, MDP5 seems to work fine without it.
Reviewed-by: Abhinav Kumar abhinavk@codeaurora.org Signed-off-by: Sean Paul seanpaul@chromium.org Signed-off-by: Rob Clark robdclark@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_atomic.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c index 73bae382eac3..5c58a98f67c0 100644 --- a/drivers/gpu/drm/msm/msm_atomic.c +++ b/drivers/gpu/drm/msm/msm_atomic.c @@ -98,7 +98,12 @@ static void msm_atomic_wait_for_commit_done(struct drm_device *dev, if (old_state->legacy_cursor_update) continue;
+ if (drm_crtc_vblank_get(crtc)) + continue; + kms->funcs->wait_for_crtc_commit_done(kms, crtc); + + drm_crtc_vblank_put(crtc); } }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 10d443431dc2bb733cf7add99b453e3fb9047a2e ]
Some ARC CPU's do not support unaligned loads/stores. Currently, generic implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC. This can lead to misfunction of some drivers as generic functions do a plain dereference of a pointer that can be unaligned.
Let's use {get/put}_unaligned() helpers instead of plain dereference of pointer in order to fix. The helpers allow to get and store data from an unaligned address whilst preserving the CPU internal alignment. According to [1], the use of these helpers are costly in terms of performance so we added an initial check for a buffer already aligned so that the usage of the helpers can be avoided, when possible.
[1] Documentation/unaligned-memory-access.txt
Cc: Alexey Brodkin abrodkin@synopsys.com Cc: Joao Pinto jpinto@synopsys.com Cc: David Laight David.Laight@ACULAB.COM Tested-by: Vitor Soares soares@synopsys.com Signed-off-by: Jose Abreu joabreu@synopsys.com Signed-off-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arc/include/asm/io.h | 72 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+)
diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h index c22b181e8206..2f39d9b3886e 100644 --- a/arch/arc/include/asm/io.h +++ b/arch/arc/include/asm/io.h @@ -12,6 +12,7 @@ #include <linux/types.h> #include <asm/byteorder.h> #include <asm/page.h> +#include <asm/unaligned.h>
#ifdef CONFIG_ISA_ARCV2 #include <asm/barrier.h> @@ -94,6 +95,42 @@ static inline u32 __raw_readl(const volatile void __iomem *addr) return w; }
+/* + * {read,write}s{b,w,l}() repeatedly access the same IO address in + * native endianness in 8-, 16-, 32-bit chunks {into,from} memory, + * @count times + */ +#define __raw_readsx(t,f) \ +static inline void __raw_reads##f(const volatile void __iomem *addr, \ + void *ptr, unsigned int count) \ +{ \ + bool is_aligned = ((unsigned long)ptr % ((t) / 8)) == 0; \ + u##t *buf = ptr; \ + \ + if (!count) \ + return; \ + \ + /* Some ARC CPU's don't support unaligned accesses */ \ + if (is_aligned) { \ + do { \ + u##t x = __raw_read##f(addr); \ + *buf++ = x; \ + } while (--count); \ + } else { \ + do { \ + u##t x = __raw_read##f(addr); \ + put_unaligned(x, buf++); \ + } while (--count); \ + } \ +} + +#define __raw_readsb __raw_readsb +__raw_readsx(8, b) +#define __raw_readsw __raw_readsw +__raw_readsx(16, w) +#define __raw_readsl __raw_readsl +__raw_readsx(32, l) + #define __raw_writeb __raw_writeb static inline void __raw_writeb(u8 b, volatile void __iomem *addr) { @@ -126,6 +163,35 @@ static inline void __raw_writel(u32 w, volatile void __iomem *addr)
}
+#define __raw_writesx(t,f) \ +static inline void __raw_writes##f(volatile void __iomem *addr, \ + const void *ptr, unsigned int count) \ +{ \ + bool is_aligned = ((unsigned long)ptr % ((t) / 8)) == 0; \ + const u##t *buf = ptr; \ + \ + if (!count) \ + return; \ + \ + /* Some ARC CPU's don't support unaligned accesses */ \ + if (is_aligned) { \ + do { \ + __raw_write##f(*buf++, addr); \ + } while (--count); \ + } else { \ + do { \ + __raw_write##f(get_unaligned(buf++), addr); \ + } while (--count); \ + } \ +} + +#define __raw_writesb __raw_writesb +__raw_writesx(8, b) +#define __raw_writesw __raw_writesw +__raw_writesx(16, w) +#define __raw_writesl __raw_writesl +__raw_writesx(32, l) + /* * MMIO can also get buffered/optimized in micro-arch, so barriers needed * Based on ARM model for the typical use case @@ -141,10 +207,16 @@ static inline void __raw_writel(u32 w, volatile void __iomem *addr) #define readb(c) ({ u8 __v = readb_relaxed(c); __iormb(); __v; }) #define readw(c) ({ u16 __v = readw_relaxed(c); __iormb(); __v; }) #define readl(c) ({ u32 __v = readl_relaxed(c); __iormb(); __v; }) +#define readsb(p,d,l) ({ __raw_readsb(p,d,l); __iormb(); }) +#define readsw(p,d,l) ({ __raw_readsw(p,d,l); __iormb(); }) +#define readsl(p,d,l) ({ __raw_readsl(p,d,l); __iormb(); })
#define writeb(v,c) ({ __iowmb(); writeb_relaxed(v,c); }) #define writew(v,c) ({ __iowmb(); writew_relaxed(v,c); }) #define writel(v,c) ({ __iowmb(); writel_relaxed(v,c); }) +#define writesb(p,d,l) ({ __iowmb(); __raw_writesb(p,d,l); }) +#define writesw(p,d,l) ({ __iowmb(); __raw_writesw(p,d,l); }) +#define writesl(p,d,l) ({ __iowmb(); __raw_writesl(p,d,l); })
/* * Relaxed API for drivers which can handle barrier ordering themselves
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3b5b3a3331d141e8f2a7aaae3a94dfa1e61ecbe4 ]
Previously when unbinding a slave the 802.3ad implementation only told partner that the port is not suitable for aggregation by setting the port aggregation state from aggregatable to individual. This is not enough. If the physical layer still stays up and we only unbinded this port from the bond there is nothing in the aggregation status alone to prevent the partner from sending traffic towards us. To ensure that the partner doesn't consider this port at all anymore we should also disable collecting and distributing to signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to ensure partner exits collecting + distributing state.
I have tested this behaviour againts Arista EOS switches with mlx5 cards (physical link stays up even when interface is down) and simulated the same situation virtually Linux <-> Linux with two network namespaces running two veth device pairs. In both cases setting aggregation to individual doesn't alone prevent traffic from being to sent towards this port given that the link stays up in partners end. Partner still keeps it's end in collecting + distributing state and continues until timeout is reached. In most cases this means we are losing the traffic partner sends towards our port while we wait for timeout. This is most visible with slow periodic time (LACP rate slow).
Other open source implementations like Open VSwitch and libreswitch, and vendor implementations like Arista EOS, seem to disable collecting + distributing to when doing similar port disabling/detaching/removing change. With this patch kernel implementation would behave the same way and ensure partner doesn't consider our actor viable anymore.
Signed-off-by: Toni Peltonen peltzi@peltzi.fi Signed-off-by: Jay Vosburgh jay.vosburgh@canonical.com Acked-by: Jonathan Toppins jtoppins@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_3ad.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index 6dcc42d79cab..1e2ee97b9240 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -2050,6 +2050,9 @@ void bond_3ad_unbind_slave(struct slave *slave) aggregator->aggregator_identifier);
/* Tell the partner that this port is not suitable for aggregation */ + port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION; + port->actor_oper_port_state &= ~AD_STATE_COLLECTING; + port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING; port->actor_oper_port_state &= ~AD_STATE_AGGREGATION; __update_lacpdu_from_port(port); ad_lacpdu_send(port);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ad3cba223ac02dc769c3bbe88efe277bbb457566 ]
When we use direct_IO with an NFS backing store, we can trigger a WARNING in __set_page_dirty(), as below, since we're dirtying the page unnecessarily in nfs_direct_read_completion().
To fix, replicate the logic in commit 53cbf3b157a0 ("fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read").
Other filesystems that implement direct_IO handle this; most use blockdev_direct_IO(). ceph and cifs have similar logic.
mount 127.0.0.1:/export /nfs dd if=/dev/zero of=/nfs/image bs=1M count=200 losetup --direct-io=on -f /nfs/image mkfs.btrfs /dev/loop0 mount -t btrfs /dev/loop0 /mnt/
kernel: WARNING: CPU: 0 PID: 8067 at fs/buffer.c:580 __set_page_dirty+0xaf/0xd0 kernel: Modules linked in: loop(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) tun(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_ kernel: snd_seq(E) snd_seq_device(E) snd_pcm(E) video(E) snd_timer(E) snd(E) soundcore(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) pata_acpi(E) crc32c_intel(E) ahci(E) li kernel: CPU: 0 PID: 8067 Comm: kworker/0:2 Tainted: G E 4.20.0-rc1.master.20181111.ol7.x86_64 #1 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 kernel: Workqueue: nfsiod rpc_async_release [sunrpc] kernel: RIP: 0010:__set_page_dirty+0xaf/0xd0 kernel: Code: c3 48 8b 02 f6 c4 04 74 d4 48 89 df e8 ba 05 f7 ff 48 89 c6 eb cb 48 8b 43 08 a8 01 75 1f 48 89 d8 48 8b 00 a8 04 74 02 eb 87 <0f> 0b eb 83 48 83 e8 01 eb 9f 48 83 ea 01 0f 1f 00 eb 8b 48 83 e8 kernel: RSP: 0000:ffffc1c8825b7d78 EFLAGS: 00013046 kernel: RAX: 000fffffc0020089 RBX: fffff2b603308b80 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: ffff9d11478115c8 RDI: ffff9d11478115d0 kernel: RBP: ffffc1c8825b7da0 R08: 0000646f6973666e R09: 8080808080808080 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d11478115d0 kernel: R13: ffff9d11478115c8 R14: 0000000000003246 R15: 0000000000000001 kernel: FS: 0000000000000000(0000) GS:ffff9d115ba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f408686f640 CR3: 0000000104d8e004 CR4: 00000000000606f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: __set_page_dirty_buffers+0xb6/0x110 kernel: set_page_dirty+0x52/0xb0 kernel: nfs_direct_read_completion+0xc4/0x120 [nfs] kernel: nfs_pgio_release+0x10/0x20 [nfs] kernel: rpc_free_task+0x30/0x70 [sunrpc] kernel: rpc_async_release+0x12/0x20 [sunrpc] kernel: process_one_work+0x174/0x390 kernel: worker_thread+0x4f/0x3e0 kernel: kthread+0x102/0x140 kernel: ? drain_workqueue+0x130/0x130 kernel: ? kthread_stop+0x110/0x110 kernel: ret_from_fork+0x35/0x40 kernel: ---[ end trace 01341980905412c9 ]---
Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Santosh Shilimkar santosh.shilimkar@oracle.com
[forward-ported to v4.20] Signed-off-by: Calum Mackay calum.mackay@oracle.com Reviewed-by: Dave Kleikamp dave.kleikamp@oracle.com Reviewed-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/direct.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 1ab91124a93e..53f0012ace42 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -98,8 +98,11 @@ struct nfs_direct_req { struct pnfs_ds_commit_info ds_cinfo; /* Storage for cinfo */ struct work_struct work; int flags; + /* for write */ #define NFS_ODIRECT_DO_COMMIT (1) /* an unstable reply was received */ #define NFS_ODIRECT_RESCHED_WRITES (2) /* write verification failed */ + /* for read */ +#define NFS_ODIRECT_SHOULD_DIRTY (3) /* dirty user-space page after read */ struct nfs_writeverf verf; /* unstable write verifier */ };
@@ -422,7 +425,8 @@ static void nfs_direct_read_completion(struct nfs_pgio_header *hdr) struct nfs_page *req = nfs_list_entry(hdr->pages.next); struct page *page = req->wb_page;
- if (!PageCompound(page) && bytes < hdr->good_bytes) + if (!PageCompound(page) && bytes < hdr->good_bytes && + (dreq->flags == NFS_ODIRECT_SHOULD_DIRTY)) set_page_dirty(page); bytes += req->wb_bytes; nfs_list_remove_request(req); @@ -597,6 +601,9 @@ ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter) if (!is_sync_kiocb(iocb)) dreq->iocb = iocb;
+ if (iter_is_iovec(iter)) + dreq->flags = NFS_ODIRECT_SHOULD_DIRTY; + nfs_start_io_direct(inode);
NFS_I(inode)->read_io += count;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ]
If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress.
Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/xprt.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 685e6d225414..1a8df242d26a 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -778,8 +778,15 @@ void xprt_connect(struct rpc_task *task) return; if (xprt_test_and_set_connecting(xprt)) return; - xprt->stat.connect_start = jiffies; - xprt->ops->connect(xprt, task); + /* Race breaker */ + if (!xprt_connected(xprt)) { + xprt->stat.connect_start = jiffies; + xprt->ops->connect(xprt, task); + } else { + xprt_clear_connecting(xprt); + task->tk_status = 0; + rpc_wake_up_queued_task(&xprt->pending, task); + } } xprt_release_write(xprt, task); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 87d81a23e24f24ebe014891e8bdf3ff8785031e8 ]
use of_node_put() to release the refcount.
Signed-off-by: Yangtao Li tiny.windzz@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/sbus/char/display7seg.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/sbus/char/display7seg.c b/drivers/sbus/char/display7seg.c index 33fbe8249fd5..044cffbc45e8 100644 --- a/drivers/sbus/char/display7seg.c +++ b/drivers/sbus/char/display7seg.c @@ -221,6 +221,7 @@ static int d7s_probe(struct platform_device *op) dev_set_drvdata(&op->dev, p); d7s_device = p; err = 0; + of_node_put(opts);
out: return err;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6bd520ab7cf69486ea81fd3cdfd2d5a390ad1100 ]
use of_node_put() to release the refcount.
Signed-off-by: Yangtao Li tiny.windzz@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/sbus/char/envctrl.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/sbus/char/envctrl.c b/drivers/sbus/char/envctrl.c index 5609b602c54d..baa9b322520b 100644 --- a/drivers/sbus/char/envctrl.c +++ b/drivers/sbus/char/envctrl.c @@ -910,8 +910,10 @@ static void envctrl_init_i2c_child(struct device_node *dp, for (len = 0; len < PCF8584_MAX_CHANNELS; ++len) { pchild->mon_type[len] = ENVCTRL_NOMON; } + of_node_put(root_node); return; } + of_node_put(root_node); }
/* Get the monitor channels. */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dac097c4546e4c5b16dd303a1e97c1d319c8ab3e ]
of_find_node_by_path() acquires a reference to the node returned by it and that reference needs to be dropped by its caller. This place is not doing this, so fix it.
Signed-off-by: Yangtao Li tiny.windzz@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/suncore.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/suncore.c b/drivers/tty/serial/suncore.c index 127472bd6a7c..209f314745ab 100644 --- a/drivers/tty/serial/suncore.c +++ b/drivers/tty/serial/suncore.c @@ -111,6 +111,7 @@ void sunserial_console_termios(struct console *con, struct device_node *uart_dp) mode = of_get_property(dp, mode_prop, NULL); if (!mode) mode = "9600,8,n,1,-"; + of_node_put(dp); }
cflag = CREAD | HUPCL | CLOCAL;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a51921c0db3fd26c4ed83dc0ec5d32988fa02aa5 ]
use of_node_put() to release the refcount.
Signed-off-by: Yangtao Li tiny.windzz@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ide/pmac.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ide/pmac.c b/drivers/ide/pmac.c index 0c5d3a99468e..b20025a5a8d9 100644 --- a/drivers/ide/pmac.c +++ b/drivers/ide/pmac.c @@ -920,6 +920,7 @@ static u8 pmac_ide_cable_detect(ide_hwif_t *hwif) struct device_node *root = of_find_node_by_path("/"); const char *model = of_get_property(root, "model", NULL);
+ of_node_put(root); /* Get cable type from device-tree. */ if (cable && !strncmp(cable, "80-", 3)) { /* Some drives fail to detect 80c cable in PowerBook */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d9f5b7f5dd0fa74a89de5a7ac1e26366f211ccee ]
These > comparisons should be >= to prevent reading beyond the end of of the clk_data->hws[] buffer.
The clk_data->hws[] array is allocated in cp110_syscon_common_probe() when we do: cp110_clk_data = devm_kzalloc(dev, sizeof(*cp110_clk_data) + sizeof(struct clk_hw *) * CP110_CLK_NUM, GFP_KERNEL); As you can see, it has CP110_CLK_NUM elements which is equivalent to CP110_MAX_CORE_CLOCKS + CP110_MAX_GATABLE_CLOCKS.
Fixes: d3da3eaef7f4 ("clk: mvebu: new driver for Armada CP110 system controller") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mvebu/cp110-system-controller.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/mvebu/cp110-system-controller.c b/drivers/clk/mvebu/cp110-system-controller.c index f2303da7fda7..465953c75320 100644 --- a/drivers/clk/mvebu/cp110-system-controller.c +++ b/drivers/clk/mvebu/cp110-system-controller.c @@ -172,11 +172,11 @@ static struct clk *cp110_of_clk_get(struct of_phandle_args *clkspec, void *data) unsigned int idx = clkspec->args[1];
if (type == CP110_CLK_TYPE_CORE) { - if (idx > CP110_MAX_CORE_CLOCKS) + if (idx >= CP110_MAX_CORE_CLOCKS) return ERR_PTR(-EINVAL); return clk_data->clks[idx]; } else if (type == CP110_CLK_TYPE_GATABLE) { - if (idx > CP110_MAX_GATABLE_CLOCKS) + if (idx >= CP110_MAX_GATABLE_CLOCKS) return ERR_PTR(-EINVAL); return clk_data->clks[CP110_MAX_CORE_CLOCKS + idx]; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2e85c57493e391b93445c1e0d530b36b95becc64 ]
The > comparison should be >= or we write one element beyond the end of the unit->clk_table[] array.
(The unit->clk_table[] array is allocated in the mmp_clk_init() function and it has unit->nr_clks elements).
Fixes: 4661fda10f8b ("clk: mmp: add basic support functions for DT support") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mmp/clk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/mmp/clk.c b/drivers/clk/mmp/clk.c index 61893fe73251..18b6c9b55b95 100644 --- a/drivers/clk/mmp/clk.c +++ b/drivers/clk/mmp/clk.c @@ -182,7 +182,7 @@ void mmp_clk_add(struct mmp_clk_unit *unit, unsigned int id, pr_err("CLK %d has invalid pointer %p\n", id, clk); return; } - if (id > unit->nr_clks) { + if (id >= unit->nr_clks) { pr_err("CLK %d is invalid\n", id); return; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6c3516fed7b61a3527459ccfa67fab130d910610 ]
I noticed that the Android v3.0.8 kernel on droid4 is using different keypad values from the mainline kernel and does not have issues with keys occasionally being stuck until pressed again. Turns out there was an earlier patch posted to fix this as "Input: omap-keypad: errata i689: Correct debounce time", but it was never reposted to fix use macros for timing calculations.
This updated version is using macros, and also fixes the use of the input clock rate to use 32768KiHz instead of 32000KiHz. And we want to use the known good Android kernel values of 3 and 6 instead of 2 and 6 in the earlier patch.
Reported-by: Pavel Machek pavel@ucw.cz Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/keyboard/omap4-keypad.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/input/keyboard/omap4-keypad.c b/drivers/input/keyboard/omap4-keypad.c index 6639b2b8528a..f78c464899db 100644 --- a/drivers/input/keyboard/omap4-keypad.c +++ b/drivers/input/keyboard/omap4-keypad.c @@ -60,8 +60,18 @@
/* OMAP4 values */ #define OMAP4_VAL_IRQDISABLE 0x0 -#define OMAP4_VAL_DEBOUNCINGTIME 0x7 -#define OMAP4_VAL_PVT 0x7 + +/* + * Errata i689: If a key is released for a time shorter than debounce time, + * the keyboard will idle and never detect the key release. The workaround + * is to use at least a 12ms debounce time. See omap5432 TRM chapter + * "26.4.6.2 Keyboard Controller Timer" for more information. + */ +#define OMAP4_KEYPAD_PTV_DIV_128 0x6 +#define OMAP4_KEYPAD_DEBOUNCINGTIME_MS(dbms, ptv) \ + ((((dbms) * 1000) / ((1 << ((ptv) + 1)) * (1000000 / 32768))) - 1) +#define OMAP4_VAL_DEBOUNCINGTIME_16MS \ + OMAP4_KEYPAD_DEBOUNCINGTIME_MS(16, OMAP4_KEYPAD_PTV_DIV_128)
enum { KBD_REVISION_OMAP4 = 0, @@ -181,9 +191,9 @@ static int omap4_keypad_open(struct input_dev *input)
kbd_writel(keypad_data, OMAP4_KBD_CTRL, OMAP4_DEF_CTRL_NOSOFTMODE | - (OMAP4_VAL_PVT << OMAP4_DEF_CTRL_PTV_SHIFT)); + (OMAP4_KEYPAD_PTV_DIV_128 << OMAP4_DEF_CTRL_PTV_SHIFT)); kbd_writel(keypad_data, OMAP4_KBD_DEBOUNCINGTIME, - OMAP4_VAL_DEBOUNCINGTIME); + OMAP4_VAL_DEBOUNCINGTIME_16MS); /* clear pending interrupts */ kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS, kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS));
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit fd6f32f78645db32b6b95a42e45da2ddd6de0e67 ]
These devices support read zero after trim (RZAT), as they advertise to the OS. However, the OS doesn't believe the SSDs unless they are explicitly whitelisted.
Acked-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Juha-Matti Tilli juha-matti.tilli@iki.fi Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index a166359ad5d4..35be49f5791d 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4476,6 +4476,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { "SSD*INTEL*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "Samsung*SSD*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "SAMSUNG*SSD*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { "SAMSUNG*MZ7KM*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "ST[1248][0248]0[FH]*", NULL, ATA_HORKAGE_ZERO_AFTER_TRIM, },
/*
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a74515604a7b171f2702bdcbd1e231225fb456d0 ]
Disable hardware level MAC learning because it breaks station roaming. When enabled it drops all frames that arrive from a MAC address that is on a different port at learning table.
Signed-off-by: Anderson Luiz Alves alacn1@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6060.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c index 7ce36dbd9b62..a3607d083332 100644 --- a/drivers/net/dsa/mv88e6060.c +++ b/drivers/net/dsa/mv88e6060.c @@ -114,8 +114,7 @@ static int mv88e6060_switch_reset(struct dsa_switch *ds) /* Reset the switch. */ REG_WRITE(REG_GLOBAL, GLOBAL_ATU_CONTROL, GLOBAL_ATU_CONTROL_SWRESET | - GLOBAL_ATU_CONTROL_ATUSIZE_1024 | - GLOBAL_ATU_CONTROL_ATE_AGE_5MIN); + GLOBAL_ATU_CONTROL_LEARNDIS);
/* Wait up to one second for reset to complete. */ timeout = jiffies + 1 * HZ; @@ -140,13 +139,10 @@ static int mv88e6060_setup_global(struct dsa_switch *ds) */ REG_WRITE(REG_GLOBAL, GLOBAL_CONTROL, GLOBAL_CONTROL_MAX_FRAME_1536);
- /* Enable automatic address learning, set the address - * database size to 1024 entries, and set the default aging - * time to 5 minutes. + /* Disable automatic address learning. */ REG_WRITE(REG_GLOBAL, GLOBAL_ATU_CONTROL, - GLOBAL_ATU_CONTROL_ATUSIZE_1024 | - GLOBAL_ATU_CONTROL_ATE_AGE_5MIN); + GLOBAL_ATU_CONTROL_LEARNDIS);
return 0; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1b603f9e4313348608f256b564ed6e3d9e67f377 ]
MLX4_EN depends on NETDEVICES, ETHERNET and INET Kconfigs. Make sure they are listed in MLX4_EN Kconfig dependencies.
This fixes the following build break:
drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: ‘struct iphdr’ declared inside parameter list [enabled by default] struct iphdr *iph) ^ drivers/net/ethernet/mellanox/mlx4/en_rx.c:582:18: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/mellanox/mlx4/en_rx.c: In function ‘get_fixed_ipv4_csum’: drivers/net/ethernet/mellanox/mlx4/en_rx.c:586:20: error: dereferencing pointer to incomplete type _u8 ipproto = iph->protocol;
Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx4/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig b/drivers/net/ethernet/mellanox/mlx4/Kconfig index 5098e7f21987..a0eb4e4bc525 100644 --- a/drivers/net/ethernet/mellanox/mlx4/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig @@ -5,7 +5,7 @@ config MLX4_EN tristate "Mellanox Technologies 1/10/40Gbit Ethernet support" depends on MAY_USE_DEVLINK - depends on PCI + depends on PCI && NETDEVICES && ETHERNET && INET select MLX4_CORE select PTP_1588_CLOCK ---help---
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c3494801cd1785e2c25f1a5735fa19ddcf9665da ]
Malicious user space may try to force the verifier to use as much cpu time and memory as possible. Hence check for pending signals while verifying the program. Note that suspend of sys_bpf(PROG_LOAD) syscall will lead to EAGAIN, since the kernel has to release the resources used for program verification.
Reported-by: Anatoly Trosinenko anatoly.trosinenko@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Edward Cree ecree@solarflare.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1438b7396cb4..335c00209f74 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2919,6 +2919,9 @@ static int do_check(struct bpf_verifier_env *env) goto process_bpf_exit; }
+ if (signal_pending(current)) + return -EAGAIN; + if (need_resched()) cond_resched();
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a1208f6a822ac29933e772ef1f637c5d67838da9 ]
This patch addresses possible memory corruption when v7_dma_inv_range(start_address, end_address) address parameters are not aligned to whole cache lines. This function issues "invalidate" cache management operations to all cache lines from start_address (inclusive) to end_address (exclusive). When start_address and/or end_address are not aligned, the start and/or end cache lines are first issued "clean & invalidate" operation. The assumption is this is done to ensure that any dirty data addresses outside the address range (but part of the first or last cache lines) are cleaned/flushed so that data is not lost, which could happen if just an invalidate is issued.
The problem is that these first/last partial cache lines are issued "clean & invalidate" and then "invalidate". This second "invalidate" is not required and worse can cause "lost" writes to addresses outside the address range but part of the cache line. If another component writes to its part of the cache line between the "clean & invalidate" and "invalidate" operations, the write can get lost. This fix is to remove the extra "invalidate" operation when unaligned addressed are used.
A kernel module is available that has a stress test to reproduce the issue and a unit test of the updated v7_dma_inv_range(). It can be downloaded from http://ftp.sageembedded.com/outgoing/linux/cache-test-20181107.tgz.
v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction) when the direction is DMA_FROM_DEVICE. One can (I believe) successfully argue that DMA from a device to main memory should use buffers aligned to cache line size, because the "clean & invalidate" might overwrite data that the device just wrote using DMA. But if a driver does use unaligned buffers, at least this fix will prevent memory corruption outside the buffer.
Signed-off-by: Chris Cole chris@sageembedded.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mm/cache-v7.S | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S index a134d8a13d00..11d699af30ed 100644 --- a/arch/arm/mm/cache-v7.S +++ b/arch/arm/mm/cache-v7.S @@ -359,14 +359,16 @@ v7_dma_inv_range: ALT_UP(W(nop)) #endif mcrne p15, 0, r0, c7, c14, 1 @ clean & invalidate D / U line + addne r0, r0, r2
tst r1, r3 bic r1, r1, r3 mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D / U line -1: - mcr p15, 0, r0, c7, c6, 1 @ invalidate D / U line - add r0, r0, r2 cmp r0, r1 +1: + mcrlo p15, 0, r0, c7, c6, 1 @ invalidate D / U line + addlo r0, r0, r2 + cmplo r0, r1 blo 1b dsb st ret lr
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3d0358d0ba048c5afb1385787aaec8fa5ad78fcc ]
Chris has discovered and reported that v7_dma_inv_range() may corrupt memory if address range is not aligned to cache line size.
Since the whole cache-v7m.S was lifted form cache-v7.S the same observation applies to v7m_dma_inv_range(). So the fix just mirrors what has been done for v7 with a little specific of M-class.
Cc: Chris Cole chris@sageembedded.com Signed-off-by: Vladimir Murzin vladimir.murzin@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mm/cache-v7m.S | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/arm/mm/cache-v7m.S b/arch/arm/mm/cache-v7m.S index 816a7e44e6f1..d29927740a19 100644 --- a/arch/arm/mm/cache-v7m.S +++ b/arch/arm/mm/cache-v7m.S @@ -73,9 +73,11 @@ /* * dcimvac: Invalidate data cache line by MVA to PoC */ -.macro dcimvac, rt, tmp - v7m_cacheop \rt, \tmp, V7M_SCB_DCIMVAC +.irp c,,eq,ne,cs,cc,mi,pl,vs,vc,hi,ls,ge,lt,gt,le,hs,lo +.macro dcimvac\c, rt, tmp + v7m_cacheop \rt, \tmp, V7M_SCB_DCIMVAC, \c .endm +.endr
/* * dccmvau: Clean data cache line by MVA to PoU @@ -369,14 +371,16 @@ v7m_dma_inv_range: tst r0, r3 bic r0, r0, r3 dccimvacne r0, r3 + addne r0, r0, r2 subne r3, r2, #1 @ restore r3, corrupted by v7m's dccimvac tst r1, r3 bic r1, r1, r3 dccimvacne r1, r3 -1: - dcimvac r0, r3 - add r0, r0, r2 cmp r0, r1 +1: + dcimvaclo r0, r3 + addlo r0, r0, r2 + cmplo r0, r1 blo 1b dsb st ret lr
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ecb239d96d369c23c33d41708646df646de669f4 ]
After getting a reference to the platform device's of_node the probe function ends up calling of_find_matching_node() using the node as an argument. The function takes care of decreasing the refcount on it. We are then incorrectly decreasing the refcount on that node again.
This patch removes the unwarranted call to of_node_put().
Fixes: 414fd46e7762 ("fsl/fman: Add FMan support") Signed-off-by: Nicolas Saenz Julienne nsaenzjulienne@suse.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fman/fman.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c index dafd9e1baba2..380c4a2f6516 100644 --- a/drivers/net/ethernet/freescale/fman/fman.c +++ b/drivers/net/ethernet/freescale/fman/fman.c @@ -2817,7 +2817,7 @@ static struct fman *read_dts_node(struct platform_device *of_dev) if (!muram_node) { dev_err(&of_dev->dev, "%s: could not find MURAM node\n", __func__); - goto fman_node_put; + goto fman_free; }
err = of_address_to_resource(muram_node, 0, @@ -2826,11 +2826,10 @@ static struct fman *read_dts_node(struct platform_device *of_dev) of_node_put(muram_node); dev_err(&of_dev->dev, "%s: of_address_to_resource() = %d\n", __func__, err); - goto fman_node_put; + goto fman_free; }
of_node_put(muram_node); - of_node_put(fm_node);
err = devm_request_irq(&of_dev->dev, irq, fman_irq, 0, "fman", fman); if (err < 0) {
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e594a5e349ddbfdaca1951bb3f8d72f3f1660d73 ]
When unloading the ast driver, a warning message is printed by drm_mode_config_cleanup() because a reference is still held to one of the drm_connector structs.
Correct this by calling drm_crtc_force_disable_all() in ast_fbdev_destroy().
Signed-off-by: Sam Bobroff sbobroff@linux.ibm.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/1e613f3c630c7bbc72e04a44b17825... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ast/ast_fb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/ast/ast_fb.c b/drivers/gpu/drm/ast/ast_fb.c index 7a86e24e2687..5e0d3e561b04 100644 --- a/drivers/gpu/drm/ast/ast_fb.c +++ b/drivers/gpu/drm/ast/ast_fb.c @@ -286,6 +286,7 @@ static void ast_fbdev_destroy(struct drm_device *dev, { struct ast_framebuffer *afb = &afbdev->afb;
+ drm_crtc_force_disable_all(dev); drm_fb_helper_unregister_fbi(&afbdev->helper); drm_fb_helper_release_fbi(&afbdev->helper);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6e785302dad32228819d8066e5376acd15d0e6ba ]
Missing a dependency. Shouldn't show cifs posix extensions in Kconfig if CONFIG_CIFS_ALLOW_INSECURE_DIALECTS (ie SMB1 protocol) is disabled.
Signed-off-by: Steve French stfrench@microsoft.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig index e7b478b49985..8bef27b8f85d 100644 --- a/fs/cifs/Kconfig +++ b/fs/cifs/Kconfig @@ -111,7 +111,7 @@ config CIFS_XATTR
config CIFS_POSIX bool "CIFS POSIX Extensions" - depends on CIFS_XATTR + depends on CIFS && CIFS_ALLOW_INSECURE_LEGACY && CIFS_XATTR help Enabling this option will cause the cifs client to attempt to negotiate a newer dialect with servers, such as Samba 3.0.5
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c38f57da428b033f2721b611d84b1f40bde674a8 ]
If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout expires.
When a vhost_vsock instance is released with the timeout still pending, the socket is never freed because vhost_vsock has already set the SOCK_DONE flag.
Check if the close timer is pending and let it close the socket. This prevents the race which can leak sockets.
Reported-by: Maximilian Riemensberger riemensberger@cadami.net Cc: Graham Whaley graham.whaley@gmail.com Signed-off-by: Stefan Hajnoczi stefanha@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vsock.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index f800f89068db..46f966d7c328 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -559,13 +559,21 @@ static void vhost_vsock_reset_orphans(struct sock *sk) * executing. */
- if (!vhost_vsock_get(vsk->remote_addr.svm_cid)) { - sock_set_flag(sk, SOCK_DONE); - vsk->peer_shutdown = SHUTDOWN_MASK; - sk->sk_state = SS_UNCONNECTED; - sk->sk_err = ECONNRESET; - sk->sk_error_report(sk); - } + /* If the peer is still valid, no need to reset connection */ + if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + return; + + /* If the close timeout is pending, let it expire. This avoids races + * with the timeout callback. + */ + if (vsk->close_work_scheduled) + return; + + sock_set_flag(sk, SOCK_DONE); + vsk->peer_shutdown = SHUTDOWN_MASK; + sk->sk_state = SS_UNCONNECTED; + sk->sk_err = ECONNRESET; + sk->sk_error_report(sk); }
static int vhost_vsock_dev_release(struct inode *inode, struct file *file)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6c7f25cae54b840302e4f1b371dbf318fbf09ab2 ]
According to Intel (R) Axxia TM Lionfish Communication Processor Peripheral Subsystem Hardware Reference Manual, the AXXIA I2C module have a programmable Master Wait Timer, which among others, checks the time between commands send in manual mode. When a timeout (25ms) passes, TSS bit is set in Master Interrupt Status register and a Stop command is issued by the hardware.
The axxia_i2c_xfer(), does not properly handle this situation, however. For each message a separate axxia_i2c_xfer_msg() is called and this function incorrectly assumes that any interrupt might happen only when waiting for completion. This is mostly correct but there is one exception - a master timeout can trigger if enough time has passed between individual transfers. It will, by definition, happen between transfers when the interrupts are disabled by the code. If that happens, the hardware issues Stop command.
The interrupt indicating timeout will not be triggered as soon as we enable them since the Master Interrupt Status is cleared when master mode is entered again (which happens before enabling irqs) meaning this error is lost and the transfer is continued even though the Stop was issued on the bus. The subsequent operations completes without error but a bogus value (0xFF in case of read) is read as the client device is confused because aborted transfer. No error is returned from master_xfer() making caller believe that a valid value was read.
To fix the problem, the TSS bit (indicating timeout) in Master Interrupt Status register is checked before each transfer. If it is set, there was a timeout before this transfer and (as described above) the hardware already issued Stop command so the transaction should be aborted thus -ETIMEOUT is returned from the master_xfer() callback. In order to be sure no timeout was issued we can't just read the status just before starting new transaction as there will always be a small window of time (few CPU cycles at best) where this might still happen. For this reason we have to temporally disable the timer before checking for TSS bit. Disabling it will, however, clear the TSS bit so in order to preserve that information, we have to read it in ISR so we have to ensure that the TSS interrupt is not masked between transfers of one transaction. There is no need to call bus recovery or controller reinitialization if that happens so it's skipped.
Signed-off-by: Krzysztof Adamski krzysztof.adamski@nokia.com Reviewed-by: Alexander Sverdlin alexander.sverdlin@nokia.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-axxia.c | 40 ++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 11 deletions(-)
diff --git a/drivers/i2c/busses/i2c-axxia.c b/drivers/i2c/busses/i2c-axxia.c index 4351a9343058..96a6d5df9b26 100644 --- a/drivers/i2c/busses/i2c-axxia.c +++ b/drivers/i2c/busses/i2c-axxia.c @@ -74,8 +74,7 @@ MST_STATUS_ND) #define MST_STATUS_ERR (MST_STATUS_NAK | \ MST_STATUS_AL | \ - MST_STATUS_IP | \ - MST_STATUS_TSS) + MST_STATUS_IP) #define MST_TX_BYTES_XFRD 0x50 #define MST_RX_BYTES_XFRD 0x54 #define SCL_HIGH_PERIOD 0x80 @@ -241,7 +240,7 @@ static int axxia_i2c_empty_rx_fifo(struct axxia_i2c_dev *idev) */ if (c <= 0 || c > I2C_SMBUS_BLOCK_MAX) { idev->msg_err = -EPROTO; - i2c_int_disable(idev, ~0); + i2c_int_disable(idev, ~MST_STATUS_TSS); complete(&idev->msg_complete); break; } @@ -299,14 +298,19 @@ static irqreturn_t axxia_i2c_isr(int irq, void *_dev)
if (status & MST_STATUS_SCC) { /* Stop completed */ - i2c_int_disable(idev, ~0); + i2c_int_disable(idev, ~MST_STATUS_TSS); complete(&idev->msg_complete); } else if (status & MST_STATUS_SNS) { /* Transfer done */ - i2c_int_disable(idev, ~0); + i2c_int_disable(idev, ~MST_STATUS_TSS); if (i2c_m_rd(idev->msg) && idev->msg_xfrd < idev->msg->len) axxia_i2c_empty_rx_fifo(idev); complete(&idev->msg_complete); + } else if (status & MST_STATUS_TSS) { + /* Transfer timeout */ + idev->msg_err = -ETIMEDOUT; + i2c_int_disable(idev, ~MST_STATUS_TSS); + complete(&idev->msg_complete); } else if (unlikely(status & MST_STATUS_ERR)) { /* Transfer error */ i2c_int_disable(idev, ~0); @@ -339,10 +343,10 @@ static int axxia_i2c_xfer_msg(struct axxia_i2c_dev *idev, struct i2c_msg *msg) u32 rx_xfer, tx_xfer; u32 addr_1, addr_2; unsigned long time_left; + unsigned int wt_value;
idev->msg = msg; idev->msg_xfrd = 0; - idev->msg_err = 0; reinit_completion(&idev->msg_complete);
if (i2c_m_ten(msg)) { @@ -382,9 +386,18 @@ static int axxia_i2c_xfer_msg(struct axxia_i2c_dev *idev, struct i2c_msg *msg) else if (axxia_i2c_fill_tx_fifo(idev) != 0) int_mask |= MST_STATUS_TFL;
+ wt_value = WT_VALUE(readl(idev->base + WAIT_TIMER_CONTROL)); + /* Disable wait timer temporarly */ + writel(wt_value, idev->base + WAIT_TIMER_CONTROL); + /* Check if timeout error happened */ + if (idev->msg_err) + goto out; + /* Start manual mode */ writel(CMD_MANUAL, idev->base + MST_COMMAND);
+ writel(WT_EN | wt_value, idev->base + WAIT_TIMER_CONTROL); + i2c_int_enable(idev, int_mask);
time_left = wait_for_completion_timeout(&idev->msg_complete, @@ -395,13 +408,15 @@ static int axxia_i2c_xfer_msg(struct axxia_i2c_dev *idev, struct i2c_msg *msg) if (readl(idev->base + MST_COMMAND) & CMD_BUSY) dev_warn(idev->dev, "busy after xfer\n");
- if (time_left == 0) + if (time_left == 0) { idev->msg_err = -ETIMEDOUT; - - if (idev->msg_err == -ETIMEDOUT) i2c_recover_bus(&idev->adapter); + axxia_i2c_init(idev); + }
- if (unlikely(idev->msg_err) && idev->msg_err != -ENXIO) +out: + if (unlikely(idev->msg_err) && idev->msg_err != -ENXIO && + idev->msg_err != -ETIMEDOUT) axxia_i2c_init(idev);
return idev->msg_err; @@ -409,7 +424,7 @@ static int axxia_i2c_xfer_msg(struct axxia_i2c_dev *idev, struct i2c_msg *msg)
static int axxia_i2c_stop(struct axxia_i2c_dev *idev) { - u32 int_mask = MST_STATUS_ERR | MST_STATUS_SCC; + u32 int_mask = MST_STATUS_ERR | MST_STATUS_SCC | MST_STATUS_TSS; unsigned long time_left;
reinit_completion(&idev->msg_complete); @@ -436,6 +451,9 @@ axxia_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num) int i; int ret = 0;
+ idev->msg_err = 0; + i2c_int_enable(idev, MST_STATUS_TSS); + for (i = 0; ret == 0 && i < num; ++i) ret = axxia_i2c_xfer_msg(idev, &msgs[i]);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0544ee4b1ad574aec3b6379af5f5cdee42840971 ]
Some AMD based HP laptops have a SMB0001 ACPI device node which does not define any methods.
This leads to the following error in dmesg:
[ 5.222731] cmi: probe of SMB0001:00 failed with error -5
This commit makes acpi_smbus_cmi_add() return -ENODEV instead in this case silencing the error. In case of a failure of the i2c_add_adapter() call this commit now propagates the error from that call instead of -EIO.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-scmi.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/i2c/busses/i2c-scmi.c b/drivers/i2c/busses/i2c-scmi.c index efefcfa24a4c..d2178f701b41 100644 --- a/drivers/i2c/busses/i2c-scmi.c +++ b/drivers/i2c/busses/i2c-scmi.c @@ -364,6 +364,7 @@ static int acpi_smbus_cmi_add(struct acpi_device *device) { struct acpi_smbus_cmi *smbus_cmi; const struct acpi_device_id *id; + int ret;
smbus_cmi = kzalloc(sizeof(struct acpi_smbus_cmi), GFP_KERNEL); if (!smbus_cmi) @@ -385,8 +386,10 @@ static int acpi_smbus_cmi_add(struct acpi_device *device) acpi_walk_namespace(ACPI_TYPE_METHOD, smbus_cmi->handle, 1, acpi_smbus_cmi_query_methods, NULL, smbus_cmi, NULL);
- if (smbus_cmi->cap_info == 0) + if (smbus_cmi->cap_info == 0) { + ret = -ENODEV; goto err; + }
snprintf(smbus_cmi->adapter.name, sizeof(smbus_cmi->adapter.name), "SMBus CMI adapter %s", @@ -397,7 +400,8 @@ static int acpi_smbus_cmi_add(struct acpi_device *device) smbus_cmi->adapter.class = I2C_CLASS_HWMON | I2C_CLASS_SPD; smbus_cmi->adapter.dev.parent = &device->dev;
- if (i2c_add_adapter(&smbus_cmi->adapter)) { + ret = i2c_add_adapter(&smbus_cmi->adapter); + if (ret) { dev_err(&device->dev, "Couldn't register adapter!\n"); goto err; } @@ -407,7 +411,7 @@ static int acpi_smbus_cmi_add(struct acpi_device *device) err: kfree(smbus_cmi); device->driver_data = NULL; - return -EIO; + return ret; }
static int acpi_smbus_cmi_remove(struct acpi_device *device)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d7dcdf9d4e15189ecfda24cc87339a3425448d5c ]
nvmet_rdma_release_rsp() may free the response before using it at error flow.
Fixes: 8407879 ("nvmet-rdma: fix possible bogus dereference under heavy load") Signed-off-by: Israel Rukshin israelr@mellanox.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Max Gurtovoy maxg@mellanox.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/rdma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 2dfd877974d7..486393fa4f3e 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -524,6 +524,7 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) { struct nvmet_rdma_rsp *rsp = container_of(wc->wr_cqe, struct nvmet_rdma_rsp, send_cqe); + struct nvmet_rdma_queue *queue = cq->cq_context;
nvmet_rdma_release_rsp(rsp);
@@ -531,7 +532,7 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) wc->status != IB_WC_WR_FLUSH_ERR)) { pr_err("SEND for CQE 0x%p failed with status %s (%d).\n", wc->wr_cqe, ib_wc_status_msg(wc->status), wc->status); - nvmet_rdma_error_comp(rsp->queue); + nvmet_rdma_error_comp(queue); } }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 7bb633b1a9812a6b9f3e49d0cf17f60a633914e5 ]
The clear of the LPTA_EN flag should be synced before writing to the alarm register. Omitting this synchronization creates a race when trying to change existing alarm.
Signed-off-by: Guy Shapiro guy.shapiro@mobi-wize.com Signed-off-by: Alexandre Belloni alexandre.belloni@free-electrons.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-snvs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/rtc/rtc-snvs.c b/drivers/rtc/rtc-snvs.c index 3e8fd33c2576..9af591d5223c 100644 --- a/drivers/rtc/rtc-snvs.c +++ b/drivers/rtc/rtc-snvs.c @@ -187,6 +187,7 @@ static int snvs_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) rtc_tm_to_time(alrm_tm, &time);
regmap_update_bits(data->regmap, data->offset + SNVS_LPCR, SNVS_LPCR_LPTA_EN, 0); + rtc_write_sync_lp(data); regmap_write(data->regmap, data->offset + SNVS_LPTAR, time);
/* Clear alarm interrupt status bit */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit cd7f3a249dbed2858e6c2f30e5be7f1f7a709ee2 ]
In order to read correctly from asynchronously updated RTC registers, it's necessary to read repeatedly until their values do not change from read to read. It's also necessary to wait for three RTC clock ticks for certain operations. There are no timeouts in this code and these operations could possibly loop forever.
To avoid kernel hangs, put in timeouts.
The iMX7d can be configured to stop the SRTC on a tamper event, which will lockup the kernel inside this driver as described above.
These hangs can happen when running under qemu, which doesn't emulate the SNVS RTC, though currently the driver will refuse to load on qemu due to a timeout in the driver probe method.
It could also happen if the SRTC block where somehow placed into reset or the slow speed clock that drives the SRTC counter (but not the CPU) were to stop.
The symptoms on a two core iMX7d are a work queue hang on rtc_timer_do_work(), which eventually blocks a systemd fsnotify operation that triggers a work queue flush, causing systemd to hang and thus causing all services that should be started by systemd, like a console getty, to fail to start or stop.
Also optimize the wait code to wait less. It only needs to wait for the clock to advance three ticks, not to see it change three times.
Cc: Alexandre Belloni alexandre.belloni@free-electrons.com Cc: Alessandro Zummo a.zummo@towertech.it Cc: Fabio Estevam fabio.estevam@nxp.com Cc: Shawn Guo shawn.guo@linaro.org Cc: Bryan O'Donoghue pure.logic@nexus-software.ie Signed-off-by: Trent Piepho tpiepho@impinj.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-snvs.c | 105 +++++++++++++++++++++++++++-------------- 1 file changed, 70 insertions(+), 35 deletions(-)
diff --git a/drivers/rtc/rtc-snvs.c b/drivers/rtc/rtc-snvs.c index 9af591d5223c..71eee39520f0 100644 --- a/drivers/rtc/rtc-snvs.c +++ b/drivers/rtc/rtc-snvs.c @@ -47,49 +47,83 @@ struct snvs_rtc_data { struct clk *clk; };
+/* Read 64 bit timer register, which could be in inconsistent state */ +static u64 rtc_read_lpsrt(struct snvs_rtc_data *data) +{ + u32 msb, lsb; + + regmap_read(data->regmap, data->offset + SNVS_LPSRTCMR, &msb); + regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &lsb); + return (u64)msb << 32 | lsb; +} + +/* Read the secure real time counter, taking care to deal with the cases of the + * counter updating while being read. + */ static u32 rtc_read_lp_counter(struct snvs_rtc_data *data) { u64 read1, read2; - u32 val; + unsigned int timeout = 100;
+ /* As expected, the registers might update between the read of the LSB + * reg and the MSB reg. It's also possible that one register might be + * in partially modified state as well. + */ + read1 = rtc_read_lpsrt(data); do { - regmap_read(data->regmap, data->offset + SNVS_LPSRTCMR, &val); - read1 = val; - read1 <<= 32; - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &val); - read1 |= val; - - regmap_read(data->regmap, data->offset + SNVS_LPSRTCMR, &val); - read2 = val; - read2 <<= 32; - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &val); - read2 |= val; - } while (read1 != read2); + read2 = read1; + read1 = rtc_read_lpsrt(data); + } while (read1 != read2 && --timeout); + if (!timeout) + dev_err(&data->rtc->dev, "Timeout trying to get valid LPSRT Counter read\n");
/* Convert 47-bit counter to 32-bit raw second count */ return (u32) (read1 >> CNTR_TO_SECS_SH); }
-static void rtc_write_sync_lp(struct snvs_rtc_data *data) +/* Just read the lsb from the counter, dealing with inconsistent state */ +static int rtc_read_lp_counter_lsb(struct snvs_rtc_data *data, u32 *lsb) { - u32 count1, count2, count3; - int i; - - /* Wait for 3 CKIL cycles */ - for (i = 0; i < 3; i++) { - do { - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count1); - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count2); - } while (count1 != count2); - - /* Now wait until counter value changes */ - do { - do { - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count2); - regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count3); - } while (count2 != count3); - } while (count3 == count1); + u32 count1, count2; + unsigned int timeout = 100; + + regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count1); + do { + count2 = count1; + regmap_read(data->regmap, data->offset + SNVS_LPSRTCLR, &count1); + } while (count1 != count2 && --timeout); + if (!timeout) { + dev_err(&data->rtc->dev, "Timeout trying to get valid LPSRT Counter read\n"); + return -ETIMEDOUT; } + + *lsb = count1; + return 0; +} + +static int rtc_write_sync_lp(struct snvs_rtc_data *data) +{ + u32 count1, count2; + u32 elapsed; + unsigned int timeout = 1000; + int ret; + + ret = rtc_read_lp_counter_lsb(data, &count1); + if (ret) + return ret; + + /* Wait for 3 CKIL cycles, about 61.0-91.5 µs */ + do { + ret = rtc_read_lp_counter_lsb(data, &count2); + if (ret) + return ret; + elapsed = count2 - count1; /* wrap around _is_ handled! */ + } while (elapsed < 3 && --timeout); + if (!timeout) { + dev_err(&data->rtc->dev, "Timeout waiting for LPSRT Counter to change\n"); + return -ETIMEDOUT; + } + return 0; }
static int snvs_rtc_enable(struct snvs_rtc_data *data, bool enable) @@ -173,9 +207,7 @@ static int snvs_rtc_alarm_irq_enable(struct device *dev, unsigned int enable) (SNVS_LPCR_LPTA_EN | SNVS_LPCR_LPWUI_EN), enable ? (SNVS_LPCR_LPTA_EN | SNVS_LPCR_LPWUI_EN) : 0);
- rtc_write_sync_lp(data); - - return 0; + return rtc_write_sync_lp(data); }
static int snvs_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) @@ -183,11 +215,14 @@ static int snvs_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) struct snvs_rtc_data *data = dev_get_drvdata(dev); struct rtc_time *alrm_tm = &alrm->time; unsigned long time; + int ret;
rtc_tm_to_time(alrm_tm, &time);
regmap_update_bits(data->regmap, data->offset + SNVS_LPCR, SNVS_LPCR_LPTA_EN, 0); - rtc_write_sync_lp(data); + ret = rtc_write_sync_lp(data); + if (ret) + return ret; regmap_write(data->regmap, data->offset + SNVS_LPTAR, time);
/* Clear alarm interrupt status bit */
On Thu, 20 Dec 2018 at 14:53, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.9.147 release. There are 61 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Dec 22 08:58:31 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.147-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.9.147-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.9.y git commit: 2758b137b9ed759bd1827db35c31571beab263be git describe: v4.9.146-62-g2758b137b9ed Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.146-62-...
No regressions (compared to build v4.9.146)
No fixes (compared to build v4.9.146)
Ran 21452 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Thu, Dec 20, 2018 at 10:18:00AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.147 release. There are 61 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Dec 22 08:58:31 UTC 2018. Anything received after that time might be too late.
Build results: total: 171 pass: 171 fail: 0 Qemu test results: total: 301 pass: 301 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
On 12/20/18 2:18 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.147 release. There are 61 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Dec 22 08:58:31 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.147-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On 20/12/2018 09:18, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.147 release. There are 61 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat Dec 22 08:58:31 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.147-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.9: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 14 tests: 14 pass, 0 fail
Linux version: 4.9.147-rc1-g2758b13 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
linux-stable-mirror@lists.linaro.org