This is the start of the stable review cycle for the 4.14.89 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Dec 16 11:57:01 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.89-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.89-rc1
Piotr Stankiewicz piotr.stankiewicz@intel.com IB/hfi1: Fix an out-of-bounds access in get_hw_stats
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Fixed headphone issue for ALC700
Takashi Sakamoto o-takashi@sakamocchi.jp ALSA: fireface: fix reference to wrong register for clock configuration
Guenter Roeck linux@roeck-us.net staging: speakup: Replace strncpy with memcpy
Tigran Mkrtchyan tigran.mkrtchyan@desy.de flexfiles: enforce per-mirror stateid only for v4 DSes
Davidlohr Bueso dave@stgolabs.net lib/rbtree-test: lower default params
Petr Mladek pmladek@suse.com printk: Wake klogd when passing console_lock owner
Sergey Senozhatsky sergey.senozhatsky.work@gmail.com printk: Never set console_may_schedule in console_trylock()
Petr Mladek pmladek@suse.com printk: Hide console waiter logic into helpers
Steven Rostedt (VMware) rostedt@goodmis.org printk: Add console owner and waiter logic to load balance console writes
Sasha Levin sashal@kernel.org Revert "printk: Never set console_may_schedule in console_trylock()"
Pan Bian bianpan2016@163.com ocfs2: fix potential use after free
Qian Cai cai@gmx.us debugobjects: avoid recursive calls with kmemleak
Pan Bian bianpan2016@163.com hfsplus: do not free node before using
Pan Bian bianpan2016@163.com hfs: do not free node before using
Wei Yang richard.weiyang@gmail.com mm/page_alloc.c: fix calculation of pgdat->nr_zones
Larry Chen lchen@suse.com ocfs2: fix deadlock caused by ocfs2_defrag_extent()
Lorenzo Pieralisi lorenzo.pieralisi@arm.com ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value
Sagi Grimberg sagi@grimberg.me nvme: flush namespace scanning work just before removing namespaces
Colin Ian King colin.king@canonical.com fscache, cachefiles: remove redundant variable 'cache'
NeilBrown neilb@suse.com fscache: fix race between enablement and dropping of object
Kees Cook keescook@chromium.org pstore/ram: Correctly calculate usable PRZ bytes
Igor Druzhinin igor.druzhinin@citrix.com Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"
Srikanth Boddepalli boddepalli.srikanth@gmail.com xen: xlate_mmu: add missing header to fix 'W=1' warning
Y.C. Chen yc_chen@aspeedtech.com drm/ast: fixed reading monitor EDID not stable issue
shaoyunl shaoyun.liu@amd.com drm/amdgpu: Add delay after enable RLC ucode
Pan Bian bianpan2016@163.com net: hisilicon: remove unexpected free_netdev
Josh Elsasser jelsasser@appneta.com ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
Yunjian Wang wangyunjian@huawei.com igb: fix uninitialized variables
Kiran Kumar Modukuri kiran.modukuri@gmail.com cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active
Taehee Yoo ap420073@gmail.com netfilter: nf_tables: deactivate expressions in rule replecement routine
Marek Szyprowski m.szyprowski@samsung.com usb: gadget: u_ether: fix unsafe list iteration
Lorenzo Bianconi lorenzo.bianconi@redhat.com net: thunderx: fix NULL pointer dereference in nic_remove
Yi Wang wang.yi59@zte.com.cn x86/kvm/vmx: fix old-style function declaration
Yi Wang wang.yi59@zte.com.cn KVM: x86: fix empty-body warnings
Artemy Kovalyov artemyko@mellanox.com IB/mlx5: Fix page fault handling for MW
Alin Nastac alin.nastac@gmail.com netfilter: ipv6: Preserve link scope traffic original oif
Christian Hewitt christianshewitt@gmail.com drm/meson: add support for 1080p25 mode
Aaro Koskinen aaro.koskinen@iki.fi USB: omap_udc: fix rejection of out transfers when DMA is used
Aaro Koskinen aaro.koskinen@iki.fi USB: omap_udc: fix USB gadget functionality on Palm Tungsten E
Aaro Koskinen aaro.koskinen@iki.fi USB: omap_udc: fix omap_udc_start() on 15xx machines
Aaro Koskinen aaro.koskinen@iki.fi USB: omap_udc: fix crashes on probe error and module removal
Aaro Koskinen aaro.koskinen@iki.fi USB: omap_udc: use devm_request_irq()
Xin Long lucien.xin@gmail.com ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf
Martynas Pumputis m@lambda.lt bpf: fix check of allowed specifiers in bpf_trace_printk
Pan Bian bianpan2016@163.com exportfs: do not read dentry after free
Peter Ujfalusi peter.ujfalusi@ti.com ASoC: omap-dmic: Add pm_qos handling to avoid overruns with CPU_IDLE
Peter Ujfalusi peter.ujfalusi@ti.com ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE
Peter Ujfalusi peter.ujfalusi@ti.com ASoC: omap-mcbsp: Fix latency value calculation for pm_qos
Kamal Heib kamalheib1@gmail.com RDMA/rdmavt: Fix rvt_create_ah function signature
Majd Dibbiny majd@mellanox.com RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WR
Robbie Ko robbieko@synology.com Btrfs: send, fix infinite loop due to directory rename dependencies
Romain Izard romain.izard.pro@gmail.com ARM: dts: at91: sama5d2: use the divided clock for SMC
Artem Savkov asavkov@redhat.com objtool: Fix segfault in .cold detection with -ffunction-sections
Artem Savkov asavkov@redhat.com objtool: Fix double-free in .cold detection error path
Trent Piepho tpiepho@impinj.com PCI: imx6: Fix link training status detection in link up check
Jiri Olsa jolsa@kernel.org perf tools: Restore proper cwd on return from mnt namespace
Huacai Chen chenhc@lemote.com hwmon: (w83795) temp4_type has writable permission
Taehee Yoo ap420073@gmail.com netfilter: xt_hashlimit: fix a possible memory leak in htable_create()
Hans de Goede hdegoede@redhat.com iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers
Tzung-Bi Shih tzungbi@google.com ASoC: dapm: Recalculate audio map forcely when card instantiated
Peter Ujfalusi peter.ujfalusi@ti.com ASoC: omap-abe-twl6040: Fix missing audio card caused by deferred probing
Nicolin Chen nicoleotsuka@gmail.com hwmon: (ina2xx) Fix current value calculation
Thomas Richter tmricht@linux.ibm.com s390/cpum_cf: Reject request for sampling in event initialization
Richard Fitzgerald rf@opensource.cirrus.com ASoC: wm_adsp: Fix dma-unsafe read of scratch registers
Nicolin Chen nicoleotsuka@gmail.com hwmon (ina2xx) Fix NULL id pointer in probe()
Florian Westphal fw@strlen.de netfilter: nf_tables: fix use-after-free when deleting compat expressions
Florian Westphal fw@strlen.de selftests: add script to stress-test nft packet path vs. control plane
YueHaibing yuehaibing@huawei.com sysv: return 'err' instead of 0 in __sysv_write_inode
Janusz Krzysztofik jmkrzyszt@gmail.com ARM: OMAP1: ams-delta: Fix possible use of uninitialized field
Adam Ford aford173@gmail.com ARM: dts: logicpd-somlv: Fix interrupt on mmc3_dat1
Christophe JAILLET christophe.jaillet@wanadoo.fr staging: rtl8723bs: Fix the return value in case of error in 'rtw_wx_read32()'
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ASoC: rsnd: fixup clock start checker
Nathan Chancellor natechancellor@gmail.com ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup
Jason Wang jasowang@redhat.com virtio-net: keep vnet header zeroed after processing XDP
Nicolas Dichtel nicolas.dichtel@6wind.com tun: forbid iface creation with rtnl ops
Yuchung Cheng ycheng@google.com tcp: fix NULL ref in tail loss probe
Eric Dumazet edumazet@google.com tcp: Do not underestimate rwnd_limited
Xin Long lucien.xin@gmail.com sctp: kfree_rcu asoc
Eric Dumazet edumazet@google.com rtnetlink: ndo_dflt_fdb_dump() only work for ARPHRD_ETHER devices
Christoph Paasch cpaasch@apple.com net: Prevent invalid access to skb->prev in __qdisc_drop_all
Heiner Kallweit hkallweit1@gmail.com net: phy: don't allow __set_phy_supported to add unsupported modes
Eran Ben Elisha eranbe@mellanox.com net/mlx4_en: Change min MTU size to ETH_MIN_MTU
Tarick Bedeir tarick@google.com net/mlx4_core: Correctly set PFC param if global pause is turned off.
Su Yanjun suyj.fnst@cn.fujitsu.com net: 8139cp: fix a BUG triggered by changing mtu with network traffic
Shmulik Ladkani shmulik@metanetworks.com ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output
Stefano Brivio sbrivio@redhat.com neighbour: Avoid writing before skb->head in neigh_hh_output()
Stefano Brivio sbrivio@redhat.com ipv6: Check available headroom in ip6_xmit() even without options
Jiri Wiesner jwiesner@suse.com ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/logicpd-som-lv.dtsi | 2 +- arch/arm/boot/dts/sama5d2.dtsi | 2 +- arch/arm/mach-omap1/board-ams-delta.c | 3 + arch/arm/mach-omap2/prm44xx.c | 2 +- arch/s390/kernel/perf_cpum_cf.c | 2 + arch/x86/kvm/lapic.c | 2 +- arch/x86/kvm/vmx.c | 8 +- arch/x86/xen/enlighten.c | 78 ---------- arch/x86/xen/setup.c | 6 +- drivers/acpi/arm64/iort.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 7 +- drivers/gpu/drm/ast/ast_mode.c | 36 ++++- drivers/gpu/drm/meson/meson_venc.c | 1 + drivers/hid/hid-sensor-custom.c | 2 +- drivers/hid/hid-sensor-hub.c | 13 +- drivers/hwmon/ina2xx.c | 6 +- drivers/hwmon/w83795.c | 2 +- drivers/iio/accel/hid-sensor-accel-3d.c | 5 +- drivers/iio/gyro/hid-sensor-gyro-3d.c | 5 +- drivers/iio/humidity/hid-sensor-humidity.c | 3 +- drivers/iio/light/hid-sensor-als.c | 8 +- drivers/iio/light/hid-sensor-prox.c | 8 +- drivers/iio/magnetometer/hid-sensor-magn-3d.c | 8 +- drivers/iio/orientation/hid-sensor-incl-3d.c | 8 +- drivers/iio/pressure/hid-sensor-press.c | 8 +- drivers/iio/temperature/hid-sensor-temperature.c | 3 +- drivers/infiniband/hw/hfi1/chip.c | 3 +- drivers/infiniband/hw/hfi1/hfi.h | 2 + drivers/infiniband/hw/hfi1/verbs.c | 2 +- drivers/infiniband/hw/mlx5/odp.c | 1 + drivers/infiniband/hw/mlx5/qp.c | 19 +-- drivers/infiniband/sw/rdmavt/ah.c | 4 +- drivers/infiniband/sw/rdmavt/ah.h | 3 +- drivers/net/ethernet/cavium/thunder/nic_main.c | 3 + drivers/net/ethernet/hisilicon/hip04_eth.c | 4 +- drivers/net/ethernet/intel/igb/e1000_i210.c | 1 + drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 4 +- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 4 +- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 +- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 - drivers/net/ethernet/realtek/8139cp.c | 5 + drivers/net/phy/phy_device.c | 19 ++- drivers/net/tun.c | 6 +- drivers/net/virtio_net.c | 14 +- drivers/nvme/host/core.c | 4 +- drivers/pci/dwc/pci-imx6.c | 10 +- drivers/rtc/rtc-hid-sensor-time.c | 2 +- drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 2 +- drivers/staging/speakup/kobjects.c | 4 +- drivers/usb/gadget/function/u_ether.c | 11 +- drivers/usb/gadget/udc/omap_udc.c | 88 ++++-------- drivers/xen/balloon.c | 65 ++------- drivers/xen/xlate_mmu.c | 1 + fs/btrfs/send.c | 11 +- fs/cachefiles/rdwr.c | 9 +- fs/exportfs/expfs.c | 2 +- fs/fscache/object.c | 3 + fs/hfs/btree.c | 3 +- fs/hfsplus/btree.c | 3 +- fs/nfs/flexfilelayout/flexfilelayout.c | 6 +- fs/ocfs2/export.c | 2 +- fs/ocfs2/move_extents.c | 47 +++--- fs/pstore/ram.c | 15 +- fs/sysv/inode.c | 2 +- include/linux/hid-sensor-hub.h | 4 +- include/linux/pstore.h | 5 +- include/net/neighbour.h | 28 +++- include/net/sctp/structs.h | 2 + include/xen/balloon.h | 5 - kernel/printk/printk.c | 160 ++++++++++++++++++++- kernel/trace/bpf_trace.c | 8 +- lib/debugobjects.c | 5 +- lib/interval_tree_test.c | 4 +- lib/rbtree_test.c | 2 +- mm/page_alloc.c | 4 +- net/core/rtnetlink.c | 3 + net/ipv4/ip_fragment.c | 7 + net/ipv4/tcp_output.c | 17 ++- net/ipv6/ip6_output.c | 42 +++--- net/ipv6/netfilter.c | 3 +- net/ipv6/netfilter/nf_conntrack_reasm.c | 8 +- net/ipv6/reassembly.c | 8 +- net/ipv6/seg6_iptunnel.c | 1 + net/netfilter/ipvs/ip_vs_ctl.c | 3 + net/netfilter/nf_tables_api.c | 20 +-- net/netfilter/nft_compat.c | 3 +- net/netfilter/xt_hashlimit.c | 9 +- net/sched/sch_netem.c | 3 + net/sctp/associola.c | 2 +- sound/firewire/fireface/ff-protocol-ff400.c | 2 +- sound/pci/hda/patch_realtek.c | 33 +++++ sound/soc/codecs/wm_adsp.c | 37 ++--- sound/soc/omap/omap-abe-twl6040.c | 67 ++++----- sound/soc/omap/omap-dmic.c | 9 ++ sound/soc/omap/omap-mcbsp.c | 6 +- sound/soc/omap/omap-mcpdm.c | 43 +++++- sound/soc/sh/rcar/ssi.c | 2 +- sound/soc/soc-core.c | 1 + tools/objtool/elf.c | 19 ++- tools/perf/util/namespaces.c | 17 ++- tools/perf/util/namespaces.h | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/netfilter/Makefile | 6 + tools/testing/selftests/netfilter/config | 2 + .../selftests/netfilter/nft_trans_stress.sh | 78 ++++++++++ 106 files changed, 820 insertions(+), 483 deletions(-)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Wiesner jwiesner@suse.com
[ Upstream commit ebaf39e6032faf77218220707fc3fa22487784e0 ]
The *_frag_reasm() functions are susceptible to miscalculating the byte count of packet fragments in case the truesize of a head buffer changes. The truesize member may be changed by the call to skb_unclone(), leaving the fragment memory limit counter unbalanced even if all fragments are processed. This miscalculation goes unnoticed as long as the network namespace which holds the counter is not destroyed.
Should an attempt be made to destroy a network namespace that holds an unbalanced fragment memory limit counter the cleanup of the namespace never finishes. The thread handling the cleanup gets stuck in inet_frags_exit_net() waiting for the percpu counter to reach zero. The thread is usually in running state with a stacktrace similar to:
PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 #10 [ffff880621563e38] process_one_work at ffffffff81096f14
It is not possible to create new network namespaces, and processes that call unshare() end up being stuck in uninterruptible sleep state waiting to acquire the net_mutex.
The bug was observed in the IPv6 netfilter code by Per Sundstrom. I thank him for his analysis of the problem. The parts of this patch that apply to IPv4 and IPv6 fragment reassembly are preemptive measures.
Signed-off-by: Jiri Wiesner jwiesner@suse.com Reported-by: Per Sundstrom per.sundstrom@redqube.se Acked-by: Peter Oskolkov posk@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_fragment.c | 7 +++++++ net/ipv6/netfilter/nf_conntrack_reasm.c | 8 +++++++- net/ipv6/reassembly.c | 8 +++++++- 3 files changed, 21 insertions(+), 2 deletions(-)
--- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -513,6 +513,7 @@ static int ip_frag_reasm(struct ipq *qp, struct rb_node *rbn; int len; int ihlen; + int delta; int err; u8 ecn;
@@ -554,10 +555,16 @@ static int ip_frag_reasm(struct ipq *qp, if (len > 65535) goto out_oversize;
+ delta = - head->truesize; + /* Head of list must not be cloned. */ if (skb_unclone(head, GFP_ATOMIC)) goto out_nomem;
+ delta += head->truesize; + if (delta) + add_frag_mem_limit(qp->q.net, delta); + /* If the first fragment is fragmented itself, we split * it to two chunks: the first with data and paged part * and the second, holding only fragments. */ --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -349,7 +349,7 @@ static bool nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev) { struct sk_buff *fp, *head = fq->q.fragments; - int payload_len; + int payload_len, delta; u8 ecn;
inet_frag_kill(&fq->q); @@ -371,10 +371,16 @@ nf_ct_frag6_reasm(struct frag_queue *fq, return false; }
+ delta = - head->truesize; + /* Head of list must not be cloned. */ if (skb_unclone(head, GFP_ATOMIC)) return false;
+ delta += head->truesize; + if (delta) + add_frag_mem_limit(fq->q.net, delta); + /* If the first fragment is fragmented itself, we split * it to two chunks: the first with data and paged part * and the second, holding only fragments. */ --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -348,7 +348,7 @@ static int ip6_frag_reasm(struct frag_qu { struct net *net = container_of(fq->q.net, struct net, ipv6.frags); struct sk_buff *fp, *head = fq->q.fragments; - int payload_len; + int payload_len, delta; unsigned int nhoff; int sum_truesize; u8 ecn; @@ -389,10 +389,16 @@ static int ip6_frag_reasm(struct frag_qu if (payload_len > IPV6_MAXPLEN) goto out_oversize;
+ delta = - head->truesize; + /* Head of list must not be cloned. */ if (skb_unclone(head, GFP_ATOMIC)) goto out_oom;
+ delta += head->truesize; + if (delta) + add_frag_mem_limit(fq->q.net, delta); + /* If the first fragment is fragmented itself, we split * it to two chunks: the first with data and paged part * and the second, holding only fragments. */
On 2018-12-14 12:59, Greg Kroah-Hartman wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Jiri Wiesner jwiesner@suse.com
[ Upstream commit ebaf39e6032faf77218220707fc3fa22487784e0 ]
I am sorry for forgetting to mention (and to include a tag) that the patch actually fixed v4.10-rc4-868-g158f323b9868, which introduced changing the truesize member in pskb_expand_head(). The patch under review needs to be applied to 4.14-stable and 4.19-stable only.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit 66033f47ca60294a95fc85ec3a3cc909dab7b765 ]
Even if we send an IPv6 packet without options, MAX_HEADER might not be enough to account for the additional headroom required by alignment of hardware headers.
On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL, sending short SCTP packets over IPv4 over L2TP over IPv6, we start with 100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54 bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2().
Those would be enough to append our 14 bytes header, but we're going to align that to 16 bytes, and write 2 bytes out of the allocated slab in neigh_hh_output().
KASan says:
[ 264.967848] ================================================================== [ 264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70 [ 264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201 [ 264.967870] [ 264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1 [ 264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 264.967887] Call Trace: [ 264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0) [ 264.967903] [<00000000017e379c>] dump_stack+0x23c/0x290 [ 264.967912] [<00000000007bc594>] print_address_description+0xf4/0x290 [ 264.967919] [<00000000007bc8fc>] kasan_report+0x13c/0x240 [ 264.967927] [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70 [ 264.967935] [<000000000163f890>] ip6_finish_output+0x430/0x7f0 [ 264.967943] [<000000000163fe44>] ip6_output+0x1f4/0x580 [ 264.967953] [<000000000163882a>] ip6_xmit+0xfea/0x1ce8 [ 264.967963] [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8 [ 264.968033] [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core] [ 264.968037] [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth] [ 264.968041] [<0000000001220020>] dev_hard_start_xmit+0x268/0x928 [ 264.968069] [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350 [ 264.968071] [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478 [ 264.968075] [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0 [ 264.968078] [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8 [ 264.968081] [<00000000013ddd1e>] ip_output+0x226/0x4c0 [ 264.968083] [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938 [ 264.968100] [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp] [ 264.968116] [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp] [ 264.968131] [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp] [ 264.968146] [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp] [ 264.968161] [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp] [ 264.968177] [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp] [ 264.968192] [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp] [ 264.968208] [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp] [ 264.968212] [<0000000001197942>] __sys_connect+0x21a/0x450 [ 264.968215] [<000000000119aff8>] sys_socketcall+0x3d0/0xb08 [ 264.968218] [<000000000184ea7a>] system_call+0x2a2/0x2c0
[...]
Just like ip_finish_output2() does for IPv4, check that we have enough headroom in ip6_xmit(), and reallocate it if we don't.
This issue is older than git history.
Reported-by: Jianlin Shi jishi@redhat.com Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_output.c | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-)
--- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -195,37 +195,37 @@ int ip6_xmit(const struct sock *sk, stru const struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *first_hop = &fl6->daddr; struct dst_entry *dst = skb_dst(skb); + unsigned int head_room; struct ipv6hdr *hdr; u8 proto = fl6->flowi6_proto; int seg_len = skb->len; int hlimit = -1; u32 mtu;
- if (opt) { - unsigned int head_room; - - /* First: exthdrs may take lots of space (~8K for now) - MAX_HEADER is not enough. - */ - head_room = opt->opt_nflen + opt->opt_flen; - seg_len += head_room; - head_room += sizeof(struct ipv6hdr) + LL_RESERVED_SPACE(dst->dev); + head_room = sizeof(struct ipv6hdr) + LL_RESERVED_SPACE(dst->dev); + if (opt) + head_room += opt->opt_nflen + opt->opt_flen;
- if (skb_headroom(skb) < head_room) { - struct sk_buff *skb2 = skb_realloc_headroom(skb, head_room); - if (!skb2) { - IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), - IPSTATS_MIB_OUTDISCARDS); - kfree_skb(skb); - return -ENOBUFS; - } - if (skb->sk) - skb_set_owner_w(skb2, skb->sk); - consume_skb(skb); - skb = skb2; + if (unlikely(skb_headroom(skb) < head_room)) { + struct sk_buff *skb2 = skb_realloc_headroom(skb, head_room); + if (!skb2) { + IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), + IPSTATS_MIB_OUTDISCARDS); + kfree_skb(skb); + return -ENOBUFS; } + if (skb->sk) + skb_set_owner_w(skb2, skb->sk); + consume_skb(skb); + skb = skb2; + } + + if (opt) { + seg_len += opt->opt_nflen + opt->opt_flen; + if (opt->opt_flen) ipv6_push_frag_opts(skb, opt, &proto); + if (opt->opt_nflen) ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop, &fl6->saddr);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit e6ac64d4c4d095085d7dd71cbd05704ac99829b2 ]
While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs.
In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer.
Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet.
v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet)
Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/neighbour.h | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-)
--- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -452,6 +452,7 @@ static inline int neigh_hh_bridge(struct
static inline int neigh_hh_output(const struct hh_cache *hh, struct sk_buff *skb) { + unsigned int hh_alen = 0; unsigned int seq; unsigned int hh_len;
@@ -459,16 +460,33 @@ static inline int neigh_hh_output(const seq = read_seqbegin(&hh->hh_lock); hh_len = hh->hh_len; if (likely(hh_len <= HH_DATA_MOD)) { - /* this is inlined by gcc */ - memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD); + hh_alen = HH_DATA_MOD; + + /* skb_push() would proceed silently if we have room for + * the unaligned size but not for the aligned size: + * check headroom explicitly. + */ + if (likely(skb_headroom(skb) >= HH_DATA_MOD)) { + /* this is inlined by gcc */ + memcpy(skb->data - HH_DATA_MOD, hh->hh_data, + HH_DATA_MOD); + } } else { - unsigned int hh_alen = HH_DATA_ALIGN(hh_len); + hh_alen = HH_DATA_ALIGN(hh_len);
- memcpy(skb->data - hh_alen, hh->hh_data, hh_alen); + if (likely(skb_headroom(skb) >= hh_alen)) { + memcpy(skb->data - hh_alen, hh->hh_data, + hh_alen); + } } } while (read_seqretry(&hh->hh_lock, seq));
- skb_push(skb, hh_len); + if (WARN_ON_ONCE(skb_headroom(skb) < hh_alen)) { + kfree_skb(skb); + return NET_XMIT_DROP; + } + + __skb_push(skb, hh_len); return dev_queue_xmit(skb); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shmulik Ladkani shmulik@metanetworks.com
[ Upstream commit 1b4e5ad5d6b9f15cd0b5121f86d4719165958417 ]
In 'seg6_output', stack variable 'struct flowi6 fl6' was missing initialization.
Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Shmulik Ladkani shmulik.ladkani@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/seg6_iptunnel.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -327,6 +327,7 @@ static int seg6_output(struct net *net, struct ipv6hdr *hdr = ipv6_hdr(skb); struct flowi6 fl6;
+ memset(&fl6, 0, sizeof(fl6)); fl6.daddr = hdr->daddr; fl6.saddr = hdr->saddr; fl6.flowlabel = ip6_flowinfo(hdr);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Yanjun suyj.fnst@cn.fujitsu.com
[ Upstream commit a5d4a89245ead1f37ed135213653c5beebea4237 ]
When changing mtu many times with traffic, a bug is triggered:
[ 1035.684037] kernel BUG at lib/dynamic_queue_limits.c:26! [ 1035.684042] invalid opcode: 0000 [#1] SMP [ 1035.684049] Modules linked in: loop binfmt_misc 8139cp(OE) macsec tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag tcp_lp fuse uinput xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter devlink ip6_tables iptable_filter sunrpc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq iosf_mbi crc32_pclmul parport_pc snd_seq_device ghash_clmulni_intel parport snd_pcm aesni_intel joydev lrw snd_timer virtio_balloon sg gf128mul glue_helper ablk_helper cryptd snd soundcore i2c_piix4 pcspkr ip_tables xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic [ 1035.684102] pata_acpi virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt floppy fb_sys_fops crct10dif_pclmul crct10dif_common ttm crc32c_intel serio_raw ata_piix drm libata 8139too virtio_pci drm_panel_orientation_quirks virtio_ring virtio mii dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 8139cp] [ 1035.684132] CPU: 9 PID: 25140 Comm: if-mtu-change Kdump: loaded Tainted: G OE ------------ T 3.10.0-957.el7.x86_64 #1 [ 1035.684134] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 1035.684136] task: ffff8f59b1f5a080 ti: ffff8f5a2e32c000 task.ti: ffff8f5a2e32c000 [ 1035.684149] RIP: 0010:[<ffffffffba3a40d0>] [<ffffffffba3a40d0>] dql_completed+0x180/0x190 [ 1035.684162] RSP: 0000:ffff8f5a75483e50 EFLAGS: 00010093 [ 1035.684162] RAX: 00000000000000c2 RBX: ffff8f5a6f91c000 RCX: 0000000000000000 [ 1035.684162] RDX: 0000000000000000 RSI: 0000000000000184 RDI: ffff8f599fea3ec0 [ 1035.684162] RBP: ffff8f5a75483ea8 R08: 00000000000000c2 R09: 0000000000000000 [ 1035.684162] R10: 00000000000616ef R11: ffff8f5a75483b56 R12: ffff8f599fea3e00 [ 1035.684162] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000184 [ 1035.684162] FS: 00007fa8434de740(0000) GS:ffff8f5a75480000(0000) knlGS:0000000000000000 [ 1035.684162] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1035.684162] CR2: 00000000004305d0 CR3: 000000024eb66000 CR4: 00000000001406e0 [ 1035.684162] Call Trace: [ 1035.684162] <IRQ> [ 1035.684162] [<ffffffffc08cbaf8>] ? cp_interrupt+0x478/0x580 [8139cp] [ 1035.684162] [<ffffffffba14a294>] __handle_irq_event_percpu+0x44/0x1c0 [ 1035.684162] [<ffffffffba14a442>] handle_irq_event_percpu+0x32/0x80 [ 1035.684162] [<ffffffffba14a4cc>] handle_irq_event+0x3c/0x60 [ 1035.684162] [<ffffffffba14db29>] handle_fasteoi_irq+0x59/0x110 [ 1035.684162] [<ffffffffba02e554>] handle_irq+0xe4/0x1a0 [ 1035.684162] [<ffffffffba7795dd>] do_IRQ+0x4d/0xf0 [ 1035.684162] [<ffffffffba76b362>] common_interrupt+0x162/0x162 [ 1035.684162] <EOI> [ 1035.684162] [<ffffffffba0c2ae4>] ? __wake_up_bit+0x24/0x70 [ 1035.684162] [<ffffffffba1e46f5>] ? do_set_pte+0xd5/0x120 [ 1035.684162] [<ffffffffba1b64fb>] unlock_page+0x2b/0x30 [ 1035.684162] [<ffffffffba1e4879>] do_read_fault.isra.61+0x139/0x1b0 [ 1035.684162] [<ffffffffba1e9134>] handle_pte_fault+0x2f4/0xd10 [ 1035.684162] [<ffffffffba1ebc6d>] handle_mm_fault+0x39d/0x9b0 [ 1035.684162] [<ffffffffba76f5e3>] __do_page_fault+0x203/0x500 [ 1035.684162] [<ffffffffba76f9c6>] trace_do_page_fault+0x56/0x150 [ 1035.684162] [<ffffffffba76ef42>] do_async_page_fault+0x22/0xf0 [ 1035.684162] [<ffffffffba76b788>] async_page_fault+0x28/0x30 [ 1035.684162] Code: 54 c7 47 54 ff ff ff ff 44 0f 49 ce 48 8b 35 48 2f 9c 00 48 89 77 58 e9 fe fe ff ff 0f 1f 80 00 00 00 00 41 89 d1 e9 ef fe ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 8d 42 ff 48 [ 1035.684162] RIP [<ffffffffba3a40d0>] dql_completed+0x180/0x190 [ 1035.684162] RSP <ffff8f5a75483e50>
It's not the same as in 7fe0ee09 patch described. As 8139cp uses shared irq mode, other device irq will trigger cp_interrupt to execute.
cp_change_mtu -> cp_close -> cp_open
In cp_close routine just before free_irq(), some interrupt may occur. In my environment, cp_interrupt exectutes and IntrStatus is 0x4, exactly TxOk. That will cause cp_tx to wake device queue.
As device queue is started, cp_start_xmit and cp_open will run at same time which will cause kernel BUG.
For example: [#] for tx descriptor
At start:
[#][#][#] num_queued=3
After cp_init_hw->cp_start_hw->netdev_reset_queue:
[#][#][#] num_queued=0
When 8139cp starts to work then cp_tx will check num_queued mismatchs the complete_bytes.
The patch will check IntrMask before check IntrStatus in cp_interrupt. When 8139cp interrupt is disabled, just return.
Signed-off-by: Su Yanjun suyj.fnst@cn.fujitsu.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/8139cp.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/net/ethernet/realtek/8139cp.c +++ b/drivers/net/ethernet/realtek/8139cp.c @@ -571,6 +571,7 @@ static irqreturn_t cp_interrupt (int irq struct cp_private *cp; int handled = 0; u16 status; + u16 mask;
if (unlikely(dev == NULL)) return IRQ_NONE; @@ -578,6 +579,10 @@ static irqreturn_t cp_interrupt (int irq
spin_lock(&cp->lock);
+ mask = cpr16(IntrMask); + if (!mask) + goto out_unlock; + status = cpr16(IntrStatus); if (!status || (status == 0xFFFF)) goto out_unlock;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tarick Bedeir tarick@google.com
[ Upstream commit bd5122cd1e0644d8bd8dd84517c932773e999766 ]
rx_ppp and tx_ppp can be set between 0 and 255, so don't clamp to 1.
Fixes: 6e8814ceb7e8 ("net/mlx4_en: Fix mixed PFC and Global pause user control requests") Signed-off-by: Tarick Bedeir tarick@google.com Reviewed-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c @@ -1070,8 +1070,8 @@ static int mlx4_en_set_pauseparam(struct
tx_pause = !!(pause->tx_pause); rx_pause = !!(pause->rx_pause); - rx_ppp = priv->prof->rx_ppp && !(tx_pause || rx_pause); - tx_ppp = priv->prof->tx_ppp && !(tx_pause || rx_pause); + rx_ppp = (tx_pause || rx_pause) ? 0 : priv->prof->rx_ppp; + tx_ppp = (tx_pause || rx_pause) ? 0 : priv->prof->tx_ppp;
err = mlx4_SET_PORT_general(mdev->dev, priv->port, priv->rx_skb_size + ETH_FCS_LEN,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eran Ben Elisha eranbe@mellanox.com
[ Upstream commit 24be19e47779d604d1492c114459dca9a92acf78 ]
NIC driver minimal MTU size shall be set to ETH_MIN_MTU, as defined in the RFC791 and in the network stack. Remove old mlx4_en only define for it, which was set to wrong value.
Fixes: b80f71f5816f ("ethernet/mellanox: use core min/max MTU checking") Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 ++-- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 - 2 files changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -3505,8 +3505,8 @@ int mlx4_en_init_netdev(struct mlx4_en_d dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM; }
- /* MTU range: 46 - hw-specific max */ - dev->min_mtu = MLX4_EN_MIN_MTU; + /* MTU range: 68 - hw-specific max */ + dev->min_mtu = ETH_MIN_MTU; dev->max_mtu = priv->max_mtu;
mdev->pndev[port] = dev; --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -157,7 +157,6 @@ #define HEADER_COPY_SIZE (128 - NET_IP_ALIGN) #define MLX4_LOOPBACK_TEST_PAYLOAD (HEADER_COPY_SIZE - ETH_HLEN)
-#define MLX4_EN_MIN_MTU 46 /* VLAN_HLEN is added twice,to support skb vlan tagged with multiple * headers. (For example: ETH_P_8021Q and ETH_P_8021AD). */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit d2a36971ef595069b7a600d1144c2e0881a930a1 ]
Currently __set_phy_supported allows to add modes w/o checking whether the PHY supports them. This is wrong, it should never add modes but only remove modes we don't want to support.
The commit marked as fixed didn't do anything wrong, it just copied existing functionality to the helper which is being fixed now.
Fixes: f3a6bd393c2c ("phylib: Add phy_set_max_speed helper") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phy_device.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-)
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1703,20 +1703,17 @@ EXPORT_SYMBOL(genphy_loopback);
static int __set_phy_supported(struct phy_device *phydev, u32 max_speed) { - phydev->supported &= ~(PHY_1000BT_FEATURES | PHY_100BT_FEATURES | - PHY_10BT_FEATURES); - switch (max_speed) { - default: - return -ENOTSUPP; - case SPEED_1000: - phydev->supported |= PHY_1000BT_FEATURES; + case SPEED_10: + phydev->supported &= ~PHY_100BT_FEATURES; /* fall through */ case SPEED_100: - phydev->supported |= PHY_100BT_FEATURES; - /* fall through */ - case SPEED_10: - phydev->supported |= PHY_10BT_FEATURES; + phydev->supported &= ~PHY_1000BT_FEATURES; + break; + case SPEED_1000: + break; + default: + return -ENOTSUPP; }
return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Paasch cpaasch@apple.com
[ Upstream commit 9410d386d0a829ace9558336263086c2fbbe8aed ]
__qdisc_drop_all() accesses skb->prev to get to the tail of the segment-list.
With commit 68d2f84a1368 ("net: gro: properly remove skb from list") the skb-list handling has been changed to set skb->next to NULL and set the list-poison on skb->prev.
With that change, __qdisc_drop_all() will panic when it tries to dereference skb->prev.
Since commit 992cba7e276d ("net: Add and use skb_list_del_init().") __list_del_entry is used, leaving skb->prev unchanged (thus, pointing to the list-head if it's the first skb of the list). This will make __qdisc_drop_all modify the next-pointer of the list-head and result in a panic later on:
[ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI [ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108 [ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011 [ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90 [ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04 [ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202 [ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6 [ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038 [ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062 [ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000 [ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008 [ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000 [ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0 [ 34.514593] Call Trace: [ 34.514893] <IRQ> [ 34.515157] napi_gro_receive+0x93/0x150 [ 34.515632] receive_buf+0x893/0x3700 [ 34.516094] ? __netif_receive_skb+0x1f/0x1a0 [ 34.516629] ? virtnet_probe+0x1b40/0x1b40 [ 34.517153] ? __stable_node_chain+0x4d0/0x850 [ 34.517684] ? kfree+0x9a/0x180 [ 34.518067] ? __kasan_slab_free+0x171/0x190 [ 34.518582] ? detach_buf+0x1df/0x650 [ 34.519061] ? lapic_next_event+0x5a/0x90 [ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0 [ 34.520093] virtnet_poll+0x2df/0xd60 [ 34.520533] ? receive_buf+0x3700/0x3700 [ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140 [ 34.521631] ? htb_dequeue+0x1817/0x25f0 [ 34.522107] ? sch_direct_xmit+0x142/0xf30 [ 34.522595] ? virtqueue_napi_schedule+0x26/0x30 [ 34.523155] net_rx_action+0x2f6/0xc50 [ 34.523601] ? napi_complete_done+0x2f0/0x2f0 [ 34.524126] ? kasan_check_read+0x11/0x20 [ 34.524608] ? _raw_spin_lock+0x7d/0xd0 [ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0 [ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80 [ 34.526130] ? apic_ack_irq+0x9e/0xe0 [ 34.526567] __do_softirq+0x188/0x4b5 [ 34.527015] irq_exit+0x151/0x180 [ 34.527417] do_IRQ+0xdb/0x150 [ 34.527783] common_interrupt+0xf/0xf [ 34.528223] </IRQ>
This patch makes sure that skb->prev is set to NULL when entering netem_enqueue.
Cc: Prashant Bhole bhole_prashant_q7@lab.ntt.co.jp Cc: Tyler Hicks tyhicks@canonical.com Cc: Eric Dumazet eric.dumazet@gmail.com Fixes: 68d2f84a1368 ("net: gro: properly remove skb from list") Suggested-by: Eric Dumazet eric.dumazet@gmail.com Signed-off-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_netem.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -436,6 +436,9 @@ static int netem_enqueue(struct sk_buff int count = 1; int rc = NET_XMIT_SUCCESS;
+ /* Do not fool qdisc_drop_all() */ + skb->prev = NULL; + /* Random duplication */ if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor)) ++count;
Hello Greg,
On 14/12/18 - 12:59:22, Greg Kroah-Hartman wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Christoph Paasch cpaasch@apple.com
[ Upstream commit 9410d386d0a829ace9558336263086c2fbbe8aed ]
__qdisc_drop_all() accesses skb->prev to get to the tail of the segment-list.
With commit 68d2f84a1368 ("net: gro: properly remove skb from list") the skb-list handling has been changed to set skb->next to NULL and set the list-poison on skb->prev.
With that change, __qdisc_drop_all() will panic when it tries to dereference skb->prev.
Since commit 992cba7e276d ("net: Add and use skb_list_del_init().") __list_del_entry is used, leaving skb->prev unchanged (thus, pointing to the list-head if it's the first skb of the list). This will make __qdisc_drop_all modify the next-pointer of the list-head and result in a panic later on:
[ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI [ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108 [ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011 [ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90 [ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04 [ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202 [ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6 [ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038 [ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062 [ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000 [ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008 [ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000 [ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0 [ 34.514593] Call Trace: [ 34.514893] <IRQ> [ 34.515157] napi_gro_receive+0x93/0x150 [ 34.515632] receive_buf+0x893/0x3700 [ 34.516094] ? __netif_receive_skb+0x1f/0x1a0 [ 34.516629] ? virtnet_probe+0x1b40/0x1b40 [ 34.517153] ? __stable_node_chain+0x4d0/0x850 [ 34.517684] ? kfree+0x9a/0x180 [ 34.518067] ? __kasan_slab_free+0x171/0x190 [ 34.518582] ? detach_buf+0x1df/0x650 [ 34.519061] ? lapic_next_event+0x5a/0x90 [ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0 [ 34.520093] virtnet_poll+0x2df/0xd60 [ 34.520533] ? receive_buf+0x3700/0x3700 [ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140 [ 34.521631] ? htb_dequeue+0x1817/0x25f0 [ 34.522107] ? sch_direct_xmit+0x142/0xf30 [ 34.522595] ? virtqueue_napi_schedule+0x26/0x30 [ 34.523155] net_rx_action+0x2f6/0xc50 [ 34.523601] ? napi_complete_done+0x2f0/0x2f0 [ 34.524126] ? kasan_check_read+0x11/0x20 [ 34.524608] ? _raw_spin_lock+0x7d/0xd0 [ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0 [ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80 [ 34.526130] ? apic_ack_irq+0x9e/0xe0 [ 34.526567] __do_softirq+0x188/0x4b5 [ 34.527015] irq_exit+0x151/0x180 [ 34.527417] do_IRQ+0xdb/0x150 [ 34.527783] common_interrupt+0xf/0xf [ 34.528223] </IRQ>
This patch makes sure that skb->prev is set to NULL when entering netem_enqueue.
Cc: Prashant Bhole bhole_prashant_q7@lab.ntt.co.jp Cc: Tyler Hicks tyhicks@canonical.com Cc: Eric Dumazet eric.dumazet@gmail.com Fixes: 68d2f84a1368 ("net: gro: properly remove skb from list") Suggested-by: Eric Dumazet eric.dumazet@gmail.com Signed-off-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
I don't think this patch needs to be backported. I haven't hit this panic on v4.14.x (although, I only have been using v4.14.79 for now - not yet the latest one). And the commits that seem to have introduced the issue haven't been backported either AFAICS. That being said, I don't think a backport would cause any side-effects.
Or have you seen reports of this panic in the stable branches?
(same for the other stable branches v4.4,... - cfr., the other mails regarding this patch here)
Cheers, Christoph
net/sched/sch_netem.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -436,6 +436,9 @@ static int netem_enqueue(struct sk_buff int count = 1; int rc = NET_XMIT_SUCCESS;
- /* Do not fool qdisc_drop_all() */
- skb->prev = NULL;
- /* Random duplication */ if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor)) ++count;
On 14/12/18 - 07:52:26, Christoph Paasch wrote:
Hello Greg,
On 14/12/18 - 12:59:22, Greg Kroah-Hartman wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Christoph Paasch cpaasch@apple.com
[ Upstream commit 9410d386d0a829ace9558336263086c2fbbe8aed ]
__qdisc_drop_all() accesses skb->prev to get to the tail of the segment-list.
With commit 68d2f84a1368 ("net: gro: properly remove skb from list") the skb-list handling has been changed to set skb->next to NULL and set the list-poison on skb->prev.
With that change, __qdisc_drop_all() will panic when it tries to dereference skb->prev.
Since commit 992cba7e276d ("net: Add and use skb_list_del_init().") __list_del_entry is used, leaving skb->prev unchanged (thus, pointing to the list-head if it's the first skb of the list). This will make __qdisc_drop_all modify the next-pointer of the list-head and result in a panic later on:
[ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI [ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108 [ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011 [ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90 [ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04 [ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202 [ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6 [ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038 [ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062 [ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000 [ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008 [ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000 [ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0 [ 34.514593] Call Trace: [ 34.514893] <IRQ> [ 34.515157] napi_gro_receive+0x93/0x150 [ 34.515632] receive_buf+0x893/0x3700 [ 34.516094] ? __netif_receive_skb+0x1f/0x1a0 [ 34.516629] ? virtnet_probe+0x1b40/0x1b40 [ 34.517153] ? __stable_node_chain+0x4d0/0x850 [ 34.517684] ? kfree+0x9a/0x180 [ 34.518067] ? __kasan_slab_free+0x171/0x190 [ 34.518582] ? detach_buf+0x1df/0x650 [ 34.519061] ? lapic_next_event+0x5a/0x90 [ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0 [ 34.520093] virtnet_poll+0x2df/0xd60 [ 34.520533] ? receive_buf+0x3700/0x3700 [ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140 [ 34.521631] ? htb_dequeue+0x1817/0x25f0 [ 34.522107] ? sch_direct_xmit+0x142/0xf30 [ 34.522595] ? virtqueue_napi_schedule+0x26/0x30 [ 34.523155] net_rx_action+0x2f6/0xc50 [ 34.523601] ? napi_complete_done+0x2f0/0x2f0 [ 34.524126] ? kasan_check_read+0x11/0x20 [ 34.524608] ? _raw_spin_lock+0x7d/0xd0 [ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0 [ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80 [ 34.526130] ? apic_ack_irq+0x9e/0xe0 [ 34.526567] __do_softirq+0x188/0x4b5 [ 34.527015] irq_exit+0x151/0x180 [ 34.527417] do_IRQ+0xdb/0x150 [ 34.527783] common_interrupt+0xf/0xf [ 34.528223] </IRQ>
This patch makes sure that skb->prev is set to NULL when entering netem_enqueue.
Cc: Prashant Bhole bhole_prashant_q7@lab.ntt.co.jp Cc: Tyler Hicks tyhicks@canonical.com Cc: Eric Dumazet eric.dumazet@gmail.com Fixes: 68d2f84a1368 ("net: gro: properly remove skb from list") Suggested-by: Eric Dumazet eric.dumazet@gmail.com Signed-off-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
I don't think this patch needs to be backported. I haven't hit this panic on v4.14.x (although, I only have been using v4.14.79 for now - not yet the latest one). And the commits that seem to have introduced the issue haven't been backported either AFAICS. That being said, I don't think a backport would cause any side-effects.
Or have you seen reports of this panic in the stable branches?
(same for the other stable branches v4.4,... - cfr., the other mails regarding this patch here)
To clarify: Backport to v4.19.x is needed. The backport to v4.14 and older should not be needed.
Christoph
Cheers, Christoph
net/sched/sch_netem.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -436,6 +436,9 @@ static int netem_enqueue(struct sk_buff int count = 1; int rc = NET_XMIT_SUCCESS;
- /* Do not fool qdisc_drop_all() */
- skb->prev = NULL;
- /* Random duplication */ if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor)) ++count;
On Fri, Dec 14, 2018 at 7:54 AM Christoph Paasch cpaasch@apple.com wrote:
To clarify: Backport to v4.19.x is needed. The backport to v4.14 and older should not be needed.
Sure, but this does not cause any harm.
No further action needed :)
From: Christoph Paasch cpaasch@apple.com Date: Fri, 14 Dec 2018 07:52:26 -0800
I don't think this patch needs to be backported. I haven't hit this panic on v4.14.x (although, I only have been using v4.14.79 for now - not yet the latest one). And the commits that seem to have introduced the issue haven't been backported either AFAICS. That being said, I don't think a backport would cause any side-effects.
Or have you seen reports of this panic in the stable branches?
(same for the other stable branches v4.4,... - cfr., the other mails regarding this patch here)
I was being ultra paranoid when I decided to submit this to 4.14 too, and as Eric says it does not harm.
On 14/12/18 - 11:05:30, David Miller wrote:
From: Christoph Paasch cpaasch@apple.com Date: Fri, 14 Dec 2018 07:52:26 -0800
I don't think this patch needs to be backported. I haven't hit this panic on v4.14.x (although, I only have been using v4.14.79 for now - not yet the latest one). And the commits that seem to have introduced the issue haven't been backported either AFAICS. That being said, I don't think a backport would cause any side-effects.
Or have you seen reports of this panic in the stable branches?
(same for the other stable branches v4.4,... - cfr., the other mails regarding this patch here)
I was being ultra paranoid when I decided to submit this to 4.14 too, and as Eric says it does not harm.
Sounds good to me. I just wanted to point out that I didn't reproduce the issue nor tested the patch on the older stable branches.
Christoph
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 688838934c231bb08f46db687e57f6d8bf82709c ]
kmsan was able to trigger a kernel-infoleak using a gre device [1]
nlmsg_populate_fdb_fill() has a hard coded assumption that dev->addr_len is ETH_ALEN, as normally guaranteed for ARPHRD_ETHER devices.
A similar issue was fixed recently in commit da71577545a5 ("rtnetlink: Disallow FDB configuration for non-Ethernet device")
[1] BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:143 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576 CPU: 0 PID: 6697 Comm: syz-executor310 Not tainted 4.20.0-rc3+ #95 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x32d/0x480 lib/dump_stack.c:113 kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683 kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743 kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634 copyout lib/iov_iter.c:143 [inline] _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576 copy_to_iter include/linux/uio.h:143 [inline] skb_copy_datagram_iter+0x4e2/0x1070 net/core/datagram.c:431 skb_copy_datagram_msg include/linux/skbuff.h:3316 [inline] netlink_recvmsg+0x6f9/0x19d0 net/netlink/af_netlink.c:1975 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg+0x1d1/0x230 net/socket.c:801 ___sys_recvmsg+0x444/0xae0 net/socket.c:2278 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x441119 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffc7f008a8 EFLAGS: 00000207 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000441119 RDX: 0000000000000040 RSI: 00000000200005c0 RDI: 0000000000000003 RBP: 00000000006cc018 R08: 0000000000000100 R09: 0000000000000100 R10: 0000000000000100 R11: 0000000000000207 R12: 0000000000402080 R13: 0000000000402110 R14: 0000000000000000 R15: 0000000000000000
Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline] kmsan_save_stack mm/kmsan/kmsan.c:261 [inline] kmsan_internal_chain_origin+0x13d/0x240 mm/kmsan/kmsan.c:469 kmsan_memcpy_memmove_metadata+0x1a9/0xf70 mm/kmsan/kmsan.c:344 kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:362 __msan_memcpy+0x61/0x70 mm/kmsan/kmsan_instr.c:162 __nla_put lib/nlattr.c:744 [inline] nla_put+0x20a/0x2d0 lib/nlattr.c:802 nlmsg_populate_fdb_fill+0x444/0x810 net/core/rtnetlink.c:3466 nlmsg_populate_fdb net/core/rtnetlink.c:3775 [inline] ndo_dflt_fdb_dump+0x73a/0x960 net/core/rtnetlink.c:3807 rtnl_fdb_dump+0x1318/0x1cb0 net/core/rtnetlink.c:3979 netlink_dump+0xc79/0x1c90 net/netlink/af_netlink.c:2244 __netlink_dump_start+0x10c4/0x11d0 net/netlink/af_netlink.c:2352 netlink_dump_start include/linux/netlink.h:216 [inline] rtnetlink_rcv_msg+0x141b/0x1540 net/core/rtnetlink.c:4910 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline] kmsan_internal_poison_shadow+0x6d/0x130 mm/kmsan/kmsan.c:170 kmsan_kmalloc+0xa1/0x100 mm/kmsan/kmsan_hooks.c:186 __kmalloc+0x14c/0x4d0 mm/slub.c:3825 kmalloc include/linux/slab.h:551 [inline] __hw_addr_create_ex net/core/dev_addr_lists.c:34 [inline] __hw_addr_add_ex net/core/dev_addr_lists.c:80 [inline] __dev_mc_add+0x357/0x8a0 net/core/dev_addr_lists.c:670 dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687 ip_mc_filter_add net/ipv4/igmp.c:1128 [inline] igmp_group_added+0x4d4/0xb80 net/ipv4/igmp.c:1311 __ip_mc_inc_group+0xea9/0xf70 net/ipv4/igmp.c:1444 ip_mc_inc_group net/ipv4/igmp.c:1453 [inline] ip_mc_up+0x1c3/0x400 net/ipv4/igmp.c:1775 inetdev_event+0x1d03/0x1d80 net/ipv4/devinet.c:1522 notifier_call_chain kernel/notifier.c:93 [inline] __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x13d/0x240 kernel/notifier.c:401 __dev_notify_flags+0x3da/0x860 net/core/dev.c:1733 dev_change_flags+0x1ac/0x230 net/core/dev.c:7569 do_setlink+0x165f/0x5ea0 net/core/rtnetlink.c:2492 rtnl_newlink+0x2ad7/0x35a0 net/core/rtnetlink.c:3111 rtnetlink_rcv_msg+0x1148/0x1540 net/core/rtnetlink.c:4947 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Bytes 36-37 of 105 are uninitialized Memory access of size 105 starts at ffff88819686c000 Data copied to user address 0000000020000380
Fixes: d83b06036048 ("net: add fdb generic dump routine") Signed-off-by: Eric Dumazet edumazet@google.com Cc: John Fastabend john.fastabend@gmail.com Cc: Ido Schimmel idosch@mellanox.com Cc: David Ahern dsahern@gmail.com Reviewed-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/rtnetlink.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3280,6 +3280,9 @@ int ndo_dflt_fdb_dump(struct sk_buff *sk { int err;
+ if (dev->type != ARPHRD_ETHER) + return -EINVAL; + netif_addr_lock_bh(dev); err = nlmsg_populate_fdb(skb, cb, dev, idx, &dev->uc); if (err)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit fb6df5a6234c38a9c551559506a49a677ac6f07a ]
In sctp_hash_transport/sctp_epaddr_lookup_transport, it dereferences a transport's asoc under rcu_read_lock while asoc is freed not after a grace period, which leads to a use-after-free panic.
This patch fixes it by calling kfree_rcu to make asoc be freed after a grace period.
Note that only the asoc's memory is delayed to free in the patch, it won't cause sk to linger longer.
Thanks Neil and Marcelo to make this clear.
Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable") Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new transport") Reported-by: syzbot+0b05d8aa7cb185107483@syzkaller.appspotmail.com Reported-by: syzbot+aad231d51b1923158444@syzkaller.appspotmail.com Suggested-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sctp/structs.h | 2 ++ net/sctp/associola.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
--- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -1902,6 +1902,8 @@ struct sctp_association {
__u64 abandoned_unsent[SCTP_PR_INDEX(MAX) + 1]; __u64 abandoned_sent[SCTP_PR_INDEX(MAX) + 1]; + + struct rcu_head rcu; };
--- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -432,7 +432,7 @@ static void sctp_association_destroy(str
WARN_ON(atomic_read(&asoc->rmem_alloc));
- kfree(asoc); + kfree_rcu(asoc, rcu); SCTP_DBG_OBJCNT_DEC(assoc); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 41727549de3e7281feb174d568c6e46823db8684 ]
If available rwnd is too small, tcp_tso_should_defer() can decide it is worth waiting before splitting a TSO packet.
This really means we are rwnd limited.
Fixes: 5615f88614a4 ("tcp: instrument how long TCP is limited by receive window") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Soheil Hassas Yeganeh soheil@google.com Reviewed-by: Yuchung Cheng ycheng@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_output.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2328,8 +2328,11 @@ static bool tcp_write_xmit(struct sock * } else { if (!push_one && tcp_tso_should_defer(sk, skb, &is_cwnd_limited, - max_segs)) + max_segs)) { + if (!is_cwnd_limited) + is_rwnd_limited = true; break; + } }
limit = mss_now;
Hi Greg,
On Fri, Dec 14, 2018 at 12:08 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Eric Dumazet edumazet@google.com
[ Upstream commit 41727549de3e7281feb174d568c6e46823db8684 ]
There is one upstream commit which fixes this one. f9bfe4e6a9d0 ("tcp: lack of available data can also cause TSO defer")
On Fri, Dec 14, 2018 at 02:03:55PM +0000, Sudip Mukherjee wrote:
Hi Greg,
On Fri, Dec 14, 2018 at 12:08 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Eric Dumazet edumazet@google.com
[ Upstream commit 41727549de3e7281feb174d568c6e46823db8684 ]
There is one upstream commit which fixes this one. f9bfe4e6a9d0 ("tcp: lack of available data can also cause TSO defer")
I can take this if Eric and/or David ack it :)
thanks,
greg k-h
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Date: Fri, 14 Dec 2018 15:26:21 +0100
On Fri, Dec 14, 2018 at 02:03:55PM +0000, Sudip Mukherjee wrote:
Hi Greg,
On Fri, Dec 14, 2018 at 12:08 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
4.14-stable review patch. If anyone has any objections, please let me know.
From: Eric Dumazet edumazet@google.com
[ Upstream commit 41727549de3e7281feb174d568c6e46823db8684 ]
There is one upstream commit which fixes this one. f9bfe4e6a9d0 ("tcp: lack of available data can also cause TSO defer")
I can take this if Eric and/or David ack it :)
Acked-by: David S. Miller davem@davemloft.net
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuchung Cheng ycheng@google.com
[ Upstream commit b2b7af861122a0c0f6260155c29a1b2e594cd5b5 ]
TCP loss probe timer may fire when the retranmission queue is empty but has a non-zero tp->packets_out counter. tcp_send_loss_probe will call tcp_rearm_rto which triggers NULL pointer reference by fetching the retranmission queue head in its sub-routines.
Add a more detailed warning to help catch the root cause of the inflight accounting inconsistency.
Reported-by: Rafael Tinoco rafael.tinoco@linaro.org Signed-off-by: Yuchung Cheng ycheng@google.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: Neal Cardwell ncardwell@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_output.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2476,14 +2476,18 @@ void tcp_send_loss_probe(struct sock *sk skb = tcp_write_queue_tail(sk); }
+ if (unlikely(!skb)) { + WARN_ONCE(tp->packets_out, + "invalid inflight: %u state %u cwnd %u mss %d\n", + tp->packets_out, sk->sk_state, tp->snd_cwnd, mss); + inet_csk(sk)->icsk_pending = 0; + return; + } + /* At most one outstanding TLP retransmission. */ if (tp->tlp_high_seq) goto rearm_timer;
- /* Retransmit last segment. */ - if (WARN_ON(!skb)) - goto rearm_timer; - if (skb_still_in_host_queue(sk, skb)) goto rearm_timer;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Dichtel nicolas.dichtel@6wind.com
[ Upstream commit 35b827b6d06199841a83839e8bb69c0cd13a28be ]
It's not supported right now (the goal of the initial patch was to support 'ip link del' only).
Before the patch: $ ip link add foo type tun [ 239.632660] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [snip] [ 239.636410] RIP: 0010:register_netdevice+0x8e/0x3a0
This panic occurs because dev->netdev_ops is not set by tun_setup(). But to have something usable, it will require more than just setting netdev_ops.
Fixes: f019a7a594d9 ("tun: Implement ip link del tunXXX") CC: Eric W. Biederman ebiederm@xmission.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/tun.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1818,9 +1818,9 @@ static void tun_setup(struct net_device static int tun_validate(struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { - if (!data) - return 0; - return -EINVAL; + NL_SET_ERR_MSG(extack, + "tun/tap creation via rtnetlink is not supported."); + return -EOPNOTSUPP; }
static struct rtnl_link_ops tun_link_ops __read_mostly = {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Wang jasowang@redhat.com
[ Upstream commit 436c9453a1ac0944b82870ef2e0d9be956b396d9 ]
We copy vnet header unconditionally in page_to_skb() this is wrong since XDP may modify the packet data. So let's keep a zeroed vnet header for not confusing the conversion between vnet header and skb metadata.
In the future, we should able to detect whether or not the packet was modified and keep using the vnet header when packet was not touched.
Fixes: f600b6905015 ("virtio_net: Add XDP support") Reported-by: Pavel Popa pashinho1990@gmail.com Signed-off-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -309,7 +309,8 @@ static unsigned int mergeable_ctx_to_tru static struct sk_buff *page_to_skb(struct virtnet_info *vi, struct receive_queue *rq, struct page *page, unsigned int offset, - unsigned int len, unsigned int truesize) + unsigned int len, unsigned int truesize, + bool hdr_valid) { struct sk_buff *skb; struct virtio_net_hdr_mrg_rxbuf *hdr; @@ -331,7 +332,8 @@ static struct sk_buff *page_to_skb(struc else hdr_padded_len = sizeof(struct padded_vnet_hdr);
- memcpy(hdr, p, hdr_len); + if (hdr_valid) + memcpy(hdr, p, hdr_len);
len -= hdr_len; offset += hdr_padded_len; @@ -594,7 +596,8 @@ static struct sk_buff *receive_big(struc unsigned int len) { struct page *page = buf; - struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE); + struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len, + PAGE_SIZE, true);
if (unlikely(!skb)) goto err; @@ -678,7 +681,8 @@ static struct sk_buff *receive_mergeable rcu_read_unlock(); put_page(page); head_skb = page_to_skb(vi, rq, xdp_page, - offset, len, PAGE_SIZE); + offset, len, + PAGE_SIZE, false); ewma_pkt_len_add(&rq->mrg_avg_pkt_len, len); return head_skb; } @@ -712,7 +716,7 @@ static struct sk_buff *receive_mergeable goto err_skb; }
- head_skb = page_to_skb(vi, rq, page, offset, len, truesize); + head_skb = page_to_skb(vi, rq, page, offset, len, truesize, !xdp_prog); curr_skb = head_skb;
if (unlikely(!curr_skb))
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit eef3dc34a1e0b01d53328b88c25237bcc7323777 ]
When building the kernel with Clang, the following section mismatch warning appears:
WARNING: vmlinux.o(.text+0x38b3c): Section mismatch in reference from the function omap44xx_prm_late_init() to the function .init.text:omap44xx_prm_enable_io_wakeup() The function omap44xx_prm_late_init() references the function __init omap44xx_prm_enable_io_wakeup(). This is often because omap44xx_prm_late_init lacks a __init annotation or the annotation of omap44xx_prm_enable_io_wakeup is wrong.
Remove the __init annotation from omap44xx_prm_enable_io_wakeup so there is no more mismatch.
Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap2/prm44xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-omap2/prm44xx.c b/arch/arm/mach-omap2/prm44xx.c index 1c0c1663f078..5affa9f5300b 100644 --- a/arch/arm/mach-omap2/prm44xx.c +++ b/arch/arm/mach-omap2/prm44xx.c @@ -344,7 +344,7 @@ static void omap44xx_prm_reconfigure_io_chain(void) * to occur, WAKEUPENABLE bits must be set in the pad mux registers, and * omap44xx_prm_reconfigure_io_chain() must be called. No return value. */ -static void __init omap44xx_prm_enable_io_wakeup(void) +static void omap44xx_prm_enable_io_wakeup(void) { s32 inst = omap4_prmst_get_prm_dev_inst();
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3ee9a76a8c5a10e1bfb04b81db767c6d562ddaf3 ]
commit 4d230d12710646 ("ASoC: rsnd: fixup not to call clk_get/set under non-atomic") fixuped clock start timing. But it exchanged clock start checker from ssi->usrcnt to ssi->rate.
Current rsnd_ssi_master_clk_start() is called from .prepare, but some player (for example GStreamer) might calls it many times. In such case, the checker might returns error even though it was not error. It should check ssi->usrcnt instead of ssi->rate. This patch fixup it. Without this patch, GStreamer can't switch 48kHz / 44.1kHz.
Reported-by: Yusuke Goda yusuke.goda.sx@renesas.com Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Tested-by: Yusuke Goda yusuke.goda.sx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sh/rcar/ssi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/sh/rcar/ssi.c b/sound/soc/sh/rcar/ssi.c index 34223c8c28a8..0db2791f7035 100644 --- a/sound/soc/sh/rcar/ssi.c +++ b/sound/soc/sh/rcar/ssi.c @@ -280,7 +280,7 @@ static int rsnd_ssi_master_clk_start(struct rsnd_mod *mod, if (rsnd_ssi_is_multi_slave(mod, io)) return 0;
- if (ssi->rate) { + if (ssi->usrcnt > 1) { if (ssi->rate != rate) { dev_err(dev, "SSI parent/child should use same rate\n"); return -EINVAL;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c3e43d8b958bd6849817393483e805d8638a8ab7 ]
We return 0 unconditionally in 'rtw_wx_read32()'. However, 'ret' is set to some error codes in several error handling paths.
Return 'ret' instead to propagate the error code.
Fixes: 554c0a3abf216 ("staging: Add rtl8723bs sdio wifi driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c index d5e5f830f2a1..1b61da61690b 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c @@ -2383,7 +2383,7 @@ static int rtw_wx_read32(struct net_device *dev, exit: kfree(ptmp);
- return 0; + return ret; }
static int rtw_wx_write32(struct net_device *dev,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3d8b804bc528d3720ec0c39c212af92dafaf6e84 ]
The interrupt on mmc3_dat1 is wrong which prevents this from appearing in /proc/interrupts.
Fixes: ab8dd3aed011 ("ARM: DTS: Add minimal Support for Logic PD DM3730 SOM-LV") #Kernel 4.9+
Signed-off-by: Adam Ford aford173@gmail.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/logicpd-som-lv.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/logicpd-som-lv.dtsi b/arch/arm/boot/dts/logicpd-som-lv.dtsi index c335b923753a..a7883676f675 100644 --- a/arch/arm/boot/dts/logicpd-som-lv.dtsi +++ b/arch/arm/boot/dts/logicpd-som-lv.dtsi @@ -123,7 +123,7 @@ };
&mmc3 { - interrupts-extended = <&intc 94 &omap3_pmx_core2 0x46>; + interrupts-extended = <&intc 94 &omap3_pmx_core 0x136>; pinctrl-0 = <&mmc3_pins &wl127x_gpio>; pinctrl-names = "default"; vmmc-supply = <&wl12xx_vmmc>;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit cec83ff1241ec98113a19385ea9e9cfa9aa4125b ]
While playing with initialization order of modem device, it has been discovered that under some circumstances (early console init, I believe) its .pm() callback may be called before the uart_port->private_data pointer is initialized from plat_serial8250_port->private_data, resulting in NULL pointer dereference. Fix it by checking for uninitialized pointer before using it in modem_pm().
Fixes: aabf31737a6a ("ARM: OMAP1: ams-delta: update the modem to use regulator API") Signed-off-by: Janusz Krzysztofik jmkrzyszt@gmail.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-omap1/board-ams-delta.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/mach-omap1/board-ams-delta.c b/arch/arm/mach-omap1/board-ams-delta.c index 6cbc69c92913..4174fa86bfb1 100644 --- a/arch/arm/mach-omap1/board-ams-delta.c +++ b/arch/arm/mach-omap1/board-ams-delta.c @@ -512,6 +512,9 @@ static void modem_pm(struct uart_port *port, unsigned int state, unsigned old) struct modem_private_data *priv = port->private_data; int ret;
+ if (!priv) + return; + if (IS_ERR(priv->regulator)) return;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c4b7d1ba7d263b74bb72e9325262a67139605cde ]
Fixes gcc '-Wunused-but-set-variable' warning:
fs/sysv/inode.c: In function '__sysv_write_inode': fs/sysv/inode.c:239:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]
__sysv_write_inode should return 'err' instead of 0
Fixes: 05459ca81ac3 ("repair sysv_write_inode(), switch sysv to simple_fsync()") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/sysv/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c index 3c47b7d5d4cf..9e0874d1524c 100644 --- a/fs/sysv/inode.c +++ b/fs/sysv/inode.c @@ -275,7 +275,7 @@ static int __sysv_write_inode(struct inode *inode, int wait) } } brelse(bh); - return 0; + return err; }
int sysv_write_inode(struct inode *inode, struct writeback_control *wbc)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 25d8bcedbf4329895dbaf9dd67baa6f18dad918c ]
Start flood ping for each cpu while loading/flushing rulesets to make sure we do not access already-free'd rules from nf_tables evaluation loop.
Also add this to TARGETS so 'make run_tests' in selftest dir runs it automatically.
This would have caught the bug fixed in previous change ("netfilter: nf_tables: do not skip inactive chains during generation update") sooner.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/netfilter/Makefile | 6 ++ tools/testing/selftests/netfilter/config | 2 + .../selftests/netfilter/nft_trans_stress.sh | 78 +++++++++++++++++++ 4 files changed, 87 insertions(+) create mode 100644 tools/testing/selftests/netfilter/Makefile create mode 100644 tools/testing/selftests/netfilter/config create mode 100755 tools/testing/selftests/netfilter/nft_trans_stress.sh
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index ea300e7818a7..10b89f5b9af7 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -20,6 +20,7 @@ TARGETS += memory-hotplug TARGETS += mount TARGETS += mqueue TARGETS += net +TARGETS += netfilter TARGETS += nsfs TARGETS += powerpc TARGETS += pstore diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile new file mode 100644 index 000000000000..47ed6cef93fb --- /dev/null +++ b/tools/testing/selftests/netfilter/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# Makefile for netfilter selftests + +TEST_PROGS := nft_trans_stress.sh + +include ../lib.mk diff --git a/tools/testing/selftests/netfilter/config b/tools/testing/selftests/netfilter/config new file mode 100644 index 000000000000..1017313e41a8 --- /dev/null +++ b/tools/testing/selftests/netfilter/config @@ -0,0 +1,2 @@ +CONFIG_NET_NS=y +NF_TABLES_INET=y diff --git a/tools/testing/selftests/netfilter/nft_trans_stress.sh b/tools/testing/selftests/netfilter/nft_trans_stress.sh new file mode 100755 index 000000000000..f1affd12c4b1 --- /dev/null +++ b/tools/testing/selftests/netfilter/nft_trans_stress.sh @@ -0,0 +1,78 @@ +#!/bin/bash +# +# This test is for stress-testing the nf_tables config plane path vs. +# packet path processing: Make sure we never release rules that are +# still visible to other cpus. +# +# set -e + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +testns=testns1 +tables="foo bar baz quux" + +nft --version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without nft tool" + exit $ksft_skip +fi + +ip -Version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without ip tool" + exit $ksft_skip +fi + +tmp=$(mktemp) + +for table in $tables; do + echo add table inet "$table" >> "$tmp" + echo flush table inet "$table" >> "$tmp" + + echo "add chain inet $table INPUT { type filter hook input priority 0; }" >> "$tmp" + echo "add chain inet $table OUTPUT { type filter hook output priority 0; }" >> "$tmp" + for c in $(seq 1 400); do + chain=$(printf "chain%03u" "$c") + echo "add chain inet $table $chain" >> "$tmp" + done + + for c in $(seq 1 400); do + chain=$(printf "chain%03u" "$c") + for BASE in INPUT OUTPUT; do + echo "add rule inet $table $BASE counter jump $chain" >> "$tmp" + done + echo "add rule inet $table $chain counter return" >> "$tmp" + done +done + +ip netns add "$testns" +ip -netns "$testns" link set lo up + +lscpu | grep ^CPU(s): | ( read cpu cpunum ; +cpunum=$((cpunum-1)) +for i in $(seq 0 $cpunum);do + mask=$(printf 0x%x $((1<<$i))) + ip netns exec "$testns" taskset $mask ping -4 127.0.0.1 -fq > /dev/null & + ip netns exec "$testns" taskset $mask ping -6 ::1 -fq > /dev/null & +done) + +sleep 1 + +for i in $(seq 1 10) ; do ip netns exec "$testns" nft -f "$tmp" & done + +for table in $tables;do + randsleep=$((RANDOM%10)) + sleep $randsleep + ip netns exec "$testns" nft delete table inet $table 2>/dev/null +done + +randsleep=$((RANDOM%10)) +sleep $randsleep + +pkill -9 ping + +wait + +rm -f "$tmp" +ip netns del "$testns"
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 29e3880109e357fdc607b4393f8308cef6af9413 ]
nft_compat ops do not have static storage duration, unlike all other expressions.
When nf_tables_expr_destroy() returns, expr->ops might have been free'd already, so we need to store next address before calling expression destructor.
For same reason, we can't deref match pointer after nft_xt_put().
This can be easily reproduced by adding msleep() before nft_match_destroy() returns.
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables") Reported-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 5 +++-- net/netfilter/nft_compat.c | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 3ae365f92bff..ea1e57daf50e 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2252,7 +2252,7 @@ static int nf_tables_getrule(struct net *net, struct sock *nlsk, static void nf_tables_rule_destroy(const struct nft_ctx *ctx, struct nft_rule *rule) { - struct nft_expr *expr; + struct nft_expr *expr, *next;
/* * Careful: some expressions might not be initialized in case this @@ -2260,8 +2260,9 @@ static void nf_tables_rule_destroy(const struct nft_ctx *ctx, */ expr = nft_expr_first(rule); while (expr != nft_expr_last(rule) && expr->ops) { + next = nft_expr_next(expr); nf_tables_expr_destroy(ctx, expr); - expr = nft_expr_next(expr); + expr = next; } kfree(rule); } diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index 6da1cec1494a..7533c2fd6b76 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -497,6 +497,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr, void *info) { struct xt_match *match = expr->ops->data; + struct module *me = match->me; struct xt_mtdtor_param par;
par.net = ctx->net; @@ -507,7 +508,7 @@ __nft_match_destroy(const struct nft_ctx *ctx, const struct nft_expr *expr, par.match->destroy(&par);
if (nft_xt_put(container_of(expr->ops, struct nft_xt, ops))) - module_put(match->me); + module_put(me); }
static void
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 70df9ebbd82c794ddfbb49d45b337f18d5588dc2 ]
When using DT configurations, the id pointer might turn out to be NULL. Then the driver encounters NULL pointer access:
Unable to handle kernel read from unreadable memory at vaddr 00000018 [...] PC is at ina2xx_probe+0x114/0x200 LR is at ina2xx_probe+0x10c/0x200 [...] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
The reason is that i2c core returns the id pointer by matching id_table with client->name, while the client->name is actually using the name from the first string in the DT compatible list, not the best one. So i2c core would fail to match the id_table if the best matched compatible string isn't the first one, and then would return a NULL id pointer.
This probably should be fixed in i2c core. But it doesn't hurt to make the driver robust. So this patch fixes it by using the "chip" that's added to unify both DT and non-DT configurations.
Additionally, since id pointer could be null, so as id->name: ina2xx 10-0047: power monitor (null) (Rshunt = 1000 uOhm) ina2xx 10-0048: power monitor (null) (Rshunt = 10000 uOhm)
So this patch also fixes NULL name pointer, using client->name to play safe and to align with hwmon->name.
Fixes: bd0ddd4d0883 ("hwmon: (ina2xx) Add OF device ID table") Signed-off-by: Nicolin Chen nicoleotsuka@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ina2xx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/ina2xx.c b/drivers/hwmon/ina2xx.c index 71d3445ba869..c2252cf452f5 100644 --- a/drivers/hwmon/ina2xx.c +++ b/drivers/hwmon/ina2xx.c @@ -491,7 +491,7 @@ static int ina2xx_probe(struct i2c_client *client, }
data->groups[group++] = &ina2xx_group; - if (id->driver_data == ina226) + if (chip == ina226) data->groups[group++] = &ina226_group;
hwmon_dev = devm_hwmon_device_register_with_groups(dev, client->name, @@ -500,7 +500,7 @@ static int ina2xx_probe(struct i2c_client *client, return PTR_ERR(hwmon_dev);
dev_info(dev, "power monitor %s (Rshunt = %li uOhm)\n", - id->name, data->rshunt); + client->name, data->rshunt);
return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 20e00db2f59bdddf8a8e241473ef8be94631d3ae ]
Stack memory isn't DMA-safe so it isn't safe to use either regmap_raw_read or regmap_bulk_read to read into stack memory.
The two functions to read the scratch registers were using stack memory and regmap_raw_read. It's not worth allocating memory just for this trivial read, and it isn't time-critical. A simple regmap_read for each register is sufficient.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wm_adsp.c | 37 ++++++++++++++++++++----------------- 1 file changed, 20 insertions(+), 17 deletions(-)
diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c index 989d093abda7..67330b6ab204 100644 --- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -787,38 +787,41 @@ static unsigned int wm_adsp_region_to_reg(struct wm_adsp_region const *mem,
static void wm_adsp2_show_fw_status(struct wm_adsp *dsp) { - u16 scratch[4]; + unsigned int scratch[4]; + unsigned int addr = dsp->base + ADSP2_SCRATCH0; + unsigned int i; int ret;
- ret = regmap_raw_read(dsp->regmap, dsp->base + ADSP2_SCRATCH0, - scratch, sizeof(scratch)); - if (ret) { - adsp_err(dsp, "Failed to read SCRATCH regs: %d\n", ret); - return; + for (i = 0; i < ARRAY_SIZE(scratch); ++i) { + ret = regmap_read(dsp->regmap, addr + i, &scratch[i]); + if (ret) { + adsp_err(dsp, "Failed to read SCRATCH%u: %d\n", i, ret); + return; + } }
adsp_dbg(dsp, "FW SCRATCH 0:0x%x 1:0x%x 2:0x%x 3:0x%x\n", - be16_to_cpu(scratch[0]), - be16_to_cpu(scratch[1]), - be16_to_cpu(scratch[2]), - be16_to_cpu(scratch[3])); + scratch[0], scratch[1], scratch[2], scratch[3]); }
static void wm_adsp2v2_show_fw_status(struct wm_adsp *dsp) { - u32 scratch[2]; + unsigned int scratch[2]; int ret;
- ret = regmap_raw_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH0_1, - scratch, sizeof(scratch)); - + ret = regmap_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH0_1, + &scratch[0]); if (ret) { - adsp_err(dsp, "Failed to read SCRATCH regs: %d\n", ret); + adsp_err(dsp, "Failed to read SCRATCH0_1: %d\n", ret); return; }
- scratch[0] = be32_to_cpu(scratch[0]); - scratch[1] = be32_to_cpu(scratch[1]); + ret = regmap_read(dsp->regmap, dsp->base + ADSP2V2_SCRATCH2_3, + &scratch[1]); + if (ret) { + adsp_err(dsp, "Failed to read SCRATCH2_3: %d\n", ret); + return; + }
adsp_dbg(dsp, "FW SCRATCH 0:0x%x 1:0x%x 2:0x%x 3:0x%x\n", scratch[0] & 0xFFFF,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 613a41b0d16e617f46776a93b975a1eeea96417c ]
On s390 command perf top fails [root@s35lp76 perf] # ./perf top -F100000 --stdio Error: cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' [root@s35lp76 perf] #
Using event -e rb0000 works as designed. Event rb0000 is the event number of the sampling facility for basic sampling.
During system start up the following PMUs are installed in the kernel's PMU list (from head to tail): cpum_cf --> s390 PMU counter facility device driver cpum_sf --> s390 PMU sampling facility device driver uprobe kprobe tracepoint task_clock cpu_clock
Perf top executes following functions and calls perf_event_open(2) system call with different parameters many times:
cmd_top --> __cmd_top --> perf_evlist__add_default --> __perf_evlist__add_default --> perf_evlist__new_cycles (creates event type:0 (HW) config 0 (CPU_CYCLES) --> perf_event_attr__set_max_precise_ip Uses perf_event_open(2) to detect correct precise_ip level. Fails 3 times on s390 which is ok.
Then functions cmd_top --> __cmd_top --> perf_top__start_counters -->perf_evlist__config --> perf_can_comm_exec --> perf_probe_api This functions test support for the following events: "cycles:u", "instructions:u", "cpu-clock:u" using --> perf_do_probe_api --> perf_event_open_cloexec Test the close on exec flag support with perf_event_open(2). perf_do_probe_api returns true if the event is supported. The function returns true because event cpu-clock is supported by the PMU cpu_clock. This is achieved by many calls to perf_event_open(2).
Function perf_top__start_counters now calls perf_evsel__open() for every event, which is the default event cpu_cycles (config:0) and type HARDWARE (type:0) which a predfined frequence of 4000.
Given the above order of the PMU list, the PMU cpum_cf gets called first and returns 0, which indicates support for this sampling. The event is fully allocated in the function perf_event_open (file kernel/event/core.c near line 10521 and the following check fails:
event = perf_event_alloc(&attr, cpu, task, group_leader, NULL, NULL, NULL, cgroup_fd); if (IS_ERR(event)) { err = PTR_ERR(event); goto err_cred; }
if (is_sampling_event(event)) { if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) { err = -EOPNOTSUPP; goto err_alloc; } }
The check for the interrupt capabilities fails and the system call perf_event_open() returns -EOPNOTSUPP (-95).
Add a check to return -ENODEV when sampling is requested in PMU cpum_cf. This allows common kernel code in the perf_event_open() system call to test the next PMU in above list.
Fixes: 97b1198fece0 (" "s390, perf: Use common PMU interrupt disabled code") Signed-off-by: Thomas Richter tmricht@linux.ibm.com Reviewed-by: Hendrik Brueckner brueckner@linux.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/perf_cpum_cf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c index 61e91fee8467..edf6a61f0a64 100644 --- a/arch/s390/kernel/perf_cpum_cf.c +++ b/arch/s390/kernel/perf_cpum_cf.c @@ -349,6 +349,8 @@ static int __hw_perf_event_init(struct perf_event *event) break;
case PERF_TYPE_HARDWARE: + if (is_sampling_event(event)) /* No sampling support */ + return -ENOENT; ev = attr->config; /* Count user space (problem-state) only */ if (!attr->exclude_user && attr->exclude_kernel) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 38cd989ee38c16388cde89db5b734f9d55b905f9 ]
The current register (04h) has a sign bit at MSB. The comments for this calculation also mention that it's a signed register.
However, the regval is unsigned type so result of calculation turns out to be an incorrect value when current is negative.
This patch simply fixes this by adding a casting to s16.
Fixes: 5d389b125186c ("hwmon: (ina2xx) Make calibration register value fixed") Signed-off-by: Nicolin Chen nicoleotsuka@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ina2xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/ina2xx.c b/drivers/hwmon/ina2xx.c index c2252cf452f5..07ee19573b3f 100644 --- a/drivers/hwmon/ina2xx.c +++ b/drivers/hwmon/ina2xx.c @@ -274,7 +274,7 @@ static int ina2xx_get_value(struct ina2xx_data *data, u8 reg, break; case INA2XX_CURRENT: /* signed register, result in mA */ - val = regval * data->current_lsb_uA; + val = (s16)regval * data->current_lsb_uA; val = DIV_ROUND_CLOSEST(val, 1000); break; case INA2XX_CALIBRATION:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 76836fd354922ebe4798a64fda01f8dc6a8b0984 ]
The machine driver fails to probe in next-20181113 with:
[ 2.539093] omap-abe-twl6040 sound: ASoC: CODEC DAI twl6040-legacy not registered [ 2.546630] omap-abe-twl6040 sound: devm_snd_soc_register_card() failed: -517 ... [ 3.693206] omap-abe-twl6040 sound: ASoC: Both platform name/of_node are set for TWL6040 [ 3.701446] omap-abe-twl6040 sound: ASoC: failed to init link TWL6040 [ 3.708007] omap-abe-twl6040 sound: devm_snd_soc_register_card() failed: -22 [ 3.715148] omap-abe-twl6040: probe of sound failed with error -22
Bisect pointed to a merge commit: first bad commit: [0f688ab20a540aafa984c5dbd68a71debebf4d7f] Merge remote-tracking branch 'net-next/master'
and a diff between a working kernel does not reveal anything which would explain the change in behavior.
Further investigation showed that on the second try of loading fails because the dai_link->platform is no longer NULL and it might be pointing to uninitialized memory.
The fix is to move the snd_soc_dai_link and snd_soc_card inside of the abe_twl6040 struct, which is dynamically allocated every time the driver probes.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/omap/omap-abe-twl6040.c | 67 +++++++++++++------------------ 1 file changed, 29 insertions(+), 38 deletions(-)
diff --git a/sound/soc/omap/omap-abe-twl6040.c b/sound/soc/omap/omap-abe-twl6040.c index 614b18d2f631..6fd143799534 100644 --- a/sound/soc/omap/omap-abe-twl6040.c +++ b/sound/soc/omap/omap-abe-twl6040.c @@ -36,6 +36,8 @@ #include "../codecs/twl6040.h"
struct abe_twl6040 { + struct snd_soc_card card; + struct snd_soc_dai_link dai_links[2]; int jack_detection; /* board can detect jack events */ int mclk_freq; /* MCLK frequency speed for twl6040 */ }; @@ -208,40 +210,10 @@ static int omap_abe_dmic_init(struct snd_soc_pcm_runtime *rtd) ARRAY_SIZE(dmic_audio_map)); }
-/* Digital audio interface glue - connects codec <--> CPU */ -static struct snd_soc_dai_link abe_twl6040_dai_links[] = { - { - .name = "TWL6040", - .stream_name = "TWL6040", - .codec_dai_name = "twl6040-legacy", - .codec_name = "twl6040-codec", - .init = omap_abe_twl6040_init, - .ops = &omap_abe_ops, - }, - { - .name = "DMIC", - .stream_name = "DMIC Capture", - .codec_dai_name = "dmic-hifi", - .codec_name = "dmic-codec", - .init = omap_abe_dmic_init, - .ops = &omap_abe_dmic_ops, - }, -}; - -/* Audio machine driver */ -static struct snd_soc_card omap_abe_card = { - .owner = THIS_MODULE, - - .dapm_widgets = twl6040_dapm_widgets, - .num_dapm_widgets = ARRAY_SIZE(twl6040_dapm_widgets), - .dapm_routes = audio_map, - .num_dapm_routes = ARRAY_SIZE(audio_map), -}; - static int omap_abe_probe(struct platform_device *pdev) { struct device_node *node = pdev->dev.of_node; - struct snd_soc_card *card = &omap_abe_card; + struct snd_soc_card *card; struct device_node *dai_node; struct abe_twl6040 *priv; int num_links = 0; @@ -252,12 +224,18 @@ static int omap_abe_probe(struct platform_device *pdev) return -ENODEV; }
- card->dev = &pdev->dev; - priv = devm_kzalloc(&pdev->dev, sizeof(struct abe_twl6040), GFP_KERNEL); if (priv == NULL) return -ENOMEM;
+ card = &priv->card; + card->dev = &pdev->dev; + card->owner = THIS_MODULE; + card->dapm_widgets = twl6040_dapm_widgets; + card->num_dapm_widgets = ARRAY_SIZE(twl6040_dapm_widgets); + card->dapm_routes = audio_map; + card->num_dapm_routes = ARRAY_SIZE(audio_map); + if (snd_soc_of_parse_card_name(card, "ti,model")) { dev_err(&pdev->dev, "Card name is not provided\n"); return -ENODEV; @@ -274,14 +252,27 @@ static int omap_abe_probe(struct platform_device *pdev) dev_err(&pdev->dev, "McPDM node is not provided\n"); return -EINVAL; } - abe_twl6040_dai_links[0].cpu_of_node = dai_node; - abe_twl6040_dai_links[0].platform_of_node = dai_node; + + priv->dai_links[0].name = "DMIC"; + priv->dai_links[0].stream_name = "TWL6040"; + priv->dai_links[0].cpu_of_node = dai_node; + priv->dai_links[0].platform_of_node = dai_node; + priv->dai_links[0].codec_dai_name = "twl6040-legacy"; + priv->dai_links[0].codec_name = "twl6040-codec"; + priv->dai_links[0].init = omap_abe_twl6040_init; + priv->dai_links[0].ops = &omap_abe_ops;
dai_node = of_parse_phandle(node, "ti,dmic", 0); if (dai_node) { num_links = 2; - abe_twl6040_dai_links[1].cpu_of_node = dai_node; - abe_twl6040_dai_links[1].platform_of_node = dai_node; + priv->dai_links[1].name = "TWL6040"; + priv->dai_links[1].stream_name = "DMIC Capture"; + priv->dai_links[1].cpu_of_node = dai_node; + priv->dai_links[1].platform_of_node = dai_node; + priv->dai_links[1].codec_dai_name = "dmic-hifi"; + priv->dai_links[1].codec_name = "dmic-codec"; + priv->dai_links[1].init = omap_abe_dmic_init; + priv->dai_links[1].ops = &omap_abe_dmic_ops; } else { num_links = 1; } @@ -300,7 +291,7 @@ static int omap_abe_probe(struct platform_device *pdev) return -ENODEV; }
- card->dai_link = abe_twl6040_dai_links; + card->dai_link = priv->dai_links; card->num_links = num_links;
snd_soc_card_set_drvdata(card, priv);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 882eab6c28d23a970ae73b7eb831b169a672d456 ]
Audio map are possible in wrong state before card->instantiated has been set to true. Imaging the following examples:
time 1: at the beginning
in:-1 in:-1 in:-1 in:-1 out:-1 out:-1 out:-1 out:-1 SIGGEN A B Spk
time 2: after someone called snd_soc_dapm_new_widgets() (e.g. create_fill_widget_route_map() in sound/soc/codecs/hdac_hdmi.c)
in:1 in:0 in:0 in:0 out:0 out:0 out:0 out:1 SIGGEN A B Spk
time 3: routes added
in:1 in:0 in:0 in:0 out:0 out:0 out:0 out:1 SIGGEN -----> A -----> B ---> Spk
In the end, the path should be powered on but it did not. At time 3, "in" of SIGGEN and "out" of Spk did not propagate to their neighbors because snd_soc_dapm_add_path() will not invalidate the paths if the card has not instantiated (i.e. card->instantiated is false). To correct the state of audio map, recalculate the whole map forcely.
Signed-off-by: Tzung-Bi Shih tzungbi@google.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index fee4b0ef5566..42c2a3065b77 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -2307,6 +2307,7 @@ static int snd_soc_instantiate_card(struct snd_soc_card *card) }
card->instantiated = 1; + dapm_mark_endpoints_dirty(card); snd_soc_dapm_sync(&card->dapm); mutex_unlock(&card->mutex); mutex_unlock(&client_mutex);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0145b50566e7de5637e80ecba96c7f0e6fff1aad ]
Before this commit sensor_hub_input_attr_get_raw_value() failed to take the signedness of 16 and 8 bit values into account, returning e.g. 65436 instead of -100 for the z-axis reading of an accelerometer.
This commit adds a new is_signed parameter to the function and makes all callers pass the appropriate value for this.
While at it, this commit also fixes up some neighboring lines where statements were needlessly split over 2 lines to improve readability.
Signed-off-by: Hans de Goede hdegoede@redhat.com Acked-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Acked-by: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-sensor-custom.c | 2 +- drivers/hid/hid-sensor-hub.c | 13 ++++++++++--- drivers/iio/accel/hid-sensor-accel-3d.c | 5 ++++- drivers/iio/gyro/hid-sensor-gyro-3d.c | 5 ++++- drivers/iio/humidity/hid-sensor-humidity.c | 3 ++- drivers/iio/light/hid-sensor-als.c | 8 +++++--- drivers/iio/light/hid-sensor-prox.c | 8 +++++--- drivers/iio/magnetometer/hid-sensor-magn-3d.c | 8 +++++--- drivers/iio/orientation/hid-sensor-incl-3d.c | 8 +++++--- drivers/iio/pressure/hid-sensor-press.c | 8 +++++--- drivers/iio/temperature/hid-sensor-temperature.c | 3 ++- drivers/rtc/rtc-hid-sensor-time.c | 2 +- include/linux/hid-sensor-hub.h | 4 +++- 13 files changed, 52 insertions(+), 25 deletions(-)
diff --git a/drivers/hid/hid-sensor-custom.c b/drivers/hid/hid-sensor-custom.c index 0bcf041368c7..574126b649e9 100644 --- a/drivers/hid/hid-sensor-custom.c +++ b/drivers/hid/hid-sensor-custom.c @@ -358,7 +358,7 @@ static ssize_t show_value(struct device *dev, struct device_attribute *attr, sensor_inst->hsdev, sensor_inst->hsdev->usage, usage, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, false); } else if (!strncmp(name, "units", strlen("units"))) value = sensor_inst->fields[field_index].attribute.units; else if (!strncmp(name, "unit-expo", strlen("unit-expo"))) diff --git a/drivers/hid/hid-sensor-hub.c b/drivers/hid/hid-sensor-hub.c index faba542d1b07..b5bd5cb7d532 100644 --- a/drivers/hid/hid-sensor-hub.c +++ b/drivers/hid/hid-sensor-hub.c @@ -299,7 +299,8 @@ EXPORT_SYMBOL_GPL(sensor_hub_get_feature); int sensor_hub_input_attr_get_raw_value(struct hid_sensor_hub_device *hsdev, u32 usage_id, u32 attr_usage_id, u32 report_id, - enum sensor_hub_read_flags flag) + enum sensor_hub_read_flags flag, + bool is_signed) { struct sensor_hub_data *data = hid_get_drvdata(hsdev->hdev); unsigned long flags; @@ -331,10 +332,16 @@ int sensor_hub_input_attr_get_raw_value(struct hid_sensor_hub_device *hsdev, &hsdev->pending.ready, HZ*5); switch (hsdev->pending.raw_size) { case 1: - ret_val = *(u8 *)hsdev->pending.raw_data; + if (is_signed) + ret_val = *(s8 *)hsdev->pending.raw_data; + else + ret_val = *(u8 *)hsdev->pending.raw_data; break; case 2: - ret_val = *(u16 *)hsdev->pending.raw_data; + if (is_signed) + ret_val = *(s16 *)hsdev->pending.raw_data; + else + ret_val = *(u16 *)hsdev->pending.raw_data; break; case 4: ret_val = *(u32 *)hsdev->pending.raw_data; diff --git a/drivers/iio/accel/hid-sensor-accel-3d.c b/drivers/iio/accel/hid-sensor-accel-3d.c index 2238a26aba63..f573d9c61fc3 100644 --- a/drivers/iio/accel/hid-sensor-accel-3d.c +++ b/drivers/iio/accel/hid-sensor-accel-3d.c @@ -149,6 +149,7 @@ static int accel_3d_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min; struct hid_sensor_hub_device *hsdev = accel_state->common_attributes.hsdev;
@@ -158,12 +159,14 @@ static int accel_3d_read_raw(struct iio_dev *indio_dev, case 0: hid_sensor_power_state(&accel_state->common_attributes, true); report_id = accel_state->accel[chan->scan_index].report_id; + min = accel_state->accel[chan->scan_index].logical_minimum; address = accel_3d_addresses[chan->scan_index]; if (report_id >= 0) *val = sensor_hub_input_attr_get_raw_value( accel_state->common_attributes.hsdev, hsdev->usage, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); else { *val = 0; hid_sensor_power_state(&accel_state->common_attributes, diff --git a/drivers/iio/gyro/hid-sensor-gyro-3d.c b/drivers/iio/gyro/hid-sensor-gyro-3d.c index c67ce2ac4715..d9192eb41131 100644 --- a/drivers/iio/gyro/hid-sensor-gyro-3d.c +++ b/drivers/iio/gyro/hid-sensor-gyro-3d.c @@ -111,6 +111,7 @@ static int gyro_3d_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; @@ -118,13 +119,15 @@ static int gyro_3d_read_raw(struct iio_dev *indio_dev, case 0: hid_sensor_power_state(&gyro_state->common_attributes, true); report_id = gyro_state->gyro[chan->scan_index].report_id; + min = gyro_state->gyro[chan->scan_index].logical_minimum; address = gyro_3d_addresses[chan->scan_index]; if (report_id >= 0) *val = sensor_hub_input_attr_get_raw_value( gyro_state->common_attributes.hsdev, HID_USAGE_SENSOR_GYRO_3D, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); else { *val = 0; hid_sensor_power_state(&gyro_state->common_attributes, diff --git a/drivers/iio/humidity/hid-sensor-humidity.c b/drivers/iio/humidity/hid-sensor-humidity.c index 6e09c1acfe51..e53914d51ec3 100644 --- a/drivers/iio/humidity/hid-sensor-humidity.c +++ b/drivers/iio/humidity/hid-sensor-humidity.c @@ -75,7 +75,8 @@ static int humidity_read_raw(struct iio_dev *indio_dev, HID_USAGE_SENSOR_HUMIDITY, HID_USAGE_SENSOR_ATMOSPHERIC_HUMIDITY, humid_st->humidity_attr.report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + humid_st->humidity_attr.logical_minimum < 0); hid_sensor_power_state(&humid_st->common_attributes, false);
return IIO_VAL_INT; diff --git a/drivers/iio/light/hid-sensor-als.c b/drivers/iio/light/hid-sensor-als.c index 059d964772c7..95ca86f50434 100644 --- a/drivers/iio/light/hid-sensor-als.c +++ b/drivers/iio/light/hid-sensor-als.c @@ -93,6 +93,7 @@ static int als_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; @@ -102,8 +103,8 @@ static int als_read_raw(struct iio_dev *indio_dev, case CHANNEL_SCAN_INDEX_INTENSITY: case CHANNEL_SCAN_INDEX_ILLUM: report_id = als_state->als_illum.report_id; - address = - HID_USAGE_SENSOR_LIGHT_ILLUM; + min = als_state->als_illum.logical_minimum; + address = HID_USAGE_SENSOR_LIGHT_ILLUM; break; default: report_id = -1; @@ -116,7 +117,8 @@ static int als_read_raw(struct iio_dev *indio_dev, als_state->common_attributes.hsdev, HID_USAGE_SENSOR_ALS, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); hid_sensor_power_state(&als_state->common_attributes, false); } else { diff --git a/drivers/iio/light/hid-sensor-prox.c b/drivers/iio/light/hid-sensor-prox.c index 73fced8a63b7..8c017abc4ee2 100644 --- a/drivers/iio/light/hid-sensor-prox.c +++ b/drivers/iio/light/hid-sensor-prox.c @@ -73,6 +73,7 @@ static int prox_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; @@ -81,8 +82,8 @@ static int prox_read_raw(struct iio_dev *indio_dev, switch (chan->scan_index) { case CHANNEL_SCAN_INDEX_PRESENCE: report_id = prox_state->prox_attr.report_id; - address = - HID_USAGE_SENSOR_HUMAN_PRESENCE; + min = prox_state->prox_attr.logical_minimum; + address = HID_USAGE_SENSOR_HUMAN_PRESENCE; break; default: report_id = -1; @@ -95,7 +96,8 @@ static int prox_read_raw(struct iio_dev *indio_dev, prox_state->common_attributes.hsdev, HID_USAGE_SENSOR_PROX, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); hid_sensor_power_state(&prox_state->common_attributes, false); } else { diff --git a/drivers/iio/magnetometer/hid-sensor-magn-3d.c b/drivers/iio/magnetometer/hid-sensor-magn-3d.c index 0e791b02ed4a..b495107bd173 100644 --- a/drivers/iio/magnetometer/hid-sensor-magn-3d.c +++ b/drivers/iio/magnetometer/hid-sensor-magn-3d.c @@ -163,21 +163,23 @@ static int magn_3d_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; switch (mask) { case 0: hid_sensor_power_state(&magn_state->magn_flux_attributes, true); - report_id = - magn_state->magn[chan->address].report_id; + report_id = magn_state->magn[chan->address].report_id; + min = magn_state->magn[chan->address].logical_minimum; address = magn_3d_addresses[chan->address]; if (report_id >= 0) *val = sensor_hub_input_attr_get_raw_value( magn_state->magn_flux_attributes.hsdev, HID_USAGE_SENSOR_COMPASS_3D, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); else { *val = 0; hid_sensor_power_state( diff --git a/drivers/iio/orientation/hid-sensor-incl-3d.c b/drivers/iio/orientation/hid-sensor-incl-3d.c index fd1b3696ee42..16c744bef021 100644 --- a/drivers/iio/orientation/hid-sensor-incl-3d.c +++ b/drivers/iio/orientation/hid-sensor-incl-3d.c @@ -111,21 +111,23 @@ static int incl_3d_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; switch (mask) { case IIO_CHAN_INFO_RAW: hid_sensor_power_state(&incl_state->common_attributes, true); - report_id = - incl_state->incl[chan->scan_index].report_id; + report_id = incl_state->incl[chan->scan_index].report_id; + min = incl_state->incl[chan->scan_index].logical_minimum; address = incl_3d_addresses[chan->scan_index]; if (report_id >= 0) *val = sensor_hub_input_attr_get_raw_value( incl_state->common_attributes.hsdev, HID_USAGE_SENSOR_INCLINOMETER_3D, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); else { hid_sensor_power_state(&incl_state->common_attributes, false); diff --git a/drivers/iio/pressure/hid-sensor-press.c b/drivers/iio/pressure/hid-sensor-press.c index 6848d8c80eff..1c49ef78f888 100644 --- a/drivers/iio/pressure/hid-sensor-press.c +++ b/drivers/iio/pressure/hid-sensor-press.c @@ -77,6 +77,7 @@ static int press_read_raw(struct iio_dev *indio_dev, int report_id = -1; u32 address; int ret_type; + s32 min;
*val = 0; *val2 = 0; @@ -85,8 +86,8 @@ static int press_read_raw(struct iio_dev *indio_dev, switch (chan->scan_index) { case CHANNEL_SCAN_INDEX_PRESSURE: report_id = press_state->press_attr.report_id; - address = - HID_USAGE_SENSOR_ATMOSPHERIC_PRESSURE; + min = press_state->press_attr.logical_minimum; + address = HID_USAGE_SENSOR_ATMOSPHERIC_PRESSURE; break; default: report_id = -1; @@ -99,7 +100,8 @@ static int press_read_raw(struct iio_dev *indio_dev, press_state->common_attributes.hsdev, HID_USAGE_SENSOR_PRESSURE, address, report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + min < 0); hid_sensor_power_state(&press_state->common_attributes, false); } else { diff --git a/drivers/iio/temperature/hid-sensor-temperature.c b/drivers/iio/temperature/hid-sensor-temperature.c index c01efeca4002..6ed5cd5742f1 100644 --- a/drivers/iio/temperature/hid-sensor-temperature.c +++ b/drivers/iio/temperature/hid-sensor-temperature.c @@ -76,7 +76,8 @@ static int temperature_read_raw(struct iio_dev *indio_dev, HID_USAGE_SENSOR_TEMPERATURE, HID_USAGE_SENSOR_DATA_ENVIRONMENTAL_TEMPERATURE, temp_st->temperature_attr.report_id, - SENSOR_HUB_SYNC); + SENSOR_HUB_SYNC, + temp_st->temperature_attr.logical_minimum < 0); hid_sensor_power_state( &temp_st->common_attributes, false); diff --git a/drivers/rtc/rtc-hid-sensor-time.c b/drivers/rtc/rtc-hid-sensor-time.c index 2751dba850c6..3e1abb455472 100644 --- a/drivers/rtc/rtc-hid-sensor-time.c +++ b/drivers/rtc/rtc-hid-sensor-time.c @@ -213,7 +213,7 @@ static int hid_rtc_read_time(struct device *dev, struct rtc_time *tm) /* get a report with all values through requesting one value */ sensor_hub_input_attr_get_raw_value(time_state->common_attributes.hsdev, HID_USAGE_SENSOR_TIME, hid_time_addresses[0], - time_state->info[0].report_id, SENSOR_HUB_SYNC); + time_state->info[0].report_id, SENSOR_HUB_SYNC, false); /* wait for all values (event) */ ret = wait_for_completion_killable_timeout( &time_state->comp_last_time, HZ*6); diff --git a/include/linux/hid-sensor-hub.h b/include/linux/hid-sensor-hub.h index fc7aae64dcde..000de6da3b1b 100644 --- a/include/linux/hid-sensor-hub.h +++ b/include/linux/hid-sensor-hub.h @@ -177,6 +177,7 @@ int sensor_hub_input_get_attribute_info(struct hid_sensor_hub_device *hsdev, * @attr_usage_id: Attribute usage id as per spec * @report_id: Report id to look for * @flag: Synchronous or asynchronous read +* @is_signed: If true then fields < 32 bits will be sign-extended * * Issues a synchronous or asynchronous read request for an input attribute. * Returns data upto 32 bits. @@ -190,7 +191,8 @@ enum sensor_hub_read_flags { int sensor_hub_input_attr_get_raw_value(struct hid_sensor_hub_device *hsdev, u32 usage_id, u32 attr_usage_id, u32 report_id, - enum sensor_hub_read_flags flag + enum sensor_hub_read_flags flag, + bool is_signed );
/**
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit b4e955e9f372035361fbc6f07b21fe2cc6a5be4a ]
In the htable_create(), hinfo is allocated by vmalloc() So that if error occurred, hinfo should be freed.
Fixes: 11d5f15723c9 ("netfilter: xt_hashlimit: Create revision 2 to support higher pps rates") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/xt_hashlimit.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c index 0c034597b9b8..fe8e8a1622b5 100644 --- a/net/netfilter/xt_hashlimit.c +++ b/net/netfilter/xt_hashlimit.c @@ -295,9 +295,10 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
/* copy match config into hashtable config */ ret = cfg_copy(&hinfo->cfg, (void *)cfg, 3); - - if (ret) + if (ret) { + vfree(hinfo); return ret; + }
hinfo->cfg.size = size; if (hinfo->cfg.max == 0) @@ -814,7 +815,6 @@ hashlimit_mt_v1(const struct sk_buff *skb, struct xt_action_param *par) int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1); - if (ret) return ret;
@@ -830,7 +830,6 @@ hashlimit_mt_v2(const struct sk_buff *skb, struct xt_action_param *par) int ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2); - if (ret) return ret;
@@ -920,7 +919,6 @@ static int hashlimit_mt_check_v1(const struct xt_mtchk_param *par) return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 1); - if (ret) return ret;
@@ -939,7 +937,6 @@ static int hashlimit_mt_check_v2(const struct xt_mtchk_param *par) return ret;
ret = cfg_copy(&cfg, (void *)&info->cfg, 2); - if (ret) return ret;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 09aaf6813cfca4c18034fda7a43e68763f34abb1 ]
Both datasheet and comments of store_temp_mode() tell us that temp1~4_type is writable, so fix it.
Signed-off-by: Yao Wang wangyao@lemote.com Signed-off-by: Huacai Chen chenhc@lemote.com Fixes: 39deb6993e7c (" hwmon: (w83795) Simplify temperature sensor type handling") Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/w83795.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/w83795.c b/drivers/hwmon/w83795.c index 49276bbdac3d..1bb80f992aa8 100644 --- a/drivers/hwmon/w83795.c +++ b/drivers/hwmon/w83795.c @@ -1691,7 +1691,7 @@ store_sf_setup(struct device *dev, struct device_attribute *attr, * somewhere else in the code */ #define SENSOR_ATTR_TEMP(index) { \ - SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 4 ? S_IWUSR : 0), \ + SENSOR_ATTR_2(temp##index##_type, S_IRUGO | (index < 5 ? S_IWUSR : 0), \ show_temp_mode, store_temp_mode, NOT_USED, index - 1), \ SENSOR_ATTR_2(temp##index##_input, S_IRUGO, show_temp, \ NULL, TEMP_READ, index - 1), \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit b01c1f69c8660eaeab7d365cd570103c5c073a02 ]
When reporting on 'record' server we try to retrieve/use the mnt namespace of the profiled tasks. We use following API with cookie to hold the return namespace, roughly:
nsinfo__mountns_enter(struct nsinfo *nsi, struct nscookie *nc) setns(newns, 0); ... new ns related open.. ... nsinfo__mountns_exit(struct nscookie *nc) setns(nc->oldns)
Once finished we setns to old namespace, which also sets the current working directory (cwd) to "/", trashing the cwd we had.
This is mostly fine, because we use absolute paths almost everywhere, but it screws up 'perf diff':
# perf diff failed to open perf.data: No such file or directory (try 'perf record' first) ...
Adding the current working directory to be part of the cookie and restoring it in the nsinfo__mountns_exit call.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Krister Johansen kjlx@templeofstupid.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Fixes: 843ff37bb59e ("perf symbols: Find symbols in different mount namespace") Link: http://lkml.kernel.org/r/20181101170001.30019-1-jolsa@kernel.org [ No need to check for NULL args for free(), use zfree() for struct members ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/namespaces.c | 17 +++++++++++++++-- tools/perf/util/namespaces.h | 1 + 2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/namespaces.c b/tools/perf/util/namespaces.c index 1ef0049860a8..eadc7ddacbf6 100644 --- a/tools/perf/util/namespaces.c +++ b/tools/perf/util/namespaces.c @@ -17,6 +17,7 @@ #include <stdio.h> #include <string.h> #include <unistd.h> +#include <asm/bug.h>
struct namespaces *namespaces__new(struct namespaces_event *event) { @@ -185,6 +186,7 @@ void nsinfo__mountns_enter(struct nsinfo *nsi, char curpath[PATH_MAX]; int oldns = -1; int newns = -1; + char *oldcwd = NULL;
if (nc == NULL) return; @@ -198,9 +200,13 @@ void nsinfo__mountns_enter(struct nsinfo *nsi, if (snprintf(curpath, PATH_MAX, "/proc/self/ns/mnt") >= PATH_MAX) return;
+ oldcwd = get_current_dir_name(); + if (!oldcwd) + return; + oldns = open(curpath, O_RDONLY); if (oldns < 0) - return; + goto errout;
newns = open(nsi->mntns_path, O_RDONLY); if (newns < 0) @@ -209,11 +215,13 @@ void nsinfo__mountns_enter(struct nsinfo *nsi, if (setns(newns, CLONE_NEWNS) < 0) goto errout;
+ nc->oldcwd = oldcwd; nc->oldns = oldns; nc->newns = newns; return;
errout: + free(oldcwd); if (oldns > -1) close(oldns); if (newns > -1) @@ -222,11 +230,16 @@ void nsinfo__mountns_enter(struct nsinfo *nsi,
void nsinfo__mountns_exit(struct nscookie *nc) { - if (nc == NULL || nc->oldns == -1 || nc->newns == -1) + if (nc == NULL || nc->oldns == -1 || nc->newns == -1 || !nc->oldcwd) return;
setns(nc->oldns, CLONE_NEWNS);
+ if (nc->oldcwd) { + WARN_ON_ONCE(chdir(nc->oldcwd)); + zfree(&nc->oldcwd); + } + if (nc->oldns > -1) { close(nc->oldns); nc->oldns = -1; diff --git a/tools/perf/util/namespaces.h b/tools/perf/util/namespaces.h index 05d82601c9a6..23584a6dd048 100644 --- a/tools/perf/util/namespaces.h +++ b/tools/perf/util/namespaces.h @@ -36,6 +36,7 @@ struct nsinfo { struct nscookie { int oldns; int newns; + char *oldcwd; };
int nsinfo__init(struct nsinfo *nsi);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 68bc10bf992180f269816ff3d22eb30383138577 ]
This bug was introduced in the interaction for two commits on either branch of the merge commit 562df5c8521e ("Merge branch 'pci/host-designware' into next").
Commit 4d107d3b5a68 ("PCI: imx6: Move link up check into imx6_pcie_wait_for_link()"), changed imx6_pcie_wait_for_link() to poll the link status register directly, checking for link up and not training, and made imx6_pcie_link_up() only check the link up bit (once, not a polling loop).
While commit 886bc5ceb5cc ("PCI: designware: Add generic dw_pcie_wait_for_link()"), replaced the loop in imx6_pcie_wait_for_link() with a call to a new dwc core function, which polled imx6_pcie_link_up(), which still checked both link up and not training in a loop.
When these two commits were merged, the version of imx6_pcie_wait_for_link() from 886bc5ceb5cc was kept, which eliminated the link training check placed there by 4d107d3b5a68. However, the version of imx6_pcie_link_up() from 4d107d3b5a68 was kept, which eliminated the link training check that had been there and was moved to imx6_pcie_wait_for_link().
The result was the link training check got lost for the imx6 driver.
Eliminate imx6_pcie_link_up() so that the default handler, dw_pcie_link_up(), is used instead. The default handler has the correct code, which checks for link up and also that it still is not training, fixing the regression.
Fixes: 562df5c8521e ("Merge branch 'pci/host-designware' into next") Signed-off-by: Trent Piepho tpiepho@impinj.com [lorenzo.pieralisi@arm.com: rewrote the commit log] Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Lucas Stach l.stach@pengutronix.de Cc: Bjorn Helgaas bhelgaas@google.com Cc: Joao Pinto Joao.Pinto@synopsys.com Cc: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: Richard Zhu hongxing.zhu@nxp.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/dwc/pci-imx6.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c index b73483534a5b..1f1069b70e45 100644 --- a/drivers/pci/dwc/pci-imx6.c +++ b/drivers/pci/dwc/pci-imx6.c @@ -83,8 +83,6 @@ struct imx6_pcie { #define PCIE_PL_PFLR_FORCE_LINK (1 << 15) #define PCIE_PHY_DEBUG_R0 (PL_OFFSET + 0x28) #define PCIE_PHY_DEBUG_R1 (PL_OFFSET + 0x2c) -#define PCIE_PHY_DEBUG_R1_XMLH_LINK_IN_TRAINING (1 << 29) -#define PCIE_PHY_DEBUG_R1_XMLH_LINK_UP (1 << 4)
#define PCIE_PHY_CTRL (PL_OFFSET + 0x114) #define PCIE_PHY_CTRL_DATA_LOC 0 @@ -653,12 +651,6 @@ static int imx6_pcie_host_init(struct pcie_port *pp) return 0; }
-static int imx6_pcie_link_up(struct dw_pcie *pci) -{ - return dw_pcie_readl_dbi(pci, PCIE_PHY_DEBUG_R1) & - PCIE_PHY_DEBUG_R1_XMLH_LINK_UP; -} - static const struct dw_pcie_host_ops imx6_pcie_host_ops = { .host_init = imx6_pcie_host_init, }; @@ -701,7 +693,7 @@ static int imx6_add_pcie_port(struct imx6_pcie *imx6_pcie, }
static const struct dw_pcie_ops dw_pcie_ops = { - .link_up = imx6_pcie_link_up, + /* No special ops needed, but pcie-designware still expects this struct */ };
static int imx6_pcie_probe(struct platform_device *pdev)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0b9301fb632f7111a3293a30cc5b20f1b82ed08d ]
If read_symbols() fails during second list traversal (the one dealing with ".cold" subfunctions) it frees the symbol, but never deletes it from the list/hash_table resulting in symbol being freed again in elf_close(). Fix it by just returning an error, leaving cleanup to elf_close().
Signed-off-by: Artem Savkov asavkov@redhat.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/beac5a9b7da9e8be90223459dcbe07766ae437dd.1542736240... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/objtool/elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 0d1acb704f64..3616d626991e 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -312,7 +312,7 @@ static int read_symbols(struct elf *elf) if (!pfunc) { WARN("%s(): can't find parent function", sym->name); - goto err; + return -1; }
sym->pfunc = pfunc;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 22566c1603030f0a036ad564634b064ad1a55db2 ]
Because find_symbol_by_name() traverses the same lists as read_symbols(), changing sym->name in place without copying it affects the result of find_symbol_by_name(). In the case where a ".cold" function precedes its parent in sec->symbol_list, it can result in a function being considered a parent of itself. This leads to function length being set to 0 and other consequent side-effects including a segfault in add_switch_table(). The effects of this bug are only visible when building with -ffunction-sections in KCFLAGS.
Fix by copying the search string instead of modifying it in place.
Signed-off-by: Artem Savkov asavkov@redhat.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/910abd6b5a4945130fd44f787c24e07b9e07c8da.1542736240... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/objtool/elf.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c index 3616d626991e..dd4ed7c3c062 100644 --- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -31,6 +31,8 @@ #include "elf.h" #include "warn.h"
+#define MAX_NAME_LEN 128 + struct section *find_section_by_name(struct elf *elf, const char *name) { struct section *sec; @@ -298,6 +300,8 @@ static int read_symbols(struct elf *elf) /* Create parent/child links for any cold subfunctions */ list_for_each_entry(sec, &elf->sections, list) { list_for_each_entry(sym, &sec->symbol_list, list) { + char pname[MAX_NAME_LEN + 1]; + size_t pnamelen; if (sym->type != STT_FUNC) continue; sym->pfunc = sym->cfunc = sym; @@ -305,9 +309,16 @@ static int read_symbols(struct elf *elf) if (!coldstr) continue;
- coldstr[0] = '\0'; - pfunc = find_symbol_by_name(elf, sym->name); - coldstr[0] = '.'; + pnamelen = coldstr - sym->name; + if (pnamelen > MAX_NAME_LEN) { + WARN("%s(): parent function name exceeds maximum length of %d characters", + sym->name, MAX_NAME_LEN); + return -1; + } + + strncpy(pname, sym->name, pnamelen); + pname[pnamelen] = '\0'; + pfunc = find_symbol_by_name(elf, pname);
if (!pfunc) { WARN("%s(): can't find parent function",
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 4ab7ca092c3c7ac8b16aa28eba723a8868f82f14 ]
The SAMA5D2 is different from SAMA5D3 and SAMA5D4, as there are two different clocks for the peripherals in the SoC. The Static Memory controller is connected to the divided master clock.
Unfortunately, the device tree does not correctly show this and uses the master clock directly. This clock is then used by the code for the NAND controller to calculate the timings for the controller, and we end up with slow NAND Flash access.
Fix the device tree, and the performance of Flash access is improved.
Signed-off-by: Romain Izard romain.izard.pro@gmail.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/sama5d2.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/sama5d2.dtsi b/arch/arm/boot/dts/sama5d2.dtsi index b1a26b42d190..a8e4b89097d9 100644 --- a/arch/arm/boot/dts/sama5d2.dtsi +++ b/arch/arm/boot/dts/sama5d2.dtsi @@ -308,7 +308,7 @@ 0x1 0x0 0x60000000 0x10000000 0x2 0x0 0x70000000 0x10000000 0x3 0x0 0x80000000 0x10000000>; - clocks = <&mck>; + clocks = <&h32ck>; status = "disabled";
nand_controller: nand-controller {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a4390aee72713d9e73f1132bcdeb17d72fbbf974 ]
When doing an incremental send, due to the need of delaying directory move (rename) operations we can end up in infinite loop at apply_children_dir_moves().
An example scenario that triggers this problem is described below, where directory names correspond to the numbers of their respective inodes.
Parent snapshot:
. |--- 261/ |--- 271/ |--- 266/ |--- 259/ |--- 260/ | |--- 267 | |--- 264/ | |--- 258/ | |--- 257/ | |--- 265/ |--- 268/ |--- 269/ | |--- 262/ | |--- 270/ |--- 272/ | |--- 263/ | |--- 275/ | |--- 274/ |--- 273/
Send snapshot:
. |-- 275/ |-- 274/ |-- 273/ |-- 262/ |-- 269/ |-- 258/ |-- 271/ |-- 268/ |-- 267/ |-- 270/ |-- 259/ | |-- 265/ | |-- 272/ |-- 257/ |-- 260/ |-- 264/ |-- 263/ |-- 261/ |-- 266/
When processing inode 257 we delay its move (rename) operation because its new parent in the send snapshot, inode 272, was not yet processed. Then when processing inode 272, we delay the move operation for that inode because inode 274 is its ancestor in the send snapshot. Finally we delay the move operation for inode 274 when processing it because inode 275 is its new parent in the send snapshot and was not yet moved.
When finishing processing inode 275, we start to do the move operations that were previously delayed (at apply_children_dir_moves()), resulting in the following iterations:
1) We issue the move operation for inode 274;
2) Because inode 262 depended on the move operation of inode 274 (it was delayed because 274 is its ancestor in the send snapshot), we issue the move operation for inode 262;
3) We issue the move operation for inode 272, because it was delayed by inode 274 too (ancestor of 272 in the send snapshot);
4) We issue the move operation for inode 269 (it was delayed by 262);
5) We issue the move operation for inode 257 (it was delayed by 272);
6) We issue the move operation for inode 260 (it was delayed by 272);
7) We issue the move operation for inode 258 (it was delayed by 269);
8) We issue the move operation for inode 264 (it was delayed by 257);
9) We issue the move operation for inode 271 (it was delayed by 258);
10) We issue the move operation for inode 263 (it was delayed by 264);
11) We issue the move operation for inode 268 (it was delayed by 271);
12) We verify if we can issue the move operation for inode 270 (it was delayed by 271). We detect a path loop in the current state, because inode 267 needs to be moved first before we can issue the move operation for inode 270. So we delay again the move operation for inode 270, this time we will attempt to do it after inode 267 is moved;
13) We issue the move operation for inode 261 (it was delayed by 263);
14) We verify if we can issue the move operation for inode 266 (it was delayed by 263). We detect a path loop in the current state, because inode 270 needs to be moved first before we can issue the move operation for inode 266. So we delay again the move operation for inode 266, this time we will attempt to do it after inode 270 is moved (its move operation was delayed in step 12);
15) We issue the move operation for inode 267 (it was delayed by 268);
16) We verify if we can issue the move operation for inode 266 (it was delayed by 270). We detect a path loop in the current state, because inode 270 needs to be moved first before we can issue the move operation for inode 266. So we delay again the move operation for inode 266, this time we will attempt to do it after inode 270 is moved (its move operation was delayed in step 12). So here we added again the same delayed move operation that we added in step 14;
17) We attempt again to see if we can issue the move operation for inode 266, and as in step 16, we realize we can not due to a path loop in the current state due to a dependency on inode 270. Again we delay inode's 266 rename to happen after inode's 270 move operation, adding the same dependency to the empty stack that we did in steps 14 and 16. The next iteration will pick the same move dependency on the stack (the only entry) and realize again there is still a path loop and then again the same dependency to the stack, over and over, resulting in an infinite loop.
So fix this by preventing adding the same move dependency entries to the stack by removing each pending move record from the red black tree of pending moves. This way the next call to get_pending_dir_moves() will not return anything for the current parent inode.
A test case for fstests, with this reproducer, follows soon.
Signed-off-by: Robbie Ko robbieko@synology.com Reviewed-by: Filipe Manana fdmanana@suse.com [Wrote changelog with example and more clear explanation] Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/send.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index baf5a4cd7ffc..3f22af96d63b 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3354,7 +3354,8 @@ static void free_pending_move(struct send_ctx *sctx, struct pending_dir_move *m) kfree(m); }
-static void tail_append_pending_moves(struct pending_dir_move *moves, +static void tail_append_pending_moves(struct send_ctx *sctx, + struct pending_dir_move *moves, struct list_head *stack) { if (list_empty(&moves->list)) { @@ -3365,6 +3366,10 @@ static void tail_append_pending_moves(struct pending_dir_move *moves, list_add_tail(&moves->list, stack); list_splice_tail(&list, stack); } + if (!RB_EMPTY_NODE(&moves->node)) { + rb_erase(&moves->node, &sctx->pending_dir_moves); + RB_CLEAR_NODE(&moves->node); + } }
static int apply_children_dir_moves(struct send_ctx *sctx) @@ -3379,7 +3384,7 @@ static int apply_children_dir_moves(struct send_ctx *sctx) return 0;
INIT_LIST_HEAD(&stack); - tail_append_pending_moves(pm, &stack); + tail_append_pending_moves(sctx, pm, &stack);
while (!list_empty(&stack)) { pm = list_first_entry(&stack, struct pending_dir_move, list); @@ -3390,7 +3395,7 @@ static int apply_children_dir_moves(struct send_ctx *sctx) goto out; pm = get_pending_dir_moves(sctx, parent_ino); if (pm) - tail_append_pending_moves(pm, &stack); + tail_append_pending_moves(sctx, pm, &stack); } return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 074fca3a18e7e1e0d4d7dcc9d7badc43b90232f4 ]
Currently, for IB_WR_LOCAL_INV WR, when the next fence is None, the current fence will be SMALL instead of Normal Fence.
Without this patch krping doesn't work on CX-5 devices and throws following error:
The error messages are from CX5 driver are: (from server side) [ 710.434014] mlx5_0:dump_cqe:278:(pid 2712): dump error cqe [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434017] 00000000 00000000 00000000 00000000 [ 710.434018] 00000000 93003204 100000b8 000524d2 [ 710.434019] krping: cq completion failed with wr_id 0 status 4 opcode 128 vender_err 32
Fixed the logic to set the correct fence type.
Fixes: 6e8484c5cf07 ("RDMA/mlx5: set UMR wqe fence according to HCA cap") Signed-off-by: Majd Dibbiny majd@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/qp.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index dfc190055167..964c3a0bbf16 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -3928,17 +3928,18 @@ int mlx5_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, goto out; }
- if (wr->opcode == IB_WR_LOCAL_INV || - wr->opcode == IB_WR_REG_MR) { + if (wr->opcode == IB_WR_REG_MR) { fence = dev->umr_fence; next_fence = MLX5_FENCE_MODE_INITIATOR_SMALL; - } else if (wr->send_flags & IB_SEND_FENCE) { - if (qp->next_fence) - fence = MLX5_FENCE_MODE_SMALL_AND_FENCE; - else - fence = MLX5_FENCE_MODE_FENCE; - } else { - fence = qp->next_fence; + } else { + if (wr->send_flags & IB_SEND_FENCE) { + if (qp->next_fence) + fence = MLX5_FENCE_MODE_SMALL_AND_FENCE; + else + fence = MLX5_FENCE_MODE_FENCE; + } else { + fence = qp->next_fence; + } }
switch (ibqp->qp_type) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 4f32fb921b153ae9ea280e02a3e91509fffc03d3 ]
rdmavt uses a crazy system that looses the type checking when assinging functions to struct ib_device function pointers. Because of this the signature to this function was not changed when the below commit revised things.
Fix the signature so we are not calling a function pointer with a mismatched signature.
Fixes: 477864c8fcd9 ("IB/core: Let create_ah return extended response to user") Signed-off-by: Kamal Heib kamalheib1@gmail.com Reviewed-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/sw/rdmavt/ah.c | 4 +++- drivers/infiniband/sw/rdmavt/ah.h | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/rdmavt/ah.c b/drivers/infiniband/sw/rdmavt/ah.c index ba3639a0d77c..48ea5b8207f0 100644 --- a/drivers/infiniband/sw/rdmavt/ah.c +++ b/drivers/infiniband/sw/rdmavt/ah.c @@ -91,13 +91,15 @@ EXPORT_SYMBOL(rvt_check_ah); * rvt_create_ah - create an address handle * @pd: the protection domain * @ah_attr: the attributes of the AH + * @udata: pointer to user's input output buffer information. * * This may be called from interrupt context. * * Return: newly allocated ah */ struct ib_ah *rvt_create_ah(struct ib_pd *pd, - struct rdma_ah_attr *ah_attr) + struct rdma_ah_attr *ah_attr, + struct ib_udata *udata) { struct rvt_ah *ah; struct rvt_dev_info *dev = ib_to_rvt(pd->device); diff --git a/drivers/infiniband/sw/rdmavt/ah.h b/drivers/infiniband/sw/rdmavt/ah.h index 16105af99189..25271b48a683 100644 --- a/drivers/infiniband/sw/rdmavt/ah.h +++ b/drivers/infiniband/sw/rdmavt/ah.h @@ -51,7 +51,8 @@ #include <rdma/rdma_vt.h>
struct ib_ah *rvt_create_ah(struct ib_pd *pd, - struct rdma_ah_attr *ah_attr); + struct rdma_ah_attr *ah_attr, + struct ib_udata *udata); int rvt_destroy_ah(struct ib_ah *ibah); int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dd2f52d8991af9fe0928d59ec502ba52be7bc38d ]
The latency number is in usec for the pm_qos. Correct the calculation to give us the time in usec
Signed-off-by: Peter Ujfalusi peter.ujfalusi@ti.com Acked-by: Jarkko Nikula jarkko.nikula@bitmer.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/omap/omap-mcbsp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/omap/omap-mcbsp.c b/sound/soc/omap/omap-mcbsp.c index 6b40bdbef336..47c2ed5ca492 100644 --- a/sound/soc/omap/omap-mcbsp.c +++ b/sound/soc/omap/omap-mcbsp.c @@ -308,9 +308,9 @@ static int omap_mcbsp_dai_hw_params(struct snd_pcm_substream *substream, pkt_size = channels; }
- latency = ((((buffer_size - pkt_size) / channels) * 1000) - / (params->rate_num / params->rate_den)); - + latency = (buffer_size - pkt_size) / channels; + latency = latency * USEC_PER_SEC / + (params->rate_num / params->rate_den); mcbsp->latency[substream->stream] = latency;
omap_mcbsp_set_threshold(substream, pkt_size);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 373a500e34aea97971c9d71e45edad458d3da98f ]
We need to block sleep states which would require longer time to leave than the time the DMA must react to the DMA request in order to keep the FIFO serviced without under of overrun.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@ti.com Acked-by: Jarkko Nikula jarkko.nikula@bitmer.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/omap/omap-mcpdm.c | 43 ++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-)
diff --git a/sound/soc/omap/omap-mcpdm.c b/sound/soc/omap/omap-mcpdm.c index 64609c77a79d..44ffeb71cd1d 100644 --- a/sound/soc/omap/omap-mcpdm.c +++ b/sound/soc/omap/omap-mcpdm.c @@ -54,6 +54,8 @@ struct omap_mcpdm { unsigned long phys_base; void __iomem *io_base; int irq; + struct pm_qos_request pm_qos_req; + int latency[2];
struct mutex mutex;
@@ -277,6 +279,9 @@ static void omap_mcpdm_dai_shutdown(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai); + int tx = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK); + int stream1 = tx ? SNDRV_PCM_STREAM_PLAYBACK : SNDRV_PCM_STREAM_CAPTURE; + int stream2 = tx ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK;
mutex_lock(&mcpdm->mutex);
@@ -289,6 +294,14 @@ static void omap_mcpdm_dai_shutdown(struct snd_pcm_substream *substream, } }
+ if (mcpdm->latency[stream2]) + pm_qos_update_request(&mcpdm->pm_qos_req, + mcpdm->latency[stream2]); + else if (mcpdm->latency[stream1]) + pm_qos_remove_request(&mcpdm->pm_qos_req); + + mcpdm->latency[stream1] = 0; + mutex_unlock(&mcpdm->mutex); }
@@ -300,7 +313,7 @@ static int omap_mcpdm_dai_hw_params(struct snd_pcm_substream *substream, int stream = substream->stream; struct snd_dmaengine_dai_dma_data *dma_data; u32 threshold; - int channels; + int channels, latency; int link_mask = 0;
channels = params_channels(params); @@ -340,14 +353,25 @@ static int omap_mcpdm_dai_hw_params(struct snd_pcm_substream *substream,
dma_data->maxburst = (MCPDM_DN_THRES_MAX - threshold) * channels; + latency = threshold; } else { /* If playback is not running assume a stereo stream to come */ if (!mcpdm->config[!stream].link_mask) mcpdm->config[!stream].link_mask = (0x3 << 3);
dma_data->maxburst = threshold * channels; + latency = (MCPDM_DN_THRES_MAX - threshold); }
+ /* + * The DMA must act to a DMA request within latency time (usec) to avoid + * under/overflow + */ + mcpdm->latency[stream] = latency * USEC_PER_SEC / params_rate(params); + + if (!mcpdm->latency[stream]) + mcpdm->latency[stream] = 10; + /* Check if we need to restart McPDM with this stream */ if (mcpdm->config[stream].link_mask && mcpdm->config[stream].link_mask != link_mask) @@ -362,6 +386,20 @@ static int omap_mcpdm_prepare(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { struct omap_mcpdm *mcpdm = snd_soc_dai_get_drvdata(dai); + struct pm_qos_request *pm_qos_req = &mcpdm->pm_qos_req; + int tx = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK); + int stream1 = tx ? SNDRV_PCM_STREAM_PLAYBACK : SNDRV_PCM_STREAM_CAPTURE; + int stream2 = tx ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK; + int latency = mcpdm->latency[stream2]; + + /* Prevent omap hardware from hitting off between FIFO fills */ + if (!latency || mcpdm->latency[stream1] < latency) + latency = mcpdm->latency[stream1]; + + if (pm_qos_request_active(pm_qos_req)) + pm_qos_update_request(pm_qos_req, latency); + else if (latency) + pm_qos_add_request(pm_qos_req, PM_QOS_CPU_DMA_LATENCY, latency);
if (!omap_mcpdm_active(mcpdm)) { omap_mcpdm_start(mcpdm); @@ -423,6 +461,9 @@ static int omap_mcpdm_remove(struct snd_soc_dai *dai) free_irq(mcpdm->irq, (void *)mcpdm); pm_runtime_disable(mcpdm->dev);
+ if (pm_qos_request_active(&mcpdm->pm_qos_req)) + pm_qos_remove_request(&mcpdm->pm_qos_req); + return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ffdcc3638c58d55a6fa68b6e5dfd4fb4109652eb ]
We need to block sleep states which would require longer time to leave than the time the DMA must react to the DMA request in order to keep the FIFO serviced without overrun.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@ti.com Acked-by: Jarkko Nikula jarkko.nikula@bitmer.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/omap/omap-dmic.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/sound/soc/omap/omap-dmic.c b/sound/soc/omap/omap-dmic.c index 09db2aec12a3..776e809a8aab 100644 --- a/sound/soc/omap/omap-dmic.c +++ b/sound/soc/omap/omap-dmic.c @@ -48,6 +48,8 @@ struct omap_dmic { struct device *dev; void __iomem *io_base; struct clk *fclk; + struct pm_qos_request pm_qos_req; + int latency; int fclk_freq; int out_freq; int clk_div; @@ -124,6 +126,8 @@ static void omap_dmic_dai_shutdown(struct snd_pcm_substream *substream,
mutex_lock(&dmic->mutex);
+ pm_qos_remove_request(&dmic->pm_qos_req); + if (!dai->active) dmic->active = 0;
@@ -226,6 +230,8 @@ static int omap_dmic_dai_hw_params(struct snd_pcm_substream *substream, /* packet size is threshold * channels */ dma_data = snd_soc_dai_get_dma_data(dai, substream); dma_data->maxburst = dmic->threshold * channels; + dmic->latency = (OMAP_DMIC_THRES_MAX - dmic->threshold) * USEC_PER_SEC / + params_rate(params);
return 0; } @@ -236,6 +242,9 @@ static int omap_dmic_dai_prepare(struct snd_pcm_substream *substream, struct omap_dmic *dmic = snd_soc_dai_get_drvdata(dai); u32 ctrl;
+ if (pm_qos_request_active(&dmic->pm_qos_req)) + pm_qos_update_request(&dmic->pm_qos_req, dmic->latency); + /* Configure uplink threshold */ omap_dmic_write(dmic, OMAP_DMIC_FIFO_CTRL_REG, dmic->threshold);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2084ac6c505a58f7efdec13eba633c6aaa085ca5 ]
The function dentry_connected calls dput(dentry) to drop the previously acquired reference to dentry. In this case, dentry can be released. After that, IS_ROOT(dentry) checks the condition (dentry == dentry->d_parent), which may result in a use-after-free bug. This patch directly compares dentry with its parent obtained before dropping the reference.
Fixes: a056cc8934c("exportfs: stop retrying once we race with rename/remove")
Signed-off-by: Pan Bian bianpan2016@163.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exportfs/expfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c index 329a5d103846..c22cc9d2a5c9 100644 --- a/fs/exportfs/expfs.c +++ b/fs/exportfs/expfs.c @@ -77,7 +77,7 @@ static bool dentry_connected(struct dentry *dentry) struct dentry *parent = dget_parent(dentry);
dput(dentry); - if (IS_ROOT(dentry)) { + if (dentry == parent) { dput(parent); return false; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1efb6ee3edea57f57f9fb05dba8dcb3f7333f61f ]
A format string consisting of "%p" or "%s" followed by an invalid specifier (e.g. "%p%\n" or "%s%") could pass the check which would make format_decode (lib/vsprintf.c) to warn.
Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()") Reported-by: syzbot+1ec5c5ec949c4adaa0c4@syzkaller.appspotmail.com Signed-off-by: Martynas Pumputis m@lambda.lt Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/bpf_trace.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 6350f64d5aa4..f9dd8fd055a6 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -161,11 +161,13 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, i++; } else if (fmt[i] == 'p' || fmt[i] == 's') { mod[fmt_cnt]++; - i++; - if (!isspace(fmt[i]) && !ispunct(fmt[i]) && fmt[i] != 0) + /* disallow any further format extensions */ + if (fmt[i + 1] != 0 && + !isspace(fmt[i + 1]) && + !ispunct(fmt[i + 1])) return -EINVAL; fmt_cnt++; - if (fmt[i - 1] == 's') { + if (fmt[i] == 's') { if (str_seen) /* allow only one '%s' per fmt string */ return -EINVAL;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2a31e4bd9ad255ee40809b5c798c4b1c2b09703b ]
ip_vs_dst_event is supposed to clean up all dst used in ipvs' destinations when a net dev is going down. But it works only when the dst's dev is the same as the dev from the event.
Now with the same priority but late registration, ip_vs_dst_notifier is always called later than ipv6_dev_notf where the dst's dev is set to lo for NETDEV_DOWN event.
As the dst's dev lo is not the same as the dev from the event in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work. Also as these dst have to wait for dest_trash_timer to clean them up. It would cause some non-permanent kernel warnings:
unregister_netdevice: waiting for br0 to become free. Usage count = 3
To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.
Note that for ipv4 route fib_netdev_notifier doesn't set dst's dev to lo in NETDEV_DOWN event, so this fix is only needed when IP_VS_IPV6 is defined.
Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring") Reported-by: Li Shuang shuali@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_ctl.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 327ebe786eeb..2f45c3ce77ef 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -4012,6 +4012,9 @@ static void __net_exit ip_vs_control_net_cleanup_sysctl(struct netns_ipvs *ipvs)
static struct notifier_block ip_vs_dst_notifier = { .notifier_call = ip_vs_dst_event, +#ifdef CONFIG_IP_VS_IPV6 + .priority = ADDRCONF_NOTIFY_PRIORITY + 5, +#endif };
int __net_init ip_vs_control_net_init(struct netns_ipvs *ipvs)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 286afdde1640d8ea8916a0f05e811441fbbf4b9d ]
The current code fails to release the third irq on the error path (observed by reading the code), and we get also multiple WARNs with failing gadget drivers due to duplicate IRQ releases. Fix by using devm_request_irq().
Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/omap_udc.c | 37 +++++++++---------------------- 1 file changed, 10 insertions(+), 27 deletions(-)
diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index f05ba6825bfe..e515c85ef0c5 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -2886,8 +2886,8 @@ static int omap_udc_probe(struct platform_device *pdev) udc->clr_halt = UDC_RESET_EP;
/* USB general purpose IRQ: ep0, state changes, dma, etc */ - status = request_irq(pdev->resource[1].start, omap_udc_irq, - 0, driver_name, udc); + status = devm_request_irq(&pdev->dev, pdev->resource[1].start, + omap_udc_irq, 0, driver_name, udc); if (status != 0) { ERR("can't get irq %d, err %d\n", (int) pdev->resource[1].start, status); @@ -2895,20 +2895,20 @@ static int omap_udc_probe(struct platform_device *pdev) }
/* USB "non-iso" IRQ (PIO for all but ep0) */ - status = request_irq(pdev->resource[2].start, omap_udc_pio_irq, - 0, "omap_udc pio", udc); + status = devm_request_irq(&pdev->dev, pdev->resource[2].start, + omap_udc_pio_irq, 0, "omap_udc pio", udc); if (status != 0) { ERR("can't get irq %d, err %d\n", (int) pdev->resource[2].start, status); - goto cleanup2; + goto cleanup1; } #ifdef USE_ISO - status = request_irq(pdev->resource[3].start, omap_udc_iso_irq, - 0, "omap_udc iso", udc); + status = devm_request_irq(&pdev->dev, pdev->resource[3].start, + omap_udc_iso_irq, 0, "omap_udc iso", udc); if (status != 0) { ERR("can't get irq %d, err %d\n", (int) pdev->resource[3].start, status); - goto cleanup3; + goto cleanup1; } #endif if (cpu_is_omap16xx() || cpu_is_omap7xx()) { @@ -2921,22 +2921,11 @@ static int omap_udc_probe(struct platform_device *pdev) create_proc_file(); status = usb_add_gadget_udc_release(&pdev->dev, &udc->gadget, omap_udc_release); - if (status) - goto cleanup4; - - return 0; + if (!status) + return 0;
-cleanup4: remove_proc_file();
-#ifdef USE_ISO -cleanup3: - free_irq(pdev->resource[2].start, udc); -#endif - -cleanup2: - free_irq(pdev->resource[1].start, udc); - cleanup1: kfree(udc); udc = NULL; @@ -2980,12 +2969,6 @@ static int omap_udc_remove(struct platform_device *pdev)
remove_proc_file();
-#ifdef USE_ISO - free_irq(pdev->resource[3].start, udc); -#endif - free_irq(pdev->resource[2].start, udc); - free_irq(pdev->resource[1].start, udc); - if (udc->dc_clk) { if (udc->clk_requested) omap_udc_enable_clock(0);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 99f700366fcea1aa2fa3c49c99f371670c3c62f8 ]
We currently crash if usb_add_gadget_udc_release() fails, since the udc->done is not initialized until in the remove function. Furthermore, on module removal the udc data is accessed although the release function is already triggered by usb_del_gadget_udc() early in the function.
Fix by rewriting the release and remove functions, basically moving all the cleanup into the release function, and doing the completion only in the module removal case.
The patch fixes omap_udc module probe with a failing gadged, and also allows the removal of omap_udc. Tested by running "modprobe omap_udc; modprobe -r omap_udc" in a loop.
Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/omap_udc.c | 50 ++++++++++++------------------- 1 file changed, 19 insertions(+), 31 deletions(-)
diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index e515c85ef0c5..d45dc14ef0a2 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -2612,9 +2612,22 @@ omap_ep_setup(char *name, u8 addr, u8 type,
static void omap_udc_release(struct device *dev) { - complete(udc->done); + pullup_disable(udc); + if (!IS_ERR_OR_NULL(udc->transceiver)) { + usb_put_phy(udc->transceiver); + udc->transceiver = NULL; + } + omap_writew(0, UDC_SYSCON1); + remove_proc_file(); + if (udc->dc_clk) { + if (udc->clk_requested) + omap_udc_enable_clock(0); + clk_put(udc->hhc_clk); + clk_put(udc->dc_clk); + } + if (udc->done) + complete(udc->done); kfree(udc); - udc = NULL; }
static int @@ -2919,12 +2932,8 @@ static int omap_udc_probe(struct platform_device *pdev) }
create_proc_file(); - status = usb_add_gadget_udc_release(&pdev->dev, &udc->gadget, - omap_udc_release); - if (!status) - return 0; - - remove_proc_file(); + return usb_add_gadget_udc_release(&pdev->dev, &udc->gadget, + omap_udc_release);
cleanup1: kfree(udc); @@ -2951,36 +2960,15 @@ static int omap_udc_remove(struct platform_device *pdev) { DECLARE_COMPLETION_ONSTACK(done);
- if (!udc) - return -ENODEV; - - usb_del_gadget_udc(&udc->gadget); - if (udc->driver) - return -EBUSY; - udc->done = &done;
- pullup_disable(udc); - if (!IS_ERR_OR_NULL(udc->transceiver)) { - usb_put_phy(udc->transceiver); - udc->transceiver = NULL; - } - omap_writew(0, UDC_SYSCON1); - - remove_proc_file(); + usb_del_gadget_udc(&udc->gadget);
- if (udc->dc_clk) { - if (udc->clk_requested) - omap_udc_enable_clock(0); - clk_put(udc->hhc_clk); - clk_put(udc->dc_clk); - } + wait_for_completion(&done);
release_mem_region(pdev->resource[0].start, pdev->resource[0].end - pdev->resource[0].start + 1);
- wait_for_completion(&done); - return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6ca6695f576b8453fe68865e84d25946d63b10ad ]
On OMAP 15xx machines there are no transceivers, and omap_udc_start() always fails as it forgot to adjust the default return value.
Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/omap_udc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index d45dc14ef0a2..9060b3af27ff 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -2045,7 +2045,7 @@ static inline int machine_without_vbus_sense(void) static int omap_udc_start(struct usb_gadget *g, struct usb_gadget_driver *driver) { - int status = -ENODEV; + int status; struct omap_ep *ep; unsigned long flags;
@@ -2083,6 +2083,7 @@ static int omap_udc_start(struct usb_gadget *g, goto done; } } else { + status = 0; if (can_pullup(udc)) pullup_enable(udc); else
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2c2322fbcab8102b8cadc09d66714700a2da42c2 ]
On Palm TE nothing happens when you try to use gadget drivers and plug the USB cable. Fix by adding the board to the vbus sense quirk list.
Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/omap_udc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index 9060b3af27ff..c8facc8aa87e 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -2037,6 +2037,7 @@ static inline int machine_without_vbus_sense(void) { return machine_is_omap_innovator() || machine_is_omap_osk() + || machine_is_omap_palmte() || machine_is_sx1() /* No known omap7xx boards with vbus sense */ || cpu_is_omap7xx();
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 069caf5950dfa75d0526cd89c439ff9d9d3136d8 ]
Commit 387f869d2579 ("usb: gadget: u_ether: conditionally align transfer size") started aligning transfer size only if requested, breaking omap_udc DMA mode. Set quirk_ep_out_aligned_size to restore the old behaviour.
Fixes: 387f869d2579 ("usb: gadget: u_ether: conditionally align transfer size") Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/omap_udc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c index c8facc8aa87e..ee0b87a0773c 100644 --- a/drivers/usb/gadget/udc/omap_udc.c +++ b/drivers/usb/gadget/udc/omap_udc.c @@ -2661,6 +2661,7 @@ omap_udc_setup(struct platform_device *odev, struct usb_phy *xceiv) udc->gadget.speed = USB_SPEED_UNKNOWN; udc->gadget.max_speed = USB_SPEED_FULL; udc->gadget.name = driver_name; + udc->gadget.quirk_ep_out_aligned_size = 1; udc->transceiver = xceiv;
/* ep0 is special; put it right after the SETUP buffer */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 31e1ab494559fb46de304cc6c2aed1528f94b298 ]
This essential mode for PAL users is missing, so add it.
Fixes: 335e3713afb87 ("drm/meson: Add support for HDMI venc modes and settings") Signed-off-by: Christian Hewitt christianshewitt@gmail.com Acked-by: Neil Armstrong narmstrong@baylibre.com Signed-off-by: Neil Armstrong narmstrong@baylibre.com Link: https://patchwork.freedesktop.org/patch/msgid/1542793169-13008-1-git-send-em... Signed-off-by: Sean Paul seanpaul@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/meson/meson_venc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/meson/meson_venc.c b/drivers/gpu/drm/meson/meson_venc.c index 9509017dbded..d5dfe7045cc6 100644 --- a/drivers/gpu/drm/meson/meson_venc.c +++ b/drivers/gpu/drm/meson/meson_venc.c @@ -714,6 +714,7 @@ struct meson_hdmi_venc_vic_mode { { 5, &meson_hdmi_encp_mode_1080i60 }, { 20, &meson_hdmi_encp_mode_1080i50 }, { 32, &meson_hdmi_encp_mode_1080p24 }, + { 33, &meson_hdmi_encp_mode_1080p50 }, { 34, &meson_hdmi_encp_mode_1080p30 }, { 31, &meson_hdmi_encp_mode_1080p50 }, { 16, &meson_hdmi_encp_mode_1080p60 },
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 508b09046c0f21678652fb66fd1e9959d55591d2 ]
When ip6_route_me_harder is invoked, it resets outgoing interface of: - link-local scoped packets sent by neighbor discovery - multicast packets sent by MLD host - multicast packets send by MLD proxy daemon that sets outgoing interface through IPV6_PKTINFO ipi6_ifindex
Link-local and multicast packets must keep their original oif after ip6_route_me_harder is called.
Signed-off-by: Alin Nastac alin.nastac@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/netfilter.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c index 9bf260459f83..1f8b1a433b5d 100644 --- a/net/ipv6/netfilter.c +++ b/net/ipv6/netfilter.c @@ -25,7 +25,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff *skb) unsigned int hh_len; struct dst_entry *dst; struct flowi6 fl6 = { - .flowi6_oif = sk ? sk->sk_bound_dev_if : 0, + .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if : + rt6_need_strict(&iph->daddr) ? skb_dst(skb)->dev->ifindex : 0, .flowi6_mark = skb->mark, .flowi6_uid = sock_net_uid(net, sk), .daddr = iph->daddr,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 75b7b86bdb0df37e08e44b6c1f99010967f81944 ]
Memory windows are implemented with an indirect MKey, when a page fault event comes for a MW Mkey we need to find the MR at the end of the list of the indirect MKeys by iterating on all items from the first to the last.
The offset calculated during this process has to be zeroed after the first iteration or the next iteration will start from a wrong address, resulting incorrect ODP faulting behavior.
Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW") Signed-off-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Moni Shoua monis@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/odp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 3d701c7a4c91..1ed94b6c0b0a 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -723,6 +723,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, head = frame;
bcnt -= frame->bcnt; + offset = 0; } break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 354cb410d87314e2eda344feea84809e4261570a ]
We get the following warnings about empty statements when building with 'W=1':
arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
Rework the debug helper macro to get rid of these warnings.
Signed-off-by: Yi Wang wang.yi59@zte.com.cn Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/lapic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 13dfb55b84db..f7c34184342a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -55,7 +55,7 @@ #define PRIo64 "o"
/* #define apic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */ -#define apic_debug(fmt, arg...) +#define apic_debug(fmt, arg...) do {} while (0)
/* 14 is the version for Xeon and Pentium 8.4.8*/ #define APIC_VERSION (0x14UL | ((KVM_APIC_LVT_NUM - 1) << 16))
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1e4329ee2c52692ea42cc677fb2133519718b34a ]
The inline keyword which is not at the beginning of the function declaration may trigger the following build warnings, so let's fix it:
arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
Signed-off-by: Yi Wang wang.yi59@zte.com.cn Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ec588cf4fe95..4353580b659a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1089,7 +1089,7 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, u16 error_code); static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu); -static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type);
static DEFINE_PER_CPU(struct vmcs *, vmxarea); @@ -5227,7 +5227,7 @@ static void free_vpid(int vpid) spin_unlock(&vmx_vpid_lock); }
-static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -5262,7 +5262,7 @@ static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bit } }
-static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type) { int f = sizeof(unsigned long); @@ -5297,7 +5297,7 @@ static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitm } }
-static void __always_inline vmx_set_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap, u32 msr, int type, bool value) { if (value)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 24a6d2dd263bc910de018c78d1148b3e33b94512 ]
Fix a possible NULL pointer dereference in nic_remove routine removing the nicpf module if nic_probe fails. The issue can be triggered with the following reproducer:
$rmmod nicvf $rmmod nicpf
[ 521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014 [ 521.422777] Mem abort info: [ 521.425561] ESR = 0x96000004 [ 521.428624] Exception class = DABT (current EL), IL = 32 bits [ 521.434535] SET = 0, FnV = 0 [ 521.437579] EA = 0, S1PTW = 0 [ 521.440730] Data abort info: [ 521.443603] ISV = 0, ISS = 0x00000004 [ 521.447431] CM = 0, WnR = 0 [ 521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42 [ 521.457022] [0000000000000014] pgd=0000000000000000 [ 521.461916] Internal error: Oops: 96000004 [#1] SMP [ 521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO) [ 521.523451] pc : nic_remove+0x24/0x88 [nicpf] [ 521.527808] lr : pci_device_remove+0x48/0xd8 [ 521.532066] sp : ffff000013433cc0 [ 521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000 [ 521.540672] x27: 0000000000000000 x26: 0000000000000000 [ 521.545974] x25: 0000000056000000 x24: 0000000000000015 [ 521.551274] x23: ffff8007ff89a110 x22: ffff000001667070 [ 521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000 [ 521.561877] x19: 0000000000000000 x18: 0000000000000025 [ 521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000 [ 521.593683] x7 : 0000000000000000 x6 : 0000000000000001 [ 521.598983] x5 : 0000000000000002 x4 : 0000000000000003 [ 521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184 [ 521.609585] x1 : ffff000001662118 x0 : ffff000008557be0 [ 521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3) [ 521.621490] Call trace: [ 521.623928] nic_remove+0x24/0x88 [nicpf] [ 521.627927] pci_device_remove+0x48/0xd8 [ 521.631847] device_release_driver_internal+0x1b0/0x248 [ 521.637062] driver_detach+0x50/0xc0 [ 521.640628] bus_remove_driver+0x60/0x100 [ 521.644627] driver_unregister+0x34/0x60 [ 521.648538] pci_unregister_driver+0x24/0xd8 [ 521.652798] nic_cleanup_module+0x14/0x111c [nicpf] [ 521.657672] __arm64_sys_delete_module+0x150/0x218 [ 521.662460] el0_svc_handler+0x94/0x110 [ 521.666287] el0_svc+0x8/0xc [ 521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi lorenzo.bianconi@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cavium/thunder/nic_main.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index fb770b0182d3..d89ec4724efd 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -1376,6 +1376,9 @@ static void nic_remove(struct pci_dev *pdev) { struct nicpf *nic = pci_get_drvdata(pdev);
+ if (!nic) + return; + if (nic->flags & NIC_SRIOV_ENABLED) pci_disable_sriov(pdev);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c9287fa657b3328b4549c0ab39ea7f197a3d6a50 ]
list_for_each_entry_safe() is not safe for deleting entries from the list if the spin lock, which protects it, is released and reacquired during the list iteration. Fix this issue by replacing this construction with a simple check if list is empty and removing the first entry in each iteration. This is almost equivalent to a revert of the commit mentioned in the Fixes: tag.
This patch fixes following issue: --->8--- Unable to handle kernel NULL pointer dereference at virtual address 00000104 pgd = (ptrval) [00000104] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT SMP ARM Modules linked in: CPU: 1 PID: 84 Comm: kworker/1:1 Not tainted 4.20.0-rc2-next-20181114-00009-g8266b35ec404 #1061 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: events eth_work PC is at rx_fill+0x60/0xac LR is at _raw_spin_lock_irqsave+0x50/0x5c pc : [<c065fee0>] lr : [<c0a056b8>] psr: 80000093 sp : ee7fbee8 ip : 00000100 fp : 00000000 r10: 006000c0 r9 : c10b0ab0 r8 : ee7eb5c0 r7 : ee7eb614 r6 : ee7eb5ec r5 : 000000dc r4 : ee12ac00 r3 : ee12ac24 r2 : 00000200 r1 : 60000013 r0 : ee7eb5ec Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 6d5dc04a DAC: 00000051 Process kworker/1:1 (pid: 84, stack limit = 0x(ptrval)) Stack: (0xee7fbee8 to 0xee7fc000) ... [<c065fee0>] (rx_fill) from [<c0143b7c>] (process_one_work+0x200/0x738) [<c0143b7c>] (process_one_work) from [<c0144118>] (worker_thread+0x2c/0x4c8) [<c0144118>] (worker_thread) from [<c014a8a4>] (kthread+0x128/0x164) [<c014a8a4>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xee7fbfb0 to 0xee7fbff8) ... ---[ end trace 64480bc835eba7d6 ]---
Fixes: fea14e68ff5e ("usb: gadget: u_ether: use better list accessors") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com
Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/u_ether.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c index bdbc3fdc7c4f..3a0e4f5d7b83 100644 --- a/drivers/usb/gadget/function/u_ether.c +++ b/drivers/usb/gadget/function/u_ether.c @@ -405,12 +405,12 @@ static int alloc_requests(struct eth_dev *dev, struct gether *link, unsigned n) static void rx_fill(struct eth_dev *dev, gfp_t gfp_flags) { struct usb_request *req; - struct usb_request *tmp; unsigned long flags;
/* fill unused rxq slots with some skb */ spin_lock_irqsave(&dev->req_lock, flags); - list_for_each_entry_safe(req, tmp, &dev->rx_reqs, list) { + while (!list_empty(&dev->rx_reqs)) { + req = list_first_entry(&dev->rx_reqs, struct usb_request, list); list_del_init(&req->list); spin_unlock_irqrestore(&dev->req_lock, flags);
@@ -1125,7 +1125,6 @@ void gether_disconnect(struct gether *link) { struct eth_dev *dev = link->ioport; struct usb_request *req; - struct usb_request *tmp;
WARN_ON(!dev); if (!dev) @@ -1142,7 +1141,8 @@ void gether_disconnect(struct gether *link) */ usb_ep_disable(link->in_ep); spin_lock(&dev->req_lock); - list_for_each_entry_safe(req, tmp, &dev->tx_reqs, list) { + while (!list_empty(&dev->tx_reqs)) { + req = list_first_entry(&dev->tx_reqs, struct usb_request, list); list_del(&req->list);
spin_unlock(&dev->req_lock); @@ -1154,7 +1154,8 @@ void gether_disconnect(struct gether *link)
usb_ep_disable(link->out_ep); spin_lock(&dev->req_lock); - list_for_each_entry_safe(req, tmp, &dev->rx_reqs, list) { + while (!list_empty(&dev->rx_reqs)) { + req = list_first_entry(&dev->rx_reqs, struct usb_request, list); list_del(&req->list);
spin_unlock(&dev->req_lock);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ca08987885a147643817d02bf260bc4756ce8cd4 ]
There is no expression deactivation call from the rule replacement path, hence, chain counter is not decremented. A few steps to reproduce the problem:
%nft add table ip filter %nft add chain ip filter c1 %nft add chain ip filter c1 %nft add rule ip filter c1 jump c2 %nft replace rule ip filter c1 handle 3 accept %nft flush ruleset
<jump c2> expression means immediate NFT_JUMP to chain c2. Reference count of chain c2 is increased when the rule is added.
When rule is deleted or replaced, the reference counter of c2 should be decreased via nft_rule_expr_deactivate() which calls nft_immediate_deactivate().
Splat looks like: [ 214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Modules linked in: nf_tables nfnetlink [ 214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44 [ 214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8 [ 214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202 [ 214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0 [ 214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78 [ 214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000 [ 214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba [ 214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6 [ 214.398983] FS: 0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000 [ 214.398983] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0 [ 214.398983] Call Trace: [ 214.398983] ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables] [ 214.398983] ? __kasan_slab_free+0x145/0x180 [ 214.398983] ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables] [ 214.398983] ? kfree+0xdb/0x280 [ 214.398983] nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables] [ ... ]
Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions") Reported by: Christoph Anton Mitterer calestyo@scientia.net Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505 Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791 Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index ea1e57daf50e..623ec29ade26 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2400,21 +2400,14 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk, }
if (nlh->nlmsg_flags & NLM_F_REPLACE) { - if (!nft_is_active_next(net, old_rule)) { - err = -ENOENT; - goto err2; - } - trans = nft_trans_rule_add(&ctx, NFT_MSG_DELRULE, - old_rule); + trans = nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule); if (trans == NULL) { err = -ENOMEM; goto err2; } - nft_deactivate_next(net, old_rule); - chain->use--; - - if (nft_trans_rule_add(&ctx, NFT_MSG_NEWRULE, rule) == NULL) { - err = -ENOMEM; + err = nft_delrule(&ctx, old_rule); + if (err < 0) { + nft_trans_destroy(trans); goto err2; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 9a24ce5b66f9c8190d63b15f4473600db4935f1f ]
[Description]
In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled, pages can be leaked by fscache while trying read pages from cachefiles backend. This can happen because two applications can be reading same page from a single mount, two threads can be trying to read the backing page at same time. This results in one of the threads finding that a page for the backing file or netfs file is already in the radix tree. During the error handling cachefiles does not clean up the reference on backing page, leading to page leak.
[Fix] The fix is straightforward, to decrement the reference when error is encountered.
[dhowells: Note that I've removed the clearance and put of newpage as they aren't attested in the commit message and don't appear to actually achieve anything since a new page is only allocated is newpage!=NULL and any residual new page is cleared before returning.]
[Testing] I have tested the fix using following method for 12+ hrs.
1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs 2) create 10000 files of 2.8MB in a NFS mount. 3) start a thread to simulate heavy VM presssure (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)& 4) start multiple parallel reader for data set at same time find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & .. .. find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ; free -h , cat /proc/meminfo and page-types -r -b lru to ensure all pages are freed.
Reviewed-by: Daniel Axtens dja@axtens.net Signed-off-by: Shantanu Goel sgoel01@yahoo.com Signed-off-by: Kiran Kumar Modukuri kiran.modukuri@gmail.com [dja: forward ported to current upstream] Signed-off-by: Daniel Axtens dja@axtens.net Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/rdwr.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index 199eb396a1bb..54379cf7db7f 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -537,7 +537,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + put_page(backpage); + backpage = NULL; put_page(netpage); + netpage = NULL; fscache_retrieval_complete(op, 1); continue; } @@ -610,7 +613,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + put_page(backpage); + backpage = NULL; put_page(netpage); + netpage = NULL; fscache_retrieval_complete(op, 1); continue; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e4c39f7926b4de355f7df75651d75003806aae09 ]
This patch fixes the variable 'phy_word' may be used uninitialized.
Signed-off-by: Yunjian Wang wangyunjian@huawei.com Tested-by: Aaron Brown aaron.f.brown@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/e1000_i210.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c b/drivers/net/ethernet/intel/igb/e1000_i210.c index 07d48f2e3369..6766081f5ab9 100644 --- a/drivers/net/ethernet/intel/igb/e1000_i210.c +++ b/drivers/net/ethernet/intel/igb/e1000_i210.c @@ -862,6 +862,7 @@ s32 igb_pll_workaround_i210(struct e1000_hw *hw) nvm_word = E1000_INVM_DEFAULT_AL; tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL; igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE); + phy_word = E1000_PHY_PLL_UNCONF; for (i = 0; i < E1000_MAX_PLL_TRIES; i++) { /* check current state directly from internal PHY */ igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a8bf879af7b1999eba36303ce9cc60e0e7dd816c ]
Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules, allowing the core driver code to establish a link over this SFP type.
This is done by the out-of-tree driver but the fix wasn't in mainline.
Fixes: e23f33367882 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”) Fixes: 6a14ee0cfb19 ("ixgbe: Add X550 support function pointers") Signed-off-by: Josh Elsasser jelsasser@appneta.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c index cf6a245db6d5..a37c951b0753 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c @@ -2257,7 +2257,9 @@ static s32 ixgbe_get_link_capabilities_X550em(struct ixgbe_hw *hw, *autoneg = false;
if (hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core0 || - hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1) { + hw->phy.sfp_type == ixgbe_sfp_type_1g_sx_core1 || + hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core0 || + hw->phy.sfp_type == ixgbe_sfp_type_1g_lx_core1) { *speed = IXGBE_LINK_SPEED_1GB_FULL; return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c758940158bf29fe14e9d0f89d5848f227b48134 ]
The net device ndev is freed via free_netdev when failing to register the device. The control flow then jumps to the error handling code block. ndev is used and freed again. Resulting in a use-after-free bug.
Signed-off-by: Pan Bian bianpan2016@163.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hip04_eth.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index 0cec06bec63e..c27054b8ce81 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -914,10 +914,8 @@ static int hip04_mac_probe(struct platform_device *pdev) }
ret = register_netdev(ndev); - if (ret) { - free_netdev(ndev); + if (ret) goto alloc_fail; - }
return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ad97d9de45835b6a0f71983b0ae0cffd7306730a ]
Driver shouldn't try to access any GFX registers until RLC is idle. During the test, it took 12 seconds for RLC to clear the BUSY bit in RLC_GPM_STAT register which is un-acceptable for driver. As per RLC engineer, it would take RLC Ucode less than 10,000 GFXCLK cycles to finish its critical section. In a lowest 300M enginer clock setting(default from vbios), 50 us delay is enough.
This commit fix the hang when RLC introduce the work around for XGMI which requires more cycles to setup more registers than normal
Signed-off-by: shaoyunl shaoyun.liu@amd.com Acked-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 3981915e2311..b2eecfc9042e 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -1992,12 +1992,13 @@ static void gfx_v9_0_rlc_start(struct amdgpu_device *adev) #endif
WREG32_FIELD15(GC, 0, RLC_CNTL, RLC_ENABLE_F32, 1); + udelay(50);
/* carrizo do enable cp interrupt after cp inited */ - if (!(adev->flags & AMD_IS_APU)) + if (!(adev->flags & AMD_IS_APU)) { gfx_v9_0_enable_gui_idle_interrupt(adev, true); - - udelay(50); + udelay(50); + }
#ifdef AMDGPU_RLC_DEBUG_RETRY /* RLC_GPM_GENERAL_6 : RLC Ucode version */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 300625620314194d9e6d4f6dda71f2dc9cf62d9f ]
v1: over-sample data to increase the stability with some specific monitors v2: refine to avoid infinite loop v3: remove un-necessary "volatile" declaration
[airlied: fix two checkpatch warnings]
Signed-off-by: Y.C. Chen yc_chen@aspeedtech.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/1542858988-1127-1-git-send-ema... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ast/ast_mode.c | 36 ++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c index fae1176b2472..343867b182dd 100644 --- a/drivers/gpu/drm/ast/ast_mode.c +++ b/drivers/gpu/drm/ast/ast_mode.c @@ -973,9 +973,21 @@ static int get_clock(void *i2c_priv) { struct ast_i2c_chan *i2c = i2c_priv; struct ast_private *ast = i2c->dev->dev_private; - uint32_t val; + uint32_t val, val2, count, pass; + + count = 0; + pass = 0; + val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01; + do { + val2 = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01; + if (val == val2) { + pass++; + } else { + pass = 0; + val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4) & 0x01; + } + } while ((pass < 5) && (count++ < 0x10000));
- val = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x10) >> 4; return val & 1 ? 1 : 0; }
@@ -983,9 +995,21 @@ static int get_data(void *i2c_priv) { struct ast_i2c_chan *i2c = i2c_priv; struct ast_private *ast = i2c->dev->dev_private; - uint32_t val; + uint32_t val, val2, count, pass; + + count = 0; + pass = 0; + val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01; + do { + val2 = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01; + if (val == val2) { + pass++; + } else { + pass = 0; + val = (ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5) & 0x01; + } + } while ((pass < 5) && (count++ < 0x10000));
- val = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x20) >> 5; return val & 1 ? 1 : 0; }
@@ -998,7 +1022,7 @@ static void set_clock(void *i2c_priv, int clock)
for (i = 0; i < 0x10000; i++) { ujcrb7 = ((clock & 0x01) ? 0 : 1); - ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xfe, ujcrb7); + ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xf4, ujcrb7); jtemp = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x01); if (ujcrb7 == jtemp) break; @@ -1014,7 +1038,7 @@ static void set_data(void *i2c_priv, int data)
for (i = 0; i < 0x10000; i++) { ujcrb7 = ((data & 0x01) ? 0 : 1) << 2; - ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xfb, ujcrb7); + ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0xf1, ujcrb7); jtemp = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xb7, 0x04); if (ujcrb7 == jtemp) break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 72791ac854fea36034fa7976b748fde585008e78 ]
Add a missing header otherwise compiler warns about missed prototype:
drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Srikanth Boddepalli boddepalli.srikanth@gmail.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Reviewed-by: Joey Pabalinas joeypabalinas@gmail.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/xlate_mmu.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c index 23f1387b3ef7..e7df65d32c91 100644 --- a/drivers/xen/xlate_mmu.c +++ b/drivers/xen/xlate_mmu.c @@ -36,6 +36,7 @@ #include <asm/xen/hypervisor.h>
#include <xen/xen.h> +#include <xen/xen-ops.h> #include <xen/page.h> #include <xen/interface/xen.h> #include <xen/interface/memory.h>
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ]
This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f.
That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only.
The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary.
Signed-off-by: Igor Druzhinin igor.druzhinin@citrix.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/enlighten.c | 78 ---------------------------------------- arch/x86/xen/setup.c | 6 ++-- drivers/xen/balloon.c | 65 +++++---------------------------- include/xen/balloon.h | 5 --- 4 files changed, 13 insertions(+), 141 deletions(-)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index df208af3cd74..515d5e4414c2 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -7,7 +7,6 @@
#include <xen/features.h> #include <xen/page.h> -#include <xen/interface/memory.h>
#include <asm/xen/hypercall.h> #include <asm/xen/hypervisor.h> @@ -336,80 +335,3 @@ void xen_arch_unregister_cpu(int num) } EXPORT_SYMBOL(xen_arch_unregister_cpu); #endif - -#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG -void __init arch_xen_balloon_init(struct resource *hostmem_resource) -{ - struct xen_memory_map memmap; - int rc; - unsigned int i, last_guest_ram; - phys_addr_t max_addr = PFN_PHYS(max_pfn); - struct e820_table *xen_e820_table; - const struct e820_entry *entry; - struct resource *res; - - if (!xen_initial_domain()) - return; - - xen_e820_table = kmalloc(sizeof(*xen_e820_table), GFP_KERNEL); - if (!xen_e820_table) - return; - - memmap.nr_entries = ARRAY_SIZE(xen_e820_table->entries); - set_xen_guest_handle(memmap.buffer, xen_e820_table->entries); - rc = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap); - if (rc) { - pr_warn("%s: Can't read host e820 (%d)\n", __func__, rc); - goto out; - } - - last_guest_ram = 0; - for (i = 0; i < memmap.nr_entries; i++) { - if (xen_e820_table->entries[i].addr >= max_addr) - break; - if (xen_e820_table->entries[i].type == E820_TYPE_RAM) - last_guest_ram = i; - } - - entry = &xen_e820_table->entries[last_guest_ram]; - if (max_addr >= entry->addr + entry->size) - goto out; /* No unallocated host RAM. */ - - hostmem_resource->start = max_addr; - hostmem_resource->end = entry->addr + entry->size; - - /* - * Mark non-RAM regions between the end of dom0 RAM and end of host RAM - * as unavailable. The rest of that region can be used for hotplug-based - * ballooning. - */ - for (; i < memmap.nr_entries; i++) { - entry = &xen_e820_table->entries[i]; - - if (entry->type == E820_TYPE_RAM) - continue; - - if (entry->addr >= hostmem_resource->end) - break; - - res = kzalloc(sizeof(*res), GFP_KERNEL); - if (!res) - goto out; - - res->name = "Unavailable host RAM"; - res->start = entry->addr; - res->end = (entry->addr + entry->size < hostmem_resource->end) ? - entry->addr + entry->size : hostmem_resource->end; - rc = insert_resource(hostmem_resource, res); - if (rc) { - pr_warn("%s: Can't insert [%llx - %llx) (%d)\n", - __func__, res->start, res->end, rc); - kfree(res); - goto out; - } - } - - out: - kfree(xen_e820_table); -} -#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */ diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 6e0d2086eacb..c114ca767b3b 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -808,6 +808,7 @@ char * __init xen_memory_setup(void) addr = xen_e820_table.entries[0].addr; size = xen_e820_table.entries[0].size; while (i < xen_e820_table.nr_entries) { + bool discard = false;
chunk_size = size; type = xen_e820_table.entries[i].type; @@ -823,10 +824,11 @@ char * __init xen_memory_setup(void) xen_add_extra_mem(pfn_s, n_pfns); xen_max_p2m_pfn = pfn_s + n_pfns; } else - type = E820_TYPE_UNUSABLE; + discard = true; }
- xen_align_and_add_e820_region(addr, chunk_size, type); + if (!discard) + xen_align_and_add_e820_region(addr, chunk_size, type);
addr += chunk_size; size -= chunk_size; diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 065f0b607373..f77e499afddd 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -257,25 +257,10 @@ static void release_memory_resource(struct resource *resource) kfree(resource); }
-/* - * Host memory not allocated to dom0. We can use this range for hotplug-based - * ballooning. - * - * It's a type-less resource. Setting IORESOURCE_MEM will make resource - * management algorithms (arch_remove_reservations()) look into guest e820, - * which we don't want. - */ -static struct resource hostmem_resource = { - .name = "Host RAM", -}; - -void __attribute__((weak)) __init arch_xen_balloon_init(struct resource *res) -{} - static struct resource *additional_memory_resource(phys_addr_t size) { - struct resource *res, *res_hostmem; - int ret = -ENOMEM; + struct resource *res; + int ret;
res = kzalloc(sizeof(*res), GFP_KERNEL); if (!res) @@ -284,42 +269,13 @@ static struct resource *additional_memory_resource(phys_addr_t size) res->name = "System RAM"; res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
- res_hostmem = kzalloc(sizeof(*res), GFP_KERNEL); - if (res_hostmem) { - /* Try to grab a range from hostmem */ - res_hostmem->name = "Host memory"; - ret = allocate_resource(&hostmem_resource, res_hostmem, - size, 0, -1, - PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL); - } - - if (!ret) { - /* - * Insert this resource into iomem. Because hostmem_resource - * tracks portion of guest e820 marked as UNUSABLE noone else - * should try to use it. - */ - res->start = res_hostmem->start; - res->end = res_hostmem->end; - ret = insert_resource(&iomem_resource, res); - if (ret < 0) { - pr_err("Can't insert iomem_resource [%llx - %llx]\n", - res->start, res->end); - release_memory_resource(res_hostmem); - res_hostmem = NULL; - res->start = res->end = 0; - } - } - - if (ret) { - ret = allocate_resource(&iomem_resource, res, - size, 0, -1, - PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL); - if (ret < 0) { - pr_err("Cannot allocate new System RAM resource\n"); - kfree(res); - return NULL; - } + ret = allocate_resource(&iomem_resource, res, + size, 0, -1, + PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL); + if (ret < 0) { + pr_err("Cannot allocate new System RAM resource\n"); + kfree(res); + return NULL; }
#ifdef CONFIG_SPARSEMEM @@ -331,7 +287,6 @@ static struct resource *additional_memory_resource(phys_addr_t size) pr_err("New System RAM resource outside addressable RAM (%lu > %lu)\n", pfn, limit); release_memory_resource(res); - release_memory_resource(res_hostmem); return NULL; } } @@ -810,8 +765,6 @@ static int __init balloon_init(void) set_online_page_callback(&xen_online_page); register_memory_notifier(&xen_memory_nb); register_sysctl_table(xen_root); - - arch_xen_balloon_init(&hostmem_resource); #endif
#ifdef CONFIG_XEN_PV diff --git a/include/xen/balloon.h b/include/xen/balloon.h index 61f410fd74e4..4914b93a23f2 100644 --- a/include/xen/balloon.h +++ b/include/xen/balloon.h @@ -44,8 +44,3 @@ static inline void xen_balloon_init(void) { } #endif - -#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG -struct resource; -void arch_xen_balloon_init(struct resource *hostmem_resource); -#endif
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 89d328f637b9904b6d4c9af73c8a608b8dd4d6f8 ]
The actual number of bytes stored in a PRZ is smaller than the bytes requested by platform data, since there is a header on each PRZ. Additionally, if ECC is enabled, there are trailing bytes used as well. Normally this mismatch doesn't matter since PRZs are circular buffers and the leading "overflow" bytes are just thrown away. However, in the case of a compressed record, this rather badly corrupts the results.
This corruption was visible with "ramoops.mem_size=204800 ramoops.ecc=1". Any stored crashes would not be uncompressable (producing a pstorefs "dmesg-*.enc.z" file), and triggering errors at boot:
[ 2.790759] pstore: crypto_comp_decompress failed, ret = -22!
Backporting this depends on commit 70ad35db3321 ("pstore: Convert console write to use ->write_buf")
Reported-by: Joel Fernandes joel@joelfernandes.org Fixes: b0aad7a99c1d ("pstore: Add compression support to pstore") Signed-off-by: Kees Cook keescook@chromium.org Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/pstore/ram.c | 15 ++++++--------- include/linux/pstore.h | 5 ++++- 2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c index 7125b398d312..9f7e546d7050 100644 --- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -804,17 +804,14 @@ static int ramoops_probe(struct platform_device *pdev)
cxt->pstore.data = cxt; /* - * Console can handle any buffer size, so prefer LOG_LINE_MAX. If we - * have to handle dumps, we must have at least record_size buffer. And - * for ftrace, bufsize is irrelevant (if bufsize is 0, buf will be - * ZERO_SIZE_PTR). + * Since bufsize is only used for dmesg crash dumps, it + * must match the size of the dprz record (after PRZ header + * and ECC bytes have been accounted for). */ - if (cxt->console_size) - cxt->pstore.bufsize = 1024; /* LOG_LINE_MAX */ - cxt->pstore.bufsize = max(cxt->record_size, cxt->pstore.bufsize); - cxt->pstore.buf = kmalloc(cxt->pstore.bufsize, GFP_KERNEL); + cxt->pstore.bufsize = cxt->dprzs[0]->buffer_size; + cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL); if (!cxt->pstore.buf) { - pr_err("cannot allocate pstore buffer\n"); + pr_err("cannot allocate pstore crash dump buffer\n"); err = -ENOMEM; goto fail_clear; } diff --git a/include/linux/pstore.h b/include/linux/pstore.h index 61f806a7fe29..170bb981d2fd 100644 --- a/include/linux/pstore.h +++ b/include/linux/pstore.h @@ -90,7 +90,10 @@ struct pstore_record { * * @buf_lock: spinlock to serialize access to @buf * @buf: preallocated crash dump buffer - * @bufsize: size of @buf available for crash dump writes + * @bufsize: size of @buf available for crash dump bytes (must match + * smallest number of bytes available for writing to a + * backend entry, since compressed bytes don't take kindly + * to being truncated) * * @read_mutex: serializes @open, @read, @close, and @erase callbacks * @flags: bitfield of frontends the backend can accept writes for
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ]
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup().
At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered.
When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared
There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting.
Customer which reported the hang, also report that the hang cannot be reproduced with this fix.
The backtrace for the blocked process looked like:
PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e
Signed-off-by: NeilBrown neilb@suse.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fscache/object.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/fscache/object.c b/fs/fscache/object.c index 7a182c87f378..ab1d7f35f6c2 100644 --- a/fs/fscache/object.c +++ b/fs/fscache/object.c @@ -715,6 +715,9 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
if (awaken) wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING); + if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags)) + wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP); +
/* Prevent a race with our last child, which has to signal EV_CLEARED * before dropping our spinlock.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 31ffa563833576bd49a8bf53120568312755e6e2 ]
Variable 'cache' is being assigned but is never used hence it is redundant and can be removed.
Cleans up clang warning: warning: variable 'cache' set but not used [-Wunused-but-set-variable]
Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/rdwr.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index 54379cf7db7f..5e9176ec0d3a 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -969,11 +969,8 @@ int cachefiles_write_page(struct fscache_storage *op, struct page *page) void cachefiles_uncache_page(struct fscache_object *_object, struct page *page) { struct cachefiles_object *object; - struct cachefiles_cache *cache;
object = container_of(_object, struct cachefiles_object, fscache); - cache = container_of(object->fscache.cache, - struct cachefiles_cache, cache);
_enter("%p,{%lu}", object, page->index);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit f6c8e432cb0479255322c5d0335b9f1699a0270c ]
nvme_stop_ctrl can be called also for reset flow and there is no need to flush the scan_work as namespaces are not being removed. This can cause deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers before controller teardown (and specifically I/O cancellation of the scan_work itself) takes place, but the scan_work will be blocked anyways so there is no need to flush it.
Instead, move scan_work flush to nvme_remove_namespaces() where it really needs to flush.
Reported-by: Ming Lei ming.lei@redhat.com Signed-off-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Keith Busch keith.busch@intel.com Reviewed by: James Smart jsmart2021@gmail.com Tested-by: Ewan D. Milne emilne@redhat.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 3a63d58d2ca9..65f3f1a34b6b 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2572,6 +2572,9 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl) { struct nvme_ns *ns, *next;
+ /* prevent racing with ns scanning */ + flush_work(&ctrl->scan_work); + /* * The dead states indicates the controller was not gracefully * disconnected. In that case, we won't be able to flush any data while @@ -2743,7 +2746,6 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl) { nvme_stop_keep_alive(ctrl); flush_work(&ctrl->async_event_work); - flush_work(&ctrl->scan_work); cancel_work_sync(&ctrl->fw_act_work); } EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ea2412dc21cc790335d319181dddc43682aef164 ]
Running the Clang static analyzer on IORT code detected the following error:
Logic error: Branch condition evaluates to a garbage value
in
iort_get_platform_device_domain()
If the named component associated with a given device has no IORT mappings, iort_get_platform_device_domain() exits its MSI mapping loop with msi_parent pointer containing garbage, which can lead to erroneous code path execution.
Initialize the msi_parent pointer, fixing the bug.
Fixes: d4f54a186667 ("ACPI: platform: setup MSI domain for ACPI based platform device") Reported-by: Patrick Bellasi patrick.bellasi@arm.com Reviewed-by: Hanjun Guo hanjun.guo@linaro.org Acked-by: Will Deacon will.deacon@arm.com Cc: Sudeep Holla sudeep.holla@arm.com Cc: "Rafael J. Wysocki" rjw@rjwysocki.net Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/arm64/iort.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index de56394dd161..ca414910710e 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -547,7 +547,7 @@ struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id) */ static struct irq_domain *iort_get_platform_device_domain(struct device *dev) { - struct acpi_iort_node *node, *msi_parent; + struct acpi_iort_node *node, *msi_parent = NULL; struct fwnode_handle *iort_fwnode; struct acpi_iort_its_group *its; int i;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e21e57445a64598b29a6f629688f9b9a39e7242a ]
ocfs2_defrag_extent may fall into deadlock.
ocfs2_ioctl_move_extents ocfs2_ioctl_move_extents ocfs2_move_extents ocfs2_defrag_extent ocfs2_lock_allocators_move_extents
ocfs2_reserve_clusters inode_lock GLOBAL_BITMAP_SYSTEM_INODE
__ocfs2_flush_truncate_log inode_lock GLOBAL_BITMAP_SYSTEM_INODE
As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock against the global bitmap if local allocator has not sufficient cluters. Once global bitmap could meet the demand, ocfs2_reserve_cluster will return success with global bitmap locked.
After ocfs2_reserve_cluster(), if truncate log is full, __ocfs2_flush_truncate_log() will definitely fall into deadlock because it needs to inode_lock global bitmap, which has already been locked.
To fix this bug, we could remove from ocfs2_lock_allocators_move_extents() the code which intends to lock global allocator, and put the removed code after __ocfs2_flush_truncate_log().
ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here, the other does not need the data allocator context, which means this patch does not affect the caller so far.
Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.com Signed-off-by: Larry Chen lchen@suse.com Reviewed-by: Changwei Ge ge.changwei@h3c.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Joseph Qi jiangqi903@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/move_extents.c | 47 +++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 21 deletions(-)
diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c index 7eb3b0a6347e..f55f82ca3425 100644 --- a/fs/ocfs2/move_extents.c +++ b/fs/ocfs2/move_extents.c @@ -156,18 +156,14 @@ static int __ocfs2_move_extent(handle_t *handle, }
/* - * lock allocators, and reserving appropriate number of bits for - * meta blocks and data clusters. - * - * in some cases, we don't need to reserve clusters, just let data_ac - * be NULL. + * lock allocator, and reserve appropriate number of bits for + * meta blocks. */ -static int ocfs2_lock_allocators_move_extents(struct inode *inode, +static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode, struct ocfs2_extent_tree *et, u32 clusters_to_move, u32 extents_to_split, struct ocfs2_alloc_context **meta_ac, - struct ocfs2_alloc_context **data_ac, int extra_blocks, int *credits) { @@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode, goto out; }
- if (data_ac) { - ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac); - if (ret) { - mlog_errno(ret); - goto out; - } - }
*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
@@ -257,10 +246,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context, } }
- ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1, - &context->meta_ac, - &context->data_ac, - extra_blocks, &credits); + ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et, + *len, 1, + &context->meta_ac, + extra_blocks, &credits); if (ret) { mlog_errno(ret); goto out; @@ -283,6 +272,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context, } }
+ /* + * Make sure ocfs2_reserve_cluster is called after + * __ocfs2_flush_truncate_log, otherwise, dead lock may happen. + * + * If ocfs2_reserve_cluster is called + * before __ocfs2_flush_truncate_log, dead lock on global bitmap + * may happen. + * + */ + ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac); + if (ret) { + mlog_errno(ret); + goto out_unlock_mutex; + } + handle = ocfs2_start_trans(osb, credits); if (IS_ERR(handle)) { ret = PTR_ERR(handle); @@ -600,9 +604,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context, } }
- ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1, - &context->meta_ac, - NULL, extra_blocks, &credits); + ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et, + len, 1, + &context->meta_ac, + extra_blocks, &credits); if (ret) { mlog_errno(ret); goto out;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8f416836c0d50b198cad1225132e5abebf8980dc ]
init_currently_empty_zone() will adjust pgdat->nr_zones and set it to 'zone_idx(zone) + 1' unconditionally. This is correct in the normal case, while not exact in hot-plug situation.
This function is used in two places:
* free_area_init_core() * move_pfn_range_to_zone()
In the first case, we are sure zone index increase monotonically. While in the second one, this is under users control.
One way to reproduce this is: ----------------------------
1. create a virtual machine with empty node1
-m 4G,slots=32,maxmem=32G \ -smp 4,maxcpus=8 \ -numa node,nodeid=0,mem=4G,cpus=0-3 \ -numa node,nodeid=1,mem=0G,cpus=4-7
2. hot-add cpu 3-7
cpu-add [3-7]
2. hot-add memory to nod1
object_add memory-backend-ram,id=ram0,size=1G device_add pc-dimm,id=dimm0,memdev=ram0,node=1
3. online memory with following order
echo online_movable > memory47/state echo online > memory40/state
After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1) instead of (ZONE_MOVABLE + 1).
Michal said: "Having an incorrect nr_zones might result in all sorts of problems which would be quite hard to debug (e.g. reclaim not considering the movable zone). I do not expect many users would suffer from this it but still this is trivial and obviously right thing to do so backporting to the stable tree shouldn't be harmful (last famous words)"
Link: http://lkml.kernel.org/r/20181117022022.9956-1-richard.weiyang@gmail.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") Signed-off-by: Wei Yang richard.weiyang@gmail.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Dave Hansen dave.hansen@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Signed-off-by: Sasha Levin sashal@kernel.org --- mm/page_alloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6be91a1a00d9..a2f365f40433 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5544,8 +5544,10 @@ void __meminit init_currently_empty_zone(struct zone *zone, unsigned long size) { struct pglist_data *pgdat = zone->zone_pgdat; + int zone_idx = zone_idx(zone) + 1;
- pgdat->nr_zones = zone_idx(zone) + 1; + if (zone_idx > pgdat->nr_zones) + pgdat->nr_zones = zone_idx;
zone->zone_start_pfn = zone_start_pfn;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ce96a407adef126870b3f4a1b73529dd8aa80f49 ]
hfs_bmap_free() frees the node via hfs_bnode_put(node). However, it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees the node only when it is never again used.
Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.... Fixes: a1185ffa2fc ("HFS rewrite") Signed-off-by: Pan Bian bianpan2016@163.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Joe Perches joe@perches.com Cc: Ernesto A. Fernandez ernesto.mnd.fernandez@gmail.com Cc: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/btree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/hfs/btree.c b/fs/hfs/btree.c index 374b5688e29e..9bdff5e40626 100644 --- a/fs/hfs/btree.c +++ b/fs/hfs/btree.c @@ -329,13 +329,14 @@ void hfs_bmap_free(struct hfs_bnode *node)
nidx -= len * 8; i = node->next; - hfs_bnode_put(node); if (!i) { /* panic */; pr_crit("unable to free bnode %u. bmap not found!\n", node->this); + hfs_bnode_put(node); return; } + hfs_bnode_put(node); node = hfs_bnode_find(tree, i); if (IS_ERR(node)) return;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c7d7d620dcbd2a1c595092280ca943f2fced7bbd ]
hfs_bmap_free() frees node via hfs_bnode_put(node). However it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees node only when it is never used.
Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.c... Signed-off-by: Pan Bian bianpan2016@163.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Ernesto A. Fernandez ernesto.mnd.fernandez@gmail.com Cc: Joe Perches joe@perches.com Cc: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/btree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/hfsplus/btree.c b/fs/hfsplus/btree.c index de14b2b6881b..3de3bc4918b5 100644 --- a/fs/hfsplus/btree.c +++ b/fs/hfsplus/btree.c @@ -454,14 +454,15 @@ void hfs_bmap_free(struct hfs_bnode *node)
nidx -= len * 8; i = node->next; - hfs_bnode_put(node); if (!i) { /* panic */; pr_crit("unable to free bnode %u. " "bmap not found!\n", node->this); + hfs_bnode_put(node); return; } + hfs_bnode_put(node); node = hfs_bnode_find(tree, i); if (IS_ERR(node)) return;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8de456cf87ba863e028c4dd01bae44255ce3d835 ]
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls.
fill_pool kmemleak_ignore make_black_object put_object __call_rcu (kernel/rcu/tree.c) debug_rcu_head_queue debug_object_activate debug_object_init fill_pool kmemleak_ignore make_black_object ...
So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly allocated debug objects at all.
Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us Signed-off-by: Qian Cai cai@gmx.us Suggested-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Waiman Long longman@redhat.com Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Yang Shi yang.shi@linux.alibaba.com Cc: Arnd Bergmann arnd@arndb.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/debugobjects.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 99308479b1c8..bacb00a9cd9f 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -111,7 +111,6 @@ static void fill_pool(void) if (!new) return;
- kmemleak_ignore(new); raw_spin_lock_irqsave(&pool_lock, flags); hlist_add_head(&new->node, &obj_pool); debug_objects_allocated++; @@ -1085,7 +1084,6 @@ static int __init debug_objects_replace_static_objects(void) obj = kmem_cache_zalloc(obj_cache, GFP_KERNEL); if (!obj) goto free; - kmemleak_ignore(obj); hlist_add_head(&obj->node, &objects); }
@@ -1141,7 +1139,8 @@ void __init debug_objects_mem_init(void)
obj_cache = kmem_cache_create("debug_objects_cache", sizeof (struct debug_obj), 0, - SLAB_DEBUG_OBJECTS, NULL); + SLAB_DEBUG_OBJECTS | SLAB_NOLEAKTRACE, + NULL);
if (!obj_cache || debug_objects_replace_static_objects()) { debug_objects_enabled = 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 164f7e586739d07eb56af6f6d66acebb11f315c8 ]
ocfs2_get_dentry() calls iput(inode) to drop the reference count of inode, and if the reference count hits 0, inode is freed. However, in this function, it then reads inode->i_generation, which may result in a use after free bug. Move the put operation later.
Link: http://lkml.kernel.org/r/1543109237-110227-1-git-send-email-bianpan2016@163.... Fixes: 781f200cb7a("ocfs2: Remove masklog ML_EXPORT.") Signed-off-by: Pan Bian bianpan2016@163.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Joseph Qi jiangqi903@gmail.com Cc: Changwei Ge ge.changwei@h3c.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/export.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c index 9f88188060db..4bf8d5854b27 100644 --- a/fs/ocfs2/export.c +++ b/fs/ocfs2/export.c @@ -125,10 +125,10 @@ static struct dentry *ocfs2_get_dentry(struct super_block *sb,
check_gen: if (handle->ih_generation != inode->i_generation) { - iput(inode); trace_ocfs2_get_dentry_generation((unsigned long long)blkno, handle->ih_generation, inode->i_generation); + iput(inode); result = ERR_PTR(-ESTALE); goto bail; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
This reverts commit c9b8d580b3fb0ab65d37c372aef19a318fda3199.
This is just a technical revert to make the printk fix apply cleanly, this patch will be re-picked in about 3 commits. --- kernel/printk/printk.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index a9cf2e15f6a3..7161312593dd 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1762,12 +1762,6 @@ asmlinkage int vprintk_emit(int facility, int level,
/* If called from the scheduler, we can not call up(). */ if (!in_sched) { - /* - * Disable preemption to avoid being preempted while holding - * console_sem which would prevent anyone from printing to - * console - */ - preempt_disable(); /* * Try to acquire and then immediately release the console * semaphore. The release will print out buffers and wake up @@ -1775,7 +1769,6 @@ asmlinkage int vprintk_emit(int facility, int level, */ if (console_trylock()) console_unlock(); - preempt_enable(); }
return printed_len; @@ -2090,7 +2083,20 @@ int console_trylock(void) return 0; } console_locked = 1; - console_may_schedule = 0; + /* + * When PREEMPT_COUNT disabled we can't reliably detect if it's + * safe to schedule (e.g. calling printk while holding a spin_lock), + * because preempt_disable()/preempt_enable() are just barriers there + * and preempt_count() is always 0. + * + * RCU read sections have a separate preemption counter when + * PREEMPT_RCU enabled thus we must take extra care and check + * rcu_preempt_depth(), otherwise RCU read sections modify + * preempt_count(). + */ + console_may_schedule = !oops_in_progress && + preemptible() && + !rcu_preempt_depth(); return 1; } EXPORT_SYMBOL(console_trylock);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dbdda842fe96f8932bae554f0adf463c27c42bc7 ]
This patch implements what I discussed in Kernel Summit. I added lockdep annotation (hopefully correctly), and it hasn't had any splats (since I fixed some bugs in the first iterations). It did catch problems when I had the owner covering too much. But now that the owner is only set when actively calling the consoles, lockdep has stayed quiet.
Here's the design again:
I added a "console_owner" which is set to a task that is actively writing to the consoles. It is *not* the same as the owner of the console_lock. It is only set when doing the calls to the console functions. It is protected by a console_owner_lock which is a raw spin lock.
There is a console_waiter. This is set when there is an active console owner that is not current, and waiter is not set. This too is protected by console_owner_lock.
In printk() when it tries to write to the consoles, we have:
if (console_trylock()) console_unlock();
Now I added an else, which will check if there is an active owner, and no current waiter. If that is the case, then console_waiter is set, and the task goes into a spin until it is no longer set.
When the active console owner finishes writing the current message to the consoles, it grabs the console_owner_lock and sees if there is a waiter, and clears console_owner.
If there is a waiter, then it breaks out of the loop, clears the waiter flag (because that will release the waiter from its spin), and exits. Note, it does *not* release the console semaphore. Because it is a semaphore, there is no owner. Another task may release it. This means that the waiter is guaranteed to be the new console owner! Which it becomes.
Then the waiter calls console_unlock() and continues to write to the consoles.
If another task comes along and does a printk() it too can become the new waiter, and we wash rinse and repeat!
By Petr Mladek about possible new deadlocks:
The thing is that we move console_sem only to printk() call that normally calls console_unlock() as well. It means that the transferred owner should not bring new type of dependencies. As Steven said somewhere: "If there is a deadlock, it was there even before."
We could look at it from this side. The possible deadlock would look like:
CPU0 CPU1
console_unlock()
console_owner = current;
spin_lockA() printk() spin = true; while (...)
call_console_drivers() spin_lockA()
This would be a deadlock. CPU0 would wait for the lock A. While CPU1 would own the lockA and would wait for CPU0 to finish calling the console drivers and pass the console_sem owner.
But if the above is true than the following scenario was already possible before:
CPU0
spin_lockA() printk() console_unlock() call_console_drivers() spin_lockA()
By other words, this deadlock was there even before. Such deadlocks are prevented by using printk_deferred() in the sections guarded by the lock A.
By Steven Rostedt:
To demonstrate the issue, this module has been shown to lock up a system with 4 CPUs and a slow console (like a serial console). It is also able to lock up a 8 CPU system with only a fast (VGA) console, by passing in "loops=100". The changes in this commit prevent this module from locking up the system.
#include <linux/module.h> #include <linux/delay.h> #include <linux/sched.h> #include <linux/mutex.h> #include <linux/workqueue.h> #include <linux/hrtimer.h>
static bool stop_testing; static unsigned int loops = 1;
static void preempt_printk_workfn(struct work_struct *work) { int i;
while (!READ_ONCE(stop_testing)) { for (i = 0; i < loops && !READ_ONCE(stop_testing); i++) { preempt_disable(); pr_emerg("%5d%-75s\n", smp_processor_id(), " XXX NOPREEMPT"); preempt_enable(); } msleep(1); } }
static struct work_struct __percpu *works;
static void finish(void) { int cpu;
WRITE_ONCE(stop_testing, true); for_each_online_cpu(cpu) flush_work(per_cpu_ptr(works, cpu)); free_percpu(works); }
static int __init test_init(void) { int cpu;
works = alloc_percpu(struct work_struct); if (!works) return -ENOMEM;
/* * This is just a test module. This will break if you * do any CPU hot plugging between loading and * unloading the module. */
for_each_online_cpu(cpu) { struct work_struct *work = per_cpu_ptr(works, cpu);
INIT_WORK(work, &preempt_printk_workfn); schedule_work_on(cpu, work); }
return 0; }
static void __exit test_exit(void) { finish(); }
module_param(loops, uint, 0); module_init(test_init); module_exit(test_exit); MODULE_LICENSE("GPL");
Link: http://lkml.kernel.org/r/20180110132418.7080-2-pmladek@suse.com Cc: akpm@linux-foundation.org Cc: linux-mm@kvack.org Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@suse.de Cc: Michal Hocko mhocko@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jan Kara jack@suse.cz Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Byungchul Park byungchul.park@lge.com Cc: Tejun Heo tj@kernel.org Cc: Pavel Machek pavel@ucw.cz Cc: linux-kernel@vger.kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org [pmladek@suse.com: Commit message about possible deadlocks] Acked-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 108 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 107 insertions(+), 1 deletion(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 7161312593dd..b88b402444d6 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -86,8 +86,15 @@ EXPORT_SYMBOL_GPL(console_drivers); static struct lockdep_map console_lock_dep_map = { .name = "console_lock" }; +static struct lockdep_map console_owner_dep_map = { + .name = "console_owner" +}; #endif
+static DEFINE_RAW_SPINLOCK(console_owner_lock); +static struct task_struct *console_owner; +static bool console_waiter; + enum devkmsg_log_bits { __DEVKMSG_LOG_BIT_ON = 0, __DEVKMSG_LOG_BIT_OFF, @@ -1767,8 +1774,56 @@ asmlinkage int vprintk_emit(int facility, int level, * semaphore. The release will print out buffers and wake up * /dev/kmsg and syslog() users. */ - if (console_trylock()) + if (console_trylock()) { console_unlock(); + } else { + struct task_struct *owner = NULL; + bool waiter; + bool spin = false; + + printk_safe_enter_irqsave(flags); + + raw_spin_lock(&console_owner_lock); + owner = READ_ONCE(console_owner); + waiter = READ_ONCE(console_waiter); + if (!waiter && owner && owner != current) { + WRITE_ONCE(console_waiter, true); + spin = true; + } + raw_spin_unlock(&console_owner_lock); + + /* + * If there is an active printk() writing to the + * consoles, instead of having it write our data too, + * see if we can offload that load from the active + * printer, and do some printing ourselves. + * Go into a spin only if there isn't already a waiter + * spinning, and there is an active printer, and + * that active printer isn't us (recursive printk?). + */ + if (spin) { + /* We spin waiting for the owner to release us */ + spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); + /* Owner will clear console_waiter on hand off */ + while (READ_ONCE(console_waiter)) + cpu_relax(); + + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + printk_safe_exit_irqrestore(flags); + + /* + * The owner passed the console lock to us. + * Since we did not spin on console lock, annotate + * this as a trylock. Otherwise lockdep will + * complain. + */ + mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_); + console_unlock(); + printk_safe_enter_irqsave(flags); + } + printk_safe_exit_irqrestore(flags); + + } }
return printed_len; @@ -2155,6 +2210,7 @@ void console_unlock(void) static u64 seen_seq; unsigned long flags; bool wake_klogd = false; + bool waiter = false; bool do_cond_resched, retry;
if (console_suspended) { @@ -2243,14 +2299,64 @@ void console_unlock(void) console_seq++; raw_spin_unlock(&logbuf_lock);
+ /* + * While actively printing out messages, if another printk() + * were to occur on another CPU, it may wait for this one to + * finish. This task can not be preempted if there is a + * waiter waiting to take over. + */ + raw_spin_lock(&console_owner_lock); + console_owner = current; + raw_spin_unlock(&console_owner_lock); + + /* The waiter may spin on us after setting console_owner */ + spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); + stop_critical_timings(); /* don't trace print latency */ call_console_drivers(ext_text, ext_len, text, len); start_critical_timings(); + + raw_spin_lock(&console_owner_lock); + waiter = READ_ONCE(console_waiter); + console_owner = NULL; + raw_spin_unlock(&console_owner_lock); + + /* + * If there is a waiter waiting for us, then pass the + * rest of the work load over to that waiter. + */ + if (waiter) + break; + + /* There was no waiter, and nothing will spin on us here */ + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + printk_safe_exit_irqrestore(flags);
if (do_cond_resched) cond_resched(); } + + /* + * If there is an active waiter waiting on the console_lock. + * Pass off the printing to the waiter, and the waiter + * will continue printing on its CPU, and when all writing + * has finished, the last printer will wake up klogd. + */ + if (waiter) { + WRITE_ONCE(console_waiter, false); + /* The waiter is now free to continue */ + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + /* + * Hand off console_lock to waiter. The waiter will perform + * the up(). After this, the waiter is the console_lock owner. + */ + mutex_release(&console_lock_dep_map, 1, _THIS_IP_); + printk_safe_exit_irqrestore(flags); + /* Note, if waiter is set, logbuf_lock is not held */ + return; + } + console_locked = 0;
/* Release the exclusive_console once it is used */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c162d5b4338d72deed61aa65ed0f2f4ba2bbc8ab ]
The commit ("printk: Add console owner and waiter logic to load balance console writes") made vprintk_emit() and console_unlock() even more complicated.
This patch extracts the new code into 3 helper functions. They should help to keep it rather self-contained. It will be easier to use and maintain.
This patch just shuffles the existing code. It does not change the functionality.
Link: http://lkml.kernel.org/r/20180112160837.GD24497@linux.suse Cc: akpm@linux-foundation.org Cc: linux-mm@kvack.org Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@suse.de Cc: Michal Hocko mhocko@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jan Kara jack@suse.cz Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: rostedt@home.goodmis.org Cc: Byungchul Park byungchul.park@lge.com Cc: Tejun Heo tj@kernel.org Cc: Pavel Machek pavel@ucw.cz Cc: linux-kernel@vger.kernel.org Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Acked-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 245 +++++++++++++++++++++++++---------------- 1 file changed, 148 insertions(+), 97 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index b88b402444d6..2d1c2700bd85 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -86,15 +86,8 @@ EXPORT_SYMBOL_GPL(console_drivers); static struct lockdep_map console_lock_dep_map = { .name = "console_lock" }; -static struct lockdep_map console_owner_dep_map = { - .name = "console_owner" -}; #endif
-static DEFINE_RAW_SPINLOCK(console_owner_lock); -static struct task_struct *console_owner; -static bool console_waiter; - enum devkmsg_log_bits { __DEVKMSG_LOG_BIT_ON = 0, __DEVKMSG_LOG_BIT_OFF, @@ -1555,6 +1548,146 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len) return do_syslog(type, buf, len, SYSLOG_FROM_READER); }
+/* + * Special console_lock variants that help to reduce the risk of soft-lockups. + * They allow to pass console_lock to another printk() call using a busy wait. + */ + +#ifdef CONFIG_LOCKDEP +static struct lockdep_map console_owner_dep_map = { + .name = "console_owner" +}; +#endif + +static DEFINE_RAW_SPINLOCK(console_owner_lock); +static struct task_struct *console_owner; +static bool console_waiter; + +/** + * console_lock_spinning_enable - mark beginning of code where another + * thread might safely busy wait + * + * This basically converts console_lock into a spinlock. This marks + * the section where the console_lock owner can not sleep, because + * there may be a waiter spinning (like a spinlock). Also it must be + * ready to hand over the lock at the end of the section. + */ +static void console_lock_spinning_enable(void) +{ + raw_spin_lock(&console_owner_lock); + console_owner = current; + raw_spin_unlock(&console_owner_lock); + + /* The waiter may spin on us after setting console_owner */ + spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); +} + +/** + * console_lock_spinning_disable_and_check - mark end of code where another + * thread was able to busy wait and check if there is a waiter + * + * This is called at the end of the section where spinning is allowed. + * It has two functions. First, it is a signal that it is no longer + * safe to start busy waiting for the lock. Second, it checks if + * there is a busy waiter and passes the lock rights to her. + * + * Important: Callers lose the lock if there was a busy waiter. + * They must not touch items synchronized by console_lock + * in this case. + * + * Return: 1 if the lock rights were passed, 0 otherwise. + */ +static int console_lock_spinning_disable_and_check(void) +{ + int waiter; + + raw_spin_lock(&console_owner_lock); + waiter = READ_ONCE(console_waiter); + console_owner = NULL; + raw_spin_unlock(&console_owner_lock); + + if (!waiter) { + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + return 0; + } + + /* The waiter is now free to continue */ + WRITE_ONCE(console_waiter, false); + + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + + /* + * Hand off console_lock to waiter. The waiter will perform + * the up(). After this, the waiter is the console_lock owner. + */ + mutex_release(&console_lock_dep_map, 1, _THIS_IP_); + return 1; +} + +/** + * console_trylock_spinning - try to get console_lock by busy waiting + * + * This allows to busy wait for the console_lock when the current + * owner is running in specially marked sections. It means that + * the current owner is running and cannot reschedule until it + * is ready to lose the lock. + * + * Return: 1 if we got the lock, 0 othrewise + */ +static int console_trylock_spinning(void) +{ + struct task_struct *owner = NULL; + bool waiter; + bool spin = false; + unsigned long flags; + + if (console_trylock()) + return 1; + + printk_safe_enter_irqsave(flags); + + raw_spin_lock(&console_owner_lock); + owner = READ_ONCE(console_owner); + waiter = READ_ONCE(console_waiter); + if (!waiter && owner && owner != current) { + WRITE_ONCE(console_waiter, true); + spin = true; + } + raw_spin_unlock(&console_owner_lock); + + /* + * If there is an active printk() writing to the + * consoles, instead of having it write our data too, + * see if we can offload that load from the active + * printer, and do some printing ourselves. + * Go into a spin only if there isn't already a waiter + * spinning, and there is an active printer, and + * that active printer isn't us (recursive printk?). + */ + if (!spin) { + printk_safe_exit_irqrestore(flags); + return 0; + } + + /* We spin waiting for the owner to release us */ + spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); + /* Owner will clear console_waiter on hand off */ + while (READ_ONCE(console_waiter)) + cpu_relax(); + spin_release(&console_owner_dep_map, 1, _THIS_IP_); + + printk_safe_exit_irqrestore(flags); + /* + * The owner passed the console lock to us. + * Since we did not spin on console lock, annotate + * this as a trylock. Otherwise lockdep will + * complain. + */ + mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_); + + return 1; +} + /* * Call the console drivers, asking them to write out * log_buf[start] to log_buf[end - 1]. @@ -1774,56 +1907,8 @@ asmlinkage int vprintk_emit(int facility, int level, * semaphore. The release will print out buffers and wake up * /dev/kmsg and syslog() users. */ - if (console_trylock()) { + if (console_trylock_spinning()) console_unlock(); - } else { - struct task_struct *owner = NULL; - bool waiter; - bool spin = false; - - printk_safe_enter_irqsave(flags); - - raw_spin_lock(&console_owner_lock); - owner = READ_ONCE(console_owner); - waiter = READ_ONCE(console_waiter); - if (!waiter && owner && owner != current) { - WRITE_ONCE(console_waiter, true); - spin = true; - } - raw_spin_unlock(&console_owner_lock); - - /* - * If there is an active printk() writing to the - * consoles, instead of having it write our data too, - * see if we can offload that load from the active - * printer, and do some printing ourselves. - * Go into a spin only if there isn't already a waiter - * spinning, and there is an active printer, and - * that active printer isn't us (recursive printk?). - */ - if (spin) { - /* We spin waiting for the owner to release us */ - spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); - /* Owner will clear console_waiter on hand off */ - while (READ_ONCE(console_waiter)) - cpu_relax(); - - spin_release(&console_owner_dep_map, 1, _THIS_IP_); - printk_safe_exit_irqrestore(flags); - - /* - * The owner passed the console lock to us. - * Since we did not spin on console lock, annotate - * this as a trylock. Otherwise lockdep will - * complain. - */ - mutex_acquire(&console_lock_dep_map, 0, 1, _THIS_IP_); - console_unlock(); - printk_safe_enter_irqsave(flags); - } - printk_safe_exit_irqrestore(flags); - - } }
return printed_len; @@ -1924,6 +2009,8 @@ static ssize_t msg_print_ext_header(char *buf, size_t size, static ssize_t msg_print_ext_body(char *buf, size_t size, char *dict, size_t dict_len, char *text, size_t text_len) { return 0; } +static void console_lock_spinning_enable(void) { } +static int console_lock_spinning_disable_and_check(void) { return 0; } static void call_console_drivers(const char *ext_text, size_t ext_len, const char *text, size_t len) {} static size_t msg_print_text(const struct printk_log *msg, @@ -2210,7 +2297,6 @@ void console_unlock(void) static u64 seen_seq; unsigned long flags; bool wake_klogd = false; - bool waiter = false; bool do_cond_resched, retry;
if (console_suspended) { @@ -2305,31 +2391,16 @@ void console_unlock(void) * finish. This task can not be preempted if there is a * waiter waiting to take over. */ - raw_spin_lock(&console_owner_lock); - console_owner = current; - raw_spin_unlock(&console_owner_lock); - - /* The waiter may spin on us after setting console_owner */ - spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_); + console_lock_spinning_enable();
stop_critical_timings(); /* don't trace print latency */ call_console_drivers(ext_text, ext_len, text, len); start_critical_timings();
- raw_spin_lock(&console_owner_lock); - waiter = READ_ONCE(console_waiter); - console_owner = NULL; - raw_spin_unlock(&console_owner_lock); - - /* - * If there is a waiter waiting for us, then pass the - * rest of the work load over to that waiter. - */ - if (waiter) - break; - - /* There was no waiter, and nothing will spin on us here */ - spin_release(&console_owner_dep_map, 1, _THIS_IP_); + if (console_lock_spinning_disable_and_check()) { + printk_safe_exit_irqrestore(flags); + return; + }
printk_safe_exit_irqrestore(flags);
@@ -2337,26 +2408,6 @@ void console_unlock(void) cond_resched(); }
- /* - * If there is an active waiter waiting on the console_lock. - * Pass off the printing to the waiter, and the waiter - * will continue printing on its CPU, and when all writing - * has finished, the last printer will wake up klogd. - */ - if (waiter) { - WRITE_ONCE(console_waiter, false); - /* The waiter is now free to continue */ - spin_release(&console_owner_dep_map, 1, _THIS_IP_); - /* - * Hand off console_lock to waiter. The waiter will perform - * the up(). After this, the waiter is the console_lock owner. - */ - mutex_release(&console_lock_dep_map, 1, _THIS_IP_); - printk_safe_exit_irqrestore(flags); - /* Note, if waiter is set, logbuf_lock is not held */ - return; - } - console_locked = 0;
/* Release the exclusive_console once it is used */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit fd5f7cde1b85d4c8e09ca46ce948e008a2377f64 ]
This patch, basically, reverts commit 6b97a20d3a79 ("printk: set may_schedule for some of console_trylock() callers"). That commit was a mistake, it introduced a big dependency on the scheduler, by enabling preemption under console_sem in printk()->console_unlock() path, which is rather too critical. The patch did not significantly reduce the possibilities of printk() lockups, but made it possible to stall printk(), as has been reported by Tetsuo Handa [1].
Another issues is that preemption under console_sem also messes up with Steven Rostedt's hand off scheme, by making it possible to sleep with console_sem both in console_unlock() and in vprintk_emit(), after acquiring the console_sem ownership (anywhere between printk_safe_exit_irqrestore() in console_trylock_spinning() and printk_safe_enter_irqsave() in console_unlock()). This makes hand off less likely and, at the same time, may result in a significant amount of pending logbuf messages. Preempted console_sem owner makes it impossible for other CPUs to emit logbuf messages, but does not make it impossible for other CPUs to append new messages to the logbuf.
Reinstate the old behavior and make printk() non-preemptible. Should any printk() lockup reports arrive they must be handled in a different way.
[1] http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20I-love%... Fixes: 6b97a20d3a79 ("printk: set may_schedule for some of console_trylock() callers") Link: http://lkml.kernel.org/r/20180116044716.GE6607@jagdpanzerIV To: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Sergey Senozhatsky sergey.senozhatsky@gmail.com Cc: Tejun Heo tj@kernel.org Cc: akpm@linux-foundation.org Cc: linux-mm@kvack.org Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@suse.de Cc: Michal Hocko mhocko@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Peter Zijlstra peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Jan Kara jack@suse.cz Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Byungchul Park byungchul.park@lge.com Cc: Pavel Machek pavel@ucw.cz Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 2d1c2700bd85..2f654a79f80b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1902,6 +1902,12 @@ asmlinkage int vprintk_emit(int facility, int level,
/* If called from the scheduler, we can not call up(). */ if (!in_sched) { + /* + * Disable preemption to avoid being preempted while holding + * console_sem which would prevent anyone from printing to + * console + */ + preempt_disable(); /* * Try to acquire and then immediately release the console * semaphore. The release will print out buffers and wake up @@ -1909,6 +1915,7 @@ asmlinkage int vprintk_emit(int facility, int level, */ if (console_trylock_spinning()) console_unlock(); + preempt_enable(); }
return printed_len; @@ -2225,20 +2232,7 @@ int console_trylock(void) return 0; } console_locked = 1; - /* - * When PREEMPT_COUNT disabled we can't reliably detect if it's - * safe to schedule (e.g. calling printk while holding a spin_lock), - * because preempt_disable()/preempt_enable() are just barriers there - * and preempt_count() is always 0. - * - * RCU read sections have a separate preemption counter when - * PREEMPT_RCU enabled thus we must take extra care and check - * rcu_preempt_depth(), otherwise RCU read sections modify - * preempt_count(). - */ - console_may_schedule = !oops_in_progress && - preemptible() && - !rcu_preempt_depth(); + console_may_schedule = 0; return 1; } EXPORT_SYMBOL(console_trylock);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c14376de3a1befa70d9811ca2872d47367b48767 ]
wake_klogd is a local variable in console_unlock(). The information is lost when the console_lock owner using the busy wait added by the commit dbdda842fe96f8932 ("printk: Add console owner and waiter logic to load balance console writes"). The following race is possible:
CPU0 CPU1 console_unlock()
for (;;) /* calling console for last message */
printk() log_store() log_next_seq++;
/* see new message */ if (seen_seq != log_next_seq) { wake_klogd = true; seen_seq = log_next_seq; }
console_lock_spinning_enable();
if (console_trylock_spinning()) /* spinning */
if (console_lock_spinning_disable_and_check()) { printk_safe_exit_irqrestore(flags); return;
console_unlock() if (seen_seq != log_next_seq) { /* already seen */ /* nothing to do */
Result: Nobody would wakeup klogd.
One solution would be to make a global variable from wake_klogd. But then we would need to manipulate it under a lock or so.
This patch wakes klogd also when console_lock is passed to the spinning waiter. It looks like the right way to go. Also userspace should have a chance to see and store any "flood" of messages.
Note that the very late klogd wake up was a historic solution. It made sense on single CPU systems or when sys_syslog() operations were synchronized using the big kernel lock like in v2.1.113. But it is questionable these days.
Fixes: dbdda842fe96f8932 ("printk: Add console owner and waiter logic to load balance console writes") Link: http://lkml.kernel.org/r/20180226155734.dzwg3aovqnwtvkoy@pathway.suse.cz Cc: Steven Rostedt rostedt@goodmis.org Cc: linux-kernel@vger.kernel.org Cc: Tejun Heo tj@kernel.org Suggested-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Reviewed-by: Sergey Senozhatsky sergey.senozhatsky@gmail.com Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 2f654a79f80b..2e2c86dd226f 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2393,7 +2393,7 @@ void console_unlock(void)
if (console_lock_spinning_disable_and_check()) { printk_safe_exit_irqrestore(flags); - return; + goto out; }
printk_safe_exit_irqrestore(flags); @@ -2426,6 +2426,7 @@ void console_unlock(void) if (retry && console_trylock()) goto again;
+out: if (wake_klogd) wake_up_klogd(); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davidlohr Bueso dave@stgolabs.net
commit 0b548e33e6cb2bff240fdaf1783783be15c29080 upstream.
Fengguang reported soft lockups while running the rbtree and interval tree test modules. The logic for these tests all occur in init phase, and we currently are pounding with the default values for number of nodes and number of iterations of each test. Reduce the latter by two orders of magnitude. This does not influence the value of the tests in that one thousand times by default is enough to get the picture.
Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805 Signed-off-by: Davidlohr Bueso dbueso@suse.de Reported-by: Fengguang Wu fengguang.wu@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- lib/interval_tree_test.c | 4 ++-- lib/rbtree_test.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/lib/interval_tree_test.c +++ b/lib/interval_tree_test.c @@ -11,10 +11,10 @@ MODULE_PARM_DESC(name, msg);
__param(int, nnodes, 100, "Number of nodes in the interval tree"); -__param(int, perf_loops, 100000, "Number of iterations modifying the tree"); +__param(int, perf_loops, 1000, "Number of iterations modifying the tree");
__param(int, nsearches, 100, "Number of searches to the interval tree"); -__param(int, search_loops, 10000, "Number of iterations searching the tree"); +__param(int, search_loops, 1000, "Number of iterations searching the tree"); __param(bool, search_all, false, "Searches will iterate all nodes in the tree");
__param(uint, max_endpoint, ~0, "Largest value for the interval's endpoint"); --- a/lib/rbtree_test.c +++ b/lib/rbtree_test.c @@ -11,7 +11,7 @@ MODULE_PARM_DESC(name, msg);
__param(int, nnodes, 100, "Number of nodes in the rb-tree"); -__param(int, perf_loops, 100000, "Number of iterations modifying the rb-tree"); +__param(int, perf_loops, 1000, "Number of iterations modifying the rb-tree"); __param(int, check_loops, 100, "Number of iterations modifying and verifying the rb-tree");
struct test_node {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tigran Mkrtchyan tigran.mkrtchyan@desy.de
commit 320f35b7bf8cccf1997ca3126843535e1b95e9c4 upstream.
Since commit bb21ce0ad227 we always enforce per-mirror stateid. However, this makes sense only for v4+ servers.
Signed-off-by: Tigran Mkrtchyan tigran.mkrtchyan@desy.de Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/flexfilelayout/flexfilelayout.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1725,7 +1725,8 @@ ff_layout_read_pagelist(struct nfs_pgio_ if (fh) hdr->args.fh = fh;
- if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) + if (vers == 4 && + !nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) goto out_failed;
/* @@ -1790,7 +1791,8 @@ ff_layout_write_pagelist(struct nfs_pgio if (fh) hdr->args.fh = fh;
- if (!nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) + if (vers == 4 && + !nfs4_ff_layout_select_ds_stateid(lseg, idx, &hdr->args.stateid)) goto out_failed;
/*
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
commit fd29edc7232bc19f969e8f463138afc5472b3d5f upstream.
gcc 8.1.0 generates the following warnings.
drivers/staging/speakup/kobjects.c: In function 'punc_store': drivers/staging/speakup/kobjects.c:522:2: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length drivers/staging/speakup/kobjects.c:504:6: note: length computed here
drivers/staging/speakup/kobjects.c: In function 'synth_store': drivers/staging/speakup/kobjects.c:391:2: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length drivers/staging/speakup/kobjects.c:388:8: note: length computed here
Using strncpy() is indeed less than perfect since the length of data to be copied has already been determined with strlen(). Replace strncpy() with memcpy() to address the warning and optimize the code a little.
Signed-off-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Samuel Thibault samuel.thibault@ens-lyon.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/speakup/kobjects.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/staging/speakup/kobjects.c +++ b/drivers/staging/speakup/kobjects.c @@ -387,7 +387,7 @@ static ssize_t synth_store(struct kobjec len = strlen(buf); if (len < 2 || len > 9) return -EINVAL; - strncpy(new_synth_name, buf, len); + memcpy(new_synth_name, buf, len); if (new_synth_name[len - 1] == '\n') len--; new_synth_name[len] = '\0'; @@ -518,7 +518,7 @@ static ssize_t punc_store(struct kobject return -EINVAL; }
- strncpy(punc_buf, buf, x); + memcpy(punc_buf, buf, x);
while (x && punc_buf[x - 1] == '\n') x--;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Sakamoto o-takashi@sakamocchi.jp
commit fa9c98e4b975bb3192ed6af09d9fa282ed3cd8a0 upstream.
In an initial commit, 'SYNC_STATUS' register is referred to get clock configuration, however this is wrong, according to my local note at hand for reverse-engineering about packet dump. It should be 'CLOCK_CONFIG' register. Actually, ff400_dump_clock_config() is correctly programmed.
This commit fixes the bug.
Cc: stable@vger.kernel.org # v4.12+ Fixes: 76fdb3a9e13a ('ALSA: fireface: add support for Fireface 400') Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/firewire/fireface/ff-protocol-ff400.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/firewire/fireface/ff-protocol-ff400.c +++ b/sound/firewire/fireface/ff-protocol-ff400.c @@ -30,7 +30,7 @@ static int ff400_get_clock(struct snd_ff int err;
err = snd_fw_transaction(ff->unit, TCODE_READ_QUADLET_REQUEST, - FF400_SYNC_STATUS, ®, sizeof(reg), 0); + FF400_CLOCK_CONFIG, ®, sizeof(reg), 0); if (err < 0) return err; data = le32_to_cpu(reg);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit bde1a7459623a66c2abec4d0a841e4b06cc88d9a upstream.
If it plugged headphone or headset into the jack, then do the reboot, it will have a chance to cause headphone no sound. It just need to run the headphone mode procedure after boot time. The issue will be fixed. It also suitable for ALC234 ALC274 and ALC294.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6965,6 +6965,37 @@ static void alc269_fill_coef(struct hda_ alc_update_coef_idx(codec, 0x4, 0, 1<<11); }
+static void alc294_hp_init(struct hda_codec *codec) +{ + struct alc_spec *spec = codec->spec; + hda_nid_t hp_pin = spec->gen.autocfg.hp_pins[0]; + int i, val; + + if (!hp_pin) + return; + + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE); + + msleep(100); + + snd_hda_codec_write(codec, hp_pin, 0, + AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0); + + alc_update_coef_idx(codec, 0x6f, 0x000f, 0);/* Set HP depop to manual mode */ + alc_update_coefex_idx(codec, 0x58, 0x00, 0x8000, 0x8000); /* HP depop procedure start */ + + /* Wait for depop procedure finish */ + val = alc_read_coefex_idx(codec, 0x58, 0x01); + for (i = 0; i < 20 && val & 0x0080; i++) { + msleep(50); + val = alc_read_coefex_idx(codec, 0x58, 0x01); + } + /* Set HP depop to auto mode */ + alc_update_coef_idx(codec, 0x6f, 0x000f, 0x000b); + msleep(50); +} + /* */ static int patch_alc269(struct hda_codec *codec) @@ -7101,6 +7132,7 @@ static int patch_alc269(struct hda_codec spec->codec_variant = ALC269_TYPE_ALC294; spec->gen.mixer_nid = 0; /* ALC2x4 does not have any loopback mixer path */ alc_update_coef_idx(codec, 0x6b, 0x0018, (1<<4) | (1<<3)); /* UAJ MIC Vref control by verb */ + alc294_hp_init(codec); break; case 0x10ec0300: spec->codec_variant = ALC269_TYPE_ALC300; @@ -7112,6 +7144,7 @@ static int patch_alc269(struct hda_codec spec->codec_variant = ALC269_TYPE_ALC700; spec->gen.mixer_nid = 0; /* ALC700 does not have any loopback mixer path */ alc_update_coef_idx(codec, 0x4a, 1 << 15, 0); /* Combo jack auto trigger control */ + alc294_hp_init(codec); break;
}
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Piotr Stankiewicz piotr.stankiewicz@intel.com
commit 36d842194a57f1b21fbc6a6875f2fa2f9a7f8679 upstream.
When running with KASAN, the following trace is produced:
[ 62.535888]
================================================================== [ 62.544930] BUG: KASAN: slab-out-of-bounds in gut_hw_stats+0x122/0x230 [hfi1] [ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task kworker/0:1/14
[ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.19.0-test-build-kasan+ #8 [ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0019.101220160604 10/12/2016 [ 62.587951] Workqueue: events work_for_cpu_fn [ 62.594050] Call Trace: [ 62.598023] dump_stack+0xc6/0x14c [ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f [ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59 [ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.622985] print_address_description+0x6c/0x23c [ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.636108] kasan_report.cold.6+0x241/0x308 [ 62.642365] get_hw_stats+0x122/0x230 [hfi1] [ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.655088] ? __kmalloc+0x110/0x240 [ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core] [ 62.673972] ? show_hfi+0x50/0x50 [hfi1] [ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core] [ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core] [ 62.695340] ? show_hfi+0x50/0x50 [hfi1] [ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core] [ 62.709222] ? __vmalloc_node_range+0x2d0/0x380 [ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.723735] ? vmalloc_node+0x5c/0x70 [ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt] [ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt] [ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt] [ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt] [ 62.766364] ? mutex_unlock+0x1d/0x40 [ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230 [ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt] [ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt] [ 62.796058] ? __get_txreq+0x400/0x400 [hfi1] [ 62.802969] ? memcpy+0x34/0x50 [ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1] [ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1] [ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1] [ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1] [ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0 [ 62.846429] ? pcie_capability_read_word+0xd0/0xd0 [ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1] [ 62.860958] init_one+0x67f/0xae0 [hfi1] [ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.873876] ? wait_woken+0x130/0x130 [ 62.879860] ? read_word_at_a_time+0xe/0x20 [ 62.886329] ? strscpy+0x14b/0x280 [ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.898405] local_pci_probe+0x70/0xd0 [ 62.904295] ? pci_device_shutdown+0x90/0x90 [ 62.910833] work_for_cpu_fn+0x29/0x40 [ 62.916750] process_one_work+0x584/0x960 [ 62.922974] ? rcu_work_rcufn+0x40/0x40 [ 62.928991] ? __schedule+0x396/0xdc0 [ 62.934806] ? __sched_text_start+0x8/0x8 [ 62.941020] ? pick_next_task_fair+0x68b/0xc60 [ 62.947674] ? run_rebalance_domains+0x260/0x260 [ 62.954471] ? __list_add_valid+0x29/0xa0 [ 62.960607] ? move_linked_works+0x1c7/0x230 [ 62.967077] ? trace_event_raw_event_workqueue_execute_start+0x140/0x140 [ 62.976248] ? mutex_lock+0xa6/0x100 [ 62.982029] ? __mutex_lock_slowpath+0x10/0x10 [ 62.988795] ? __switch_to+0x37a/0x710 [ 62.994731] worker_thread+0x62e/0x9d0 [ 63.000602] ? max_active_store+0xf0/0xf0 [ 63.006828] ? __switch_to_asm+0x40/0x70 [ 63.012932] ? __switch_to_asm+0x34/0x70 [ 63.019013] ? __switch_to_asm+0x40/0x70 [ 63.025042] ? __switch_to_asm+0x34/0x70 [ 63.031030] ? __switch_to_asm+0x40/0x70 [ 63.037006] ? __schedule+0x396/0xdc0 [ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0 [ 63.049323] ? kthread+0x59/0x1d0 [ 63.054594] ? ret_from_fork+0x35/0x40 [ 63.060257] ? __sched_text_start+0x8/0x8 [ 63.066212] ? schedule+0xcf/0x250 [ 63.071529] ? __wake_up_common+0x110/0x350 [ 63.077794] ? __schedule+0xdc0/0xdc0 [ 63.083348] ? wait_woken+0x130/0x130 [ 63.088963] ? finish_task_switch+0x1f1/0x520 [ 63.095258] ? kasan_unpoison_shadow+0x30/0x40 [ 63.101792] ? __init_waitqueue_head+0xa0/0xd0 [ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18 [ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50 [ 63.121754] ? max_active_store+0xf0/0xf0 [ 63.127753] kthread+0x1ae/0x1d0 [ 63.132894] ? kthread_bind+0x30/0x30 [ 63.138422] ret_from_fork+0x35/0x40
[ 63.146973] Allocated by task 14: [ 63.152077] kasan_kmalloc+0xbf/0xe0 [ 63.157471] __kmalloc+0x110/0x240 [ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1] [ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1] [ 63.175244] init_one+0x551/0xae0 [hfi1] [ 63.181065] local_pci_probe+0x70/0xd0 [ 63.186759] work_for_cpu_fn+0x29/0x40 [ 63.192310] process_one_work+0x584/0x960 [ 63.198163] worker_thread+0x62e/0x9d0 [ 63.203843] kthread+0x1ae/0x1d0 [ 63.208874] ret_from_fork+0x35/0x40
[ 63.217203] Freed by task 1: [ 63.221844] __kasan_slab_free+0x12e/0x180 [ 63.227844] kfree+0x92/0x1a0 [ 63.232570] single_release+0x3a/0x60 [ 63.238024] __fput+0x1d9/0x480 [ 63.242911] task_work_run+0x139/0x190 [ 63.248440] exit_to_usermode_loop+0x191/0x1a0 [ 63.254814] do_syscall_64+0x301/0x330 [ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 63.270199] The buggy address belongs to the object at ffff88080e8d5500 which belongs to the cache kmalloc-4096 of size 4096 [ 63.287247] The buggy address is located 3632 bytes inside of 4096-byte region [ffff88080e8d5500, ffff88080e8d6500) [ 63.303564] The buggy address belongs to the page: [ 63.310447] page:ffffea00203a3400 count:1 mapcount:0 mapping:ffff88081380e840 index:0x0 compound_mapcount: 0 [ 63.323102] flags: 0x2fffff80008100(slab|head) [ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001 ffff88081380e840 [ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 [ 63.350564] page dumped because: kasan: bad access detected
[ 63.361974] Memory state around the buggy address: [ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc [ 63.398944] ^ [ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.426099] ==================================================================
The trace happens because get_hw_stats() assumes there is room in the memory allocated in init_cntrs() to accommodate the driver counters. Unfortunately, that routine only allocated space for the device counters.
Fix by insuring the allocation has room for the additional driver counters.
Cc: Stable@vger.kernel.org # v4.14+ Fixes: b7481944b06e9 ("IB/hfi1: Show statistics counters under IB stats interface") Reviewed-by: Mike Marciniczyn mike.marciniszyn@intel.com Reviewed-by: Mike Ruhl michael.j.ruhl@intel.com Signed-off-by: Piotr Stankiewicz piotr.stankiewicz@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/chip.c | 3 ++- drivers/infiniband/hw/hfi1/hfi.h | 2 ++ drivers/infiniband/hw/hfi1/verbs.c | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -12449,7 +12449,8 @@ static int init_cntrs(struct hfi1_devdat }
/* allocate space for the counter values */ - dd->cntrs = kcalloc(dd->ndevcntrs, sizeof(u64), GFP_KERNEL); + dd->cntrs = kcalloc(dd->ndevcntrs + num_driver_cntrs, sizeof(u64), + GFP_KERNEL); if (!dd->cntrs) goto bail;
--- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -152,6 +152,8 @@ struct hfi1_ib_stats { extern struct hfi1_ib_stats hfi1_stats; extern const struct pci_error_handlers hfi1_pci_err_handler;
+extern int num_driver_cntrs; + /* * First-cut criterion for "device is active" is * two thousand dwords combined Tx, Rx traffic per --- a/drivers/infiniband/hw/hfi1/verbs.c +++ b/drivers/infiniband/hw/hfi1/verbs.c @@ -1693,7 +1693,7 @@ static const char * const driver_cntr_na static DEFINE_MUTEX(cntr_names_lock); /* protects the *_cntr_names bufers */ static const char **dev_cntr_names; static const char **port_cntr_names; -static int num_driver_cntrs = ARRAY_SIZE(driver_cntr_names); +int num_driver_cntrs = ARRAY_SIZE(driver_cntr_names); static int num_dev_cntrs; static int num_port_cntrs; static int cntr_names_initialized;
stable-rc/linux-4.14.y boot: 116 boots: 1 failed, 111 passed with 2 offline, 2 untried/unknown (v4.14.88-90-g159976987c2f)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.14... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.88-90-...
Tree: stable-rc Branch: linux-4.14.y Git Describe: v4.14.88-90-g159976987c2f Git Commit: 159976987c2f89308c6293a31e8fa9543549b94b Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 60 unique boards, 23 SoC families, 12 builds out of 197
Boot Regressions Detected:
arm64:
defconfig: rk3399-firefly: lab-baylibre-seattle: failing since 5 days (last pass: v4.14.86-56-g555f40d53cca - first fail: v4.14.87)
Boot Failure Detected:
arm64:
defconfig rk3399-firefly: 1 failed lab
Offline Platforms:
arm:
multi_v7_defconfig: stih410-b2120: 1 offline lab
arm64:
defconfig: sun50i-a64-pine64-plus: 1 offline lab
--- For more info write to info@kernelci.org
On 12/14/18 4:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.89 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Dec 16 11:57:01 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.89-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On 12/14/18 3:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.89 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Dec 16 11:57:01 UTC 2018. Anything received after that time might be too late.
Build results: total: 171 pass: 171 fail: 0 Qemu test results: total: 322 pass: 322 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
On Fri, Dec 14, 2018 at 12:59:13PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.89 release. There are 89 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Dec 16 11:57:01 UTC 2018. Anything received after that time might be too late.
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.14.89-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: 159976987c2f89308c6293a31e8fa9543549b94b git describe: v4.14.88-90-g159976987c2f Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.88-90...
No regressions (compared to build v4.14.88)
No fixes (compared to build v4.14.88)
Ran 21597 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
linux-stable-mirror@lists.linaro.org