This is the start of the stable review cycle for the 4.18.17 release. There are 150 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Nov 4 18:28:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.17-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.18.17-rc1
Nikolay Aleksandrov nikolay@cumulusnetworks.com net: bridge: remove ipv6 zero address check in mcast queries
David S. Miller davem@davemloft.net sparc: Throttle perf events properly.
David S. Miller davem@davemloft.net sparc: Fix syscall fallback bugs in VDSO.
David S. Miller davem@davemloft.net sparc: Fix single-pcr perf event counter management.
David S. Miller davem@davemloft.net sparc64: Wire up compat getpeername and getsockname.
David S. Miller davem@davemloft.net sparc64: Set %l4 properly on trap return after handling signals.
David S. Miller davem@davemloft.net sparc64: Make proc_id signed.
David Miller davem@redhat.com sparc64: Make corrupted user stacks more debuggable.
David S. Miller davem@davemloft.net sparc64: Export __node_distance.
Shalom Toledo shalomt@mellanox.com mlxsw: core: Fix devlink unregister flow
Tariq Toukan tariqt@mellanox.com net/mlx5: WQ, fixes for fragmented WQ buffers API
Dimitris Michailidis dmichail@google.com net: fix pskb_trim_rcsum_slow() with odd trim offset
Cong Wang xiyou.wangcong@gmail.com net: drop skb on failure in ip_check_defrag()
Taehee Yoo ap420073@gmail.com net: bpfilter: use get_pid_task instead of pid_task
Petr Machata petrm@mellanox.com mlxsw: spectrum_switchdev: Don't ignore deletions of learned MACs
Karsten Graul kgraul@linux.ibm.com net/smc: fix smc_buf_unuse to use the lgr pointer
Talat Batheesh talatb@mellanox.com net/mlx5: Fix memory leak when setting fpga ipsec caps
Xin Long lucien.xin@gmail.com sctp: not free the new asoc when sctp_wait_for_connect returns err
Xin Long lucien.xin@gmail.com sctp: fix the data size calculation in sctp_data_size
David Ahern dsahern@gmail.com net/ipv6: Allow onlink routes to have a device mismatch if it is the default route
Davide Caratti dcaratti@redhat.com net/sched: cls_api: add missing validation of netlink attributes
Phil Sutter phil@nwl.cc net: sched: Fix for duplicate class dump
Florian Fainelli f.fainelli@gmail.com net: bcmgenet: Poll internal PHY for GENETv5
Huy Nguyen huyn@mellanox.com net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type
Nikolay Aleksandrov nikolay@cumulusnetworks.com net: ipmr: fix unresolved entry dumps
Jaime Caamaño Ruiz jcaamano@suse.com openvswitch: Fix push/pop ethernet validation
Stefano Brivio sbrivio@redhat.com ip6_tunnel: Fix encapsulation layout
Tobias Jungel tobias.jungel@gmail.com bonding: fix length of actor system
Wenwen Wang wang6495@umn.edu ethtool: fix a privilege escalation bug
Ake Koomsin ake@igel.co.jp virtio_net: avoid using netif_tx_disable() for serializing tx routine
Jason Wang jasowang@redhat.com vhost: Fix Spectre V1 vulnerability
Paolo Abeni pabeni@redhat.com udp6: fix encap return code for resubmitting
Tung Nguyen tung.q.nguyen@dektech.com.au tipc: fix unsafe rcu locking when accessing publication list
Marcelo Ricardo Leitner marcelo.leitner@gmail.com sctp: fix race on sctp_id2asoc
Ido Schimmel idosch@mellanox.com rtnetlink: Disallow FDB configuration for non-Ethernet device
Heiner Kallweit hkallweit1@gmail.com r8169: fix NAPI handling under high load
Sean Tranchetti stranche@codeaurora.org net: udp: fix handling of CHECKSUM_COMPLETE packets
Niklas Cassel niklas.cassel@linaro.org net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules
Wenwen Wang wang6495@umn.edu net: socket: fix a missing-check bug
Jakub Kicinski jakub.kicinski@netronome.com net: sched: gred: pass the right attribute to gred_change_table_def()
Eric Dumazet edumazet@google.com net/mlx5e: fix csum adjustments caused by RXFCS
David Ahern dsahern@gmail.com net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
Fugang Duan fugang.duan@nxp.com net: fec: don't dump RX FIFO register when not available
Cong Wang xiyou.wangcong@gmail.com llc: set SOCK_RCU_FREE in llc_sap_add_socket()
Sabrina Dubroca sd@queasysnail.net ipv6: rate-limit probes for neighbourless routes
Stefano Brivio sbrivio@redhat.com ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called
Eric Dumazet edumazet@google.com ipv6: mcast: fix a use-after-free in inet6_mc_check
Hangbin Liu liuhangbin@gmail.com bridge: do not add port to router list when receives query with source 0.0.0.0
Rasmus Villemoes linux@rasmusvillemoes.dk perf tools: Disable parallelism for 'make clean'
Sasha Levin sashal@kernel.org Revert "netfilter: ipv6: nf_defrag: drop skb dst before queueing"
Sasha Levin sashal@kernel.org Revert "mm: slowly shrink slabs with a relatively small number of objects"
Khazhismel Kumykov khazhy@google.com fs/fat/fatent.c: add cond_resched() to fat_count_free_clusters()
David Howells dhowells@redhat.com afs: Fix cell proc list
Peter Oberparleiter oberpar@linux.ibm.com vmlinux.lds.h: Fix linker warnings about orphan .LPBX sections
Peter Oberparleiter oberpar@linux.ibm.com vmlinux.lds.h: Fix incomplete .text.exit discards
Paolo Abeni pabeni@redhat.com selftests: udpgso_bench.sh explicitly requires bash
Paolo Abeni pabeni@redhat.com selftests: rtnetlink.sh explicitly requires bash.
Ka-Cheong Poon ka-cheong.poon@oracle.com rds: RDS (tcp) hangs on sendto() to unresponding address
Valentine Fatiev Valentinef@mellanox.com IB/mlx5: Unmap DMA addr from HCA before IOMMU
Stephen Boyd swboyd@chromium.org gpio: Assign gpio_irq_chip::parents to non-stack pointer
Arthur Kiyanovski akiyano@amazon.com net: ena: fix NULL dereference due to untimely napi initialization
Arthur Kiyanovski akiyano@amazon.com net: ena: fix rare bug when failed restart/resume is followed by driver removal
Arthur Kiyanovski akiyano@amazon.com net: ena: fix warning in rmmod caused by double iounmap
Paolo Bonzini pbonzini@redhat.com KVM: x86: support CONFIG_KVM_AMD=y with CONFIG_CRYPTO_DEV_CCP_DD=m
David Howells dhowells@redhat.com rxrpc: Fix connection-level abort handling
David Howells dhowells@redhat.com rxrpc: Only take the rwind and mtu values from latest ACK
David Howells dhowells@redhat.com rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window()
David Howells dhowells@redhat.com rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
Milian Wolff milian.wolff@kdab.com perf record: Use unmapped IP for inline callchain cursors
Arnaldo Carvalho de Melo acme@redhat.com perf python: Use -Wno-redundant-decls to build with PYTHON=python3
Sascha Hauer s.hauer@pengutronix.de ARM: dts: imx53-qsb: disable 1.2GHz OPP
Paul Burton paul.burton@mips.com compiler.h: Allow arch-specific asm/compiler.h
Hans de Goede hdegoede@redhat.com HID: i2c-hid: Remove RESEND_REPORT_DESCR quirk and its handling
Doron Roberts-Kedes doronrk@fb.com tls: Fix improper revert in zerocopy_from_iter
Milian Wolff milian.wolff@kdab.com perf report: Don't try to map ip to invalid map
Daniel Mack daniel@zonque.org libertas: call into generic suspend code before turning off power
Anders Roxell anders.roxell@linaro.org clk: mvebu: armada-37xx-periph: Remove unused var num_parents
Dan Carpenter dan.carpenter@oracle.com x86/paravirt: Fix some warning messages
Anshuman Khandual anshuman.khandual@arm.com mm/migrate.c: split only transparent huge pages when allocation fails
YueHaibing yuehaibing@huawei.com mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl
Larry Chen lchen@suse.com ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
Wenwen Wang wang6495@umn.edu yam: fix a missing-check bug
Wenwen Wang wang6495@umn.edu net: cxgb3_main: fix a missing-check bug
Srikar Dronamraju srikar@linux.vnet.ibm.com powerpc/numa: Skip onlining a offline node in kdump path
Davide Caratti dcaratti@redhat.com be2net: don't flip hw_features when VXLANs are added/deleted
Shirish S shirish.s@amd.com drm/amd/display: Signal hw_done() after waiting for flip_done()
Guenter Roeck linux@roeck-us.net locking/ww_mutex: Fix runtime warning in the WW mutex selftest
Guenter Roeck linux@roeck-us.net Revert "serial: 8250_dw: Fix runtime PM handling"
Atish Patra atish.patra@wdc.com RISCV: Fix end PFN for low memory
Maciej W. Rozycki macro@linux-mips.org declance: Fix continuation with the adapter identification message
Rickard x Andersson rickaran@axis.com net: fec: fix rare tx timeout
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Initialize after IOMMUs
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Do not handle ICM events after domain is stopped
Natarajan, Janakarajan Janakarajan.Natarajan@amd.com perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events
Kan Liang kan.liang@linux.intel.com perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX
Jiri Olsa jolsa@redhat.com perf/ring_buffer: Prevent concurent ring buffer access
Masayoshi Mizuma m.mizuma@jp.fujitsu.com perf/x86/intel/uncore: Use boot_cpu_data.phys_proc_id instead of hardcorded physical package ID 0
Peter Zijlstra peterz@infradead.org perf/core: Fix perf_pmu_unregister() locking
Liran Alon liran.alon@oracle.com KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS
Liran Alon liran.alon@oracle.com KVM: x86: Do not use kvm_x86_ops->mpx_supported() directly
Liran Alon liran.alon@oracle.com KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
Yu Zhao yuzhao@google.com cfg80211: fix use-after-free in reg_process_hint()
Florian Fainelli f.fainelli@gmail.com smsc95xx: Check for Wake-on-LAN modes
Florian Fainelli f.fainelli@gmail.com smsc75xx: Check for Wake-on-LAN modes
Florian Fainelli f.fainelli@gmail.com r8152: Check for supported Wake-on-LAN Modes
Florian Fainelli f.fainelli@gmail.com sr9800: Check for supported Wake-on-LAN modes
Florian Fainelli f.fainelli@gmail.com lan78xx: Check for supported Wake-on-LAN modes
Florian Fainelli f.fainelli@gmail.com ax88179_178a: Check for supported Wake-on-LAN modes
Florian Fainelli f.fainelli@gmail.com asix: Check for supported Wake-on-LAN modes
Florian Westphal fw@strlen.de netfilter: avoid erronous array bounds warning
Taehee Yoo ap420073@gmail.com netfilter: nft_set_rbtree: add missing rb_erase() in GC routine
David Howells dhowells@redhat.com rxrpc: Fix error distribution
David Howells dhowells@redhat.com rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket
David Howells dhowells@redhat.com rxrpc: Fix RTT gathering
David Howells dhowells@redhat.com rxrpc: Fix checks as to whether we should set up a new call
Nilesh Javali nilesh.javali@cavium.com scsi: qedi: Initialize the stats mutex lock
Masashi Honma masashi.honma@gmail.com nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds
Nathan Chancellor natechancellor@gmail.com qed: Avoid implicit enum conversion in qed_iwarp_parse_rx_pkt
Nathan Chancellor natechancellor@gmail.com qed: Avoid constant logical operation warning in qed_vf_pf_acquire
Nathan Chancellor natechancellor@gmail.com qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
Nathan Chancellor natechancellor@gmail.com qed: Fix mask parameter in qed_vf_prep_tunn_req_tlv
Nathan Chancellor natechancellor@gmail.com qed: Avoid implicit enum conversion in qed_set_tunn_cls_info
Lubomir Rintel lkundrak@v3.sk pxa168fb: prepare the clock
Matias Karhumaa matias.karhumaa@gmail.com Bluetooth: SMP: fix crash in unpairing
Martin Willi martin@strongswan.org mac80211_hwsim: do not omit multicast announce of first added radio
Martin Willi martin@strongswan.org mac80211_hwsim: fix race in radio destruction from netlink notifier
Martin Willi martin@strongswan.org mac80211_hwsim: fix locking when iterating radios during ns exit
Masashi Honma masashi.honma@gmail.com nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT
Zhao Qiang qiang.zhao@nxp.com soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()
Alexandre Belloni alexandre.belloni@bootlin.com soc: fsl: qbman: qman: avoid allocating from non existing gen_pool
Michal Simek michal.simek@xilinx.com net: macb: Clean 64b dma addresses if they are not detected
Florian Fainelli f.fainelli@gmail.com ARM: dts: BCM63xx: Fix incorrect interrupt specifiers
Steve Capper steve.capper@arm.com arm64: hugetlb: Fix handling of young ptes
zhong jiang zhongjiang@huawei.com netfilter: conntrack: get rid of double sizeof
David Ahern dsahern@gmail.com netfilter: bridge: Don't sabotage nf_hook calls from an l3mdev
Hans Verkuil hverkuil@xs4all.nl drm/i2c: tda9950: set MAX_RETRIES for errors only
Colin Ian King colin.king@canonical.com drm/i2c: tda9950: fix timeout counter check
Sean Tranchetti stranche@codeaurora.org xfrm: validate template mode
Thomas Petazzoni thomas.petazzoni@free-electrons.com ARM: 8799/1: mm: fix pci_ioremap_io() offset check
Steffen Klassert steffen.klassert@secunet.com xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.
Yuan-Chi Pang fu3mo6goo@gmail.com mac80211: fix TX status reporting for ieee80211s
Johannes Berg johannes.berg@intel.com mac80211: TDLS: fix skb queue/priority assignment
Jouni Malinen jouni@codeaurora.org cfg80211: Address some corner cases in scan result channel updating
Bob Copeland me@bobcopeland.com mac80211: fix pending queue hang due to TX_DROP
Andrei Otcheretianski andrei.otcheretianski@intel.com cfg80211: reg: Init wiphy_idx in regulatory_hint_core()
Andrei Otcheretianski andrei.otcheretianski@intel.com mac80211: Always report TX status
Sowmini Varadhan sowmini.varadhan@oracle.com xfrm: reset crypto_done when iterating over multiple input xfrms
Sowmini Varadhan sowmini.varadhan@oracle.com xfrm: reset transport header back to network header after all input transforms ahave been applied
Thadeu Lima de Souza Cascardo cascardo@canonical.com xfrm6: call kfree_skb when skb is toobig
Steffen Klassert steffen.klassert@secunet.com xfrm: Validate address prefix lengths in the xfrm selector.
-------------
Diffstat:
Makefile | 4 +- arch/Kconfig | 8 ++ arch/arm/boot/dts/bcm63138.dtsi | 14 ++-- arch/arm/boot/dts/imx53-qsb-common.dtsi | 11 +++ arch/arm/kernel/vmlinux.lds.h | 2 + arch/arm/mm/ioremap.c | 2 +- arch/arm64/mm/hugetlbpage.c | 12 ++- arch/powerpc/mm/numa.c | 5 +- arch/riscv/kernel/setup.c | 2 +- arch/sparc/include/asm/cpudata_64.h | 2 +- arch/sparc/include/asm/switch_to_64.h | 3 +- arch/sparc/kernel/perf_event.c | 26 ++++++- arch/sparc/kernel/process_64.c | 25 ++++-- arch/sparc/kernel/rtrap_64.S | 4 +- arch/sparc/kernel/signal32.c | 12 ++- arch/sparc/kernel/signal_64.c | 6 +- arch/sparc/kernel/systbls_64.S | 4 +- arch/sparc/mm/init_64.c | 1 + arch/sparc/vdso/vclock_gettime.c | 12 ++- arch/x86/events/amd/uncore.c | 10 +++ arch/x86/events/intel/uncore_snbep.c | 14 ++-- arch/x86/include/asm/perf_event.h | 8 ++ arch/x86/kernel/paravirt.c | 4 +- arch/x86/kvm/svm.c | 6 +- arch/x86/kvm/vmx.c | 39 ++++++++-- arch/x86/kvm/x86.c | 2 +- drivers/clk/mvebu/armada-37xx-periph.c | 1 - drivers/gpio/gpiolib.c | 3 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++- drivers/gpu/drm/i2c/tda9950.c | 5 +- drivers/hid/hid-ids.h | 1 - drivers/hid/i2c-hid/i2c-hid.c | 18 +---- drivers/infiniband/hw/mlx5/mr.c | 12 ++- drivers/net/bonding/bond_netlink.c | 3 +- drivers/net/ethernet/amazon/ena/ena_netdev.c | 22 +++--- drivers/net/ethernet/amd/declance.c | 10 ++- drivers/net/ethernet/broadcom/genet/bcmmii.c | 7 +- drivers/net/ethernet/cadence/macb_main.c | 1 + drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 17 +++++ drivers/net/ethernet/emulex/benet/be_main.c | 5 +- drivers/net/ethernet/freescale/fec.h | 4 + drivers/net/ethernet/freescale/fec_main.c | 24 ++++-- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 57 ++++---------- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 22 +++--- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +- .../net/ethernet/mellanox/mlx5/core/fpga/ipsec.c | 9 +-- .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h | 5 +- drivers/net/ethernet/mellanox/mlx5/core/wq.c | 5 -- drivers/net/ethernet/mellanox/mlx5/core/wq.h | 11 ++- drivers/net/ethernet/mellanox/mlxsw/core.c | 24 ++++-- .../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 2 - drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_roce.c | 15 +--- drivers/net/ethernet/qlogic/qed/qed_sp_commands.c | 2 +- drivers/net/ethernet/qlogic/qed/qed_vf.c | 5 +- drivers/net/ethernet/realtek/r8169.c | 8 +- drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 2 +- drivers/net/hamradio/yam.c | 4 + drivers/net/usb/asix_common.c | 3 + drivers/net/usb/ax88179_178a.c | 3 + drivers/net/usb/lan78xx.c | 17 +---- drivers/net/usb/r8152.c | 3 + drivers/net/usb/smsc75xx.c | 3 + drivers/net/usb/smsc95xx.c | 3 + drivers/net/usb/sr9800.c | 3 + drivers/net/virtio_net.c | 5 +- drivers/net/wireless/mac80211_hwsim.c | 36 +++++---- drivers/net/wireless/marvell/libertas/if_sdio.c | 4 + drivers/scsi/qedi/qedi_main.c | 1 + drivers/soc/fsl/qbman/qman.c | 3 + drivers/soc/fsl/qe/ucc.c | 2 +- drivers/thunderbolt/icm.c | 49 +++++------- drivers/thunderbolt/nhi.c | 2 +- drivers/tty/serial/8250/8250_dw.c | 4 - drivers/vhost/vhost.c | 2 + drivers/video/fbdev/pxa168fb.c | 6 +- fs/afs/cell.c | 17 ++++- fs/afs/dynroot.c | 2 +- fs/afs/internal.h | 4 +- fs/afs/main.c | 2 +- fs/afs/proc.c | 7 +- fs/fat/fatent.c | 1 + fs/ocfs2/refcounttree.c | 16 +++- include/asm-generic/vmlinux.lds.h | 6 +- include/linux/compiler_types.h | 12 +++ include/linux/gpio/driver.h | 7 ++ include/linux/mlx5/driver.h | 8 ++ include/linux/netfilter.h | 2 + include/net/ip6_fib.h | 4 + include/net/sctp/sm.h | 2 +- include/trace/events/rxrpc.h | 4 +- kernel/events/core.c | 11 +-- kernel/locking/test-ww_mutex.c | 10 ++- mm/gup_benchmark.c | 3 +- mm/migrate.c | 2 +- mm/vmscan.c | 11 --- net/bluetooth/mgmt.c | 7 +- net/bluetooth/smp.c | 29 ++++++- net/bluetooth/smp.h | 3 +- net/bpfilter/bpfilter_kern.c | 6 +- net/bridge/br_multicast.c | 9 ++- net/bridge/br_netfilter_hooks.c | 3 +- net/core/datagram.c | 5 +- net/core/ethtool.c | 8 +- net/core/rtnetlink.c | 10 +++ net/core/skbuff.c | 5 +- net/ipv4/ip_fragment.c | 12 ++- net/ipv4/ipmr_base.c | 2 - net/ipv4/udp.c | 20 ++++- net/ipv4/xfrm4_input.c | 1 + net/ipv4/xfrm4_mode_transport.c | 4 +- net/ipv6/addrconf.c | 6 +- net/ipv6/ip6_checksum.c | 20 ++++- net/ipv6/ip6_tunnel.c | 9 ++- net/ipv6/mcast.c | 16 ++-- net/ipv6/ndisc.c | 3 +- net/ipv6/netfilter/nf_conntrack_reasm.c | 2 - net/ipv6/route.c | 14 ++-- net/ipv6/udp.c | 6 +- net/ipv6/xfrm6_input.c | 1 + net/ipv6/xfrm6_mode_transport.c | 4 +- net/ipv6/xfrm6_output.c | 2 + net/llc/llc_conn.c | 1 + net/mac80211/mesh.h | 3 +- net/mac80211/mesh_hwmp.c | 9 +-- net/mac80211/status.c | 11 ++- net/mac80211/tdls.c | 8 +- net/mac80211/tx.c | 2 +- net/netfilter/nf_conntrack_proto_tcp.c | 4 +- net/netfilter/nft_set_rbtree.c | 28 +++---- net/openvswitch/flow_netlink.c | 4 +- net/rds/send.c | 13 +++- net/rxrpc/ar-internal.h | 19 +++-- net/rxrpc/call_accept.c | 4 +- net/rxrpc/call_object.c | 2 +- net/rxrpc/conn_client.c | 4 +- net/rxrpc/conn_event.c | 26 ++++--- net/rxrpc/conn_object.c | 4 +- net/rxrpc/input.c | 88 ++++++++++++---------- net/rxrpc/local_object.c | 32 +++++--- net/rxrpc/output.c | 31 ++++---- net/rxrpc/peer_event.c | 46 +++-------- net/rxrpc/peer_object.c | 17 ----- net/sched/cls_api.c | 11 ++- net/sched/sch_api.c | 3 +- net/sched/sch_gred.c | 2 +- net/sctp/socket.c | 9 ++- net/smc/smc_core.c | 23 +++--- net/socket.c | 11 ++- net/tipc/name_distr.c | 4 +- net/tls/tls_sw.c | 12 ++- net/wireless/nl80211.c | 20 +++-- net/wireless/reg.c | 8 +- net/wireless/scan.c | 58 +++++++++++--- net/xfrm/xfrm_input.c | 1 + net/xfrm/xfrm_output.c | 4 + net/xfrm/xfrm_policy.c | 4 + net/xfrm/xfrm_user.c | 15 ++++ tools/perf/Makefile | 4 +- tools/perf/util/machine.c | 8 +- tools/perf/util/setup.py | 2 +- tools/testing/selftests/net/fib-onlink-tests.sh | 14 ++-- tools/testing/selftests/net/rtnetlink.sh | 2 +- tools/testing/selftests/net/udpgso_bench.sh | 2 +- 164 files changed, 977 insertions(+), 641 deletions(-)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 07bf7908950a8b14e81aa1807e3c667eab39287a ]
We don't validate the address prefix lengths in the xfrm selector we got from userspace. This can lead to undefined behaviour in the address matching functions if the prefix is too big for the given address family. Fix this by checking the prefixes and refuse SA/policy insertation when a prefix is invalid.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Air Icy icytxw@gmail.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 33878e6e0d0a..5151b3ebf068 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -151,10 +151,16 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, err = -EINVAL; switch (p->family) { case AF_INET: + if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) + goto out; + break;
case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) + if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) + goto out; + break; #else err = -EAFNOSUPPORT; @@ -1359,10 +1365,16 @@ static int verify_newpolicy_info(struct xfrm_userpolicy_info *p)
switch (p->sel.family) { case AF_INET: + if (p->sel.prefixlen_d > 32 || p->sel.prefixlen_s > 32) + return -EINVAL; + break;
case AF_INET6: #if IS_ENABLED(CONFIG_IPV6) + if (p->sel.prefixlen_d > 128 || p->sel.prefixlen_s > 128) + return -EINVAL; + break; #else return -EAFNOSUPPORT;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 215ab0f021c9fea3c18b75e7d522400ee6a49990 ]
After commit d6990976af7c5d8f55903bfb4289b6fb030bf754 ("vti6: fix PMTU caching and reporting on xmit"), some too big skbs might be potentially passed down to __xfrm6_output, causing it to fail to transmit but not free the skb, causing a leak of skb, and consequentially a leak of dst references.
After running pmtu.sh, that shows as failure to unregister devices in a namespace:
[ 311.397671] unregister_netdevice: waiting for veth_b to become free. Usage count = 1
The fix is to call kfree_skb in case of transmit failures.
Fixes: dd767856a36e ("xfrm6: Don't call icmpv6_send on local error") Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Reviewed-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/xfrm6_output.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index 5959ce9620eb..6a74080005cf 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -170,9 +170,11 @@ static int __xfrm6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
if (toobig && xfrm6_local_dontfrag(skb)) { xfrm6_local_rxpmtu(skb, mtu); + kfree_skb(skb); return -EMSGSIZE; } else if (!skb->ignore_df && toobig && skb->sk) { xfrm_local_error(skb, mtu); + kfree_skb(skb); return -EMSGSIZE; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit bfc0698bebcb16d19ecfc89574ad4d696955e5d3 ]
A policy may have been set up with multiple transforms (e.g., ESP and ipcomp). In this situation, the ingress IPsec processing iterates in xfrm_input() and applies each transform in turn, processing the nexthdr to find any additional xfrm that may apply.
This patch resets the transport header back to network header only after the last transformation so that subsequent xfrms can find the correct transport header.
Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Suggested-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sowmini Varadhan sowmini.varadhan@oracle.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/xfrm4_input.c | 1 + net/ipv4/xfrm4_mode_transport.c | 4 +--- net/ipv6/xfrm6_input.c | 1 + net/ipv6/xfrm6_mode_transport.c | 4 +--- 4 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c index bcfc00e88756..f8de2482a529 100644 --- a/net/ipv4/xfrm4_input.c +++ b/net/ipv4/xfrm4_input.c @@ -67,6 +67,7 @@ int xfrm4_transport_finish(struct sk_buff *skb, int async)
if (xo && (xo->flags & XFRM_GRO)) { skb_mac_header_rebuild(skb); + skb_reset_transport_header(skb); return 0; }
diff --git a/net/ipv4/xfrm4_mode_transport.c b/net/ipv4/xfrm4_mode_transport.c index 3d36644890bb..1ad2c2c4e250 100644 --- a/net/ipv4/xfrm4_mode_transport.c +++ b/net/ipv4/xfrm4_mode_transport.c @@ -46,7 +46,6 @@ static int xfrm4_transport_output(struct xfrm_state *x, struct sk_buff *skb) static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb) { int ihl = skb->data - skb_transport_header(skb); - struct xfrm_offload *xo = xfrm_offload(skb);
if (skb->transport_header != skb->network_header) { memmove(skb_transport_header(skb), @@ -54,8 +53,7 @@ static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb) skb->network_header = skb->transport_header; } ip_hdr(skb)->tot_len = htons(skb->len + ihl); - if (!xo || !(xo->flags & XFRM_GRO)) - skb_reset_transport_header(skb); + skb_reset_transport_header(skb); return 0; }
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c index 841f4a07438e..9ef490dddcea 100644 --- a/net/ipv6/xfrm6_input.c +++ b/net/ipv6/xfrm6_input.c @@ -59,6 +59,7 @@ int xfrm6_transport_finish(struct sk_buff *skb, int async)
if (xo && (xo->flags & XFRM_GRO)) { skb_mac_header_rebuild(skb); + skb_reset_transport_header(skb); return -1; }
diff --git a/net/ipv6/xfrm6_mode_transport.c b/net/ipv6/xfrm6_mode_transport.c index 9ad07a91708e..3c29da5defe6 100644 --- a/net/ipv6/xfrm6_mode_transport.c +++ b/net/ipv6/xfrm6_mode_transport.c @@ -51,7 +51,6 @@ static int xfrm6_transport_output(struct xfrm_state *x, struct sk_buff *skb) static int xfrm6_transport_input(struct xfrm_state *x, struct sk_buff *skb) { int ihl = skb->data - skb_transport_header(skb); - struct xfrm_offload *xo = xfrm_offload(skb);
if (skb->transport_header != skb->network_header) { memmove(skb_transport_header(skb), @@ -60,8 +59,7 @@ static int xfrm6_transport_input(struct xfrm_state *x, struct sk_buff *skb) } ipv6_hdr(skb)->payload_len = htons(skb->len + ihl - sizeof(struct ipv6hdr)); - if (!xo || !(xo->flags & XFRM_GRO)) - skb_reset_transport_header(skb); + skb_reset_transport_header(skb); return 0; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 782710e333a526780d65918d669cb96646983ba2 ]
We only support one offloaded xfrm (we do not have devices that can handle more than one offload), so reset crypto_done in xfrm_input() when iterating over multiple transforms in xfrm_input, so that we can invoke the appropriate x->type->input for the non-offloaded transforms
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Sowmini Varadhan sowmini.varadhan@oracle.com Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_input.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 352abca2605f..86f5afbd0a0c 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -453,6 +453,7 @@ resume: XFRM_INC_STATS(net, LINUX_MIB_XFRMINHDRERROR); goto drop; } + crypto_done = false; } while (!err);
err = xfrm_rcv_cb(skb, family, x->type->proto, 0);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8682250b3c1b75a45feb7452bc413d004cfe3778 ]
If a frame is dropped for any reason, mac80211 wouldn't report the TX status back to user space.
As the user space may rely on the TX_STATUS to kick its state machines, resends etc, it's better to just report this frame as not acked instead.
Signed-off-by: Andrei Otcheretianski andrei.otcheretianski@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/status.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/status.c b/net/mac80211/status.c index 9a6d7208bf4f..001a869c059c 100644 --- a/net/mac80211/status.c +++ b/net/mac80211/status.c @@ -479,11 +479,6 @@ static void ieee80211_report_ack_skb(struct ieee80211_local *local, if (!skb) return;
- if (dropped) { - dev_kfree_skb_any(skb); - return; - } - if (info->flags & IEEE80211_TX_INTFL_NL80211_FRAME_TX) { u64 cookie = IEEE80211_SKB_CB(skb)->ack.cookie; struct ieee80211_sub_if_data *sdata; @@ -506,6 +501,8 @@ static void ieee80211_report_ack_skb(struct ieee80211_local *local, } rcu_read_unlock();
+ dev_kfree_skb_any(skb); + } else if (dropped) { dev_kfree_skb_any(skb); } else { /* consumes skb */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 24f33e64fcd0d50a4b1a8e5b41bd0257aa66b0e8 ]
Core regulatory hints didn't set wiphy_idx to WIPHY_IDX_INVALID. Since the regulatory request is zeroed, wiphy_idx was always implicitly set to 0. This resulted in updating only phy #0. Fix that.
Fixes: 806a9e39670b ("cfg80211: make regulatory_request use wiphy_idx instead of wiphy") Signed-off-by: Andrei Otcheretianski andrei.otcheretianski@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com [add fixes tag] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/reg.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/wireless/reg.c b/net/wireless/reg.c index 2f702adf2912..765dedb12361 100644 --- a/net/wireless/reg.c +++ b/net/wireless/reg.c @@ -2867,6 +2867,7 @@ static int regulatory_hint_core(const char *alpha2) request->alpha2[0] = alpha2[0]; request->alpha2[1] = alpha2[1]; request->initiator = NL80211_REGDOM_SET_BY_CORE; + request->wiphy_idx = WIPHY_IDX_INVALID;
queue_regulatory_request(request);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6eae4a6c2be387fec41b0d2782c4fffb57159498 ]
In our environment running lots of mesh nodes, we are seeing the pending queue hang periodically, with the debugfs queues file showing lines such as:
00: 0x00000000/348
i.e. there are a large number of frames but no stop reason set.
One way this could happen is if queue processing from the pending tasklet exited early without processing all frames, and without having some future event (incoming frame, stop reason flag, ...) to reschedule it.
Exactly this can occur today if ieee80211_tx() returns false due to packet drops or power-save buffering in the tx handlers. In the past, this function would return true in such cases, and the change to false doesn't seem to be intentional. Fix this case by reverting to the previous behavior.
Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue") Signed-off-by: Bob Copeland bobcopeland@fb.com Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 9b3b069e418a..361f2f6cc839 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -1886,7 +1886,7 @@ static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata, sdata->vif.hw_queue[skb_get_queue_mapping(skb)];
if (invoke_tx_handlers_early(&tx)) - return false; + return true;
if (ieee80211_queue_skb(local, sdata, tx.sta, tx.skb)) return true;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 119f94a6fefcc76d47075b83d2b73d04c895df78 ]
cfg80211_get_bss_channel() is used to update the RX channel based on the available frame payload information (channel number from DSSS Parameter Set element or HT Operation element). This is needed on 2.4 GHz channels where frames may be received on neighboring channels due to overlapping frequency range.
This might of some use on the 5 GHz band in some corner cases, but things are more complex there since there is no n:1 or 1:n mapping between channel numbers and frequencies due to multiple different starting frequencies in different operating classes. This could result in ieee80211_channel_to_frequency() returning incorrect frequency and ieee80211_get_channel() returning incorrect channel information (or indication of no match). In the previous implementation, this could result in some scan results being dropped completely, e.g., for the 4.9 GHz channels. That prevented connection to such BSSs.
Fix this by using the driver-provided channel pointer if ieee80211_get_channel() does not find matching channel data for the channel number in the frame payload and if the scan is done with 5 MHz or 10 MHz channel bandwidth. While doing this, also add comments describing what the function is trying to achieve to make it easier to understand what happens here and why.
Signed-off-by: Jouni Malinen jouni@codeaurora.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/scan.c | 58 ++++++++++++++++++++++++++++++++++++++------- 1 file changed, 49 insertions(+), 9 deletions(-)
diff --git a/net/wireless/scan.c b/net/wireless/scan.c index d36c3eb7b931..d0e7472dd9fd 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -1058,13 +1058,23 @@ cfg80211_bss_update(struct cfg80211_registered_device *rdev, return NULL; }
+/* + * Update RX channel information based on the available frame payload + * information. This is mainly for the 2.4 GHz band where frames can be received + * from neighboring channels and the Beacon frames use the DSSS Parameter Set + * element to indicate the current (transmitting) channel, but this might also + * be needed on other bands if RX frequency does not match with the actual + * operating channel of a BSS. + */ static struct ieee80211_channel * cfg80211_get_bss_channel(struct wiphy *wiphy, const u8 *ie, size_t ielen, - struct ieee80211_channel *channel) + struct ieee80211_channel *channel, + enum nl80211_bss_scan_width scan_width) { const u8 *tmp; u32 freq; int channel_number = -1; + struct ieee80211_channel *alt_channel;
tmp = cfg80211_find_ie(WLAN_EID_DS_PARAMS, ie, ielen); if (tmp && tmp[1] == 1) { @@ -1078,16 +1088,45 @@ cfg80211_get_bss_channel(struct wiphy *wiphy, const u8 *ie, size_t ielen, } }
- if (channel_number < 0) + if (channel_number < 0) { + /* No channel information in frame payload */ return channel; + }
freq = ieee80211_channel_to_frequency(channel_number, channel->band); - channel = ieee80211_get_channel(wiphy, freq); - if (!channel) - return NULL; - if (channel->flags & IEEE80211_CHAN_DISABLED) + alt_channel = ieee80211_get_channel(wiphy, freq); + if (!alt_channel) { + if (channel->band == NL80211_BAND_2GHZ) { + /* + * Better not allow unexpected channels when that could + * be going beyond the 1-11 range (e.g., discovering + * BSS on channel 12 when radio is configured for + * channel 11. + */ + return NULL; + } + + /* No match for the payload channel number - ignore it */ + return channel; + } + + if (scan_width == NL80211_BSS_CHAN_WIDTH_10 || + scan_width == NL80211_BSS_CHAN_WIDTH_5) { + /* + * Ignore channel number in 5 and 10 MHz channels where there + * may not be an n:1 or 1:n mapping between frequencies and + * channel numbers. + */ + return channel; + } + + /* + * Use the channel determined through the payload channel number + * instead of the RX channel reported by the driver. + */ + if (alt_channel->flags & IEEE80211_CHAN_DISABLED) return NULL; - return channel; + return alt_channel; }
/* Returned bss is reference counted and must be cleaned up appropriately. */ @@ -1112,7 +1151,8 @@ cfg80211_inform_bss_data(struct wiphy *wiphy, (data->signal < 0 || data->signal > 100))) return NULL;
- channel = cfg80211_get_bss_channel(wiphy, ie, ielen, data->chan); + channel = cfg80211_get_bss_channel(wiphy, ie, ielen, data->chan, + data->scan_width); if (!channel) return NULL;
@@ -1210,7 +1250,7 @@ cfg80211_inform_bss_frame_data(struct wiphy *wiphy, return NULL;
channel = cfg80211_get_bss_channel(wiphy, mgmt->u.beacon.variable, - ielen, data->chan); + ielen, data->chan, data->scan_width); if (!channel) return NULL;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit cb59bc14e830028d2244861216df038165d7625d ]
If the TDLS setup happens over a connection to an AP that doesn't have QoS, we nevertheless assign a non-zero TID (skb->priority) and queue mapping, which may confuse us or drivers later.
Fix it by just assigning the special skb->priority and then using ieee80211_select_queue() just like other data frames would go through.
Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tdls.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/mac80211/tdls.c b/net/mac80211/tdls.c index 5cd5e6e5834e..6c647f425e05 100644 --- a/net/mac80211/tdls.c +++ b/net/mac80211/tdls.c @@ -16,6 +16,7 @@ #include "ieee80211_i.h" #include "driver-ops.h" #include "rate.h" +#include "wme.h"
/* give usermode some time for retries in setting up the TDLS session */ #define TDLS_PEER_SETUP_TIMEOUT (15 * HZ) @@ -1010,14 +1011,13 @@ ieee80211_tdls_prep_mgmt_packet(struct wiphy *wiphy, struct net_device *dev, switch (action_code) { case WLAN_TDLS_SETUP_REQUEST: case WLAN_TDLS_SETUP_RESPONSE: - skb_set_queue_mapping(skb, IEEE80211_AC_BK); - skb->priority = 2; + skb->priority = 256 + 2; break; default: - skb_set_queue_mapping(skb, IEEE80211_AC_VI); - skb->priority = 5; + skb->priority = 256 + 5; break; } + skb_set_queue_mapping(skb, ieee80211_select_queue(sdata, skb));
/* * Set the WLAN_TDLS_TEARDOWN flag to indicate a teardown in progress.
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c42055105785580563535e6d3143cad95c7ac7ee ]
TX status reporting to ieee80211s is through ieee80211s_update_metric. There are two problems about ieee80211s_update_metric:
1. The purpose is to estimate the fail probability to a specific link. No need to restrict to data frame.
2. Current implementation does not work if wireless driver does not pass tx_status with skb.
Fix this by removing ieee80211_is_data condition, passing ieee80211_tx_status directly to ieee80211s_update_metric, and putting it in both __ieee80211_tx_status and ieee80211_tx_status_ext.
Signed-off-by: Yuan-Chi Pang fu3mo6goo@gmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mesh.h | 3 ++- net/mac80211/mesh_hwmp.c | 9 +++------ net/mac80211/status.c | 4 +++- 3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/net/mac80211/mesh.h b/net/mac80211/mesh.h index ee56f18cad3f..21526630bf65 100644 --- a/net/mac80211/mesh.h +++ b/net/mac80211/mesh.h @@ -217,7 +217,8 @@ void mesh_rmc_free(struct ieee80211_sub_if_data *sdata); int mesh_rmc_init(struct ieee80211_sub_if_data *sdata); void ieee80211s_init(void); void ieee80211s_update_metric(struct ieee80211_local *local, - struct sta_info *sta, struct sk_buff *skb); + struct sta_info *sta, + struct ieee80211_tx_status *st); void ieee80211_mesh_init_sdata(struct ieee80211_sub_if_data *sdata); void ieee80211_mesh_teardown_sdata(struct ieee80211_sub_if_data *sdata); int ieee80211_start_mesh(struct ieee80211_sub_if_data *sdata); diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c index daf9db3c8f24..6950cd0bf594 100644 --- a/net/mac80211/mesh_hwmp.c +++ b/net/mac80211/mesh_hwmp.c @@ -295,15 +295,12 @@ int mesh_path_error_tx(struct ieee80211_sub_if_data *sdata, }
void ieee80211s_update_metric(struct ieee80211_local *local, - struct sta_info *sta, struct sk_buff *skb) + struct sta_info *sta, + struct ieee80211_tx_status *st) { - struct ieee80211_tx_info *txinfo = IEEE80211_SKB_CB(skb); - struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data; + struct ieee80211_tx_info *txinfo = st->info; int failed;
- if (!ieee80211_is_data(hdr->frame_control)) - return; - failed = !(txinfo->flags & IEEE80211_TX_STAT_ACK);
/* moving average, scaled to 100. diff --git a/net/mac80211/status.c b/net/mac80211/status.c index 001a869c059c..91d7c0cd1882 100644 --- a/net/mac80211/status.c +++ b/net/mac80211/status.c @@ -808,7 +808,7 @@ static void __ieee80211_tx_status(struct ieee80211_hw *hw,
rate_control_tx_status(local, sband, status); if (ieee80211_vif_is_mesh(&sta->sdata->vif)) - ieee80211s_update_metric(local, sta, skb); + ieee80211s_update_metric(local, sta, status);
if (!(info->flags & IEEE80211_TX_CTL_INJECTED) && acked) ieee80211_frame_acked(sta, skb); @@ -969,6 +969,8 @@ void ieee80211_tx_status_ext(struct ieee80211_hw *hw, }
rate_control_tx_status(local, sband, status); + if (ieee80211_vif_is_mesh(&sta->sdata->vif)) + ieee80211s_update_metric(local, sta, status); }
if (acked || noack_success) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 9e1437937807b0122e8da1ca8765be2adca9aee6 ]
Since commit 222d7dbd258d ("net: prevent dst uses after free") skb_dst_force() might clear the dst_entry attached to the skb. The xfrm code don't expect this to happen, so we crash with a NULL pointer dereference in this case. Fix it by checking skb_dst(skb) for NULL after skb_dst_force() and drop the packet in cast the dst_entry was cleared.
Fixes: 222d7dbd258d ("net: prevent dst uses after free") Reported-by: Tobias Hommel netdev-list@genoetigt.de Reported-by: Kristian Evensen kristian.evensen@gmail.com Reported-by: Wolfgang Walter linux@stwm.de Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_output.c | 4 ++++ net/xfrm/xfrm_policy.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index 89b178a78dc7..36d15a38ce5e 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -101,6 +101,10 @@ static int xfrm_output_one(struct sk_buff *skb, int err) spin_unlock_bh(&x->lock);
skb_dst_force(skb); + if (!skb_dst(skb)) { + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); + goto error_nolock; + }
if (xfrm_offload(skb)) { x->type_offload->encap(x, skb); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index a94983e03a8b..526e6814ed4b 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -2551,6 +2551,10 @@ int __xfrm_route_forward(struct sk_buff *skb, unsigned short family) }
skb_dst_force(skb); + if (!skb_dst(skb)) { + XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR); + return 0; + }
dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE); if (IS_ERR(dst)) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3a58ac65e2d7969bcdf1b6acb70fa4d12a88e53e ]
IO_SPACE_LIMIT is the ending address of the PCI IO space, i.e something like 0xfffff (and not 0x100000).
Therefore, when offset = 0xf0000 is passed as argument, this function fails even though the offset + SZ_64K fits below the IO_SPACE_LIMIT. This makes the last chunk of 64 KB of the I/O space not usable as it cannot be mapped.
This patch fixes that by substracing 1 to offset + SZ_64K, so that we compare the addrss of the last byte of the I/O space against IO_SPACE_LIMIT instead of the address of the first byte of what is after the I/O space.
Fixes: c2794437091a4 ("ARM: Add fixed PCI i/o mapping") Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Acked-by: Nicolas Pitre nico@linaro.org Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mm/ioremap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c index fc91205ff46c..5bf9443cfbaa 100644 --- a/arch/arm/mm/ioremap.c +++ b/arch/arm/mm/ioremap.c @@ -473,7 +473,7 @@ void pci_ioremap_set_mem_type(int mem_type)
int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr) { - BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT); + BUG_ON(offset + SZ_64K - 1 > IO_SPACE_LIMIT);
return ioremap_page_range(PCI_IO_VIRT_BASE + offset, PCI_IO_VIRT_BASE + offset + SZ_64K,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 32bf94fb5c2ec4ec842152d0e5937cd4bb6738fa ]
XFRM mode parameters passed as part of the user templates in the IP_XFRM_POLICY are never properly validated. Passing values other than valid XFRM modes can cause stack-out-of-bounds reads to occur later in the XFRM processing:
[ 140.535608] ================================================================ [ 140.543058] BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x17e4/0x1cc4 [ 140.550306] Read of size 4 at addr ffffffc0238a7a58 by task repro/5148 [ 140.557369] [ 140.558927] Call trace: [ 140.558936] dump_backtrace+0x0/0x388 [ 140.558940] show_stack+0x24/0x30 [ 140.558946] __dump_stack+0x24/0x2c [ 140.558949] dump_stack+0x8c/0xd0 [ 140.558956] print_address_description+0x74/0x234 [ 140.558960] kasan_report+0x240/0x264 [ 140.558963] __asan_report_load4_noabort+0x2c/0x38 [ 140.558967] xfrm_state_find+0x17e4/0x1cc4 [ 140.558971] xfrm_resolve_and_create_bundle+0x40c/0x1fb8 [ 140.558975] xfrm_lookup+0x238/0x1444 [ 140.558977] xfrm_lookup_route+0x48/0x11c [ 140.558984] ip_route_output_flow+0x88/0xc4 [ 140.558991] raw_sendmsg+0xa74/0x266c [ 140.558996] inet_sendmsg+0x258/0x3b0 [ 140.559002] sock_sendmsg+0xbc/0xec [ 140.559005] SyS_sendto+0x3a8/0x5a8 [ 140.559008] el0_svc_naked+0x34/0x38 [ 140.559009] [ 140.592245] page dumped because: kasan: bad access detected [ 140.597981] page_owner info is not active (free page?) [ 140.603267] [ 140.653503] ================================================================
Signed-off-by: Sean Tranchetti stranche@codeaurora.org Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 5151b3ebf068..d0672c400c2f 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1455,6 +1455,9 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family) (ut[i].family != prev_family)) return -EINVAL;
+ if (ut[i].mode >= XFRM_MODE_MAX) + return -EINVAL; + prev_family = ut[i].family;
switch (ut[i].family) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d98627d1360d55e3b28f702caca8b6342c4a4e45 ]
Currently the check to see if the timeout has reached zero is incorrect and the check is instead checking if the timeout is non-zero and not zero, hence it will break out of the loop on the first iteration and the msleep is never executed. Fix this by breaking from the loop when timeout is zero.
Detected by CoverityScan, CID#1469404 ("Logically Dead Code")
Fixes: f0316f93897c ("drm/i2c: tda9950: add CEC driver") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i2c/tda9950.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i2c/tda9950.c b/drivers/gpu/drm/i2c/tda9950.c index 3f7396caad48..f2186409f0cf 100644 --- a/drivers/gpu/drm/i2c/tda9950.c +++ b/drivers/gpu/drm/i2c/tda9950.c @@ -307,7 +307,7 @@ static void tda9950_release(struct tda9950_priv *priv) /* Wait up to .5s for it to signal non-busy */ do { csr = tda9950_read(client, REG_CSR); - if (!(csr & CSR_BUSY) || --timeout) + if (!(csr & CSR_BUSY) || !--timeout) break; msleep(10); } while (1);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e0dccce1193f87597548d0db6ecc942fb92c04cd ]
The CEC_TX_STATUS_MAX_RETRIES should be set for errors only to prevent the CEC framework from retrying the transmit. If the transmit was successful, then don't set this flag.
Found by running 'cec-compliance -A' on a beaglebone box.
Signed-off-by: Hans Verkuil hans.verkuil@cisco.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i2c/tda9950.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i2c/tda9950.c b/drivers/gpu/drm/i2c/tda9950.c index f2186409f0cf..ccd355d0c123 100644 --- a/drivers/gpu/drm/i2c/tda9950.c +++ b/drivers/gpu/drm/i2c/tda9950.c @@ -188,7 +188,8 @@ static irqreturn_t tda9950_irq(int irq, void *data) break; } /* TDA9950 executes all retries for us */ - tx_status |= CEC_TX_STATUS_MAX_RETRIES; + if (tx_status != CEC_TX_STATUS_OK) + tx_status |= CEC_TX_STATUS_MAX_RETRIES; cec_transmit_done(priv->adap, tx_status, arb_lost_cnt, nack_cnt, 0, err_cnt); break;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a173f066c7cfc031acb8f541708041e009fc9812 ]
For starters, the bridge netfilter code registers operations that are invoked any time nh_hook is called. Specifically, ip_sabotage_in watches for nested calls for NF_INET_PRE_ROUTING when a bridge is in the stack.
Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish, then resets in_prerouting flag to 0 and the packet continues up the stack. The packet eventually makes it to the VRF driver and it invokes nf_hook for NF_INET_PRE_ROUTING in case any rules have been added against the vrf device.
Because of the registered operations the call to nf_hook causes ip_sabotage_in to be invoked. That function sees the nf_bridge on the skb and that in_prerouting is not set. Thinking it is an invalid nested call it steals (drops) the packet.
Update ip_sabotage_in to recognize that the bridge or one of its upper devices (e.g., vlan) can be enslaved to a VRF (L3 master device) and allow the packet to go through the nf_hook a second time.
Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device") Reported-by: D'Souza, Nelson ndsouza@ciena.com Signed-off-by: David Ahern dsahern@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_netfilter_hooks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 9b16eaf33819..58240cc185e7 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -834,7 +834,8 @@ static unsigned int ip_sabotage_in(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { - if (skb->nf_bridge && !skb->nf_bridge->in_prerouting) { + if (skb->nf_bridge && !skb->nf_bridge->in_prerouting && + !netif_is_l3_master(skb->dev)) { state->okfn(state->net, state->sk, skb); return NF_STOLEN; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 346fa83d10934cf206e2fd0f514bf8ce186f08fe ]
sizeof(sizeof()) is quite strange and does not seem to be what is wanted here.
The issue is detected with the help of Coccinelle.
Fixes: 39215846740a ("netfilter: conntrack: remove nlattr_size pointer from l4proto trackers") Signed-off-by: zhong jiang zhongjiang@huawei.com Acked-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 8e67910185a0..1004fb5930de 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -1239,8 +1239,8 @@ static const struct nla_policy tcp_nla_policy[CTA_PROTOINFO_TCP_MAX+1] = { #define TCP_NLATTR_SIZE ( \ NLA_ALIGN(NLA_HDRLEN + 1) + \ NLA_ALIGN(NLA_HDRLEN + 1) + \ - NLA_ALIGN(NLA_HDRLEN + sizeof(sizeof(struct nf_ct_tcp_flags))) + \ - NLA_ALIGN(NLA_HDRLEN + sizeof(sizeof(struct nf_ct_tcp_flags)))) + NLA_ALIGN(NLA_HDRLEN + sizeof(struct nf_ct_tcp_flags)) + \ + NLA_ALIGN(NLA_HDRLEN + sizeof(struct nf_ct_tcp_flags)))
static int nlattr_to_tcp(struct nlattr *cda[], struct nf_conn *ct) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 469ed9d823b7d240d6b9574f061ded7c3834c167 ]
In the contiguous bit hugetlb break-before-make code we assume that all hugetlb pages are young.
In fact, remove_migration_pte is able to place an old hugetlb pte so this assumption is not valid.
This patch fixes the contiguous hugetlb scanning code to preserve young ptes.
Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries") Signed-off-by: Steve Capper steve.capper@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/hugetlbpage.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 192b3ba07075..f85be2f8b140 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -117,11 +117,14 @@ static pte_t get_clear_flush(struct mm_struct *mm,
/* * If HW_AFDBM is enabled, then the HW could turn on - * the dirty bit for any page in the set, so check - * them all. All hugetlb entries are already young. + * the dirty or accessed bit for any page in the set, + * so check them all. */ if (pte_dirty(pte)) orig_pte = pte_mkdirty(orig_pte); + + if (pte_young(pte)) + orig_pte = pte_mkyoung(orig_pte); }
if (valid) { @@ -340,10 +343,13 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, if (!pte_same(orig_pte, pte)) changed = 1;
- /* Make sure we don't lose the dirty state */ + /* Make sure we don't lose the dirty or young state */ if (pte_dirty(orig_pte)) pte = pte_mkdirty(pte);
+ if (pte_young(orig_pte)) + pte = pte_mkyoung(pte); + hugeprot = pte_pgprot(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) set_pte_at(vma->vm_mm, addr, ptep, pfn_pte(pfn, hugeprot));
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3ab97942d0213b6583a5408630a8cbbfbf54730f ]
A number of our interrupts were incorrectly specified, fix both the PPI and SPI interrupts to be correct.
Fixes: b5762cacc411 ("ARM: bcm63138: add NAND DT support") Fixes: 46d4bca0445a ("ARM: BCM63XX: add BCM63138 minimal Device Tree") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/bcm63138.dtsi | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/arm/boot/dts/bcm63138.dtsi b/arch/arm/boot/dts/bcm63138.dtsi index 43ee992ccdcf..6df61518776f 100644 --- a/arch/arm/boot/dts/bcm63138.dtsi +++ b/arch/arm/boot/dts/bcm63138.dtsi @@ -106,21 +106,23 @@ global_timer: timer@1e200 { compatible = "arm,cortex-a9-global-timer"; reg = <0x1e200 0x20>; - interrupts = <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <GIC_PPI 11 IRQ_TYPE_EDGE_RISING>; clocks = <&axi_clk>; };
local_timer: local-timer@1e600 { compatible = "arm,cortex-a9-twd-timer"; reg = <0x1e600 0x20>; - interrupts = <GIC_PPI 13 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(2) | + IRQ_TYPE_EDGE_RISING)>; clocks = <&axi_clk>; };
twd_watchdog: watchdog@1e620 { compatible = "arm,cortex-a9-twd-wdt"; reg = <0x1e620 0x20>; - interrupts = <GIC_PPI 14 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(2) | + IRQ_TYPE_LEVEL_HIGH)>; };
armpll: armpll { @@ -158,7 +160,7 @@ serial0: serial@600 { compatible = "brcm,bcm6345-uart"; reg = <0x600 0x1b>; - interrupts = <GIC_SPI 32 0>; + interrupts = <GIC_SPI 32 IRQ_TYPE_LEVEL_HIGH>; clocks = <&periph_clk>; clock-names = "periph"; status = "disabled"; @@ -167,7 +169,7 @@ serial1: serial@620 { compatible = "brcm,bcm6345-uart"; reg = <0x620 0x1b>; - interrupts = <GIC_SPI 33 0>; + interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>; clocks = <&periph_clk>; clock-names = "periph"; status = "disabled"; @@ -180,7 +182,7 @@ reg = <0x2000 0x600>, <0xf0 0x10>; reg-names = "nand", "nand-int-base"; status = "disabled"; - interrupts = <GIC_SPI 38 0>; + interrupts = <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "nand"; };
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e1e5d8a9fe737d94ccc0ccbaf0c97f69a8f3e000 ]
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is not detected on 64bit system. The issue was observed when bootloader(u-boot) does not check macb feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support by default. Then macb driver is reading DMACFG register back and only adding 64bit dma configuration but not cleaning it out.
Signed-off-by: Michal Simek michal.simek@xilinx.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index dfa045f22ef1..db568232ff3e 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -2089,6 +2089,7 @@ static void macb_configure_dma(struct macb *bp) else dmacfg &= ~GEM_BIT(TXCOEN);
+ dmacfg &= ~GEM_BIT(ADDR64); #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT if (bp->hw_dma_cap & HW_DMA_CAP_64B) dmacfg |= GEM_BIT(ADDR64);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 64e9e22e68512da8df3c9a7430f07621e48db3c2 ]
If the qman driver didn't probe, calling qman_alloc_fqid_range, qman_alloc_pool_range or qman_alloc_cgrid_range (as done in dpaa_eth) will pass a NULL pointer to gen_pool_alloc, leading to a NULL pointer dereference.
Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Reviewed-by: Roy Pledge roy.pledge@nxp.com Signed-off-by: Li Yang leoyang.li@nxp.com (cherry picked from commit f72487a2788aa70c3aee1d0ebd5470de9bac953a) Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qbman/qman.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index ecb22749df0b..8cc015183043 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -2729,6 +2729,9 @@ static int qman_alloc_range(struct gen_pool *p, u32 *result, u32 cnt) { unsigned long addr;
+ if (!p) + return -ENODEV; + addr = gen_pool_alloc(p, cnt); if (!addr) return -ENOMEM;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 96fc74333f84cfdf8d434c6c07254e215e2aad00 ]
There is a copy and paste bug so we accidentally use the RX_ shift when we're in TX_ mode.
Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Zhao Qiang qiang.zhao@nxp.com Signed-off-by: Li Yang leoyang.li@nxp.com (cherry picked from commit 3cb31b634052ed458922e0c8e2b4b093d7fb60b9) Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/fsl/qe/ucc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c index c646d8713861..681f7d4b7724 100644 --- a/drivers/soc/fsl/qe/ucc.c +++ b/drivers/soc/fsl/qe/ucc.c @@ -626,7 +626,7 @@ static u32 ucc_get_tdm_sync_shift(enum comm_dir mode, u32 tdm_num) { u32 shift;
- shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : RX_SYNC_SHIFT_BASE; + shift = (mode == COMM_DIR_RX) ? RX_SYNC_SHIFT_BASE : TX_SYNC_SHIFT_BASE; shift -= tdm_num * 2;
return shift;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 30fe6d50eb088783c8729c7d930f65296b2b3fa7 ]
Use array_index_nospec() to sanitize ridx with respect to speculation.
Signed-off-by: Masashi Honma masashi.honma@gmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/nl80211.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 733ccf867972..3b80cf012438 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -3699,6 +3699,7 @@ static bool ht_rateset_to_mask(struct ieee80211_supported_band *sband, return false;
/* check availability */ + ridx = array_index_nospec(ridx, IEEE80211_HT_MCS_MASK_LEN); if (sband->ht_cap.mcs.rx_mask[ridx] & rbit) mcs[ridx] |= rbit; else
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 628980e5c8f038f730582c6ee50b7410741cd96e ]
The cleanup of radios during namespace exit has recently been reworked to directly delete a radio while temporarily releasing the spinlock, fixing a race condition between the work-queue execution and namespace exits. However, the temporary unlock allows unsafe modifications on the iterated list, resulting in a potential crash when continuing the iteration of additional radios.
Move radios about to destroy to a temporary list, and clean that up after releasing the spinlock once iteration is complete.
Fixes: 8cfd36a0b53a ("mac80211_hwsim: fix use-after-free bug in hwsim_exit_net") Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index 80e2c8595c7c..6b90bef58293 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -3523,6 +3523,7 @@ static __net_init int hwsim_init_net(struct net *net) static void __net_exit hwsim_exit_net(struct net *net) { struct mac80211_hwsim_data *data, *tmp; + LIST_HEAD(list);
spin_lock_bh(&hwsim_radio_lock); list_for_each_entry_safe(data, tmp, &hwsim_radios, list) { @@ -3533,17 +3534,19 @@ static void __net_exit hwsim_exit_net(struct net *net) if (data->netgroup == hwsim_net_get_netgroup(&init_net)) continue;
- list_del(&data->list); + list_move(&data->list, &list); rhashtable_remove_fast(&hwsim_radios_rht, &data->rht, hwsim_rht_params); hwsim_radios_generation++; - spin_unlock_bh(&hwsim_radio_lock); + } + spin_unlock_bh(&hwsim_radio_lock); + + list_for_each_entry_safe(data, tmp, &list, list) { + list_del(&data->list); mac80211_hwsim_del_radio(data, wiphy_name(data->hw->wiphy), NULL); - spin_lock_bh(&hwsim_radio_lock); } - spin_unlock_bh(&hwsim_radio_lock);
ida_simple_remove(&hwsim_netgroup_ida, hwsim_net_get_netgroup(net)); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit f1c47eb61d52379de5747d02bb36be20d7a2d0d3 ]
The asynchronous destruction from a work-queue of radios tagged with destroy-on-close may race with the owning namespace about to exit, resulting in potential use-after-free of that namespace.
Instead of using a work-queue, move radios about to destroy to a temporary list, which can be worked on synchronously after releasing the lock. This should be safe to do from the netlink socket notifier, as the namespace is guaranteed to not get released.
Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index 6b90bef58293..cfd0c58aa02a 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -519,7 +519,6 @@ struct mac80211_hwsim_data { int channels, idx; bool use_chanctx; bool destroy_on_close; - struct work_struct destroy_work; u32 portid; char alpha2[2]; const struct ieee80211_regdomain *regd; @@ -3442,30 +3441,27 @@ static struct genl_family hwsim_genl_family __ro_after_init = { .n_mcgrps = ARRAY_SIZE(hwsim_mcgrps), };
-static void destroy_radio(struct work_struct *work) -{ - struct mac80211_hwsim_data *data = - container_of(work, struct mac80211_hwsim_data, destroy_work); - - hwsim_radios_generation++; - mac80211_hwsim_del_radio(data, wiphy_name(data->hw->wiphy), NULL); -} - static void remove_user_radios(u32 portid) { struct mac80211_hwsim_data *entry, *tmp; + LIST_HEAD(list);
spin_lock_bh(&hwsim_radio_lock); list_for_each_entry_safe(entry, tmp, &hwsim_radios, list) { if (entry->destroy_on_close && entry->portid == portid) { - list_del(&entry->list); + list_move(&entry->list, &list); rhashtable_remove_fast(&hwsim_radios_rht, &entry->rht, hwsim_rht_params); - INIT_WORK(&entry->destroy_work, destroy_radio); - queue_work(hwsim_wq, &entry->destroy_work); + hwsim_radios_generation++; } } spin_unlock_bh(&hwsim_radio_lock); + + list_for_each_entry_safe(entry, tmp, &list, list) { + list_del(&entry->list); + mac80211_hwsim_del_radio(entry, wiphy_name(entry->hw->wiphy), + NULL); + } }
static int mac80211_hwsim_netlink_notify(struct notifier_block *nb,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 28ef8b49a338dc1844e86b7954cfffc7dfa2660a ]
The allocation of hwsim radio identifiers uses a post-increment from 0, so the first radio has idx 0. This idx is explicitly excluded from multicast announcements ever since, but it is unclear why.
Drop that idx check and announce the first radio as well. This makes userspace happy if it relies on these events.
Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index cfd0c58aa02a..58dd217811c8 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -2811,8 +2811,7 @@ static int mac80211_hwsim_new_radio(struct genl_info *info, hwsim_radios_generation++; spin_unlock_bh(&hwsim_radio_lock);
- if (idx > 0) - hwsim_mcast_new_radio(idx, info, param); + hwsim_mcast_new_radio(idx, info, param);
return idx;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit cb28c306b93b71f2741ce1a5a66289db26715f4d ]
In case unpair_device() was called through mgmt interface at the same time when pairing was in progress, Bluetooth kernel module crash was seen.
[ 600.351225] general protection fault: 0000 [#1] SMP PTI [ 600.351235] CPU: 1 PID: 11096 Comm: btmgmt Tainted: G OE 4.19.0-rc1+ #1 [ 600.351238] Hardware name: Dell Inc. Latitude E5440/08RCYC, BIOS A18 05/14/2017 [ 600.351272] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth] [ 600.351276] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01 [ 600.351279] RSP: 0018:ffffa9be839b3b50 EFLAGS: 00010246 [ 600.351282] RAX: ffff9c999ac565a0 RBX: ffff9c9996e98c00 RCX: ffff9c999aa28b60 [ 600.351285] RDX: dead000000000200 RSI: 0000000000000010 RDI: ffff9c999e403500 [ 600.351287] RBP: ffffa9be839b3b70 R08: 0000000000000000 R09: ffffffff92a25c00 [ 600.351290] R10: ffffa9be839b3ae8 R11: 0000000000000001 R12: ffff9c995375b800 [ 600.351292] R13: 0000000000000000 R14: ffff9c99619a5000 R15: ffff9c9962a01c00 [ 600.351295] FS: 00007fb2be27c700(0000) GS:ffff9c999e880000(0000) knlGS:0000000000000000 [ 600.351298] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 600.351300] CR2: 00007fb2bdadbad0 CR3: 000000041c328001 CR4: 00000000001606e0 [ 600.351302] Call Trace: [ 600.351325] smp_failure+0x4f/0x70 [bluetooth] [ 600.351345] smp_cancel_pairing+0x74/0x80 [bluetooth] [ 600.351370] unpair_device+0x1c1/0x330 [bluetooth] [ 600.351399] hci_sock_sendmsg+0x960/0x9f0 [bluetooth] [ 600.351409] ? apparmor_socket_sendmsg+0x1e/0x20 [ 600.351417] sock_sendmsg+0x3e/0x50 [ 600.351422] sock_write_iter+0x85/0xf0 [ 600.351429] do_iter_readv_writev+0x12b/0x1b0 [ 600.351434] do_iter_write+0x87/0x1a0 [ 600.351439] vfs_writev+0x98/0x110 [ 600.351443] ? ep_poll+0x16d/0x3d0 [ 600.351447] ? ep_modify+0x73/0x170 [ 600.351451] do_writev+0x61/0xf0 [ 600.351455] ? do_writev+0x61/0xf0 [ 600.351460] __x64_sys_writev+0x1c/0x20 [ 600.351465] do_syscall_64+0x5a/0x110 [ 600.351471] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 600.351474] RIP: 0033:0x7fb2bdb62fe0 [ 600.351477] Code: 73 01 c3 48 8b 0d b8 6e 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 69 c7 2c 00 00 75 10 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 80 01 00 48 89 04 24 [ 600.351479] RSP: 002b:00007ffe062cb8f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 [ 600.351484] RAX: ffffffffffffffda RBX: 000000000255b3d0 RCX: 00007fb2bdb62fe0 [ 600.351487] RDX: 0000000000000001 RSI: 00007ffe062cb920 RDI: 0000000000000004 [ 600.351490] RBP: 00007ffe062cb920 R08: 000000000255bd80 R09: 0000000000000000 [ 600.351494] R10: 0000000000000353 R11: 0000000000000246 R12: 0000000000000001 [ 600.351497] R13: 00007ffe062cbbe0 R14: 0000000000000000 R15: 0000000000000000 [ 600.351501] Modules linked in: algif_hash algif_skcipher af_alg cmac ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay arc4 nls_iso8859_1 dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp dell_laptop kvm_intel crct10dif_pclmul dell_smm_hwmon crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_rapl_perf uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media hid_multitouch input_leds joydev serio_raw dell_wmi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic dell_smbios dcdbas sparse_keymap [ 600.351569] snd_hda_intel btusb snd_hda_codec btrtl btbcm btintel snd_hda_core bluetooth(OE) snd_hwdep snd_pcm iwlmvm ecdh_generic wmi_bmof dell_wmi_descriptor snd_seq_midi mac80211 snd_seq_midi_event lpc_ich iwlwifi snd_rawmidi snd_seq snd_seq_device snd_timer cfg80211 snd soundcore mei_me mei dell_rbtn dell_smo8800 mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid hid i915 nouveau kvmgt vfio_mdev mdev vfio_iommu_type1 vfio kvm irqbypass i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt mxm_wmi psmouse ahci sdhci_pci cqhci libahci fb_sys_fops sdhci drm e1000e video wmi [ 600.351637] ---[ end trace e49e9f1df09c94fb ]--- [ 600.351664] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth] [ 600.351666] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01 [ 600.351669] RSP: 0018:ffffa9be839b3b50 EFLAGS: 00010246 [ 600.351672] RAX: ffff9c999ac565a0 RBX: ffff9c9996e98c00 RCX: ffff9c999aa28b60 [ 600.351674] RDX: dead000000000200 RSI: 0000000000000010 RDI: ffff9c999e403500 [ 600.351676] RBP: ffffa9be839b3b70 R08: 0000000000000000 R09: ffffffff92a25c00 [ 600.351679] R10: ffffa9be839b3ae8 R11: 0000000000000001 R12: ffff9c995375b800 [ 600.351681] R13: 0000000000000000 R14: ffff9c99619a5000 R15: ffff9c9962a01c00 [ 600.351684] FS: 00007fb2be27c700(0000) GS:ffff9c999e880000(0000) knlGS:0000000000000000 [ 600.351686] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 600.351689] CR2: 00007fb2bdadbad0 CR3: 000000041c328001 CR4: 00000000001606e0
Crash happened because list_del_rcu() was called twice for smp->ltk. This was possible if unpair_device was called right after ltk was generated but before keys were distributed.
In this commit smp_cancel_pairing was refactored to cancel pairing if it is in progress and otherwise just removes keys. Once keys are removed from rcu list, pointers to smp context's keys are set to NULL to make sure removed list items are not accessed later.
This commit also adjusts the functionality of mgmt unpair_device() little bit. Previously pairing was canceled only if pairing was in state that keys were already generated. With this commit unpair_device() cancels pairing already in earlier states.
Bug was found by fuzzing kernel SMP implementation using Synopsys Defensics.
Reported-by: Pekka Oikarainen pekka.oikarainen@synopsys.com Signed-off-by: Matias Karhumaa matias.karhumaa@gmail.com Signed-off-by: Johan Hedberg johan.hedberg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/mgmt.c | 7 ++----- net/bluetooth/smp.c | 29 +++++++++++++++++++++++++---- net/bluetooth/smp.h | 3 ++- 3 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index 8a80d48d89c4..1b9984f653dd 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -2298,9 +2298,8 @@ static int unpair_device(struct sock *sk, struct hci_dev *hdev, void *data, /* LE address type */ addr_type = le_addr_type(cp->addr.type);
- hci_remove_irk(hdev, &cp->addr.bdaddr, addr_type); - - err = hci_remove_ltk(hdev, &cp->addr.bdaddr, addr_type); + /* Abort any ongoing SMP pairing. Removes ltk and irk if they exist. */ + err = smp_cancel_and_remove_pairing(hdev, &cp->addr.bdaddr, addr_type); if (err < 0) { err = mgmt_cmd_complete(sk, hdev->id, MGMT_OP_UNPAIR_DEVICE, MGMT_STATUS_NOT_PAIRED, &rp, @@ -2314,8 +2313,6 @@ static int unpair_device(struct sock *sk, struct hci_dev *hdev, void *data, goto done; }
- /* Abort any ongoing SMP pairing */ - smp_cancel_pairing(conn);
/* Defer clearing up the connection parameters until closing to * give a chance of keeping them if a repairing happens. diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c index 3a7b0773536b..73f7211d0431 100644 --- a/net/bluetooth/smp.c +++ b/net/bluetooth/smp.c @@ -2422,30 +2422,51 @@ unlock: return ret; }
-void smp_cancel_pairing(struct hci_conn *hcon) +int smp_cancel_and_remove_pairing(struct hci_dev *hdev, bdaddr_t *bdaddr, + u8 addr_type) { - struct l2cap_conn *conn = hcon->l2cap_data; + struct hci_conn *hcon; + struct l2cap_conn *conn; struct l2cap_chan *chan; struct smp_chan *smp; + int err; + + err = hci_remove_ltk(hdev, bdaddr, addr_type); + hci_remove_irk(hdev, bdaddr, addr_type); + + hcon = hci_conn_hash_lookup_le(hdev, bdaddr, addr_type); + if (!hcon) + goto done;
+ conn = hcon->l2cap_data; if (!conn) - return; + goto done;
chan = conn->smp; if (!chan) - return; + goto done;
l2cap_chan_lock(chan);
smp = chan->data; if (smp) { + /* Set keys to NULL to make sure smp_failure() does not try to + * remove and free already invalidated rcu list entries. */ + smp->ltk = NULL; + smp->slave_ltk = NULL; + smp->remote_irk = NULL; + if (test_bit(SMP_FLAG_COMPLETE, &smp->flags)) smp_failure(conn, 0); else smp_failure(conn, SMP_UNSPECIFIED); + err = 0; }
l2cap_chan_unlock(chan); + +done: + return err; }
static int smp_cmd_encrypt_info(struct l2cap_conn *conn, struct sk_buff *skb) diff --git a/net/bluetooth/smp.h b/net/bluetooth/smp.h index 0ff6247eaa6c..121edadd5f8d 100644 --- a/net/bluetooth/smp.h +++ b/net/bluetooth/smp.h @@ -181,7 +181,8 @@ enum smp_key_pref { };
/* SMP Commands */ -void smp_cancel_pairing(struct hci_conn *hcon); +int smp_cancel_and_remove_pairing(struct hci_dev *hdev, bdaddr_t *bdaddr, + u8 addr_type); bool smp_sufficient_security(struct hci_conn *hcon, u8 sec_level, enum smp_key_pref key_pref); int smp_conn_security(struct hci_conn *hcon, __u8 sec_level);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d85536cde91fcfed6fb8d983783bd2b92c843939 ]
Add missing prepare/unprepare operations for fbi->clk, this fixes following kernel warning:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:874 clk_core_enable+0x2c/0x1b0 Enabling unprepared disp0_clk Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 4.18.0-rc8-00032-g02b43ddd4f21-dirty #25 Hardware name: Marvell MMP2 (Device Tree Support) [<c010f7cc>] (unwind_backtrace) from [<c010cc6c>] (show_stack+0x10/0x14) [<c010cc6c>] (show_stack) from [<c011dab4>] (__warn+0xd8/0xf0) [<c011dab4>] (__warn) from [<c011db10>] (warn_slowpath_fmt+0x44/0x6c) [<c011db10>] (warn_slowpath_fmt) from [<c043898c>] (clk_core_enable+0x2c/0x1b0) [<c043898c>] (clk_core_enable) from [<c0439ec8>] (clk_core_enable_lock+0x18/0x2c) [<c0439ec8>] (clk_core_enable_lock) from [<c0436698>] (pxa168fb_probe+0x464/0x6ac) [<c0436698>] (pxa168fb_probe) from [<c04779a0>] (platform_drv_probe+0x48/0x94) [<c04779a0>] (platform_drv_probe) from [<c0475bec>] (driver_probe_device+0x328/0x470) [<c0475bec>] (driver_probe_device) from [<c0475de4>] (__driver_attach+0xb0/0x124) [<c0475de4>] (__driver_attach) from [<c0473c38>] (bus_for_each_dev+0x64/0xa0) [<c0473c38>] (bus_for_each_dev) from [<c0474ee0>] (bus_add_driver+0x1b8/0x230) [<c0474ee0>] (bus_add_driver) from [<c0476a20>] (driver_register+0xac/0xf0) [<c0476a20>] (driver_register) from [<c0102dd4>] (do_one_initcall+0xb8/0x1f0) [<c0102dd4>] (do_one_initcall) from [<c0b010a0>] (kernel_init_freeable+0x294/0x2e0) [<c0b010a0>] (kernel_init_freeable) from [<c07e9eb8>] (kernel_init+0x8/0x10c) [<c07e9eb8>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c) Exception stack(0xd008bfb0 to 0xd008bff8) bfa0: 00000000 00000000 00000000 00000000 bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace c0af40f9e2ed7cb4 ]---
Signed-off-by: Lubomir Rintel lkundrak@v3.sk [b.zolnierkie: enhance patch description a bit] Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/pxa168fb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/video/fbdev/pxa168fb.c b/drivers/video/fbdev/pxa168fb.c index def3a501acd6..d059d04c63ac 100644 --- a/drivers/video/fbdev/pxa168fb.c +++ b/drivers/video/fbdev/pxa168fb.c @@ -712,7 +712,7 @@ static int pxa168fb_probe(struct platform_device *pdev) /* * enable controller clock */ - clk_enable(fbi->clk); + clk_prepare_enable(fbi->clk);
pxa168fb_set_par(info);
@@ -767,7 +767,7 @@ static int pxa168fb_probe(struct platform_device *pdev) failed_free_cmap: fb_dealloc_cmap(&info->cmap); failed_free_clk: - clk_disable(fbi->clk); + clk_disable_unprepare(fbi->clk); failed_free_fbmem: dma_free_coherent(fbi->dev, info->fix.smem_len, info->screen_base, fbi->fb_start_dma); @@ -807,7 +807,7 @@ static int pxa168fb_remove(struct platform_device *pdev) dma_free_wc(fbi->dev, PAGE_ALIGN(info->fix.smem_len), info->screen_base, info->fix.smem_start);
- clk_disable(fbi->clk); + clk_disable_unprepare(fbi->clk);
framebuffer_release(info);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a898fba32229efd5e6b6154f83fa86a7145156b9 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->vxlan.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->l2_gre.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->ip_gre.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->l2_geneve.tun_cls = type; ~ ^~~~ drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning: implicit conversion from enumeration type 'enum tunnel_clss' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] p_tun->ip_geneve.tun_cls = type; ~ ^~~~ 5 warnings generated.
Avoid this by changing type to an int.
Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_sp_commands.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_sp_commands.c b/drivers/net/ethernet/qlogic/qed/qed_sp_commands.c index 8de644b4721e..77b6248ad3b9 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_sp_commands.c +++ b/drivers/net/ethernet/qlogic/qed/qed_sp_commands.c @@ -154,7 +154,7 @@ qed_set_pf_update_tunn_mode(struct qed_tunnel_info *p_tun, static void qed_set_tunn_cls_info(struct qed_tunnel_info *p_tun, struct qed_tunnel_info *p_src) { - enum tunnel_clss type; + int type;
p_tun->b_update_rx_cls = p_src->b_update_rx_cls; p_tun->b_update_tx_cls = p_src->b_update_tx_cls;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit db803f36e56f23b5a2266807e190d1dc11554d54 ]
Clang complains when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit conversion from enumeration type 'enum qed_tunn_mode' to different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion] QED_MODE_L2GENEVE_TUNN, ^~~~~~~~~~~~~~~~~~~~~~
Update mask's parameter to expect qed_tunn_mode, which is what was intended.
Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_vf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c index be6ddde1a104..ac3f54bbe9b9 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_vf.c +++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c @@ -572,7 +572,7 @@ free_p_iov: static void __qed_vf_prep_tunn_req_tlv(struct vfpf_update_tunn_param_tlv *p_req, struct qed_tunn_update_type *p_src, - enum qed_tunn_clss mask, u8 *p_cls) + enum qed_tunn_mode mask, u8 *p_cls) { if (p_src->b_update_mode) { p_req->tun_mode_update_mask |= BIT(mask); @@ -587,7 +587,7 @@ __qed_vf_prep_tunn_req_tlv(struct vfpf_update_tunn_param_tlv *p_req, static void qed_vf_prep_tunn_req_tlv(struct vfpf_update_tunn_param_tlv *p_req, struct qed_tunn_update_type *p_src, - enum qed_tunn_clss mask, + enum qed_tunn_mode mask, u8 *p_cls, struct qed_tunn_update_udp_port *p_port, u8 *p_update_port, u16 *p_udp_port) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d3a315795b4ce8b105a64a90699103121bde04a8 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit conversion from enumeration type 'enum roce_mode' to different enumeration type 'enum roce_flavor' [-Wenum-conversion] flavor = ROCE_V2_IPV6; ~ ^~~~~~~~~~~~ drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit conversion from enumeration type 'enum roce_mode' to different enumeration type 'enum roce_flavor' [-Wenum-conversion] flavor = MAX_ROCE_MODE; ~ ^~~~~~~~~~~~~ 2 warnings generated.
Use the appropriate values from the expected type, roce_flavor:
ROCE_V2_IPV6 = RROCE_IPV6 = 2 MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3
While we're add it, ditch the local variable flavor, we can just return the value directly from the switch statement.
Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_roce.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c index b5ce1581645f..79424e6f0976 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_roce.c +++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c @@ -138,23 +138,16 @@ static void qed_rdma_copy_gids(struct qed_rdma_qp *qp, __le32 *src_gid,
static enum roce_flavor qed_roce_mode_to_flavor(enum roce_mode roce_mode) { - enum roce_flavor flavor; - switch (roce_mode) { case ROCE_V1: - flavor = PLAIN_ROCE; - break; + return PLAIN_ROCE; case ROCE_V2_IPV4: - flavor = RROCE_IPV4; - break; + return RROCE_IPV4; case ROCE_V2_IPV6: - flavor = ROCE_V2_IPV6; - break; + return RROCE_IPV6; default: - flavor = MAX_ROCE_MODE; - break; + return MAX_ROCE_FLAVOR; } - return flavor; }
void qed_roce_free_cid_pair(struct qed_hwfn *p_hwfn, u16 cid)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1c492a9d55ba99079210ed901dd8a5423f980487 ]
Clang warns when a constant is used in a boolean context as it thinks a bitwise operation may have been intended.
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (!p_iov->b_pre_fp_hsi && ^ drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a bitwise operation if (!p_iov->b_pre_fp_hsi && ^~ & drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant to silence this warning if (!p_iov->b_pre_fp_hsi && ~^~ 1 warning generated.
This has been here since commit 1fe614d10f45 ("qed: Relax VF firmware requirements") and I am not entirely sure why since 0 isn't a special case. Just remove the statement causing Clang to warn since it isn't required.
Link: https://github.com/ClangBuiltLinux/linux/issues/126 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_vf.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c index ac3f54bbe9b9..c4766e4ac485 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_vf.c +++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c @@ -413,7 +413,6 @@ static int qed_vf_pf_acquire(struct qed_hwfn *p_hwfn) }
if (!p_iov->b_pre_fp_hsi && - ETH_HSI_VER_MINOR && (resp->pfdev_info.minor_fp_hsi < ETH_HSI_VER_MINOR)) { DP_INFO(p_hwfn, "PF is using older fastpath HSI; %02x.%02x is configured\n",
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 77f2d753819b7d50c16abfb778caf1fe075faed0 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit conversion from enumeration type 'enum tcp_ip_version' to different enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion] cm_info->ip_version = TCP_IPV4; ~ ^~~~~~~~ drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit conversion from enumeration type 'enum tcp_ip_version' to different enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion] cm_info->ip_version = TCP_IPV6; ~ ^~~~~~~~ 2 warnings generated.
Use the appropriate values from the expected type, qed_tcp_ip_version:
TCP_IPV4 = QED_TCP_IPV4 = 0 TCP_IPV6 = QED_TCP_IPV6 = 1
Link: https://github.com/ClangBuiltLinux/linux/issues/125 Signed-off-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_iwarp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c index 90a2b53096e2..51bbb0e5b514 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_iwarp.c +++ b/drivers/net/ethernet/qlogic/qed/qed_iwarp.c @@ -1710,7 +1710,7 @@ qed_iwarp_parse_rx_pkt(struct qed_hwfn *p_hwfn,
cm_info->local_ip[0] = ntohl(iph->daddr); cm_info->remote_ip[0] = ntohl(iph->saddr); - cm_info->ip_version = TCP_IPV4; + cm_info->ip_version = QED_TCP_IPV4;
ip_hlen = (iph->ihl) * sizeof(u32); *payload_len = ntohs(iph->tot_len) - ip_hlen; @@ -1730,7 +1730,7 @@ qed_iwarp_parse_rx_pkt(struct qed_hwfn *p_hwfn, cm_info->remote_ip[i] = ntohl(ip6h->saddr.in6_u.u6_addr32[i]); } - cm_info->ip_version = TCP_IPV6; + cm_info->ip_version = QED_TCP_IPV6;
ip_hlen = sizeof(*ip6h); *payload_len = ntohs(ip6h->payload_len);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1222a16014888ed9733c11e221730d4a8196222b ]
Use array_index_nospec() to sanitize i with respect to speculation.
Note that the user doesn't control i directly, but can make it out of bounds by not finding a threshold in the array.
Signed-off-by: Masashi Honma masashi.honma@gmail.com [add note about user control, as explained by Masashi] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/nl80211.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 3b80cf012438..214f9ef79a64 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -10125,7 +10125,7 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev, struct wireless_dev *wdev = dev->ieee80211_ptr; s32 last, low, high; u32 hyst; - int i, n; + int i, n, low_index; int err;
/* RSSI reporting disabled? */ @@ -10162,10 +10162,19 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev, if (last < wdev->cqm_config->rssi_thresholds[i]) break;
- low = i > 0 ? - (wdev->cqm_config->rssi_thresholds[i - 1] - hyst) : S32_MIN; - high = i < n ? - (wdev->cqm_config->rssi_thresholds[i] + hyst - 1) : S32_MAX; + low_index = i - 1; + if (low_index >= 0) { + low_index = array_index_nospec(low_index, n); + low = wdev->cqm_config->rssi_thresholds[low_index] - hyst; + } else { + low = S32_MIN; + } + if (i < n) { + i = array_index_nospec(i, n); + high = wdev->cqm_config->rssi_thresholds[i] + hyst - 1; + } else { + high = S32_MAX; + }
return rdev_set_cqm_rssi_range_config(rdev, dev, low, high); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3cc5746e5ad7688e274e193fa71278d98aa52759 ]
Fix kernel NULL pointer dereference,
Call Trace: [<ffffffff9b7658e6>] __mutex_lock_slowpath+0xa6/0x1d0 [<ffffffff9b764cef>] mutex_lock+0x1f/0x2f [<ffffffffc061b5e1>] qedi_get_protocol_tlv_data+0x61/0x450 [qedi] [<ffffffff9b1f9d8e>] ? map_vm_area+0x2e/0x40 [<ffffffff9b1fc370>] ? __vmalloc_node_range+0x170/0x280 [<ffffffffc0b81c3d>] ? qed_mfw_process_tlv_req+0x27d/0xbd0 [qed] [<ffffffffc0b6461b>] qed_mfw_fill_tlv_data+0x4b/0xb0 [qed] [<ffffffffc0b81c59>] qed_mfw_process_tlv_req+0x299/0xbd0 [qed] [<ffffffff9b02a59e>] ? __switch_to+0xce/0x580 [<ffffffffc0b61e5b>] qed_slowpath_task+0x5b/0x80 [qed]
Signed-off-by: Nilesh Javali nilesh.javali@cavium.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qedi/qedi_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c index 3e18a68c2b03..054e66d93ed6 100644 --- a/drivers/scsi/qedi/qedi_main.c +++ b/drivers/scsi/qedi/qedi_main.c @@ -2472,6 +2472,7 @@ static int __qedi_probe(struct pci_dev *pdev, int mode) /* start qedi context */ spin_lock_init(&qedi->hba_lock); spin_lock_init(&qedi->task_idx_lock); + mutex_init(&qedi->stats_lock); } qedi_ops->ll2->register_cb_ops(qedi->cdev, &qedi_ll2_cb_ops, qedi); qedi_ops->ll2->start(qedi->cdev, ¶ms);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dc71db34e4f3c06b8277c8f3c2ff014610607a8c ]
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED flag in the packet type field rather than in the packet flags field.
Fix this by creating a pair of helper functions to check whether the packet is going to the client or to the server and use them generally.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/ar-internal.h | 10 ++++++++++ net/rxrpc/conn_object.c | 2 +- net/rxrpc/input.c | 12 ++++-------- 3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 707630ab4713..5069193d2cc1 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -462,6 +462,16 @@ struct rxrpc_connection { u8 out_clientflag; /* RXRPC_CLIENT_INITIATED if we are client */ };
+static inline bool rxrpc_to_server(const struct rxrpc_skb_priv *sp) +{ + return sp->hdr.flags & RXRPC_CLIENT_INITIATED; +} + +static inline bool rxrpc_to_client(const struct rxrpc_skb_priv *sp) +{ + return !rxrpc_to_server(sp); +} + /* * Flags in call->flags. */ diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 4c77a78a252a..c37bf8e282b9 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -99,7 +99,7 @@ struct rxrpc_connection *rxrpc_find_connection_rcu(struct rxrpc_local *local, k.epoch = sp->hdr.epoch; k.cid = sp->hdr.cid & RXRPC_CIDMASK;
- if (sp->hdr.flags & RXRPC_CLIENT_INITIATED) { + if (rxrpc_to_server(sp)) { /* We need to look up service connections by the full protocol * parameter set. We look up the peer first as an intermediate * step and then the connection from the peer's tree. diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 608d078a4981..338fbbf216a9 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -1171,10 +1171,6 @@ void rxrpc_data_ready(struct sock *udp_sk)
trace_rxrpc_rx_packet(sp);
- _net("Rx RxRPC %s ep=%x call=%x:%x", - sp->hdr.flags & RXRPC_CLIENT_INITIATED ? "ToServer" : "ToClient", - sp->hdr.epoch, sp->hdr.cid, sp->hdr.callNumber); - if (sp->hdr.type >= RXRPC_N_PACKET_TYPES || !((RXRPC_SUPPORTED_PACKET_TYPES >> sp->hdr.type) & 1)) { _proto("Rx Bad Packet Type %u", sp->hdr.type); @@ -1183,13 +1179,13 @@ void rxrpc_data_ready(struct sock *udp_sk)
switch (sp->hdr.type) { case RXRPC_PACKET_TYPE_VERSION: - if (!(sp->hdr.flags & RXRPC_CLIENT_INITIATED)) + if (rxrpc_to_client(sp)) goto discard; rxrpc_post_packet_to_local(local, skb); goto out;
case RXRPC_PACKET_TYPE_BUSY: - if (sp->hdr.flags & RXRPC_CLIENT_INITIATED) + if (rxrpc_to_server(sp)) goto discard; /* Fall through */
@@ -1269,7 +1265,7 @@ void rxrpc_data_ready(struct sock *udp_sk) call = rcu_dereference(chan->call);
if (sp->hdr.callNumber > chan->call_id) { - if (!(sp->hdr.flags & RXRPC_CLIENT_INITIATED)) { + if (rxrpc_to_client(sp)) { rcu_read_unlock(); goto reject_packet; } @@ -1292,7 +1288,7 @@ void rxrpc_data_ready(struct sock *udp_sk) }
if (!call || atomic_read(&call->usage) == 0) { - if (!(sp->hdr.type & RXRPC_CLIENT_INITIATED) || + if (rxrpc_to_client(sp) || sp->hdr.callNumber == 0 || sp->hdr.type != RXRPC_PACKET_TYPE_DATA) goto bad_message_unlock;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit b604dd9883f783a94020d772e4fe03160f455372 ]
Fix RTT information gathering in AF_RXRPC by the following means:
(1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.
(2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready() collects it, set it at that point.
(3) Allow ACKs to be requested on the last packet of a client call, but not a service call. We need to be careful lest we undo:
bf7d620abf22c321208a4da4f435e7af52551a21 Author: David Howells dhowells@redhat.com Date: Thu Oct 6 08:11:51 2016 +0100 rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase
but that only really applies to service calls that we're handling, since the client side gets to send the final ACK (or not).
(4) When about to transmit an ACK or DATA packet, record the Tx timestamp before only; don't update the timestamp afterwards.
(5) Switch the ordering between recording the serial and recording the timestamp to always set the serial number first. The serial number shouldn't be seen referenced by an ACK packet until we've transmitted the packet bearing it - so in the Rx path, we don't need the timestamp until we've checked the serial number.
Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/input.c | 8 ++++++-- net/rxrpc/local_object.c | 9 +++++++++ net/rxrpc/output.c | 31 ++++++++++++++++++------------- 3 files changed, 33 insertions(+), 15 deletions(-)
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 338fbbf216a9..f6027c875876 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -616,13 +616,14 @@ static void rxrpc_input_requested_ack(struct rxrpc_call *call, if (!skb) continue;
+ sent_at = skb->tstamp; + smp_rmb(); /* Read timestamp before serial. */ sp = rxrpc_skb(skb); if (sp->hdr.serial != orig_serial) continue; - smp_rmb(); - sent_at = skb->tstamp; goto found; } + return;
found: @@ -1137,6 +1138,9 @@ void rxrpc_data_ready(struct sock *udp_sk) return; }
+ if (skb->tstamp == 0) + skb->tstamp = ktime_get_real(); + rxrpc_new_skb(skb, rxrpc_skb_rx_received);
_net("recv skb %p", skb); diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index b493e6b62740..5d89ea5c1976 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -173,6 +173,15 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) _debug("setsockopt failed"); goto error; } + + /* We want receive timestamps. */ + opt = 1; + ret = kernel_setsockopt(local->socket, SOL_SOCKET, SO_TIMESTAMPNS, + (char *)&opt, sizeof(opt)); + if (ret < 0) { + _debug("setsockopt failed"); + goto error; + } break;
default: diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index 4774c8f5634d..6ac21bb2071d 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -124,7 +124,6 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, bool ping, struct kvec iov[2]; rxrpc_serial_t serial; rxrpc_seq_t hard_ack, top; - ktime_t now; size_t len, n; int ret; u8 reason; @@ -196,9 +195,7 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, bool ping, /* We need to stick a time in before we send the packet in case * the reply gets back before kernel_sendmsg() completes - but * asking UDP to send the packet can take a relatively long - * time, so we update the time after, on the assumption that - * the packet transmission is more likely to happen towards the - * end of the kernel_sendmsg() call. + * time. */ call->ping_time = ktime_get_real(); set_bit(RXRPC_CALL_PINGING, &call->flags); @@ -206,9 +203,6 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, bool ping, }
ret = kernel_sendmsg(conn->params.local->socket, &msg, iov, 2, len); - now = ktime_get_real(); - if (ping) - call->ping_time = now; conn->params.peer->last_tx_at = ktime_get_seconds(); if (ret < 0) trace_rxrpc_tx_fail(call->debug_id, serial, ret, @@ -357,8 +351,14 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb,
/* If our RTT cache needs working on, request an ACK. Also request * ACKs if a DATA packet appears to have been lost. + * + * However, we mustn't request an ACK on the last reply packet of a + * service call, lest OpenAFS incorrectly send us an ACK with some + * soft-ACKs in it and then never follow up with a proper hard ACK. */ - if (!(sp->hdr.flags & RXRPC_LAST_PACKET) && + if ((!(sp->hdr.flags & RXRPC_LAST_PACKET) || + rxrpc_to_server(sp) + ) && (test_and_clear_bit(RXRPC_CALL_EV_ACK_LOST, &call->events) || retrans || call->cong_mode == RXRPC_CALL_SLOW_START || @@ -384,6 +384,11 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb, goto send_fragmentable;
down_read(&conn->params.local->defrag_sem); + + sp->hdr.serial = serial; + smp_wmb(); /* Set serial before timestamp */ + skb->tstamp = ktime_get_real(); + /* send the packet by UDP * - returns -EMSGSIZE if UDP would have to fragment the packet * to go out of the interface @@ -404,12 +409,8 @@ done: trace_rxrpc_tx_data(call, sp->hdr.seq, serial, whdr.flags, retrans, lost); if (ret >= 0) { - ktime_t now = ktime_get_real(); - skb->tstamp = now; - smp_wmb(); - sp->hdr.serial = serial; if (whdr.flags & RXRPC_REQUEST_ACK) { - call->peer->rtt_last_req = now; + call->peer->rtt_last_req = skb->tstamp; trace_rxrpc_rtt_tx(call, rxrpc_rtt_tx_data, serial); if (call->peer->rtt_usage > 1) { unsigned long nowj = jiffies, ack_lost_at; @@ -448,6 +449,10 @@ send_fragmentable:
down_write(&conn->params.local->defrag_sem);
+ sp->hdr.serial = serial; + smp_wmb(); /* Set serial before timestamp */ + skb->tstamp = ktime_get_real(); + switch (conn->params.local->srx.transport.family) { case AF_INET: opt = IP_PMTUDISC_DONT;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 37a675e768d7606fe8a53e0c459c9b53e121ac20 ]
It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on IP_RECVERR, so neither local errors nor ICMP-transported remote errors from IPv4 peer addresses are returned to the AF_RXRPC protocol.
Make the sockopt setting code in rxrpc_open_socket() fall through from the AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in the AF_INET6 case.
Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/local_object.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 5d89ea5c1976..386dc1f20c73 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -135,10 +135,10 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) }
switch (local->srx.transport.family) { - case AF_INET: - /* we want to receive ICMP errors */ + case AF_INET6: + /* we want to receive ICMPv6 errors */ opt = 1; - ret = kernel_setsockopt(local->socket, SOL_IP, IP_RECVERR, + ret = kernel_setsockopt(local->socket, SOL_IPV6, IPV6_RECVERR, (char *) &opt, sizeof(opt)); if (ret < 0) { _debug("setsockopt failed"); @@ -146,19 +146,22 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) }
/* we want to set the don't fragment bit */ - opt = IP_PMTUDISC_DO; - ret = kernel_setsockopt(local->socket, SOL_IP, IP_MTU_DISCOVER, + opt = IPV6_PMTUDISC_DO; + ret = kernel_setsockopt(local->socket, SOL_IPV6, IPV6_MTU_DISCOVER, (char *) &opt, sizeof(opt)); if (ret < 0) { _debug("setsockopt failed"); goto error; } - break;
- case AF_INET6: + /* Fall through and set IPv4 options too otherwise we don't get + * errors from IPv4 packets sent through the IPv6 socket. + */ + + case AF_INET: /* we want to receive ICMP errors */ opt = 1; - ret = kernel_setsockopt(local->socket, SOL_IPV6, IPV6_RECVERR, + ret = kernel_setsockopt(local->socket, SOL_IP, IP_RECVERR, (char *) &opt, sizeof(opt)); if (ret < 0) { _debug("setsockopt failed"); @@ -166,8 +169,8 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) }
/* we want to set the don't fragment bit */ - opt = IPV6_PMTUDISC_DO; - ret = kernel_setsockopt(local->socket, SOL_IPV6, IPV6_MTU_DISCOVER, + opt = IP_PMTUDISC_DO; + ret = kernel_setsockopt(local->socket, SOL_IP, IP_MTU_DISCOVER, (char *) &opt, sizeof(opt)); if (ret < 0) { _debug("setsockopt failed");
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit f334430316e7fd37c4821ebec627e27714bb5d76 ]
Fix error distribution by immediately delivering the errors to all the affected calls rather than deferring them to a worker thread. The problem with the latter is that retries and things can happen in the meantime when we want to stop that sooner.
To this end:
(1) Stop the error distributor from removing calls from the error_targets list so that peer->lock isn't needed to synchronise against other adds and removals.
(2) Require the peer's error_targets list to be accessed with RCU, thereby avoiding the need to take peer->lock over distribution.
(3) Don't attempt to affect a call's state if it is already marked complete.
Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/rxrpc.h | 4 +--- net/rxrpc/ar-internal.h | 5 ---- net/rxrpc/call_object.c | 2 +- net/rxrpc/conn_client.c | 4 ++-- net/rxrpc/conn_object.c | 2 +- net/rxrpc/peer_event.c | 46 +++++++++--------------------------- net/rxrpc/peer_object.c | 17 ------------- 7 files changed, 16 insertions(+), 64 deletions(-)
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index 4fff00e9da8a..0a774b64fc29 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -56,7 +56,6 @@ enum rxrpc_peer_trace { rxrpc_peer_new, rxrpc_peer_processing, rxrpc_peer_put, - rxrpc_peer_queued_error, };
enum rxrpc_conn_trace { @@ -257,8 +256,7 @@ enum rxrpc_tx_fail_trace { EM(rxrpc_peer_got, "GOT") \ EM(rxrpc_peer_new, "NEW") \ EM(rxrpc_peer_processing, "PRO") \ - EM(rxrpc_peer_put, "PUT") \ - E_(rxrpc_peer_queued_error, "QER") + E_(rxrpc_peer_put, "PUT")
#define rxrpc_conn_traces \ EM(rxrpc_conn_got, "GOT") \ diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 5069193d2cc1..4718d08c0af1 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -293,7 +293,6 @@ struct rxrpc_peer { struct hlist_node hash_link; struct rxrpc_local *local; struct hlist_head error_targets; /* targets for net error distribution */ - struct work_struct error_distributor; struct rb_root service_conns; /* Service connections */ struct list_head keepalive_link; /* Link in net->peer_keepalive[] */ time64_t last_tx_at; /* Last time packet sent here */ @@ -304,8 +303,6 @@ struct rxrpc_peer { unsigned int maxdata; /* data size (MTU - hdrsize) */ unsigned short hdrsize; /* header size (IP + UDP + RxRPC) */ int debug_id; /* debug ID for printks */ - int error_report; /* Net (+0) or local (+1000000) to distribute */ -#define RXRPC_LOCAL_ERROR_OFFSET 1000000 struct sockaddr_rxrpc srx; /* remote address */
/* calculated RTT cache */ @@ -1039,7 +1036,6 @@ void rxrpc_send_keepalive(struct rxrpc_peer *); * peer_event.c */ void rxrpc_error_report(struct sock *); -void rxrpc_peer_error_distributor(struct work_struct *); void rxrpc_peer_add_rtt(struct rxrpc_call *, enum rxrpc_rtt_rx_trace, rxrpc_serial_t, rxrpc_serial_t, ktime_t, ktime_t); void rxrpc_peer_keepalive_worker(struct work_struct *); @@ -1058,7 +1054,6 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *); struct rxrpc_peer *rxrpc_get_peer(struct rxrpc_peer *); struct rxrpc_peer *rxrpc_get_peer_maybe(struct rxrpc_peer *); void rxrpc_put_peer(struct rxrpc_peer *); -void __rxrpc_queue_peer_error(struct rxrpc_peer *);
/* * proc.c diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c index f6734d8cb01a..ed69257203c2 100644 --- a/net/rxrpc/call_object.c +++ b/net/rxrpc/call_object.c @@ -400,7 +400,7 @@ void rxrpc_incoming_call(struct rxrpc_sock *rx, rcu_assign_pointer(conn->channels[chan].call, call);
spin_lock(&conn->params.peer->lock); - hlist_add_head(&call->error_link, &conn->params.peer->error_targets); + hlist_add_head_rcu(&call->error_link, &conn->params.peer->error_targets); spin_unlock(&conn->params.peer->lock);
_net("CALL incoming %d on CONN %d", call->debug_id, call->conn->debug_id); diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c index 5736f643c516..0be19132202b 100644 --- a/net/rxrpc/conn_client.c +++ b/net/rxrpc/conn_client.c @@ -709,8 +709,8 @@ int rxrpc_connect_call(struct rxrpc_call *call, }
spin_lock_bh(&call->conn->params.peer->lock); - hlist_add_head(&call->error_link, - &call->conn->params.peer->error_targets); + hlist_add_head_rcu(&call->error_link, + &call->conn->params.peer->error_targets); spin_unlock_bh(&call->conn->params.peer->lock);
out: diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index c37bf8e282b9..e0d6d0fb7426 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -214,7 +214,7 @@ void rxrpc_disconnect_call(struct rxrpc_call *call) call->peer->cong_cwnd = call->cong_cwnd;
spin_lock_bh(&conn->params.peer->lock); - hlist_del_init(&call->error_link); + hlist_del_rcu(&call->error_link); spin_unlock_bh(&conn->params.peer->lock);
if (rxrpc_is_client_call(call)) diff --git a/net/rxrpc/peer_event.c b/net/rxrpc/peer_event.c index 4f9da2f51c69..f3e6fc670da2 100644 --- a/net/rxrpc/peer_event.c +++ b/net/rxrpc/peer_event.c @@ -23,6 +23,8 @@ #include "ar-internal.h"
static void rxrpc_store_error(struct rxrpc_peer *, struct sock_exterr_skb *); +static void rxrpc_distribute_error(struct rxrpc_peer *, int, + enum rxrpc_call_completion);
/* * Find the peer associated with an ICMP packet. @@ -194,8 +196,6 @@ void rxrpc_error_report(struct sock *sk) rcu_read_unlock(); rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
- /* The ref we obtained is passed off to the work item */ - __rxrpc_queue_peer_error(peer); _leave(""); }
@@ -205,6 +205,7 @@ void rxrpc_error_report(struct sock *sk) static void rxrpc_store_error(struct rxrpc_peer *peer, struct sock_exterr_skb *serr) { + enum rxrpc_call_completion compl = RXRPC_CALL_NETWORK_ERROR; struct sock_extended_err *ee; int err;
@@ -255,7 +256,7 @@ static void rxrpc_store_error(struct rxrpc_peer *peer, case SO_EE_ORIGIN_NONE: case SO_EE_ORIGIN_LOCAL: _proto("Rx Received local error { error=%d }", err); - err += RXRPC_LOCAL_ERROR_OFFSET; + compl = RXRPC_CALL_LOCAL_ERROR; break;
case SO_EE_ORIGIN_ICMP6: @@ -264,48 +265,23 @@ static void rxrpc_store_error(struct rxrpc_peer *peer, break; }
- peer->error_report = err; + rxrpc_distribute_error(peer, err, compl); }
/* - * Distribute an error that occurred on a peer + * Distribute an error that occurred on a peer. */ -void rxrpc_peer_error_distributor(struct work_struct *work) +static void rxrpc_distribute_error(struct rxrpc_peer *peer, int error, + enum rxrpc_call_completion compl) { - struct rxrpc_peer *peer = - container_of(work, struct rxrpc_peer, error_distributor); struct rxrpc_call *call; - enum rxrpc_call_completion compl; - int error; - - _enter(""); - - error = READ_ONCE(peer->error_report); - if (error < RXRPC_LOCAL_ERROR_OFFSET) { - compl = RXRPC_CALL_NETWORK_ERROR; - } else { - compl = RXRPC_CALL_LOCAL_ERROR; - error -= RXRPC_LOCAL_ERROR_OFFSET; - }
- _debug("ISSUE ERROR %s %d", rxrpc_call_completions[compl], error); - - spin_lock_bh(&peer->lock); - - while (!hlist_empty(&peer->error_targets)) { - call = hlist_entry(peer->error_targets.first, - struct rxrpc_call, error_link); - hlist_del_init(&call->error_link); + hlist_for_each_entry_rcu(call, &peer->error_targets, error_link) { rxrpc_see_call(call); - - if (rxrpc_set_call_completion(call, compl, 0, -error)) + if (call->state < RXRPC_CALL_COMPLETE && + rxrpc_set_call_completion(call, compl, 0, -error)) rxrpc_notify_socket(call); } - - spin_unlock_bh(&peer->lock); - - rxrpc_put_peer(peer); - _leave(""); }
/* diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 24ec7cdcf332..ef4c2e8a35cc 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -222,8 +222,6 @@ struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp) atomic_set(&peer->usage, 1); peer->local = local; INIT_HLIST_HEAD(&peer->error_targets); - INIT_WORK(&peer->error_distributor, - &rxrpc_peer_error_distributor); peer->service_conns = RB_ROOT; seqlock_init(&peer->service_conn_lock); spin_lock_init(&peer->lock); @@ -415,21 +413,6 @@ struct rxrpc_peer *rxrpc_get_peer_maybe(struct rxrpc_peer *peer) return peer; }
-/* - * Queue a peer record. This passes the caller's ref to the workqueue. - */ -void __rxrpc_queue_peer_error(struct rxrpc_peer *peer) -{ - const void *here = __builtin_return_address(0); - int n; - - n = atomic_read(&peer->usage); - if (rxrpc_queue_work(&peer->error_distributor)) - trace_rxrpc_peer(peer, rxrpc_peer_queued_error, n, here); - else - rxrpc_put_peer(peer); -} - /* * Discard a peer record. */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a13f814a67b12a2f29d1decf4b4f4e700658a517 ]
The nft_set_gc_batch_check() checks whether gc buffer is full. If gc buffer is full, gc buffer is released by the nft_set_gc_batch_complete() internally. In case of rbtree, the rb_erase() should be called before calling the nft_set_gc_batch_complete(). therefore the rb_erase() should be called before calling the nft_set_gc_batch_check() too.
test commands: table ip filter { set set1 { type ipv4_addr; flags interval, timeout; gc-interval 10s; timeout 1s; elements = { 1-2, 3-4, 5-6, ... 10000-10001, } } } %nft -f test.nft
splat looks like: [ 430.273885] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 430.282158] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 430.283116] CPU: 1 PID: 190 Comm: kworker/1:2 Tainted: G B 4.18.0+ #7 [ 430.283116] Workqueue: events_power_efficient nft_rbtree_gc [nf_tables_set] [ 430.313559] RIP: 0010:rb_next+0x81/0x130 [ 430.313559] Code: 08 49 bd 00 00 00 00 00 fc ff df 48 bb 00 00 00 00 00 fc ff df 48 85 c0 75 05 eb 58 48 89 d4 [ 430.313559] RSP: 0018:ffff88010cdb7680 EFLAGS: 00010207 [ 430.313559] RAX: 0000000000b84854 RBX: dffffc0000000000 RCX: ffffffff83f01973 [ 430.313559] RDX: 000000000017090c RSI: 0000000000000008 RDI: 0000000000b84864 [ 430.313559] RBP: ffff8801060d4588 R08: fffffbfff09bc349 R09: fffffbfff09bc349 [ 430.313559] R10: 0000000000000001 R11: fffffbfff09bc348 R12: ffff880100f081a8 [ 430.313559] R13: dffffc0000000000 R14: ffff880100ff8688 R15: dffffc0000000000 [ 430.313559] FS: 0000000000000000(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 [ 430.313559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 430.313559] CR2: 0000000001551008 CR3: 000000005dc16000 CR4: 00000000001006e0 [ 430.313559] Call Trace: [ 430.313559] nft_rbtree_gc+0x112/0x5c0 [nf_tables_set] [ 430.313559] process_one_work+0xc13/0x1ec0 [ 430.313559] ? _raw_spin_unlock_irq+0x29/0x40 [ 430.313559] ? pwq_dec_nr_in_flight+0x3c0/0x3c0 [ 430.313559] ? set_load_weight+0x270/0x270 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __schedule+0x6d3/0x1f50 [ 430.313559] ? find_held_lock+0x39/0x1c0 [ 430.313559] ? __sched_text_start+0x8/0x8 [ 430.313559] ? cyc2ns_read_end+0x10/0x10 [ 430.313559] ? save_trace+0x300/0x300 [ 430.313559] ? sched_clock_local+0xd4/0x140 [ 430.313559] ? find_held_lock+0x39/0x1c0 [ 430.313559] ? worker_thread+0x353/0x1120 [ 430.313559] ? worker_thread+0x353/0x1120 [ 430.313559] ? lock_contended+0xe70/0xe70 [ 430.313559] ? __lock_acquire+0x4500/0x4500 [ 430.535635] ? do_raw_spin_unlock+0xa5/0x330 [ 430.535635] ? do_raw_spin_trylock+0x101/0x1a0 [ 430.535635] ? do_raw_spin_lock+0x1f0/0x1f0 [ 430.535635] ? _raw_spin_lock_irq+0x10/0x70 [ 430.535635] worker_thread+0x15d/0x1120 [ ... ]
Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_rbtree.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 9873d734b494..8ad78b82c8e2 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -355,12 +355,11 @@ cont:
static void nft_rbtree_gc(struct work_struct *work) { + struct nft_rbtree_elem *rbe, *rbe_end = NULL, *rbe_prev = NULL; struct nft_set_gc_batch *gcb = NULL; - struct rb_node *node, *prev = NULL; - struct nft_rbtree_elem *rbe; struct nft_rbtree *priv; + struct rb_node *node; struct nft_set *set; - int i;
priv = container_of(work, struct nft_rbtree, gc_work.work); set = nft_set_container_of(priv); @@ -371,7 +370,7 @@ static void nft_rbtree_gc(struct work_struct *work) rbe = rb_entry(node, struct nft_rbtree_elem, node);
if (nft_rbtree_interval_end(rbe)) { - prev = node; + rbe_end = rbe; continue; } if (!nft_set_elem_expired(&rbe->ext)) @@ -379,29 +378,30 @@ static void nft_rbtree_gc(struct work_struct *work) if (nft_set_elem_mark_busy(&rbe->ext)) continue;
+ if (rbe_prev) { + rb_erase(&rbe_prev->node, &priv->root); + rbe_prev = NULL; + } gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC); if (!gcb) break;
atomic_dec(&set->nelems); nft_set_gc_batch_add(gcb, rbe); + rbe_prev = rbe;
- if (prev) { - rbe = rb_entry(prev, struct nft_rbtree_elem, node); + if (rbe_end) { atomic_dec(&set->nelems); - nft_set_gc_batch_add(gcb, rbe); - prev = NULL; + nft_set_gc_batch_add(gcb, rbe_end); + rb_erase(&rbe_end->node, &priv->root); + rbe_end = NULL; } node = rb_next(node); if (!node) break; } - if (gcb) { - for (i = 0; i < gcb->head.cnt; i++) { - rbe = gcb->elems[i]; - rb_erase(&rbe->node, &priv->root); - } - } + if (rbe_prev) + rb_erase(&rbe_prev->node, &priv->root); write_seqcount_end(&priv->count); write_unlock_bh(&priv->lock);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 421c119f558761556afca6a62ad183bc2d8659e0 ]
Unfortunately some versions of gcc emit following warning: $ make net/xfrm/xfrm_output.o linux/compiler.h:252:20: warning: array subscript is above array bounds [-Warray-bounds] hook_head = rcu_dereference(net->nf.hooks_arp[hook]); ^~~~~~~~~~~~~~~~~~~~~ xfrm_output_resume passes skb_dst(skb)->ops->family as its 'pf' arg so compiler can't know that we'll never access hooks_arp[]. (NFPROTO_IPV4 or NFPROTO_IPV6 are only possible cases).
Avoid this by adding an explicit WARN_ON_ONCE() check.
This patch has no effect if the family is a compile-time constant as gcc will remove the switch() construct entirely.
Reported-by: David Ahern dsahern@gmail.com Signed-off-by: Florian Westphal fw@strlen.de Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index dd2052f0efb7..11b7b8ab0696 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -215,6 +215,8 @@ static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net, break; case NFPROTO_ARP: #ifdef CONFIG_NETFILTER_FAMILY_ARP + if (WARN_ON_ONCE(hook >= ARRAY_SIZE(net->nf.hooks_arp))) + break; hook_head = rcu_dereference(net->nf.hooks_arp[hook]); #endif break;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c4ce446e33d7a0e978256ac6fea4c80e59d9de5f ]
The driver currently silently accepts unsupported Wake-on-LAN modes (other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user, which is confusing.
Fixes: 2e55cc7210fe ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/asix_common.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c index e95dd12edec4..023b8d0bf175 100644 --- a/drivers/net/usb/asix_common.c +++ b/drivers/net/usb/asix_common.c @@ -607,6 +607,9 @@ int asix_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo) struct usbnet *dev = netdev_priv(net); u8 opt = 0;
+ if (wolinfo->wolopts & ~(WAKE_PHY | WAKE_MAGIC)) + return -EINVAL; + if (wolinfo->wolopts & WAKE_PHY) opt |= AX_MONITOR_LINK; if (wolinfo->wolopts & WAKE_MAGIC)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 5ba6b4aa9a410c5e2c6417df52b5e2118ea9b467 ]
The driver currently silently accepts unsupported Wake-on-LAN modes (other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user, which is confusing.
Fixes: e2ca90c276e1 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/ax88179_178a.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c index 9e8ad372f419..2207f7a7d1ff 100644 --- a/drivers/net/usb/ax88179_178a.c +++ b/drivers/net/usb/ax88179_178a.c @@ -566,6 +566,9 @@ ax88179_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo) struct usbnet *dev = netdev_priv(net); u8 opt = 0;
+ if (wolinfo->wolopts & ~(WAKE_PHY | WAKE_MAGIC)) + return -EINVAL; + if (wolinfo->wolopts & WAKE_PHY) opt |= AX_MONITOR_MODE_RWLC; if (wolinfo->wolopts & WAKE_MAGIC)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit eb9ad088f96653a26b340f7c447c44cf023d5cdc ]
The driver supports a fair amount of Wake-on-LAN modes, but is not checking that the user specified one that is supported.
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Woojung Huh Woojung.Huh@Microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/lan78xx.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index aeca484a75b8..2bb3a081ff10 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1401,19 +1401,10 @@ static int lan78xx_set_wol(struct net_device *netdev, if (ret < 0) return ret;
- pdata->wol = 0; - if (wol->wolopts & WAKE_UCAST) - pdata->wol |= WAKE_UCAST; - if (wol->wolopts & WAKE_MCAST) - pdata->wol |= WAKE_MCAST; - if (wol->wolopts & WAKE_BCAST) - pdata->wol |= WAKE_BCAST; - if (wol->wolopts & WAKE_MAGIC) - pdata->wol |= WAKE_MAGIC; - if (wol->wolopts & WAKE_PHY) - pdata->wol |= WAKE_PHY; - if (wol->wolopts & WAKE_ARP) - pdata->wol |= WAKE_ARP; + if (wol->wolopts & ~WAKE_ALL) + return -EINVAL; + + pdata->wol = wol->wolopts;
device_set_wakeup_enable(&dev->udev->dev, (bool)wol->wolopts);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c5cb93e994ffb43b7b3b1ff10b9f928f54574a36 ]
The driver currently silently accepts unsupported Wake-on-LAN modes (other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user, which is confusing.
Fixes: 19a38d8e0aa3 ("USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/sr9800.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/sr9800.c b/drivers/net/usb/sr9800.c index 9277a0f228df..35f39f23d881 100644 --- a/drivers/net/usb/sr9800.c +++ b/drivers/net/usb/sr9800.c @@ -421,6 +421,9 @@ sr_set_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo) struct usbnet *dev = netdev_priv(net); u8 opt = 0;
+ if (wolinfo->wolopts & ~(WAKE_PHY | WAKE_MAGIC)) + return -EINVAL; + if (wolinfo->wolopts & WAKE_PHY) opt |= SR_MONITOR_LINK; if (wolinfo->wolopts & WAKE_MAGIC)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit f2750df1548bd8a2b060eb609fc43ca82811af4c ]
The driver does not check for Wake-on-LAN modes specified by an user, but will conditionally set the device as wake-up enabled or not based on that, which could be a very confusing user experience.
Fixes: 21ff2e8976b1 ("r8152: support WOL") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/r8152.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 1b07bb5e110d..9a55d75f7f10 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -4503,6 +4503,9 @@ static int rtl8152_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol) if (!rtl_can_wakeup(tp)) return -EOPNOTSUPP;
+ if (wol->wolopts & ~WAKE_ANY) + return -EINVAL; + ret = usb_autopm_get_interface(tp->intf); if (ret < 0) goto out_set_wol;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 9c734b2769a73eea2e9e9767c0e0bf839ff23679 ]
The driver does not check for Wake-on-LAN modes specified by an user, but will conditionally set the device as wake-up enabled or not based on that, which could be a very confusing user experience.
Fixes: 6c636503260d ("smsc75xx: add wol magic packet support") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc75xx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index b64b1ee56d2d..ec287c9741e8 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -731,6 +731,9 @@ static int smsc75xx_ethtool_set_wol(struct net_device *net, struct smsc75xx_priv *pdata = (struct smsc75xx_priv *)(dev->data[0]); int ret;
+ if (wolinfo->wolopts & ~SUPPORTED_WAKE) + return -EINVAL; + pdata->wolopts = wolinfo->wolopts & SUPPORTED_WAKE;
ret = device_set_wakeup_enable(&dev->udev->dev, pdata->wolopts);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c530c471ba37bdd9fe1c7185b01455c00ae606fb ]
The driver does not check for Wake-on-LAN modes specified by an user, but will conditionally set the device as wake-up enabled or not based on that, which could be a very confusing user experience.
Fixes: e0e474a83c18 ("smsc95xx: add wol magic packet support") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc95xx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index 06b4d290784d..262e7a3c23cb 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -774,6 +774,9 @@ static int smsc95xx_ethtool_set_wol(struct net_device *net, struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]); int ret;
+ if (wolinfo->wolopts & ~SUPPORTED_WAKE) + return -EINVAL; + pdata->wolopts = wolinfo->wolopts & SUPPORTED_WAKE;
ret = device_set_wakeup_enable(&dev->udev->dev, pdata->wolopts);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 1db58529454742f67ebd96e3588315e880b72837 ]
reg_process_hint_country_ie() can free regulatory_request and return REG_REQ_ALREADY_SET. We shouldn't use regulatory_request after it's called. KASAN error was observed when this happens.
BUG: KASAN: use-after-free in reg_process_hint+0x839/0x8aa [cfg80211] Read of size 4 at addr ffff8800c430d434 by task kworker/1:3/89 <snipped> Workqueue: events reg_todo [cfg80211] Call Trace: dump_stack+0xc1/0x10c ? _atomic_dec_and_lock+0x1ad/0x1ad ? _raw_spin_lock_irqsave+0xa0/0xd2 print_address_description+0x86/0x26f ? reg_process_hint+0x839/0x8aa [cfg80211] kasan_report+0x241/0x29b reg_process_hint+0x839/0x8aa [cfg80211] reg_todo+0x204/0x5b9 [cfg80211] process_one_work+0x55f/0x8d0 ? worker_detach_from_pool+0x1b5/0x1b5 ? _raw_spin_unlock_irq+0x65/0xdd ? _raw_spin_unlock_irqrestore+0xf3/0xf3 worker_thread+0x5dd/0x841 ? kthread_parkme+0x1d/0x1d kthread+0x270/0x285 ? pr_cont_work+0xe3/0xe3 ? rcu_read_unlock_sched_notrace+0xca/0xca ret_from_fork+0x22/0x40
Allocated by task 2718: set_track+0x63/0xfa __kmalloc+0x119/0x1ac regulatory_hint_country_ie+0x38/0x329 [cfg80211] __cfg80211_connect_result+0x854/0xadd [cfg80211] cfg80211_rx_assoc_resp+0x3bc/0x4f0 [cfg80211] smsc95xx v1.0.6 ieee80211_sta_rx_queued_mgmt+0x1803/0x7ed5 [mac80211] ieee80211_iface_work+0x411/0x696 [mac80211] process_one_work+0x55f/0x8d0 worker_thread+0x5dd/0x841 kthread+0x270/0x285 ret_from_fork+0x22/0x40
Freed by task 89: set_track+0x63/0xfa kasan_slab_free+0x6a/0x87 kfree+0xdc/0x470 reg_process_hint+0x31e/0x8aa [cfg80211] reg_todo+0x204/0x5b9 [cfg80211] process_one_work+0x55f/0x8d0 worker_thread+0x5dd/0x841 kthread+0x270/0x285 ret_from_fork+0x22/0x40 <snipped>
Signed-off-by: Yu Zhao yuzhao@google.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/reg.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/wireless/reg.c b/net/wireless/reg.c index 765dedb12361..24cfa2776f50 100644 --- a/net/wireless/reg.c +++ b/net/wireless/reg.c @@ -2661,11 +2661,12 @@ static void reg_process_hint(struct regulatory_request *reg_request) { struct wiphy *wiphy = NULL; enum reg_request_treatment treatment; + enum nl80211_reg_initiator initiator = reg_request->initiator;
if (reg_request->wiphy_idx != WIPHY_IDX_INVALID) wiphy = wiphy_idx_to_wiphy(reg_request->wiphy_idx);
- switch (reg_request->initiator) { + switch (initiator) { case NL80211_REGDOM_SET_BY_CORE: treatment = reg_process_hint_core(reg_request); break; @@ -2683,7 +2684,7 @@ static void reg_process_hint(struct regulatory_request *reg_request) treatment = reg_process_hint_country_ie(wiphy, reg_request); break; default: - WARN(1, "invalid initiator %d\n", reg_request->initiator); + WARN(1, "invalid initiator %d\n", initiator); goto out_free; }
@@ -2698,7 +2699,7 @@ static void reg_process_hint(struct regulatory_request *reg_request) */ if (treatment == REG_REQ_ALREADY_SET && wiphy && wiphy->regulatory_flags & REGULATORY_STRICT_REG) { - wiphy_update_regulatory(wiphy, reg_request->initiator); + wiphy_update_regulatory(wiphy, initiator); wiphy_all_share_dfs_chan_state(wiphy); reg_check_channels(); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8 ]
Before this commit, KVM exposes MPX VMX controls to L1 guest only based on if KVM and host processor supports MPX virtualization. However, these controls should be exposed to guest only in case guest vCPU supports MPX.
Without this change, a L1 guest running with kernel which don't have commit 691bd4340bef ("kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS") asserts in QEMU on the following: qemu-kvm: error: failed to set MSR 0xd90 to 0x0 qemu-kvm: .../qemu-2.10.0/target/i386/kvm.c:1801 kvm_put_msrs: Assertion 'ret == cpu->kvm_msr_buf->nmsrs failed' This is because L1 KVM kvm_init_msr_list() will see that vmx_mpx_supported() (As it only checks MPX VMX controls support) and therefore KVM_GET_MSR_INDEX_LIST IOCTL will include MSR_IA32_BNDCFGS. However, later when L1 will attempt to set this MSR via KVM_SET_MSRS IOCTL, it will fail because !guest_cpuid_has_mpx(vcpu).
Therefore, fix the issue by exposing MPX VMX controls to L1 guest only when vCPU supports MPX.
Fixes: 36be0b9deb23 ("KVM: x86: Add nested virtualization support for MPX")
Reported-by: Eyal Moscovici eyal.moscovici@oracle.com Reviewed-by: Nikita Leshchenko nikita.leshchenko@oracle.com Reviewed-by: Darren Kenny darren.kenny@oracle.com Signed-off-by: Liran Alon liran.alon@oracle.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 32721ef9652d..ea691ddfc3aa 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3395,9 +3395,6 @@ static void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, bool apicv) VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | VM_EXIT_ACK_INTR_ON_EXIT;
- if (kvm_mpx_supported()) - msrs->exit_ctls_high |= VM_EXIT_CLEAR_BNDCFGS; - /* We support free control of debug control saving. */ msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
@@ -3414,8 +3411,6 @@ static void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, bool apicv) VM_ENTRY_LOAD_IA32_PAT; msrs->entry_ctls_high |= (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER); - if (kvm_mpx_supported()) - msrs->entry_ctls_high |= VM_ENTRY_LOAD_BNDCFGS;
/* We support free control of debug control loading. */ msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS; @@ -10825,6 +10820,23 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) #undef cr4_fixed1_update }
+static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + + if (kvm_mpx_supported()) { + bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX); + + if (mpx_enabled) { + vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_BNDCFGS; + vmx->nested.msrs.exit_ctls_high |= VM_EXIT_CLEAR_BNDCFGS; + } else { + vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_BNDCFGS; + vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS; + } + } +} + static void vmx_cpuid_update(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -10841,8 +10853,10 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu) to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &= ~FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
- if (nested_vmx_allowed(vcpu)) + if (nested_vmx_allowed(vcpu)) { nested_vmx_cr_fixed1_bits_update(vcpu); + nested_vmx_entry_exit_ctls_update(vcpu); + } }
static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 503234b3fdcaa578395c07e393ea3e5d13958824 ]
Commit a87036add092 ("KVM: x86: disable MPX if host did not enable MPX XSAVE features") introduced kvm_mpx_supported() to return true iff MPX is enabled in the host.
However, that commit seems to have missed replacing some calls to kvm_x86_ops->mpx_supported() to kvm_mpx_supported().
Complete original commit by replacing remaining calls to kvm_mpx_supported().
Fixes: a87036add092 ("KVM: x86: disable MPX if host did not enable MPX XSAVE features")
Suggested-by: Sean Christopherson sean.j.christopherson@intel.com Signed-off-by: Liran Alon liran.alon@oracle.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx.c | 2 +- arch/x86/kvm/x86.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ea691ddfc3aa..2e23fce5eb1f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11567,7 +11567,7 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
set_cr4_guest_host_mask(vmx);
- if (vmx_mpx_supported()) + if (kvm_mpx_supported()) vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
if (enable_vpid) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 97fcac34e007..3cd58a5eb449 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4625,7 +4625,7 @@ static void kvm_init_msr_list(void) */ switch (msrs_to_save[i]) { case MSR_IA32_BNDCFGS: - if (!kvm_x86_ops->mpx_supported()) + if (!kvm_mpx_supported()) continue; break; case MSR_TSC_AUX:
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 62cf9bd8118c4009f02c477ef78c723f49e53e16 ]
L2 IA32_BNDCFGS should be updated with vmcs12->guest_bndcfgs only when VM_ENTRY_LOAD_BNDCFGS is specified in vmcs12->vm_entry_controls.
Otherwise, L2 IA32_BNDCFGS should be set to vmcs01->guest_bndcfgs which is L1 IA32_BNDCFGS.
Reviewed-by: Nikita Leshchenko nikita.leshchenko@oracle.com Reviewed-by: Darren Kenny darren.kenny@oracle.com Signed-off-by: Liran Alon liran.alon@oracle.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 2e23fce5eb1f..9efe130ea2e6 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -819,6 +819,7 @@ struct nested_vmx {
/* to migrate it to L2 if VM_ENTRY_LOAD_DEBUG_CONTROLS is off */ u64 vmcs01_debugctl; + u64 vmcs01_guest_bndcfgs;
u16 vpid02; u16 last_vpid; @@ -11567,8 +11568,13 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
set_cr4_guest_host_mask(vmx);
- if (kvm_mpx_supported()) - vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); + if (kvm_mpx_supported()) { + if (vmx->nested.nested_run_pending && + (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) + vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); + else + vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs); + }
if (enable_vpid) { if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02) @@ -12082,6 +12088,9 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu *vcpu)
if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL); + if (kvm_mpx_supported() && + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) + vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
vmx_switch_vmcs(vcpu, &vmx->nested.vmcs02); vmx_segment_cache_clear(vmx);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit a9f9772114c8b07ae75bcb3654bd017461248095 ]
When we unregister a PMU, we fail to serialize the @pmu_idr properly. Fix that by doing the entire thing under pmu_lock.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: 2e80a82a49c4 ("perf: Dynamic pmu types") Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index ae22d93701db..b1ed5e99d9c6 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9436,9 +9436,7 @@ static void free_pmu_context(struct pmu *pmu) if (pmu->task_ctx_nr > perf_invalid_context) return;
- mutex_lock(&pmus_lock); free_percpu(pmu->pmu_cpu_context); - mutex_unlock(&pmus_lock); }
/* @@ -9694,12 +9692,8 @@ EXPORT_SYMBOL_GPL(perf_pmu_register);
void perf_pmu_unregister(struct pmu *pmu) { - int remove_device; - mutex_lock(&pmus_lock); - remove_device = pmu_bus_running; list_del_rcu(&pmu->entry); - mutex_unlock(&pmus_lock);
/* * We dereference the pmu list under both SRCU and regular RCU, so @@ -9711,13 +9705,14 @@ void perf_pmu_unregister(struct pmu *pmu) free_percpu(pmu->pmu_disable_count); if (pmu->type >= PERF_TYPE_MAX) idr_remove(&pmu_idr, pmu->type); - if (remove_device) { + if (pmu_bus_running) { if (pmu->nr_addr_filters) device_remove_file(pmu->dev, &dev_attr_nr_addr_filters); device_del(pmu->dev); put_device(pmu->dev); } free_pmu_context(pmu); + mutex_unlock(&pmus_lock); } EXPORT_SYMBOL_GPL(perf_pmu_unregister);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6265adb9726098b7f4f7ca70bc51992b25fdd9d6 ]
Physical package id 0 doesn't always exist, we should use boot_cpu_data.phys_proc_id here.
Signed-off-by: Masayoshi Mizuma m.mizuma@jp.fujitsu.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: H. Peter Anvin hpa@zytor.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Masayoshi Mizuma msys.mizuma@gmail.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: http://lkml.kernel.org/r/20180910144750.6782-1-msys.mizuma@gmail.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index 51d7c117e3c7..53b981dcdb42 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -3061,7 +3061,7 @@ static struct event_constraint bdx_uncore_pcu_constraints[] = {
void bdx_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = topology_phys_to_logical_pkg(boot_cpu_data.phys_proc_id);
if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit cd6fb677ce7e460c25bdd66f689734102ec7d642 ]
Some of the scheduling tracepoints allow the perf_tp_event code to write to ring buffer under different cpu than the code is running on.
This results in corrupted ring buffer data demonstrated in following perf commands:
# perf record -e 'sched:sched_switch,sched:sched_wakeup' perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run
Total time: 0.383 [sec] [ perf record: Woken up 8 times to write data ] 0x42b890 [0]: failed to process type: -1765585640 [ perf record: Captured and wrote 4.825 MB perf.data (29669 samples) ]
# perf report --stdio 0x42b890 [0]: failed to process type: -1765585640
The reason for the corruption are some of the scheduling tracepoints, that have __perf_task dfined and thus allow to store data to another cpu ring buffer:
sched_waking sched_wakeup sched_wakeup_new sched_stat_wait sched_stat_sleep sched_stat_iowait sched_stat_blocked
The perf_tp_event function first store samples for current cpu related events defined for tracepoint:
hlist_for_each_entry_rcu(event, head, hlist_entry) perf_swevent_event(event, count, &data, regs);
And then iterates events of the 'task' and store the sample for any task's event that passes tracepoint checks:
ctx = rcu_dereference(task->perf_event_ctxp[perf_sw_context]);
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { if (event->attr.type != PERF_TYPE_TRACEPOINT) continue; if (event->attr.config != entry->type) continue;
perf_swevent_event(event, count, &data, regs); }
Above code can race with same code running on another cpu, ending up with 2 cpus trying to store under the same ring buffer, which is specifically not allowed.
This patch prevents the problem, by allowing only events with the same current cpu to receive the event.
NOTE: this requires the use of (per-task-)per-cpu buffers for this feature to work; perf-record does this.
Signed-off-by: Jiri Olsa jolsa@kernel.org [peterz: small edits to Changelog] Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andrew Vagin avagin@openvz.org Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: e6dab5ffab59 ("perf/trace: Add ability to set a target task for events") Link: http://lkml.kernel.org/r/20180923161343.GB15054@krava Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c index b1ed5e99d9c6..fc072b7f839d 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8319,6 +8319,8 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size, goto unlock;
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { + if (event->cpu != smp_processor_id()) + continue; if (event->attr.type != PERF_TYPE_TRACEPOINT) continue; if (event->attr.config != entry->type)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 9d92cfeaf5215158d26d2991be7f7ff865cb98f3 ]
The counters on M3UPI Link 0 and Link 3 don't count properly, and writing 0 to these counters may causes system crash on some machines.
The PCI BDF addresses of the M3UPI in the current code are incorrect.
The correct addresses should be:
D18:F1 0x204D D18:F2 0x204E D18:F5 0x204D
Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1537538826-55489-1-git-send-email-kan.liang@linux.i... Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/uncore_snbep.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index 53b981dcdb42..c07bee31abe8 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -3931,16 +3931,16 @@ static const struct pci_device_id skx_uncore_pci_ids[] = { .driver_data = UNCORE_PCI_DEV_FULL_DATA(21, 5, SKX_PCI_UNCORE_M2PCIE, 3), }, { /* M3UPI0 Link 0 */ - PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204C), - .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 0, SKX_PCI_UNCORE_M3UPI, 0), + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204D), + .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 1, SKX_PCI_UNCORE_M3UPI, 0), }, { /* M3UPI0 Link 1 */ - PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204D), - .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 1, SKX_PCI_UNCORE_M3UPI, 1), + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204E), + .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 2, SKX_PCI_UNCORE_M3UPI, 1), }, { /* M3UPI1 Link 2 */ - PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204C), - .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 4, SKX_PCI_UNCORE_M3UPI, 2), + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204D), + .driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 5, SKX_PCI_UNCORE_M3UPI, 2), }, { /* end: all zeroes */ } };
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d7cbbe49a9304520181fb8c9272d1327deec8453 ]
In Family 17h, some L3 Cache Performance events require the ThreadMask and SliceMask to be set. For other events, these fields do not affect the count either way.
Set ThreadMask and SliceMask to 0xFF and 0xF respectively.
Signed-off-by: Janakarajan Natarajan Janakarajan.Natarajan@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: H . Peter Anvin hpa@zytor.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Suravee Suravee.Suthikulpanit@amd.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: http://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/uncore.c | 10 ++++++++++ arch/x86/include/asm/perf_event.h | 8 ++++++++ 2 files changed, 18 insertions(+)
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c index 981ba5e8241b..8671de126eac 100644 --- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -36,6 +36,7 @@
static int num_counters_llc; static int num_counters_nb; +static bool l3_mask;
static HLIST_HEAD(uncore_unused_list);
@@ -209,6 +210,13 @@ static int amd_uncore_event_init(struct perf_event *event) hwc->config = event->attr.config & AMD64_RAW_EVENT_MASK_NB; hwc->idx = -1;
+ /* + * SliceMask and ThreadMask need to be set for certain L3 events in + * Family 17h. For other events, the two fields do not affect the count. + */ + if (l3_mask) + hwc->config |= (AMD64_L3_SLICE_MASK | AMD64_L3_THREAD_MASK); + if (event->cpu < 0) return -EINVAL;
@@ -525,6 +533,7 @@ static int __init amd_uncore_init(void) amd_llc_pmu.name = "amd_l3"; format_attr_event_df.show = &event_show_df; format_attr_event_l3.show = &event_show_l3; + l3_mask = true; } else { num_counters_nb = NUM_COUNTERS_NB; num_counters_llc = NUM_COUNTERS_L2; @@ -532,6 +541,7 @@ static int __init amd_uncore_init(void) amd_llc_pmu.name = "amd_l2"; format_attr_event_df = format_attr_event; format_attr_event_l3 = format_attr_event; + l3_mask = false; }
amd_nb_pmu.attr_groups = amd_uncore_attr_groups_df; diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 12f54082f4c8..78241b736f2a 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -46,6 +46,14 @@ #define INTEL_ARCH_EVENT_MASK \ (ARCH_PERFMON_EVENTSEL_UMASK | ARCH_PERFMON_EVENTSEL_EVENT)
+#define AMD64_L3_SLICE_SHIFT 48 +#define AMD64_L3_SLICE_MASK \ + ((0xFULL) << AMD64_L3_SLICE_SHIFT) + +#define AMD64_L3_THREAD_SHIFT 56 +#define AMD64_L3_THREAD_MASK \ + ((0xFFULL) << AMD64_L3_THREAD_SHIFT) + #define X86_RAW_EVENT_MASK \ (ARCH_PERFMON_EVENTSEL_EVENT | \ ARCH_PERFMON_EVENTSEL_UMASK | \
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 86da809dda64a63fc27e05a215475325c3aaae92 ]
If there is a long chain of devices connected when the driver is loaded ICM sends device connected event for each and those are put to tb->wq for later processing. Now if the driver gets unloaded in the middle, so that the work queue is not yet empty it gets flushed by tb_domain_stop(). However, by that time the root switch is already removed so the driver crashes when it tries to dereference it in ICM event handling callbacks.
Fix this by checking whether the root switch is already removed. If it is we know that the domain is stopped and we should merely skip handling the event.
Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/icm.c | 49 ++++++++++++++++----------------------- 1 file changed, 20 insertions(+), 29 deletions(-)
diff --git a/drivers/thunderbolt/icm.c b/drivers/thunderbolt/icm.c index 500911f16498..5bad9fdec5f8 100644 --- a/drivers/thunderbolt/icm.c +++ b/drivers/thunderbolt/icm.c @@ -653,14 +653,6 @@ icm_fr_xdomain_connected(struct tb *tb, const struct icm_pkg_header *hdr) bool approved; u64 route;
- /* - * After NVM upgrade adding root switch device fails because we - * initiated reset. During that time ICM might still send - * XDomain connected message which we ignore here. - */ - if (!tb->root_switch) - return; - link = pkg->link_info & ICM_LINK_INFO_LINK_MASK; depth = (pkg->link_info & ICM_LINK_INFO_DEPTH_MASK) >> ICM_LINK_INFO_DEPTH_SHIFT; @@ -950,14 +942,6 @@ icm_tr_device_connected(struct tb *tb, const struct icm_pkg_header *hdr) if (pkg->hdr.packet_id) return;
- /* - * After NVM upgrade adding root switch device fails because we - * initiated reset. During that time ICM might still send device - * connected message which we ignore here. - */ - if (!tb->root_switch) - return; - route = get_route(pkg->route_hi, pkg->route_lo); authorized = pkg->link_info & ICM_LINK_INFO_APPROVED; security_level = (pkg->hdr.flags & ICM_FLAGS_SLEVEL_MASK) >> @@ -1317,19 +1301,26 @@ static void icm_handle_notification(struct work_struct *work)
mutex_lock(&tb->lock);
- switch (n->pkg->code) { - case ICM_EVENT_DEVICE_CONNECTED: - icm->device_connected(tb, n->pkg); - break; - case ICM_EVENT_DEVICE_DISCONNECTED: - icm->device_disconnected(tb, n->pkg); - break; - case ICM_EVENT_XDOMAIN_CONNECTED: - icm->xdomain_connected(tb, n->pkg); - break; - case ICM_EVENT_XDOMAIN_DISCONNECTED: - icm->xdomain_disconnected(tb, n->pkg); - break; + /* + * When the domain is stopped we flush its workqueue but before + * that the root switch is removed. In that case we should treat + * the queued events as being canceled. + */ + if (tb->root_switch) { + switch (n->pkg->code) { + case ICM_EVENT_DEVICE_CONNECTED: + icm->device_connected(tb, n->pkg); + break; + case ICM_EVENT_DEVICE_DISCONNECTED: + icm->device_disconnected(tb, n->pkg); + break; + case ICM_EVENT_XDOMAIN_CONNECTED: + icm->xdomain_connected(tb, n->pkg); + break; + case ICM_EVENT_XDOMAIN_DISCONNECTED: + icm->xdomain_disconnected(tb, n->pkg); + break; + } }
mutex_unlock(&tb->lock);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit eafa717bc145963c944bb0a64d16add683861b35 ]
If IOMMU is enabled and Thunderbolt driver is built into the kernel image, it will be probed before IOMMUs are attached to the PCI bus. Because of this DMA mappings the driver does will not go through IOMMU and start failing right after IOMMUs are enabled.
For this reason move the Thunderbolt driver initialization happen at rootfs level.
Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/nhi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c index f5a33e88e676..2d042150e41c 100644 --- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -1147,5 +1147,5 @@ static void __exit nhi_unload(void) tb_domain_exit(); }
-fs_initcall(nhi_init); +rootfs_initcall(nhi_init); module_exit(nhi_unload);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 657ade07df72847f591ccdb36bd9b91ed0edbac3 ]
During certain heavy network loads TX could time out with TX ring dump. TX is sometimes never restarted after reaching "tx_stop_threshold" because function "fec_enet_tx_queue" only tests the first queue.
In addition the TX timeout callback function failed to recover because it also operated only on the first queue.
Signed-off-by: Rickard x Andersson rickaran@axis.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index c729665107f5..e10471ee0a8b 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1157,7 +1157,7 @@ static void fec_enet_timeout_work(struct work_struct *work) napi_disable(&fep->napi); netif_tx_lock_bh(ndev); fec_restart(ndev); - netif_wake_queue(ndev); + netif_tx_wake_all_queues(ndev); netif_tx_unlock_bh(ndev); napi_enable(&fep->napi); } @@ -1272,7 +1272,7 @@ skb_done:
/* Since we have freed up a buffer, the ring is no longer full */ - if (netif_queue_stopped(ndev)) { + if (netif_tx_queue_stopped(nq)) { entries_free = fec_enet_get_free_txdesc_num(txq); if (entries_free >= txq->tx_wake_threshold) netif_tx_wake_queue(nq); @@ -1745,7 +1745,7 @@ static void fec_enet_adjust_link(struct net_device *ndev) napi_disable(&fep->napi); netif_tx_lock_bh(ndev); fec_restart(ndev); - netif_wake_queue(ndev); + netif_tx_wake_all_queues(ndev); netif_tx_unlock_bh(ndev); napi_enable(&fep->napi); } @@ -2246,7 +2246,7 @@ static int fec_enet_set_pauseparam(struct net_device *ndev, napi_disable(&fep->napi); netif_tx_lock_bh(ndev); fec_restart(ndev); - netif_wake_queue(ndev); + netif_tx_wake_all_queues(ndev); netif_tx_unlock_bh(ndev); napi_enable(&fep->napi); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit fe3a83af6a50199bf250fa331e94216912f79395 ]
Fix a commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") regression with the `declance' driver, which caused the adapter identification message to be split between two lines, e.g.:
declance.c: v0.011 by Linux MIPS DECstation task force tc6: PMAD-AA , addr = 08:00:2b:1b:2a:6a, irq = 14 tc6: registered as eth0.
Address that properly, by printing identification with a single call, making the messages now look like:
declance.c: v0.011 by Linux MIPS DECstation task force tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14 tc6: registered as eth0.
Signed-off-by: Maciej W. Rozycki macro@linux-mips.org Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/declance.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/amd/declance.c b/drivers/net/ethernet/amd/declance.c index 116997a8b593..00332a1ea84b 100644 --- a/drivers/net/ethernet/amd/declance.c +++ b/drivers/net/ethernet/amd/declance.c @@ -1031,6 +1031,7 @@ static int dec_lance_probe(struct device *bdev, const int type) int i, ret; unsigned long esar_base; unsigned char *esar; + const char *desc;
if (dec_lance_debug && version_printed++ == 0) printk(version); @@ -1216,19 +1217,20 @@ static int dec_lance_probe(struct device *bdev, const int type) */ switch (type) { case ASIC_LANCE: - printk("%s: IOASIC onboard LANCE", name); + desc = "IOASIC onboard LANCE"; break; case PMAD_LANCE: - printk("%s: PMAD-AA", name); + desc = "PMAD-AA"; break; case PMAX_LANCE: - printk("%s: PMAX onboard LANCE", name); + desc = "PMAX onboard LANCE"; break; } for (i = 0; i < 6; i++) dev->dev_addr[i] = esar[i * 4];
- printk(", addr = %pM, irq = %d\n", dev->dev_addr, dev->irq); + printk("%s: %s, addr = %pM, irq = %d\n", + name, desc, dev->dev_addr, dev->irq);
dev->netdev_ops = &lance_netdev_ops; dev->watchdog_timeo = 5*HZ;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ef1f2258748b675422ca0107e5bfb9ceeac675de ]
Use memblock_end_of_DRAM which provides correct last low memory PFN. Without that, DMA32 region becomes empty resulting in zero pages being allocated for DMA32.
This patch is based on earlier patch from palmer which never merged into 4.19. I just edited the commit text to make more sense.
Signed-off-by: Atish Patra atish.patra@wdc.com Signed-off-by: Palmer Dabbelt palmer@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 0efa5b29d0a3..dcff272aee06 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -165,7 +165,7 @@ static void __init setup_bootmem(void) BUG_ON(mem_size == 0);
set_max_mapnr(PFN_DOWN(mem_size)); - max_low_pfn = pfn_base + PFN_DOWN(mem_size); + max_low_pfn = memblock_end_of_DRAM();
#ifdef CONFIG_BLK_DEV_INITRD setup_initrd();
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit beeeac43b6fae5f5eaf707b6fcc2bf1e09deb785 ]
This reverts commit d76c74387e1c978b6c5524a146ab0f3f72206f98.
While commit d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling") fixes runtime PM handling when using kgdb, it introduces a traceback for everyone else.
BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034 in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0 7 locks held by swapper/0/1: #0: 000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b #1: 000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b #2: 0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb #3: 000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0 #4: 00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0 #5: 00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217 #6: 00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f irq event stamp: 735222 __down_trylock_console_sem+0x4a/0x84 console_unlock+0x338/0x60f __do_softirq+0x4a4/0x50d irq_exit+0x64/0xe2 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 #6 Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017 Call Trace: dump_stack+0x7d/0xbd ___might_sleep+0x238/0x259 __pm_runtime_resume+0x4e/0xa4 ? serial8250_rpm_get+0x2e/0x44 serial8250_console_write+0x44/0x301 ? lock_acquire+0x1b8/0x1fa console_unlock+0x577/0x60f vprintk_emit+0x1f0/0x217 printk+0x52/0x6e register_console+0x43b/0x524 uart_add_one_port+0x672/0x8b0 ? set_io_from_upio+0x150/0x162 serial8250_register_8250_port+0x825/0x8bb dw8250_probe+0x80c/0x8b0 ? dw8250_serial_inq+0x8e/0x8e ? dw8250_check_lcr+0x108/0x108 ? dw8250_runtime_resume+0x5b/0x5b ? dw8250_serial_outq+0xa1/0xa1 ? dw8250_remove+0x115/0x115 platform_drv_probe+0x76/0xc5 really_probe+0x1f1/0x3ee ? driver_allows_async_probing+0x5d/0x5d driver_probe_device+0xd6/0x112 ? driver_allows_async_probing+0x5d/0x5d bus_for_each_drv+0xbe/0xe5 __device_attach+0xdd/0x15b bus_probe_device+0x5a/0x10b device_add+0x501/0x894 ? _raw_write_unlock+0x27/0x3a platform_device_add+0x224/0x2b7 mfd_add_device+0x718/0x75b ? __kmalloc+0x144/0x16a ? mfd_add_devices+0x38/0xdb mfd_add_devices+0x9b/0xdb intel_lpss_probe+0x7d4/0x8ee intel_lpss_pci_probe+0xac/0xd4 pci_device_probe+0x101/0x18e ...
Revert the offending patch until a more comprehensive solution is available.
Cc: Tony Lindgren tony@atomide.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Phil Edworthy phil.edworthy@renesas.com Fixes: d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling") Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/8250/8250_dw.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c index af842000188c..a25f6ea5c784 100644 --- a/drivers/tty/serial/8250/8250_dw.c +++ b/drivers/tty/serial/8250/8250_dw.c @@ -576,10 +576,6 @@ static int dw8250_probe(struct platform_device *pdev) if (!data->skip_autocfg) dw8250_setup_port(p);
-#ifdef CONFIG_PM - uart.capabilities |= UART_CAP_RPM; -#endif - /* If we have a valid fifosize, try hooking up DMA */ if (p->fifosize) { data->dma.rxconf.src_maxburst = p->fifosize / 4;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520 ]
If CONFIG_WW_MUTEX_SELFTEST=y is enabled, booting an image in an arm64 virtual machine results in the following traceback if 8 CPUs are enabled:
DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current) WARNING: CPU: 2 PID: 537 at kernel/locking/mutex.c:1033 __mutex_unlock_slowpath+0x1a8/0x2e0 ... Call trace: __mutex_unlock_slowpath() ww_mutex_unlock() test_cycle_work() process_one_work() worker_thread() kthread() ret_from_fork()
If requesting b_mutex fails with -EDEADLK, the error variable is reassigned to the return value from calling ww_mutex_lock on a_mutex again. If this call fails, a_mutex is not locked. It is, however, unconditionally unlocked subsequently, causing the reported warning. Fix the problem by using two error variables.
With this change, the selftest still fails as follows:
cyclic deadlock not resolved, ret[7/8] = -35
However, the traceback is gone.
Signed-off-by: Guenter Roeck linux@roeck-us.net Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Will Deacon will.deacon@arm.com Fixes: d1b42b800e5d0 ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks") Link: http://lkml.kernel.org/r/1538516929-9734-1-git-send-email-linux@roeck-us.net Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/test-ww_mutex.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 0e4cd64ad2c0..654977862b06 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -260,7 +260,7 @@ static void test_cycle_work(struct work_struct *work) { struct test_cycle *cycle = container_of(work, typeof(*cycle), work); struct ww_acquire_ctx ctx; - int err; + int err, erra = 0;
ww_acquire_init(&ctx, &ww_class); ww_mutex_lock(&cycle->a_mutex, &ctx); @@ -270,17 +270,19 @@ static void test_cycle_work(struct work_struct *work)
err = ww_mutex_lock(cycle->b_mutex, &ctx); if (err == -EDEADLK) { + err = 0; ww_mutex_unlock(&cycle->a_mutex); ww_mutex_lock_slow(cycle->b_mutex, &ctx); - err = ww_mutex_lock(&cycle->a_mutex, &ctx); + erra = ww_mutex_lock(&cycle->a_mutex, &ctx); }
if (!err) ww_mutex_unlock(cycle->b_mutex); - ww_mutex_unlock(&cycle->a_mutex); + if (!erra) + ww_mutex_unlock(&cycle->a_mutex); ww_acquire_fini(&ctx);
- cycle->result = err; + cycle->result = err ?: erra; }
static int __test_cycle(unsigned int nthreads)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 987bf116445db5d63a5c2ed94c4479687d9c9973 ]
In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before we signal hw_done().
[Why]
This is to temporarily address a paging error that occurs when a nonblocking commit contends with another commit, particularly in a mirrored display configuration where at least 2 CRTCs are updated. The error occurs in drm_atomic_helper_wait_for_flip_done(), when we attempt to access the contents of new_crtc_state->commit.
Here's the sequence for a mirrored 2 display setup (irrelevant steps left out for clarity):
**THREAD 1** | **THREAD 2** | Initialize atomic state for flip | | Queue worker | ...
| Do work for flip | | Signal hw_done() on CRTC 1 | Signal hw_done() on CRTC 2 | | Wait for flip_done() on CRTC 1
<---- **PREEMPTED BY THREAD 1**
Initialize atomic state for cursor | update (1) | | Do cursor update work on both CRTCs | | Clear atomic state (2) | **DONE** | ... | | Wait for flip_done() on CRTC 2 | *ERROR* |
The issue starts with (1). When the atomic state is initialized, the current CRTC states are duplicated to be the new_crtc_states, and referenced to be the old_crtc_states. (The new_crtc_states are to be filled with update data.)
Some things to note:
* Due to the mirrored configuration, the cursor updates on both CRTCs.
* At this point, the pflip IRQ has already been handled, and flip_done signaled on all CRTCs. The cursor commit can therefore continue.
* The old_crtc_states used by the cursor update are the **same states** as the new_crtc_states used by the flip worker.
At (2), the old_crtc_state is freed (*), and the cursor commit completes. We then context switch back to the flip worker, where we attempt to access the new_crtc_state->commit object. This is problematic, as this state has already been freed.
(*) Technically, 'state->crtcs[i].state' is freed, which was made to reference old_crtc_state in drm_atomic_helper_swap_state()
[How]
By moving hw_done() after wait_for_flip_done(), we're guaranteed that the new_crtc_state (from the flip worker's perspective) still exists. This is because any other commit will be blocked, waiting for the hw_done() signal.
Note that both the i915 and imx drivers have this sequence flipped already, masking this problem.
Signed-off-by: Shirish S shirish.s@amd.com Signed-off-by: Leo Li sunpeng.li@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e484d0a94bdc..5b9cc3aeaa55 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4494,12 +4494,18 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) } spin_unlock_irqrestore(&adev->ddev->event_lock, flags);
- /* Signal HW programming completion */ - drm_atomic_helper_commit_hw_done(state);
if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state);
+ /* + * FIXME: + * Delay hw_done() until flip_done() is signaled. This is to block + * another commit from freeing the CRTC state while we're still + * waiting on flip_done. + */ + drm_atomic_helper_commit_hw_done(state); + drm_atomic_helper_cleanup_planes(dev, state);
/* Finally, drop a runtime PM reference for each newly disabled CRTC,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2d52527e80c2dc0c5f43f50adf183781262ec565 ]
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets to call netdev_features_change(). Moreover, ethtool setting for that bit can potentially be reverted after a tunnel is added or removed.
GSO already does software segmentation when 'hw_enc_features' is 0, even if VXLAN offload is turned on. In addition, commit 096de2f83ebc ("benet: stricter vxlan offloading check in be_features_check") avoids hardware segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong destination port. So, it's safe to avoid flipping the above feature on addition/deletion of VXLAN tunnels.
Fixes: 630f4b70567f ("be2net: Export tunnel offloads only when a VxLAN tunnel is created") Signed-off-by: Davide Caratti dcaratti@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/emulex/benet/be_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 8f755009ff38..c8445a4135a9 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -3915,8 +3915,6 @@ static int be_enable_vxlan_offloads(struct be_adapter *adapter) netdev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_GSO_UDP_TUNNEL; - netdev->hw_features |= NETIF_F_GSO_UDP_TUNNEL; - netdev->features |= NETIF_F_GSO_UDP_TUNNEL;
dev_info(dev, "Enabled VxLAN offloads for UDP port %d\n", be16_to_cpu(port)); @@ -3938,8 +3936,6 @@ static void be_disable_vxlan_offloads(struct be_adapter *adapter) adapter->vxlan_port = 0;
netdev->hw_enc_features = 0; - netdev->hw_features &= ~(NETIF_F_GSO_UDP_TUNNEL); - netdev->features &= ~(NETIF_F_GSO_UDP_TUNNEL); }
static void be_calculate_vf_res(struct be_adapter *adapter, u16 num_vfs, @@ -5232,6 +5228,7 @@ static void be_netdev_init(struct net_device *netdev) struct be_adapter *adapter = netdev_priv(netdev);
netdev->hw_features |= NETIF_F_SG | NETIF_F_TSO | NETIF_F_TSO6 | + NETIF_F_GSO_UDP_TUNNEL | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | NETIF_F_HW_VLAN_CTAG_TX; if ((be_if_cap_flags(adapter) & BE_IF_FLAGS_RSS))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ac1788cc7da4ce54edcfd2e499afdb0a23d5c41d ]
With commit 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot"), kdump kernel on shared LPAR may crash.
The necessary conditions are - Shared LPAR with at least 2 nodes having memory and CPUs. - Memory requirement for kdump kernel must be met by the first N-1 nodes where there are at least N nodes with memory and CPUs.
Example numactl of such a machine. $ numactl -H available: 5 nodes (0,2,5-7) node 0 cpus: node 0 size: 0 MB node 0 free: 0 MB node 2 cpus: node 2 size: 255 MB node 2 free: 189 MB node 5 cpus: 24 25 26 27 28 29 30 31 node 5 size: 4095 MB node 5 free: 4024 MB node 6 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 6 size: 6353 MB node 6 free: 5998 MB node 7 cpus: 8 9 10 11 12 13 14 15 32 33 34 35 36 37 38 39 node 7 size: 7640 MB node 7 free: 7164 MB node distances: node 0 2 5 6 7 0: 10 40 40 40 40 2: 40 10 40 40 40 5: 40 40 10 40 40 6: 40 40 40 10 20 7: 40 40 40 20 10
Steps to reproduce. 1. Load / start kdump service. 2. Trigger a kdump (for example : echo c > /proc/sysrq-trigger)
When booting a kdump kernel with 2048M:
kexec: Starting switchover sequence. I'm in purgatory Using 1TB segments hash-mmu: Initializing hash mmu with SLB Linux version 4.19.0-rc5-master+ (srikar@linux-xxu6) (gcc version 4.8.5 (SUSE Linux)) #1 SMP Thu Sep 27 19:45:00 IST 2018 Found initrd at 0xc000000009e70000:0xc00000000ae554b4 Using pSeries machine description ----------------------------------------------------- ppc64_pft_size = 0x1e phys_mem_size = 0x88000000 dcache_bsize = 0x80 icache_bsize = 0x80 cpu_features = 0x000000ff8f5d91a7 possible = 0x0000fbffcf5fb1a7 always = 0x0000006f8b5c91a1 cpu_user_features = 0xdc0065c2 0xef000000 mmu_features = 0x7c006001 firmware_features = 0x00000007c45bfc57 htab_hash_mask = 0x7fffff physical_start = 0x8000000 ----------------------------------------------------- numa: NODE_DATA [mem 0x87d5e300-0x87d67fff] numa: NODE_DATA(0) on node 6 numa: NODE_DATA [mem 0x87d54600-0x87d5e2ff] Top of RAM: 0x88000000, Total RAM: 0x88000000 Memory hole size: 0MB Zone ranges: DMA [mem 0x0000000000000000-0x0000000087ffffff] DMA32 empty Normal empty Movable zone start for each node Early memory node ranges node 6: [mem 0x0000000000000000-0x0000000087ffffff] Could not find start_pfn for node 0 Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000] On node 0 totalpages: 0 Initmem setup node 6 [mem 0x0000000000000000-0x0000000087ffffff] On node 6 totalpages: 34816
Unable to handle kernel paging request for data at address 0x00000060 Faulting instruction address: 0xc000000008703a54 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 11 PID: 1 Comm: swapper/11 Not tainted 4.19.0-rc5-master+ #1 NIP: c000000008703a54 LR: c000000008703a38 CTR: 0000000000000000 REGS: c00000000b673440 TRAP: 0380 Not tainted (4.19.0-rc5-master+) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24022022 XER: 20000002 CFAR: c0000000086fc238 IRQMASK: 0 GPR00: c000000008703a38 c00000000b6736c0 c000000009281900 0000000000000000 GPR04: 0000000000000000 0000000000000000 fffffffffffff001 c00000000b660080 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000220 GPR12: 0000000000002200 c000000009e51400 0000000000000000 0000000000000008 GPR16: 0000000000000000 c000000008c152e8 c000000008c152a8 0000000000000000 GPR20: c000000009422fd8 c000000009412fd8 c000000009426040 0000000000000008 GPR24: 0000000000000000 0000000000000000 c000000009168bc8 c000000009168c78 GPR28: c00000000b126410 0000000000000000 c00000000916a0b8 c00000000b126400 NIP [c000000008703a54] bus_add_device+0x84/0x1e0 LR [c000000008703a38] bus_add_device+0x68/0x1e0 Call Trace: [c00000000b6736c0] [c000000008703a38] bus_add_device+0x68/0x1e0 (unreliable) [c00000000b673740] [c000000008700194] device_add+0x454/0x7c0 [c00000000b673800] [c00000000872e660] __register_one_node+0xb0/0x240 [c00000000b673860] [c00000000839a6bc] __try_online_node+0x12c/0x180 [c00000000b673900] [c00000000839b978] try_online_node+0x58/0x90 [c00000000b673930] [c0000000080846d8] find_and_online_cpu_nid+0x158/0x190 [c00000000b673a10] [c0000000080848a0] numa_update_cpu_topology+0x190/0x580 [c00000000b673c00] [c000000008d3f2e4] smp_cpus_done+0x94/0x108 [c00000000b673c70] [c000000008d5c00c] smp_init+0x174/0x19c [c00000000b673d00] [c000000008d346b8] kernel_init_freeable+0x1e0/0x450 [c00000000b673dc0] [c0000000080102e8] kernel_init+0x28/0x160 [c00000000b673e30] [c00000000800b65c] ret_from_kernel_thread+0x5c/0x80 Instruction dump: 60000000 60000000 e89e0020 7fe3fb78 4bff87d5 60000000 7c7d1b79 4082008c e8bf0050 e93e0098 3b9f0010 2fa50000 <e8690060> 38630018 419e0114 7f84e378 ---[ end trace 593577668c2daa65 ]---
However a regular kernel with 4096M (2048 gets reserved for crash kernel) boots properly.
Unlike regular kernels, which mark all available nodes as online, kdump kernel only marks just enough nodes as online and marks the rest as offline at boot. However kdump kernel boots with all available CPUs. With Commit 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot"), all CPUs are onlined on their respective nodes at boot time. try_online_node() tries to online the offline nodes but fails as all needed subsystems are not yet initialized.
As part of fix, detect and skip early onlining of a offline node.
Fixes: 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot") Reported-by: Pavithra Prakash pavrampu@in.ibm.com Signed-off-by: Srikar Dronamraju srikar@linux.vnet.ibm.com Tested-by: Hari Bathini hbathini@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/mm/numa.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 59d07bd5374a..055b211b7126 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1217,9 +1217,10 @@ int find_and_online_cpu_nid(int cpu) * Need to ensure that NODE_DATA is initialized for a node from * available memory (see memblock_alloc_try_nid). If unable to * init the node, then default to nearest node that has memory - * installed. + * installed. Skip onlining a node if the subsystems are not + * yet initialized. */ - if (try_online_node(new_nid)) + if (!topology_inited || try_online_node(new_nid)) new_nid = first_online_node; #else /*
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2c05d88818ab6571816b93edce4d53703870d7ae ]
In cxgb_extension_ioctl(), the command of the ioctl is firstly copied from the user-space buffer 'useraddr' to 'cmd' and checked through the switch statement. If the command is not as expected, an error code EOPNOTSUPP is returned. In the following execution, i.e., the cases of the switch statement, the whole buffer of 'useraddr' is copied again to a specific data structure, according to what kind of command is requested. However, after the second copy, there is no re-check on the newly-copied command. Given that the buffer 'useraddr' is in the user space, a malicious user can race to change the command between the two copies. By doing so, the attacker can supply malicious data to the kernel and cause undefined behavior.
This patch adds a re-check in each case of the switch statement if there is a second copy in that case, to re-check whether the command obtained in the second copy is the same as the one in the first copy. If not, an error code EINVAL is returned.
Signed-off-by: Wenwen Wang wang6495@umn.edu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c index a19172dbe6be..c34ea385fe4a 100644 --- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c +++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c @@ -2159,6 +2159,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EPERM; if (copy_from_user(&t, useraddr, sizeof(t))) return -EFAULT; + if (t.cmd != CHELSIO_SET_QSET_PARAMS) + return -EINVAL; if (t.qset_idx >= SGE_QSETS) return -EINVAL; if (!in_range(t.intr_lat, 0, M_NEWTIMER) || @@ -2258,6 +2260,9 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) if (copy_from_user(&t, useraddr, sizeof(t))) return -EFAULT;
+ if (t.cmd != CHELSIO_GET_QSET_PARAMS) + return -EINVAL; + /* Display qsets for all ports when offload enabled */ if (test_bit(OFFLOAD_DEVMAP_BIT, &adapter->open_device_map)) { q1 = 0; @@ -2303,6 +2308,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EBUSY; if (copy_from_user(&edata, useraddr, sizeof(edata))) return -EFAULT; + if (edata.cmd != CHELSIO_SET_QSET_NUM) + return -EINVAL; if (edata.val < 1 || (edata.val > 1 && !(adapter->flags & USING_MSIX))) return -EINVAL; @@ -2343,6 +2350,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EPERM; if (copy_from_user(&t, useraddr, sizeof(t))) return -EFAULT; + if (t.cmd != CHELSIO_LOAD_FW) + return -EINVAL; /* Check t.len sanity ? */ fw_data = memdup_user(useraddr + sizeof(t), t.len); if (IS_ERR(fw_data)) @@ -2366,6 +2375,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EBUSY; if (copy_from_user(&m, useraddr, sizeof(m))) return -EFAULT; + if (m.cmd != CHELSIO_SETMTUTAB) + return -EINVAL; if (m.nmtus != NMTUS) return -EINVAL; if (m.mtus[0] < 81) /* accommodate SACK */ @@ -2407,6 +2418,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EBUSY; if (copy_from_user(&m, useraddr, sizeof(m))) return -EFAULT; + if (m.cmd != CHELSIO_SET_PM) + return -EINVAL; if (!is_power_of_2(m.rx_pg_sz) || !is_power_of_2(m.tx_pg_sz)) return -EINVAL; /* not power of 2 */ @@ -2440,6 +2453,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EIO; /* need the memory controllers */ if (copy_from_user(&t, useraddr, sizeof(t))) return -EFAULT; + if (t.cmd != CHELSIO_GET_MEM) + return -EINVAL; if ((t.addr & 7) || (t.len & 7)) return -EINVAL; if (t.mem_id == MEM_CM) @@ -2492,6 +2507,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) return -EAGAIN; if (copy_from_user(&t, useraddr, sizeof(t))) return -EFAULT; + if (t.cmd != CHELSIO_SET_TRACE_FILTER) + return -EINVAL;
tp = (const struct trace_params *)&t.sip; if (t.config_tx)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 0781168e23a2fc8dceb989f11fc5b39b3ccacc35 ]
In yam_ioctl(), the concrete ioctl command is firstly copied from the user-space buffer 'ifr->ifr_data' to 'ioctl_cmd' and checked through the following switch statement. If the command is not as expected, an error code EINVAL is returned. In the following execution the buffer 'ifr->ifr_data' is copied again in the cases of the switch statement to specific data structures according to what kind of ioctl command is requested. However, after the second copy, no re-check is enforced on the newly-copied command. Given that the buffer 'ifr->ifr_data' is in the user space, a malicious user can race to change the command between the two copies. This way, the attacker can inject inconsistent data and cause undefined behavior.
This patch adds a re-check in each case of the switch statement if there is a second copy in that case, to re-check whether the command obtained in the second copy is the same as the one in the first copy. If not, an error code EINVAL will be returned.
Signed-off-by: Wenwen Wang wang6495@umn.edu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/hamradio/yam.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/hamradio/yam.c b/drivers/net/hamradio/yam.c index 16ec7af6ab7b..ba9df430fca6 100644 --- a/drivers/net/hamradio/yam.c +++ b/drivers/net/hamradio/yam.c @@ -966,6 +966,8 @@ static int yam_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) sizeof(struct yamdrv_ioctl_mcs)); if (IS_ERR(ym)) return PTR_ERR(ym); + if (ym->cmd != SIOCYAMSMCS) + return -EINVAL; if (ym->bitrate > YAM_MAXBITRATE) { kfree(ym); return -EINVAL; @@ -981,6 +983,8 @@ static int yam_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) if (copy_from_user(&yi, ifr->ifr_data, sizeof(struct yamdrv_ioctl_cfg))) return -EFAULT;
+ if (yi.cmd != SIOCYAMSCFG) + return -EINVAL; if ((yi.cfg.mask & YAM_IOBASE) && netif_running(dev)) return -EINVAL; /* Cannot change this parameter when up */ if ((yi.cfg.mask & YAM_IRQ) && netif_running(dev))
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 69eb7765b9c6902444c89c54e7043242faf981e5 ]
ocfs2_duplicate_clusters_by_page() may crash if one of the extent's pages is dirty. When a page has not been written back, it is still in dirty state. If ocfs2_duplicate_clusters_by_page() is called against the dirty page, the crash happens.
To fix this bug, we can just unlock the page and wait until the page until its not dirty.
The following is the backtrace:
kernel BUG at /root/code/ocfs2/refcounttree.c:2961! [exception RIP: ocfs2_duplicate_clusters_by_page+822] __ocfs2_move_extent+0x80/0x450 [ocfs2] ? __ocfs2_claim_clusters+0x130/0x250 [ocfs2] ocfs2_defrag_extent+0x5b8/0x5e0 [ocfs2] __ocfs2_move_extents_range+0x2a4/0x470 [ocfs2] ocfs2_move_extents+0x180/0x3b0 [ocfs2] ? ocfs2_wait_for_recovery+0x13/0x70 [ocfs2] ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2] ocfs2_ioctl+0x253/0x640 [ocfs2] do_vfs_ioctl+0x90/0x5f0 SyS_ioctl+0x74/0x80 do_syscall_64+0x74/0x140 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Once we find the page is dirty, we do not wait until it's clean, rather we use write_one_page() to write it back
Link: http://lkml.kernel.org/r/20180829074740.9438-1-lchen@suse.com [lchen@suse.com: update comments] Link: http://lkml.kernel.org/r/20180830075041.14879-1-lchen@suse.com [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Larry Chen lchen@suse.com Acked-by: Changwei Ge ge.changwei@h3c.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Joseph Qi jiangqi903@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/refcounttree.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c index 7869622af22a..7a5ee145c733 100644 --- a/fs/ocfs2/refcounttree.c +++ b/fs/ocfs2/refcounttree.c @@ -2946,6 +2946,7 @@ int ocfs2_duplicate_clusters_by_page(handle_t *handle, if (map_end & (PAGE_SIZE - 1)) to = map_end & (PAGE_SIZE - 1);
+retry: page = find_or_create_page(mapping, page_index, GFP_NOFS); if (!page) { ret = -ENOMEM; @@ -2954,11 +2955,18 @@ int ocfs2_duplicate_clusters_by_page(handle_t *handle, }
/* - * In case PAGE_SIZE <= CLUSTER_SIZE, This page - * can't be dirtied before we CoW it out. + * In case PAGE_SIZE <= CLUSTER_SIZE, we do not expect a dirty + * page, so write it back. */ - if (PAGE_SIZE <= OCFS2_SB(sb)->s_clustersize) - BUG_ON(PageDirty(page)); + if (PAGE_SIZE <= OCFS2_SB(sb)->s_clustersize) { + if (PageDirty(page)) { + /* + * write_on_page will unlock the page on return + */ + ret = write_one_page(page); + goto retry; + } + }
if (!PageUptodate(page)) { ret = block_read_full_page(page, ocfs2_get_block);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 51896864579d5a3349740847083f4db5c6487164 ]
get_user_pages_fast() will return negative value if no pages were pinned, then be converted to a unsigned, which is compared to zero, giving the wrong result.
Link: http://lkml.kernel.org/r/20180921095015.26088-1-yuehaibing@huawei.com Fixes: 09e35a4a1ca8 ("mm/gup_benchmark: handle gup failures") Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Michael S. Tsirkin mst@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/gup_benchmark.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/gup_benchmark.c b/mm/gup_benchmark.c index 6a473709e9b6..7405c9d89d65 100644 --- a/mm/gup_benchmark.c +++ b/mm/gup_benchmark.c @@ -19,7 +19,8 @@ static int __gup_benchmark_ioctl(unsigned int cmd, struct gup_benchmark *gup) { ktime_t start_time, end_time; - unsigned long i, nr, nr_pages, addr, next; + unsigned long i, nr_pages, addr, next; + int nr; struct page **pages;
nr_pages = gup->size / PAGE_SIZE;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit e6112fc300702f96374f34368513d57795fc6d23 ]
split_huge_page_to_list() fails on HugeTLB pages. I was experimenting with moving 32MB contig HugeTLB pages on arm64 (with a debug patch applied) and hit the following stack trace when the kernel crashed.
[ 3732.462797] Call trace: [ 3732.462835] split_huge_page_to_list+0x3b0/0x858 [ 3732.462913] migrate_pages+0x728/0xc20 [ 3732.462999] soft_offline_page+0x448/0x8b0 [ 3732.463097] __arm64_sys_madvise+0x724/0x850 [ 3732.463197] el0_svc_handler+0x74/0x110 [ 3732.463297] el0_svc+0x8/0xc [ 3732.463347] Code: d1000400 f90b0e60 f2fbd5a2 a94982a1 (f9000420)
When unmap_and_move[_huge_page]() fails due to lack of memory, the splitting should happen only for transparent huge pages not for HugeTLB pages. PageTransHuge() returns true for both THP and HugeTLB pages. Hence the conditonal check should test PagesHuge() flag to make sure that given pages is not a HugeTLB one.
Link: http://lkml.kernel.org/r/1537798495-4996-1-git-send-email-anshuman.khandual@... Fixes: 94723aafb9 ("mm: unclutter THP migration") Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Michal Hocko mhocko@suse.com Acked-by: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Zi Yan zi.yan@cs.rutgers.edu Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Vlastimil Babka vbabka@suse.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 2a55289ee9f1..f49eb9589d73 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1415,7 +1415,7 @@ retry: * we encounter them after the rest of the list * is processed. */ - if (PageTransHuge(page)) { + if (PageTransHuge(page) && !PageHuge(page)) { lock_page(page); rc = split_huge_page_to_list(page, from); unlock_page(page);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 571d0563c8881595f4ab027aef9ed1c55e3e7b7c ]
The first argument to WARN_ONCE() is a condition.
Fixes: 5800dc5c19f3 ("x86/paravirt: Fix spectre-v2 mitigations for paravirt guests") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Cc: Peter Zijlstra peterz@infradead.org Cc: Alok Kataria akataria@vmware.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: virtualization@lists.linux-foundation.org Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20180919103553.GD9238@mwanda Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/paravirt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 930c88341e4e..1fbf38dde84c 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -90,7 +90,7 @@ unsigned paravirt_patch_call(void *insnbuf,
if (len < 5) { #ifdef CONFIG_RETPOLINE - WARN_ONCE("Failing to patch indirect CALL in %ps\n", (void *)addr); + WARN_ONCE(1, "Failing to patch indirect CALL in %ps\n", (void *)addr); #endif return len; /* call too long for patch site */ } @@ -110,7 +110,7 @@ unsigned paravirt_patch_jmp(void *insnbuf, const void *target,
if (len < 5) { #ifdef CONFIG_RETPOLINE - WARN_ONCE("Failing to patch indirect JMP in %ps\n", (void *)addr); + WARN_ONCE(1, "Failing to patch indirect JMP in %ps\n", (void *)addr); #endif return len; /* call too long for patch site */ }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8927c27b32703e28041ae19bf25ea53461be83a1 ]
When building armada-37xx-periph, num_parents isn't used in function clk_pm_cpu_get_parent: drivers/clk/mvebu/armada-37xx-periph.c: In function ‘clk_pm_cpu_get_parent’: drivers/clk/mvebu/armada-37xx-periph.c:419:6: warning: unused variable ‘num_parents’ [-Wunused-variable] int num_parents = clk_hw_get_num_parents(hw); ^~~~~~~~~~~ Remove the declaration of num_parents to dispose the warning.
Fixes: 616bf80d381d ("clk: mvebu: armada-37xx-periph: Fix wrong return value in get_parent") Signed-off-by: Anders Roxell anders.roxell@linaro.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mvebu/armada-37xx-periph.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index 6f7637b19738..e764dfdea53f 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -419,7 +419,6 @@ static unsigned int armada_3700_pm_dvfs_get_cpu_parent(struct regmap *base) static u8 clk_pm_cpu_get_parent(struct clk_hw *hw) { struct clk_pm_cpu *pm_cpu = to_clk_pm_cpu(hw); - int num_parents = clk_hw_get_num_parents(hw); u32 val;
if (armada_3700_pm_dvfs_is_enabled(pm_cpu->nb_pm_base)) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 4f666675cdff0b986195413215eb062b7da6586f ]
When powering down a SDIO connected card during suspend, make sure to call into the generic lbs_suspend() function before pulling the plug. This will make sure the card is successfully deregistered from the system to avoid communication to the card starving out.
Fixes: 7444a8092906 ("libertas: fix suspend and resume for SDIO connected cards") Signed-off-by: Daniel Mack daniel@zonque.org Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Acked-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/marvell/libertas/if_sdio.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/wireless/marvell/libertas/if_sdio.c b/drivers/net/wireless/marvell/libertas/if_sdio.c index 43743c26c071..39bf85d0ade0 100644 --- a/drivers/net/wireless/marvell/libertas/if_sdio.c +++ b/drivers/net/wireless/marvell/libertas/if_sdio.c @@ -1317,6 +1317,10 @@ static int if_sdio_suspend(struct device *dev) if (priv->wol_criteria == EHS_REMOVE_WAKEUP) { dev_info(dev, "Suspend without wake params -- powering down card\n"); if (priv->fw_ready) { + ret = lbs_suspend(priv); + if (ret) + return ret; + priv->power_up_on_resume = true; if_sdio_power_off(card); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ff4ce2885af8f9e8e99864d78dbeb4673f089c76 ]
Fixes a crash when the report encounters an address that could not be associated with an mmaped region:
#0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 #1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 #2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 #3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 #4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 #5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 #6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 #7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085
Signed-off-by: Milian Wolff milian.wolff@kdab.com Tested-by: Sandipan Das sandipan@linux.ibm.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Jin Yao yao.jin@linux.intel.com Cc: Namhyung Kim namhyung@kernel.org Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/machine.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 22dbb6612b41..d49744dc46d7 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2272,7 +2272,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg) { struct callchain_cursor *cursor = arg; const char *srcline = NULL; - u64 addr; + u64 addr = entry->ip;
if (symbol_conf.hide_unresolved && entry->sym == NULL) return 0; @@ -2284,7 +2284,8 @@ static int unwind_entry(struct unwind_entry *entry, void *arg) * Convert entry->ip from a virtual address to an offset in * its corresponding binary. */ - addr = map__map_ip(entry->map, entry->ip); + if (entry->map) + addr = map__map_ip(entry->map, entry->ip);
srcline = callchain_srcline(entry->map, entry->sym, addr); return callchain_cursor_append(cursor, entry->ip,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 2da19ed3e4a87db16c0f69039da9f17a9596c350 ]
The current code is problematic because the iov_iter is reverted and never advanced in the non-error case. This patch skips the revert in the non-error case. This patch also fixes the amount by which the iov_iter is reverted. Currently, iov_iter is reverted by size, which can be greater than the amount by which the iter was actually advanced. Instead, only revert by the amount that the iter was advanced.
Fixes: 4718799817c5 ("tls: Fix zerocopy_from_iter iov handling") Signed-off-by: Doron Roberts-Kedes doronrk@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 9fab8e5a4a5b..994ddc7ec9b1 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -286,7 +286,7 @@ static int zerocopy_from_iter(struct sock *sk, struct iov_iter *from, int length, int *pages_used, unsigned int *size_used, struct scatterlist *to, int to_max_pages, - bool charge, bool revert) + bool charge) { struct page *pages[MAX_SKB_FRAGS];
@@ -335,10 +335,10 @@ static int zerocopy_from_iter(struct sock *sk, struct iov_iter *from, }
out: + if (rc) + iov_iter_revert(from, size - *size_used); *size_used = size; *pages_used = num_elem; - if (revert) - iov_iter_revert(from, size);
return rc; } @@ -440,7 +440,7 @@ alloc_encrypted: &ctx->sg_plaintext_size, ctx->sg_plaintext_data, ARRAY_SIZE(ctx->sg_plaintext_data), - true, false); + true); if (ret) goto fallback_to_reg_send;
@@ -453,8 +453,6 @@ alloc_encrypted:
copied -= try_to_copy; fallback_to_reg_send: - iov_iter_revert(&msg->msg_iter, - ctx->sg_plaintext_size - orig_size); trim_sg(sk, ctx->sg_plaintext_data, &ctx->sg_plaintext_num_elem, &ctx->sg_plaintext_size, @@ -828,7 +826,7 @@ int tls_sw_recvmsg(struct sock *sk, err = zerocopy_from_iter(sk, &msg->msg_iter, to_copy, &pages, &chunk, &sgin[1], - MAX_SKB_FRAGS, false, true); + MAX_SKB_FRAGS, false); if (err < 0) goto fallback_to_reg_recv;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit afbb1169ed5b58cfca017e368b53e019cf285853 ]
Commit 52cf93e63ee6 ("HID: i2c-hid: Don't reset device upon system resume") removes the need for the RESEND_REPORT_DESCR quirk for Raydium devices, but kept it for the SIS device id 10FB touchscreens, as the author of that commit could not determine if the quirk is still necessary there.
I've tested suspend/resume on a Toshiba Click Mini L9W-B which is the device for which this quirk was added in the first place and with the "Don't reset device upon system resume" fix the quirk is no longer necessary, so this commit removes it.
Note even better I also had some other devices with SIS touchscreens which suspend/resume issues, where the RESEND_REPORT_DESCR quirk did not help.
I've also tested these devices with the "Don't reset device upon system resume" fix and I'm happy to report that that fix also fixes touchscreen resume on the following devices:
Asus T100HA Asus T200TA Peaq C1010
Cc: Kai-Heng Feng kai.heng.feng@canonical.com Acked-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-ids.h | 1 - drivers/hid/i2c-hid/i2c-hid.c | 18 +++--------------- 2 files changed, 3 insertions(+), 16 deletions(-)
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index eee6b79fb131..ae5b72269e27 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -974,7 +974,6 @@ #define USB_DEVICE_ID_SIS817_TOUCH 0x0817 #define USB_DEVICE_ID_SIS_TS 0x1013 #define USB_DEVICE_ID_SIS1030_TOUCH 0x1030 -#define USB_DEVICE_ID_SIS10FB_TOUCH 0x10fb
#define USB_VENDOR_ID_SKYCABLE 0x1223 #define USB_DEVICE_ID_SKYCABLE_WIRELESS_PRESENTER 0x3F07 diff --git a/drivers/hid/i2c-hid/i2c-hid.c b/drivers/hid/i2c-hid/i2c-hid.c index 37013b58098c..d17cf6e323b2 100644 --- a/drivers/hid/i2c-hid/i2c-hid.c +++ b/drivers/hid/i2c-hid/i2c-hid.c @@ -47,8 +47,7 @@ /* quirks to control the device */ #define I2C_HID_QUIRK_SET_PWR_WAKEUP_DEV BIT(0) #define I2C_HID_QUIRK_NO_IRQ_AFTER_RESET BIT(1) -#define I2C_HID_QUIRK_RESEND_REPORT_DESCR BIT(2) -#define I2C_HID_QUIRK_NO_RUNTIME_PM BIT(3) +#define I2C_HID_QUIRK_NO_RUNTIME_PM BIT(2)
/* flags */ #define I2C_HID_STARTED 0 @@ -172,8 +171,6 @@ static const struct i2c_hid_quirks { { I2C_VENDOR_ID_HANTICK, I2C_PRODUCT_ID_HANTICK_5288, I2C_HID_QUIRK_NO_IRQ_AFTER_RESET | I2C_HID_QUIRK_NO_RUNTIME_PM }, - { USB_VENDOR_ID_SIS_TOUCH, USB_DEVICE_ID_SIS10FB_TOUCH, - I2C_HID_QUIRK_RESEND_REPORT_DESCR }, { 0, 0 } };
@@ -1241,22 +1238,13 @@ static int i2c_hid_resume(struct device *dev)
/* Instead of resetting device, simply powers the device on. This * solves "incomplete reports" on Raydium devices 2386:3118 and - * 2386:4B33 + * 2386:4B33 and fixes various SIS touchscreens no longer sending + * data after a suspend/resume. */ ret = i2c_hid_set_power(client, I2C_HID_PWR_ON); if (ret) return ret;
- /* Some devices need to re-send report descr cmd - * after resume, after this it will be back normal. - * otherwise it issues too many incomplete reports. - */ - if (ihid->quirks & I2C_HID_QUIRK_RESEND_REPORT_DESCR) { - ret = i2c_hid_command(client, &hid_report_descr_cmd, NULL, 0); - if (ret) - return ret; - } - if (hid->driver && hid->driver->reset_resume) { ret = hid->driver->reset_resume(hid); return ret;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 04f264d3a8b0eb25d378127bd78c3c9a0261c828 ]
We have a need to override the definition of barrier_before_unreachable() for MIPS, which means we either need to add architecture-specific code into linux/compiler-gcc.h or we need to allow the architecture to provide a header that can define the macro before the generic definition. The latter seems like the better approach.
A straightforward approach to the per-arch header is to make use of asm-generic to provide a default empty header & adjust architectures which don't need anything specific to make use of that by adding the header to generic-y. Unfortunately this doesn't work so well due to commit 28128c61e08e ("kconfig.h: Include compiler types to avoid missed struct attributes") which caused linux/compiler_types.h to be included in the compilation of every C file via the -include linux/kconfig.h flag in c_flags.
Because the -include flag is present for all C files we compile, we need the architecture-provided header to be present before any C files are compiled. If any C files can be compiled prior to the asm-generic header wrappers being generated then we hit a build failure due to missing header. Such cases do exist - one pointed out by the kbuild test robot is the compilation of arch/ia64/kernel/nr-irqs.c, which occurs as part of the archprepare target [1].
This leaves us with a few options:
1) Use generic-y & fix any build failures we find by enforcing ordering such that the asm-generic target occurs before any C compilation, such that linux/compiler_types.h can always include the generated asm-generic wrapper which in turn includes the empty asm-generic header. This would rely on us finding all the problematic cases - I don't know for sure that the ia64 issue is the only one.
2) Add an actual empty header to each architecture, so that we don't need the generated asm-generic wrapper. This seems messy.
3) Give up & add #ifdef CONFIG_MIPS or similar to linux/compiler_types.h. This seems messy too.
4) Include the arch header only when it's actually needed, removing the need for the asm-generic wrapper for all other architectures.
This patch allows us to use approach 4, by including an asm/compiler.h header from linux/compiler_types.h after the inclusion of the compiler-specific linux/compiler-*.h header(s). We do this conditionally, only when CONFIG_HAVE_ARCH_COMPILER_H is selected, in order to avoid the need for asm-generic wrappers & the associated build ordering issue described above. The asm/compiler.h header is included after the generic linux/compiler-*.h header(s) for consistency with the way linux/compiler-intel.h & linux/compiler-clang.h are included after the linux/compiler-gcc.h header that they override.
[1] https://lists.01.org/pipermail/kbuild-all/2018-August/051175.html
Signed-off-by: Paul Burton paul.burton@mips.com Reviewed-by: Masahiro Yamada yamada.masahiro@socionext.com Patchwork: https://patchwork.linux-mips.org/patch/20269/ Cc: Arnd Bergmann arnd@arndb.de Cc: James Hogan jhogan@kernel.org Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Ralf Baechle ralf@linux-mips.org Cc: linux-arch@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: linux-mips@linux-mips.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/Kconfig | 8 ++++++++ include/linux/compiler_types.h | 12 ++++++++++++ 2 files changed, 20 insertions(+)
diff --git a/arch/Kconfig b/arch/Kconfig index f03b72644902..a18371a36e03 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -977,4 +977,12 @@ config REFCOUNT_FULL against various use-after-free conditions that can be used in security flaw exploits.
+config HAVE_ARCH_COMPILER_H + bool + help + An architecture can select this if it provides an + asm/compiler.h header that should be included after + linux/compiler-*.h in order to override macro definitions that those + headers generally provide. + source "kernel/gcov/Kconfig" diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index a8ba6b04152c..55e4be8b016b 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -78,6 +78,18 @@ extern void __chk_io_ptr(const volatile void __iomem *); #include <linux/compiler-clang.h> #endif
+/* + * Some architectures need to provide custom definitions of macros provided + * by linux/compiler-*.h, and can do so using asm/compiler.h. We include that + * conditionally rather than using an asm-generic wrapper in order to avoid + * build failures if any C compilation, which will include this file via an + * -include argument in c_flags, occurs prior to the asm-generic wrappers being + * generated. + */ +#ifdef CONFIG_HAVE_ARCH_COMPILER_H +#include <asm/compiler.h> +#endif + /* * Generic compiler-dependent macros required for kernel * build go below this comment. Actual compiler/compiler version
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit eea96566c189c77e5272585984eb2729881a2f1d ]
The maximum CPU frequency for the i.MX53 QSB is 1GHz, so disable the 1.2GHz OPP. This makes the board work again with configs that have cpufreq enabled like imx_v6_v7_defconfig on which the board stopped working with the addition of cpufreq-dt support.
Fixes: 791f416608 ("ARM: dts: imx53: add cpufreq-dt support")
Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/imx53-qsb-common.dtsi | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/arch/arm/boot/dts/imx53-qsb-common.dtsi b/arch/arm/boot/dts/imx53-qsb-common.dtsi index ef7658a78836..c1548adee789 100644 --- a/arch/arm/boot/dts/imx53-qsb-common.dtsi +++ b/arch/arm/boot/dts/imx53-qsb-common.dtsi @@ -123,6 +123,17 @@ }; };
+&cpu0 { + /* CPU rated to 1GHz, not 1.2GHz as per the default settings */ + operating-points = < + /* kHz uV */ + 166666 850000 + 400000 900000 + 800000 1050000 + 1000000 1200000 + >; +}; + &esdhc1 { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_esdhc1>;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 05a2f54679861deb188750ba2a70187000b2c71f ]
When building in ClearLinux using 'make PYTHON=python3' with gcc 8.2.1 it fails with:
GEN /tmp/build/perf/python/perf.so In file included from /usr/include/python3.7m/Python.h:126, from /git/linux/tools/perf/util/python.c:2: /usr/include/python3.7m/import.h:58:24: error: redundant redeclaration of ‘_PyImport_AddModuleObject’ [-Werror=redundant-decls] PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *, PyObject *); ^~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/python3.7m/import.h:47:24: note: previous declaration of ‘_PyImport_AddModuleObject’ was here PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *name, ^~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors error: command 'gcc' failed with exit status 1
And indeed there is a redundant declaration in that Python.h file, one with parameter names and the other without, so just add -Wno-error=redundant-decls to the python setup instructions.
Now perf builds with gcc in ClearLinux with the following Dockerfile:
# docker.io/acmel/linux-perf-tools-build-clearlinux:latest FROM docker.io/clearlinux:latest MAINTAINER Arnaldo Carvalho de Melo acme@kernel.org RUN swupd update && \ swupd bundle-add sysadmin-basic-dev RUN mkdir -m 777 -p /git /tmp/build/perf /tmp/build/objtool /tmp/build/linux && \ groupadd -r perfbuilder && \ useradd -m -r -g perfbuilder perfbuilder && \ chown -R perfbuilder.perfbuilder /tmp/build/ /git/ USER perfbuilder COPY rx_and_build.sh / ENV EXTRA_MAKE_ARGS=PYTHON=python3 ENTRYPOINT ["/rx_and_build.sh"]
Now to figure out why the build fails with clang, that is present in the above container as detected by the rx_and_build.sh script:
clang version 6.0.1 (tags/RELEASE_601/final) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/sbin make: Entering directory '/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep
Auto-detecting system features: ... dwarf: [ OFF ] ... dwarf_getlocations: [ OFF ] ... glibc: [ OFF ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ OFF ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ OFF ] ... libcrypto: [ OFF ] ... libunwind: [ OFF ] ... libdw-dwarf-unwind: [ OFF ] ... zlib: [ OFF ] ... lzma: [ OFF ] ... get_cpuid: [ OFF ] ... bpf: [ OFF ]
Makefile.config:331: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop. make[1]: *** [Makefile.perf:206: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 make: Leaving directory '/git/linux/tools/perf'
Cc: Adrian Hunter adrian.hunter@intel.com Cc: David Ahern dsahern@gmail.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Thiago Macieira thiago.macieira@intel.com Cc: Wang Nan wangnan0@huawei.com Link: https://lkml.kernel.org/n/tip-c3khb9ac86s00qxzjrueomme@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py index 001be4f9d3b9..a5f9e236cc71 100644 --- a/tools/perf/util/setup.py +++ b/tools/perf/util/setup.py @@ -27,7 +27,7 @@ class install_lib(_install_lib):
cflags = getenv('CFLAGS', '').split() # switch off several checks (need to be at the end of cflags list) -cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter' ] +cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter', '-Wno-redundant-decls' ] if cc != "clang": cflags += ['-Wno-cast-function-type' ]
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf ]
Only use the mapped IP to find inline frames, but keep using the unmapped IP for the callchain cursor. This ensures we properly show the unmapped IP when displaying a frame we received via the dso__parse_addr_inlines API for a module which does not contain sufficient debug symbols to show the srcline.
This is another follow-up to commit 19610184693c ("perf script: Show virtual addresses instead of offsets").
Signed-off-by: Milian Wolff milian.wolff@kdab.com Acked-by: Jiri Olsa jolsa@kernel.org Tested-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jin Yao yao.jin@linux.intel.com Cc: Namhyung Kim namhyung@kernel.org Cc: Sandipan Das sandipan@linux.ibm.com Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets") Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com [ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/machine.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index d49744dc46d7..b70cce40ca97 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2246,7 +2246,8 @@ static int append_inlines(struct callchain_cursor *cursor, if (!symbol_conf.inline_name || !map || !sym) return ret;
- addr = map__rip_2objdump(map, ip); + addr = map__map_ip(map, ip); + addr = map__rip_2objdump(map, addr);
inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr); if (!inline_node) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit c479d5f2c2e1ce609da08c075054440d97ddff52 ]
We should only call the function to end a call's Tx phase if we rotated the marked-last packet out of the transmission buffer.
Make rxrpc_rotate_tx_window() return an indication of whether it just rotated the packet marked as the last out of the transmit buffer, carrying the information out of the locked section in that function.
We can then check the return value instead of examining RXRPC_CALL_TX_LAST.
Fixes: 70790dbe3f66 ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/input.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-)
--- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -216,10 +216,11 @@ static void rxrpc_send_ping(struct rxrpc /* * Apply a hard ACK by advancing the Tx window. */ -static void rxrpc_rotate_tx_window(struct rxrpc_call *call, rxrpc_seq_t to, +static bool rxrpc_rotate_tx_window(struct rxrpc_call *call, rxrpc_seq_t to, struct rxrpc_ack_summary *summary) { struct sk_buff *skb, *list = NULL; + bool rot_last = false; int ix; u8 annotation;
@@ -243,15 +244,17 @@ static void rxrpc_rotate_tx_window(struc skb->next = list; list = skb;
- if (annotation & RXRPC_TX_ANNO_LAST) + if (annotation & RXRPC_TX_ANNO_LAST) { set_bit(RXRPC_CALL_TX_LAST, &call->flags); + rot_last = true; + } if ((annotation & RXRPC_TX_ANNO_MASK) != RXRPC_TX_ANNO_ACK) summary->nr_rot_new_acks++; }
spin_unlock(&call->lock);
- trace_rxrpc_transmit(call, (test_bit(RXRPC_CALL_TX_LAST, &call->flags) ? + trace_rxrpc_transmit(call, (rot_last ? rxrpc_transmit_rotate_last : rxrpc_transmit_rotate)); wake_up(&call->waitq); @@ -262,6 +265,8 @@ static void rxrpc_rotate_tx_window(struc skb->next = NULL; rxrpc_free_skb(skb, rxrpc_skb_tx_freed); } + + return rot_last; }
/* @@ -332,11 +337,11 @@ static bool rxrpc_receiving_reply(struct trace_rxrpc_timer(call, rxrpc_timer_init_for_reply, now); }
- if (!test_bit(RXRPC_CALL_TX_LAST, &call->flags)) - rxrpc_rotate_tx_window(call, top, &summary); if (!test_bit(RXRPC_CALL_TX_LAST, &call->flags)) { - rxrpc_proto_abort("TXL", call, top); - return false; + if (!rxrpc_rotate_tx_window(call, top, &summary)) { + rxrpc_proto_abort("TXL", call, top); + return false; + } } if (!rxrpc_end_tx_phase(call, true, "ETD")) return false; @@ -891,8 +896,12 @@ static void rxrpc_input_ack(struct rxrpc if (nr_acks > call->tx_top - hard_ack) return rxrpc_proto_abort("AKN", call, 0);
- if (after(hard_ack, call->tx_hard_ack)) - rxrpc_rotate_tx_window(call, hard_ack, &summary); + if (after(hard_ack, call->tx_hard_ack)) { + if (rxrpc_rotate_tx_window(call, hard_ack, &summary)) { + rxrpc_end_tx_phase(call, false, "ETA"); + return; + } + }
if (nr_acks > 0) { if (skb_copy_bits(skb, offset, buf.acks, nr_acks) < 0) @@ -901,11 +910,6 @@ static void rxrpc_input_ack(struct rxrpc &summary); }
- if (test_bit(RXRPC_CALL_TX_LAST, &call->flags)) { - rxrpc_end_tx_phase(call, false, "ETA"); - return; - } - if (call->rxtx_annotations[call->tx_top & RXRPC_RXTX_BUFF_MASK] & RXRPC_TX_ANNO_LAST && summary.nr_acks == call->tx_top - hard_ack && @@ -927,8 +931,7 @@ static void rxrpc_input_ackall(struct rx
_proto("Rx ACKALL %%%u", sp->hdr.serial);
- rxrpc_rotate_tx_window(call, call->tx_top, &summary); - if (test_bit(RXRPC_CALL_TX_LAST, &call->flags)) + if (rxrpc_rotate_tx_window(call, call->tx_top, &summary)) rxrpc_end_tx_phase(call, false, "ETL"); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dfe995224693798e554ab4770f6d8a096afc60cd ]
Carry the call state out of the locked section in rxrpc_rotate_tx_window() rather than sampling it afterwards. This is only used to select tracepoint data, but could have changed by the time we do the tracepoint.
Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/input.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 5e180a3c2d01..fe1cf206d12a 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -278,23 +278,26 @@ static bool rxrpc_rotate_tx_window(struct rxrpc_call *call, rxrpc_seq_t to, static bool rxrpc_end_tx_phase(struct rxrpc_call *call, bool reply_begun, const char *abort_why) { + unsigned int state;
ASSERT(test_bit(RXRPC_CALL_TX_LAST, &call->flags));
write_lock(&call->state_lock);
- switch (call->state) { + state = call->state; + switch (state) { case RXRPC_CALL_CLIENT_SEND_REQUEST: case RXRPC_CALL_CLIENT_AWAIT_REPLY: if (reply_begun) - call->state = RXRPC_CALL_CLIENT_RECV_REPLY; + call->state = state = RXRPC_CALL_CLIENT_RECV_REPLY; else - call->state = RXRPC_CALL_CLIENT_AWAIT_REPLY; + call->state = state = RXRPC_CALL_CLIENT_AWAIT_REPLY; break;
case RXRPC_CALL_SERVER_AWAIT_ACK: __rxrpc_call_completed(call); rxrpc_notify_socket(call); + state = call->state; break;
default: @@ -302,11 +305,10 @@ static bool rxrpc_end_tx_phase(struct rxrpc_call *call, bool reply_begun, }
write_unlock(&call->state_lock); - if (call->state == RXRPC_CALL_CLIENT_AWAIT_REPLY) { + if (state == RXRPC_CALL_CLIENT_AWAIT_REPLY) trace_rxrpc_transmit(call, rxrpc_transmit_await_reply); - } else { + else trace_rxrpc_transmit(call, rxrpc_transmit_end); - } _leave(" = ok"); return true;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 298bc15b2079c324e82d0a6fda39c3d762af7282 ]
Move the out-of-order and duplicate ACK packet check to before the call to rxrpc_input_ackinfo() so that the receive window size and MTU size are only checked in the latest ACK packet and don't regress.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/input.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
--- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -862,6 +862,16 @@ static void rxrpc_input_ack(struct rxrpc rxrpc_propose_ack_respond_to_ack); }
+ /* Discard any out-of-order or duplicate ACKs. */ + if (before_eq(sp->hdr.serial, call->acks_latest)) { + _debug("discard ACK %d <= %d", + sp->hdr.serial, call->acks_latest); + return; + } + call->acks_latest_ts = skb->tstamp; + call->acks_latest = sp->hdr.serial; + + /* Parse rwind and mtu sizes if provided. */ ioffset = offset + nr_acks + 3; if (skb->len >= ioffset + sizeof(buf.info)) { if (skb_copy_bits(skb, ioffset, &buf.info, sizeof(buf.info)) < 0) @@ -883,15 +893,6 @@ static void rxrpc_input_ack(struct rxrpc return; }
- /* Discard any out-of-order or duplicate ACKs. */ - if (before_eq(sp->hdr.serial, call->acks_latest)) { - _debug("discard ACK %d <= %d", - sp->hdr.serial, call->acks_latest); - return; - } - call->acks_latest_ts = skb->tstamp; - call->acks_latest = sp->hdr.serial; - if (before(hard_ack, call->tx_hard_ack) || after(hard_ack, call->tx_top)) return rxrpc_proto_abort("AKW", call, 0);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 647530924f47c93db472ee3cf43b7ef1425581b6 ]
Fix connection-level abort handling to cache the abort and error codes properly so that a new incoming call can be properly aborted if it races with the parent connection being aborted by another CPU.
The abort_code and error parameters can then be dropped from rxrpc_abort_calls().
Fixes: f5c17aaeb2ae ("rxrpc: Calls should only have one terminal state") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/rxrpc/ar-internal.h | 4 ++-- net/rxrpc/call_accept.c | 4 ++-- net/rxrpc/conn_event.c | 26 +++++++++++++++----------- 3 files changed, 19 insertions(+), 15 deletions(-)
--- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -446,8 +446,7 @@ struct rxrpc_connection { spinlock_t state_lock; /* state-change lock */ enum rxrpc_conn_cache_state cache_state; enum rxrpc_conn_proto_state state; /* current state of connection */ - u32 local_abort; /* local abort code */ - u32 remote_abort; /* remote abort code */ + u32 abort_code; /* Abort code of connection abort */ int debug_id; /* debug ID for printks */ atomic_t serial; /* packet serial number counter */ unsigned int hi_serial; /* highest serial number received */ @@ -457,6 +456,7 @@ struct rxrpc_connection { u8 security_size; /* security header size */ u8 security_ix; /* security type */ u8 out_clientflag; /* RXRPC_CLIENT_INITIATED if we are client */ + short error; /* Local error code */ };
static inline bool rxrpc_to_server(const struct rxrpc_skb_priv *sp) --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -422,11 +422,11 @@ found_service:
case RXRPC_CONN_REMOTELY_ABORTED: rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED, - conn->remote_abort, -ECONNABORTED); + conn->abort_code, conn->error); break; case RXRPC_CONN_LOCALLY_ABORTED: rxrpc_abort_call("CON", call, sp->hdr.seq, - conn->local_abort, -ECONNABORTED); + conn->abort_code, conn->error); break; default: BUG(); --- a/net/rxrpc/conn_event.c +++ b/net/rxrpc/conn_event.c @@ -126,7 +126,7 @@ static void rxrpc_conn_retransmit_call(s
switch (chan->last_type) { case RXRPC_PACKET_TYPE_ABORT: - _proto("Tx ABORT %%%u { %d } [re]", serial, conn->local_abort); + _proto("Tx ABORT %%%u { %d } [re]", serial, conn->abort_code); break; case RXRPC_PACKET_TYPE_ACK: trace_rxrpc_tx_ack(NULL, serial, chan->last_seq, 0, @@ -148,13 +148,12 @@ static void rxrpc_conn_retransmit_call(s * pass a connection-level abort onto all calls on that connection */ static void rxrpc_abort_calls(struct rxrpc_connection *conn, - enum rxrpc_call_completion compl, - u32 abort_code, int error) + enum rxrpc_call_completion compl) { struct rxrpc_call *call; int i;
- _enter("{%d},%x", conn->debug_id, abort_code); + _enter("{%d},%x", conn->debug_id, conn->abort_code);
spin_lock(&conn->channel_lock);
@@ -167,9 +166,11 @@ static void rxrpc_abort_calls(struct rxr trace_rxrpc_abort(call->debug_id, "CON", call->cid, call->call_id, 0, - abort_code, error); + conn->abort_code, + conn->error); if (rxrpc_set_call_completion(call, compl, - abort_code, error)) + conn->abort_code, + conn->error)) rxrpc_notify_socket(call); } } @@ -202,10 +203,12 @@ static int rxrpc_abort_connection(struct return 0; }
+ conn->error = error; + conn->abort_code = abort_code; conn->state = RXRPC_CONN_LOCALLY_ABORTED; spin_unlock_bh(&conn->state_lock);
- rxrpc_abort_calls(conn, RXRPC_CALL_LOCALLY_ABORTED, abort_code, error); + rxrpc_abort_calls(conn, RXRPC_CALL_LOCALLY_ABORTED);
msg.msg_name = &conn->params.peer->srx.transport; msg.msg_namelen = conn->params.peer->srx.transport_len; @@ -224,7 +227,7 @@ static int rxrpc_abort_connection(struct whdr._rsvd = 0; whdr.serviceId = htons(conn->service_id);
- word = htonl(conn->local_abort); + word = htonl(conn->abort_code);
iov[0].iov_base = &whdr; iov[0].iov_len = sizeof(whdr); @@ -235,7 +238,7 @@ static int rxrpc_abort_connection(struct
serial = atomic_inc_return(&conn->serial); whdr.serial = htonl(serial); - _proto("Tx CONN ABORT %%%u { %d }", serial, conn->local_abort); + _proto("Tx CONN ABORT %%%u { %d }", serial, conn->abort_code);
ret = kernel_sendmsg(conn->params.local->socket, &msg, iov, 2, len); if (ret < 0) { @@ -308,9 +311,10 @@ static int rxrpc_process_event(struct rx abort_code = ntohl(wtmp); _proto("Rx ABORT %%%u { ac=%d }", sp->hdr.serial, abort_code);
+ conn->error = -ECONNABORTED; + conn->abort_code = abort_code; conn->state = RXRPC_CONN_REMOTELY_ABORTED; - rxrpc_abort_calls(conn, RXRPC_CALL_REMOTELY_ABORTED, - abort_code, -ECONNABORTED); + rxrpc_abort_calls(conn, RXRPC_CALL_REMOTELY_ABORTED); return -ECONNABORTED;
case RXRPC_PACKET_TYPE_CHALLENGE:
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 853c110982eaff0d99dace3f66f1ba58b5bfd9d5 ]
SEV requires access to the AMD cryptographic device APIs, and this does not work when KVM is builtin and the crypto driver is a module. Actually the Kconfig conditions for CONFIG_KVM_AMD_SEV try to disable SEV in that case, but it does not work because the actual crypto calls are not culled, only sev_hardware_setup() is.
This patch adds two CONFIG_KVM_AMD_SEV checks that gate all the remaining SEV code; it fixes this particular configuration, and drops 5 KiB of code when CONFIG_KVM_AMD_SEV=n.
Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/svm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ef772e5634d4..3e59a187fe30 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -436,14 +436,18 @@ static inline struct kvm_svm *to_kvm_svm(struct kvm *kvm)
static inline bool svm_sev_enabled(void) { - return max_sev_asid; + return IS_ENABLED(CONFIG_KVM_AMD_SEV) ? max_sev_asid : 0; }
static inline bool sev_guest(struct kvm *kvm) { +#ifdef CONFIG_KVM_AMD_SEV struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
return sev->active; +#else + return false; +#endif }
static inline int sev_get_asid(struct kvm *kvm)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d79c3888bde6581da7ff9f9d6f581900ecb5e632 ]
Memory mapped with devm_ioremap is automatically freed when the driver is disconnected from the device. Therefore there is no need to explicitly call devm_iounmap.
Fixes: 0857d92f71b6 ("net: ena: add missing unmap bars on device removal") Fixes: 411838e7b41c ("net: ena: fix rare kernel crash when bar memory remap fails") Signed-off-by: Arthur Kiyanovski akiyano@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 1b01cd2820ba..7093b661c50e 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -3128,15 +3128,8 @@ err_rss_init:
static void ena_release_bars(struct ena_com_dev *ena_dev, struct pci_dev *pdev) { - int release_bars; + int release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK;
- if (ena_dev->mem_bar) - devm_iounmap(&pdev->dev, ena_dev->mem_bar); - - if (ena_dev->reg_bar) - devm_iounmap(&pdev->dev, ena_dev->reg_bar); - - release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK; pci_release_selected_regions(pdev, release_bars); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit d7703ddbd7c9cb1ab7c08e1b85b314ff8cea38e9 ]
In a rare scenario when ena_device_restore() fails, followed by device remove, an FLR will not be issued. In this case, the device will keep sending asynchronous AENQ keep-alive events, even after driver removal, leading to memory corruption.
Fixes: 8c5c7abdeb2d ("net: ena: add power management ops to the ENA driver") Signed-off-by: Arthur Kiyanovski akiyano@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 7093b661c50e..72dbdebf4b5d 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -2648,7 +2648,11 @@ err_disable_msix: ena_free_mgmnt_irq(adapter); ena_disable_msix(adapter); err_device_destroy: + ena_com_abort_admin_commands(ena_dev); + ena_com_wait_for_abort_completion(ena_dev); ena_com_admin_destroy(ena_dev); + ena_com_mmio_reg_read_request_destroy(ena_dev); + ena_com_dev_reset(ena_dev, ENA_REGS_RESET_DRIVER_INVALID_STATE); err: clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags); clear_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 78a55d05def95144ca5fa9a64c49b2a0636a9866 ]
napi poll functions should be initialized before running request_irq(), to handle a rare condition where there is a pending interrupt, causing the ISR to fire immediately while the poll function wasn't set yet, causing a NULL dereference.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Arthur Kiyanovski akiyano@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 72dbdebf4b5d..000f0d42a710 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -1580,8 +1580,6 @@ static int ena_up_complete(struct ena_adapter *adapter) if (rc) return rc;
- ena_init_napi(adapter); - ena_change_mtu(adapter->netdev, adapter->netdev->mtu);
ena_refill_all_rx_bufs(adapter); @@ -1735,6 +1733,13 @@ static int ena_up(struct ena_adapter *adapter)
ena_setup_io_intr(adapter);
+ /* napi poll functions should be initialized before running + * request_irq(), to handle a rare condition where there is a pending + * interrupt, causing the ISR to fire immediately while the poll + * function wasn't set yet, causing a null dereference + */ + ena_init_napi(adapter); + rc = ena_request_io_irq(adapter); if (rc) goto err_req_irq;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3e779a2e7f909015f21428b66834127496110b6d ]
gpiochip_set_cascaded_irqchip() is passed 'parent_irq' as an argument and then the address of that argument is assigned to the gpio chips gpio_irq_chip 'parents' pointer shortly thereafter. This can't ever work, because we've just assigned some stack address to a pointer that we plan to dereference later in gpiochip_irq_map(). I ran into this issue with the KASAN report below when gpiochip_irq_map() tried to setup the parent irq with a total junk pointer for the 'parents' array.
BUG: KASAN: stack-out-of-bounds in gpiochip_irq_map+0x228/0x248 Read of size 4 at addr ffffffc0dde472e0 by task swapper/0/1
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 4.14.72 #34 Call trace: [<ffffff9008093638>] dump_backtrace+0x0/0x718 [<ffffff9008093da4>] show_stack+0x20/0x2c [<ffffff90096b9224>] __dump_stack+0x20/0x28 [<ffffff90096b91c8>] dump_stack+0x80/0xbc [<ffffff900845a350>] print_address_description+0x70/0x238 [<ffffff900845a8e4>] kasan_report+0x1cc/0x260 [<ffffff900845aa14>] __asan_report_load4_noabort+0x2c/0x38 [<ffffff900897e098>] gpiochip_irq_map+0x228/0x248 [<ffffff900820cc08>] irq_domain_associate+0x114/0x2ec [<ffffff900820d13c>] irq_create_mapping+0x120/0x234 [<ffffff900820da78>] irq_create_fwspec_mapping+0x4c8/0x88c [<ffffff900820e2d8>] irq_create_of_mapping+0x180/0x210 [<ffffff900917114c>] of_irq_get+0x138/0x198 [<ffffff9008dc70ac>] spi_drv_probe+0x94/0x178 [<ffffff9008ca5168>] driver_probe_device+0x51c/0x824 [<ffffff9008ca6538>] __device_attach_driver+0x148/0x20c [<ffffff9008ca14cc>] bus_for_each_drv+0x120/0x188 [<ffffff9008ca570c>] __device_attach+0x19c/0x2dc [<ffffff9008ca586c>] device_initial_probe+0x20/0x2c [<ffffff9008ca18bc>] bus_probe_device+0x80/0x154 [<ffffff9008c9b9b4>] device_add+0x9b8/0xbdc [<ffffff9008dc7640>] spi_add_device+0x1b8/0x380 [<ffffff9008dcbaf0>] spi_register_controller+0x111c/0x1378 [<ffffff9008dd6b10>] spi_geni_probe+0x4dc/0x6f8 [<ffffff9008cab058>] platform_drv_probe+0xdc/0x130 [<ffffff9008ca5168>] driver_probe_device+0x51c/0x824 [<ffffff9008ca59cc>] __driver_attach+0x100/0x194 [<ffffff9008ca0ea8>] bus_for_each_dev+0x104/0x16c [<ffffff9008ca58c0>] driver_attach+0x48/0x54 [<ffffff9008ca1edc>] bus_add_driver+0x274/0x498 [<ffffff9008ca8448>] driver_register+0x1ac/0x230 [<ffffff9008caaf6c>] __platform_driver_register+0xcc/0xdc [<ffffff9009c4b33c>] spi_geni_driver_init+0x1c/0x24 [<ffffff9008084cb8>] do_one_initcall+0x240/0x3dc [<ffffff9009c017d0>] kernel_init_freeable+0x378/0x468 [<ffffff90096e8240>] kernel_init+0x14/0x110 [<ffffff9008086fcc>] ret_from_fork+0x10/0x18
The buggy address belongs to the page: page:ffffffbf037791c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: ffffffbf037791e0 ffffffbf037791e0 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffffffc0dde47180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc0dde47200: f1 f1 f1 f1 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2
ffffffc0dde47280: f2 f2 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3
^ ffffffc0dde47300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc0dde47380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Let's leave around one unsigned int in the gpio_irq_chip struct for the single parent irq case and repoint the 'parents' array at it. This way code is left mostly intact to setup parents and we waste an extra few bytes per structure of which there should be only a handful in a system.
Cc: Evan Green evgreen@chromium.org Cc: Thierry Reding treding@nvidia.com Cc: Grygorii Strashko grygorii.strashko@ti.com Fixes: e0d897289813 ("gpio: Implement tighter IRQ chip integration") Signed-off-by: Stephen Boyd swboyd@chromium.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 3 ++- include/linux/gpio/driver.h | 7 +++++++ 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 06dce16e22bb..70f0dedca59f 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1675,7 +1675,8 @@ static void gpiochip_set_cascaded_irqchip(struct gpio_chip *gpiochip, irq_set_chained_handler_and_data(parent_irq, parent_handler, gpiochip);
- gpiochip->irq.parents = &parent_irq; + gpiochip->irq.parent_irq = parent_irq; + gpiochip->irq.parents = &gpiochip->irq.parent_irq; gpiochip->irq.num_parents = 1; }
diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index 5382b5183b7e..82a953ec5ef0 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -94,6 +94,13 @@ struct gpio_irq_chip { */ unsigned int num_parents;
+ /** + * @parent_irq: + * + * For use by gpiochip_set_cascaded_irqchip() + */ + unsigned int parent_irq; + /** * @parents: *
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit dd9a403495704fc80fb9f399003013ef2be2ee23 ]
The function that puts back the MR in cache also removes the DMA address from the HCA. Therefore we need to call this function before we remove the DMA mapping from MMU. Otherwise the HCA may access a memory that is no longer DMA mapped.
Call trace: NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0. CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012 RIP: 0010:intel_idle+0x73/0x120 Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02 RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046 RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000 RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9 R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000 R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d FS: 0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0 Call Trace: cpuidle_enter_state+0x7e/0x2e0 do_idle+0x1ed/0x290 cpu_startup_entry+0x6f/0x80 start_kernel+0x524/0x544 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa4/0xb0 DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set
Fixes: f3f134f5260a ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory") Signed-off-by: Valentine Fatiev valentinef@mellanox.com Reviewed-by: Moni Shoua monis@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/mr.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 308456d28afb..73339fd47dd8 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -544,6 +544,9 @@ void mlx5_mr_cache_free(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) int shrink = 0; int c;
+ if (!mr->allocated_from_cache) + return; + c = order2idx(dev, mr->order); if (c < 0 || c >= MAX_MR_CACHE_ENTRIES) { mlx5_ib_warn(dev, "order %d, cache index %d\n", mr->order, c); @@ -1647,18 +1650,19 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) umem = NULL; } #endif - clean_mr(dev, mr);
+ /* + * We should unregister the DMA address from the HCA before + * remove the DMA mapping. + */ + mlx5_mr_cache_free(dev, mr); if (umem) { ib_umem_release(umem); atomic_sub(npages, &dev->mdev->priv.reg_pages); } - if (!mr->allocated_from_cache) kfree(mr); - else - mlx5_mr_cache_free(dev, mr); }
int mlx5_ib_dereg_mr(struct ib_mr *ibmr)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 9a4890bd6d6325a1c88564a20ab310b2d56f6094 ]
In rds_send_mprds_hash(), if the calculated hash value is non-zero and the MPRDS connections are not yet up, it will wait. But it should not wait if the send is non-blocking. In this case, it should just use the base c_path for sending the message.
Signed-off-by: Ka-Cheong Poon ka-cheong.poon@oracle.com Acked-by: Santosh Shilimkar santosh.shilimkar@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/rds/send.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/rds/send.c b/net/rds/send.c index 59f17a2335f4..0e54ca0f4e9e 100644 --- a/net/rds/send.c +++ b/net/rds/send.c @@ -1006,7 +1006,8 @@ static int rds_cmsg_send(struct rds_sock *rs, struct rds_message *rm, return ret; }
-static int rds_send_mprds_hash(struct rds_sock *rs, struct rds_connection *conn) +static int rds_send_mprds_hash(struct rds_sock *rs, + struct rds_connection *conn, int nonblock) { int hash;
@@ -1022,10 +1023,16 @@ static int rds_send_mprds_hash(struct rds_sock *rs, struct rds_connection *conn) * used. But if we are interrupted, we have to use the zero * c_path in case the connection ends up being non-MP capable. */ - if (conn->c_npaths == 0) + if (conn->c_npaths == 0) { + /* Cannot wait for the connection be made, so just use + * the base c_path. + */ + if (nonblock) + return 0; if (wait_event_interruptible(conn->c_hs_waitq, conn->c_npaths != 0)) hash = 0; + } if (conn->c_npaths == 1) hash = 0; } @@ -1170,7 +1177,7 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len) }
if (conn->c_trans->t_mp_capable) - cpath = &conn->c_path[rds_send_mprds_hash(rs, conn)]; + cpath = &conn->c_path[rds_send_mprds_hash(rs, conn, nonblock)]; else cpath = &conn->c_path[0];
On 11/3/18 2:34 AM, Greg Kroah-Hartman wrote:
4.18-stable review patch. If anyone has any objections, please let me know.
This patch has a problem. I am working on a fix for that. Please do not include it in the stable release. Thanks.
[ Upstream commit 9a4890bd6d6325a1c88564a20ab310b2d56f6094 ]
In rds_send_mprds_hash(), if the calculated hash value is non-zero and the MPRDS connections are not yet up, it will wait. But it should not wait if the send is non-blocking. In this case, it should just use the base c_path for sending the message.
Signed-off-by: Ka-Cheong Poon ka-cheong.poon@oracle.com Acked-by: Santosh Shilimkar santosh.shilimkar@oracle.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org
net/rds/send.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/rds/send.c b/net/rds/send.c index 59f17a2335f4..0e54ca0f4e9e 100644 --- a/net/rds/send.c +++ b/net/rds/send.c @@ -1006,7 +1006,8 @@ static int rds_cmsg_send(struct rds_sock *rs, struct rds_message *rm, return ret; } -static int rds_send_mprds_hash(struct rds_sock *rs, struct rds_connection *conn) +static int rds_send_mprds_hash(struct rds_sock *rs,
{ int hash;struct rds_connection *conn, int nonblock)
@@ -1022,10 +1023,16 @@ static int rds_send_mprds_hash(struct rds_sock *rs, struct rds_connection *conn) * used. But if we are interrupted, we have to use the zero * c_path in case the connection ends up being non-MP capable. */
if (conn->c_npaths == 0)
if (conn->c_npaths == 0) {
/* Cannot wait for the connection be made, so just use
* the base c_path.
*/
if (nonblock)
return 0; if (wait_event_interruptible(conn->c_hs_waitq, conn->c_npaths != 0)) hash = 0;
if (conn->c_npaths == 1) hash = 0; }}
@@ -1170,7 +1177,7 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len) } if (conn->c_trans->t_mp_capable)
cpath = &conn->c_path[rds_send_mprds_hash(rs, conn)];
else cpath = &conn->c_path[0];cpath = &conn->c_path[rds_send_mprds_hash(rs, conn, nonblock)];
On Mon, Nov 05, 2018 at 03:38:25PM +0800, Ka-Cheong Poon wrote:
On 11/3/18 2:34 AM, Greg Kroah-Hartman wrote:
4.18-stable review patch. If anyone has any objections, please let me know.
This patch has a problem. I am working on a fix for that. Please do not include it in the stable release. Thanks.
It's already been released, sorry. When you get the fix upstream, please tell David to add it to the stable queue for everyone to get the fix as well.
thanks,
greg k-h
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 3c718e677c2b35b449992adc36ecce883c467e98 ]
the script rtnetlink.sh requires a bash-only features (sleep with sub-second precision). This may cause random test failure if the default shell is not bash. Address the above explicitly requiring bash as the script interpreter.
Fixes: 33b01b7b4f19 ("selftests: add rtnetlink test script") Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/rtnetlink.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh index 0d7a44fa30af..8e509cbcb209 100755 --- a/tools/testing/selftests/net/rtnetlink.sh +++ b/tools/testing/selftests/net/rtnetlink.sh @@ -1,4 +1,4 @@ -#!/bin/sh +#!/bin/bash # # This test is for checking rtnetlink callpaths, and get as much coverage as possible. #
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 12a2ea962c06efb30764c47b140d2ec9d3cd7cb0 ]
The udpgso_bench.sh script requires several bash-only features. This may cause random failures if the default shell is not bash. Address the above explicitly requiring bash as the script interpreter
Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Paolo Abeni pabeni@redhat.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/udpgso_bench.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/udpgso_bench.sh b/tools/testing/selftests/net/udpgso_bench.sh index 850767befa47..99e537ab5ad9 100755 --- a/tools/testing/selftests/net/udpgso_bench.sh +++ b/tools/testing/selftests/net/udpgso_bench.sh @@ -1,4 +1,4 @@ -#!/bin/sh +#!/bin/bash # SPDX-License-Identifier: GPL-2.0 # # Run a series of udpgso benchmarks
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 8dcf86caa1e3daf4a6ccf38e97f4f752b411f829 ]
Enabling CONFIG_GCOV_PROFILE_ALL=y causes linker errors on ARM:
`.text.exit' referenced in section `.ARM.exidx.text.exit': defined in discarded section `.text.exit'
`.text.exit' referenced in section `.fini_array.00100': defined in discarded section `.text.exit'
And related errors on NDS32:
`.text.exit' referenced in section `.dtors.65435': defined in discarded section `.text.exit'
The gcov compiler flags cause certain compiler versions to generate additional destructor-related sections that are not yet handled by the linker script, resulting in references between discarded and non-discarded sections.
Since destructors are not used in the Linux kernel, fix this by discarding these additional sections.
Reported-by: Arnd Bergmann arnd@arndb.de Tested-by: Arnd Bergmann arnd@arndb.de Acked-by: Arnd Bergmann arnd@arndb.de Reported-by: Greentime Hu green.hu@gmail.com Tested-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Peter Oberparleiter oberpar@linux.ibm.com Signed-off-by: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/kernel/vmlinux.lds.h | 2 ++ include/asm-generic/vmlinux.lds.h | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm/kernel/vmlinux.lds.h b/arch/arm/kernel/vmlinux.lds.h index ae5fdff18406..8247bc15addc 100644 --- a/arch/arm/kernel/vmlinux.lds.h +++ b/arch/arm/kernel/vmlinux.lds.h @@ -49,6 +49,8 @@ #define ARM_DISCARD \ *(.ARM.exidx.exit.text) \ *(.ARM.extab.exit.text) \ + *(.ARM.exidx.text.exit) \ + *(.ARM.extab.text.exit) \ ARM_CPU_DISCARD(*(.ARM.exidx.cpuexit.text)) \ ARM_CPU_DISCARD(*(.ARM.extab.cpuexit.text)) \ ARM_EXIT_DISCARD(EXIT_TEXT) \ diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index e373e2e10f6a..a11f86014352 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -617,8 +617,8 @@
#define EXIT_DATA \ *(.exit.data .exit.data.*) \ - *(.fini_array) \ - *(.dtors) \ + *(.fini_array .fini_array.*) \ + *(.dtors .dtors.*) \ MEM_DISCARD(exit.data*) \ MEM_DISCARD(exit.rodata*)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 52c8ee5bad8f33d02c567f6609f43d69303fc48d ]
Enabling both CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y and CONFIG_GCOV_PROFILE_ALL=y results in linker warnings:
warning: orphan section `.data..LPBX1' being placed in section `.data..LPBX1'.
LD_DEAD_CODE_DATA_ELIMINATION adds compiler flag -fdata-sections. This option causes GCC to create separate data sections for data objects, including those generated by GCC internally for gcov profiling. The names of these objects start with a dot (.LPBX0, .LPBX1), resulting in section names starting with 'data..'.
As section names starting with 'data..' are used for specific purposes in the Linux kernel, the linker script does not automatically include them in the output data section, resulting in the "orphan section" linker warnings.
Fix this by specifically including sections named "data..LPBX*" in the data section.
Reported-by: Stephen Rothwell sfr@canb.auug.org.au Tested-by: Stephen Rothwell sfr@canb.auug.org.au Tested-by: Arnd Bergmann arnd@arndb.de Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Peter Oberparleiter oberpar@linux.ibm.com Signed-off-by: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- include/asm-generic/vmlinux.lds.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index a11f86014352..83b930988e21 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -70,7 +70,7 @@ */ #ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* -#define DATA_MAIN .data .data.[0-9a-zA-Z_]* +#define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..LPBX* #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* #define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* #define BSS_MAIN .bss .bss.[0-9a-zA-Z_]*
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 6b3944e42e2e554aa5a4be681ecd70dccd459114 ]
Access to the list of cells by /proc/net/afs/cells has a couple of problems:
(1) It should be checking against SEQ_START_TOKEN for the keying the header line.
(2) It's only holding the RCU read lock, so it can't just walk over the list without following the proper RCU methods.
Fix these by using an hlist instead of an ordinary list and using the appropriate accessor functions to follow it with RCU.
Since the code that adds a cell to the list must also necessarily change, sort the list on insertion whilst we're at it.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/cell.c | 17 +++++++++++++++-- fs/afs/dynroot.c | 2 +- fs/afs/internal.h | 4 ++-- fs/afs/main.c | 2 +- fs/afs/proc.c | 7 +++---- 5 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/fs/afs/cell.c b/fs/afs/cell.c index f3d0bef16d78..6127f0fcd62c 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -514,6 +514,8 @@ static int afs_alloc_anon_key(struct afs_cell *cell) */ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell) { + struct hlist_node **p; + struct afs_cell *pcell; int ret;
if (!cell->anonymous_key) { @@ -534,7 +536,18 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell) return ret;
mutex_lock(&net->proc_cells_lock); - list_add_tail(&cell->proc_link, &net->proc_cells); + for (p = &net->proc_cells.first; *p; p = &(*p)->next) { + pcell = hlist_entry(*p, struct afs_cell, proc_link); + if (strcmp(cell->name, pcell->name) < 0) + break; + } + + cell->proc_link.pprev = p; + cell->proc_link.next = *p; + rcu_assign_pointer(*p, &cell->proc_link.next); + if (cell->proc_link.next) + cell->proc_link.next->pprev = &cell->proc_link.next; + afs_dynroot_mkdir(net, cell); mutex_unlock(&net->proc_cells_lock); return 0; @@ -550,7 +563,7 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell) afs_proc_cell_remove(cell);
mutex_lock(&net->proc_cells_lock); - list_del_init(&cell->proc_link); + hlist_del_rcu(&cell->proc_link); afs_dynroot_rmdir(net, cell); mutex_unlock(&net->proc_cells_lock);
diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 174e843f0633..7de7223843cc 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -286,7 +286,7 @@ int afs_dynroot_populate(struct super_block *sb) return -ERESTARTSYS;
net->dynroot_sb = sb; - list_for_each_entry(cell, &net->proc_cells, proc_link) { + hlist_for_each_entry(cell, &net->proc_cells, proc_link) { ret = afs_dynroot_mkdir(net, cell); if (ret < 0) goto error; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 9778df135717..270d1caa27c6 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -241,7 +241,7 @@ struct afs_net { seqlock_t cells_lock;
struct mutex proc_cells_lock; - struct list_head proc_cells; + struct hlist_head proc_cells;
/* Known servers. Theoretically each fileserver can only be in one * cell, but in practice, people create aliases and subsets and there's @@ -319,7 +319,7 @@ struct afs_cell { struct afs_net *net; struct key *anonymous_key; /* anonymous user key for this cell */ struct work_struct manager; /* Manager for init/deinit/dns */ - struct list_head proc_link; /* /proc cell list link */ + struct hlist_node proc_link; /* /proc cell list link */ #ifdef CONFIG_AFS_FSCACHE struct fscache_cookie *cache; /* caching cookie */ #endif diff --git a/fs/afs/main.c b/fs/afs/main.c index e84fe822a960..107427688edd 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -87,7 +87,7 @@ static int __net_init afs_net_init(struct net *net_ns) timer_setup(&net->cells_timer, afs_cells_timer, 0);
mutex_init(&net->proc_cells_lock); - INIT_LIST_HEAD(&net->proc_cells); + INIT_HLIST_HEAD(&net->proc_cells);
seqlock_init(&net->fs_lock); net->fs_servers = RB_ROOT; diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 476dcbb79713..9101f62707af 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -33,9 +33,8 @@ static inline struct afs_net *afs_seq2net_single(struct seq_file *m) static int afs_proc_cells_show(struct seq_file *m, void *v) { struct afs_cell *cell = list_entry(v, struct afs_cell, proc_link); - struct afs_net *net = afs_seq2net(m);
- if (v == &net->proc_cells) { + if (v == SEQ_START_TOKEN) { /* display header on line 1 */ seq_puts(m, "USE NAME\n"); return 0; @@ -50,12 +49,12 @@ static void *afs_proc_cells_start(struct seq_file *m, loff_t *_pos) __acquires(rcu) { rcu_read_lock(); - return seq_list_start_head(&afs_seq2net(m)->proc_cells, *_pos); + return seq_hlist_start_head_rcu(&afs_seq2net(m)->proc_cells, *_pos); }
static void *afs_proc_cells_next(struct seq_file *m, void *v, loff_t *pos) { - return seq_list_next(v, &afs_seq2net(m)->proc_cells, pos); + return seq_hlist_next_rcu(v, &afs_seq2net(m)->proc_cells, pos); }
static void afs_proc_cells_stop(struct seq_file *m, void *v)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ac081c3be3fae6d0cc3e1862507fca3862d30b67 ]
On non-preempt kernels this loop can take a long time (more than 50 ticks) processing through entries.
Link: http://lkml.kernel.org/r/20181010172623.57033-1-khazhy@google.com Signed-off-by: Khazhismel Kumykov khazhy@google.com Acked-by: OGAWA Hirofumi hirofumi@mail.parknet.co.jp Reviewed-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fat/fatent.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c index 3aef8630a4b9..95d2c716e0da 100644 --- a/fs/fat/fatent.c +++ b/fs/fat/fatent.c @@ -681,6 +681,7 @@ int fat_count_free_clusters(struct super_block *sb) if (ops->ent_get(&fatent) == FAT_ENT_FREE) free++; } while (fat_ent_next(sbi, &fatent)); + cond_resched(); } sbi->free_clusters = free; sbi->free_clus_valid = 1;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
This reverts commit 62aad93f09c1952ede86405894df1b22012fd5ab.
Which was upstream commit 172b06c32b94 ("mm: slowly shrink slabs with a relatively small number of objects").
The upstream commit was found to cause regressions. While there is a proposed fix upstream, revent this patch from stable trees for now as testing the fix will take some time.
Signed-off-by: Sasha Levin sashal@kernel.org --- mm/vmscan.c | 11 ----------- 1 file changed, 11 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c index fc0436407471..03822f86f288 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -386,17 +386,6 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, delta = freeable >> priority; delta *= 4; do_div(delta, shrinker->seeks); - - /* - * Make sure we apply some minimal pressure on default priority - * even on small cgroups. Stale objects are not only consuming memory - * by themselves, but can also hold a reference to a dying cgroup, - * preventing it from being reclaimed. A dying cgroup with all - * corresponding structures like per-cpu stats and kmem caches - * can be really big, so it may lead to a significant waste of memory. - */ - delta = max_t(unsigned long long, delta, min(freeable, batch_size)); - total_scan += delta; if (total_scan < 0) { pr_err("shrink_slab: %pF negative objects to delete nr=%ld\n",
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
This reverts commit 84379c9afe011020e797e3f50a662b08a6355dcf.
From Florian Westphal fw@strlen.de:
It causes kernel crash for locally generated ipv6 fragments when netfilter ipv6 defragmentation is used.
The faulty commit is not essential for -stable, it only delays netns teardown for longer than needed when that netns still has ipv6 frags queued. Much better than crash :-/
Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/netfilter/nf_conntrack_reasm.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index e4d9e6976d3c..a452d99c9f52 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -585,8 +585,6 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) fq->q.meat == fq->q.len && nf_ct_frag6_reasm(fq, skb, dev)) ret = 0; - else - skb_dst_drop(skb);
out_unlock: spin_unlock_bh(&fq->q.lock);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit da15fc2fa9c07b23db8f5e479bd8a9f0d741ca07 ]
The Yocto build system does a 'make clean' when rebuilding due to changed dependencies, and that consistently fails for me (causing the whole BSP build to fail) with errors such as
| find: '[...]/perf/1.0-r9/perf-1.0/plugin_mac80211.so': No such file or directory | find: '[...]/perf/1.0-r9/perf-1.0/plugin_mac80211.so': No such file or directory | find: find: '[...]/perf/1.0-r9/perf-1.0/libtraceevent.a''[...]/perf/1.0-r9/perf-1.0/libtraceevent.a': No such file or directory: No such file or directory | [...] | find: cannot delete '/mnt/xfs/devel/pil/yocto/tmp-glibc/work/wandboard-oe-linux-gnueabi/perf/1.0-r9/perf-1.0/util/.pstack.o.cmd': No such file or directory
Apparently (despite the comment), 'make clean' ends up launching multiple sub-makes that all want to remove the same things - perhaps this only happens in combination with a O=... parameter. In any case, we don't lose much by explicitly disabling the parallelism for the clean target, and it makes automated builds much more reliable.
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk Acked-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lkml.kernel.org/r/20180705131527.19749-1-linux@rasmusvillemoes.dk Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/perf/Makefile +++ b/tools/perf/Makefile @@ -84,10 +84,10 @@ endif # has_clean endif # MAKECMDGOALS
# -# The clean target is not really parallel, don't print the jobs info: +# Explicitly disable parallelism for the clean target. # clean: - $(make) + $(make) -j1
# # The build-test target is not really parallel, don't print the jobs info,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 5a2de63fd1a59c30c02526d427bc014b98adf508 ]
Based on RFC 4541, 2.1.1. IGMP Forwarding Rules
The switch supporting IGMP snooping must maintain a list of multicast routers and the ports on which they are attached. This list can be constructed in any combination of the following ways:
a) This list should be built by the snooping switch sending Multicast Router Solicitation messages as described in IGMP Multicast Router Discovery [MRDISC]. It may also snoop Multicast Router Advertisement messages sent by and to other nodes.
b) The arrival port for IGMP Queries (sent by multicast routers) where the source address is not 0.0.0.0.
We should not add the port to router list when receives query with source 0.0.0.0.
Reported-by: Ying Xu yinxu@redhat.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com Acked-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Acked-by: Roopa Prabhu roopa@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_multicast.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1420,7 +1420,15 @@ static void br_multicast_query_received( return;
br_multicast_update_query_timer(br, query, max_delay); - br_multicast_mark_router(br, port); + + /* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules, + * the arrival port for IGMP Queries where the source address + * is 0.0.0.0 should not be added to router port list. + */ + if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) || + (saddr->proto == htons(ETH_P_IPV6) && + !ipv6_addr_any(&saddr->u.ip6))) + br_multicast_mark_router(br, port); }
static int br_ip4_multicast_query(struct net_bridge *br,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit dc012f3628eaecfb5ba68404a5c30ef501daf63d ]
syzbot found a use-after-free in inet6_mc_check [1]
The problem here is that inet6_mc_check() uses rcu and read_lock(&iml->sflock)
So the fact that ip6_mc_leave_src() is called under RTNL and the socket lock does not help us, we need to acquire iml->sflock in write mode.
In the future, we should convert all this stuff to RCU.
[1] BUG: KASAN: use-after-free in ipv6_addr_equal include/net/ipv6.h:521 [inline] BUG: KASAN: use-after-free in inet6_mc_check+0xae7/0xb40 net/ipv6/mcast.c:649 Read of size 8 at addr ffff8801ce7f2510 by task syz-executor0/22432
CPU: 1 PID: 22432 Comm: syz-executor0 Not tainted 4.19.0-rc7+ #280 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 ipv6_addr_equal include/net/ipv6.h:521 [inline] inet6_mc_check+0xae7/0xb40 net/ipv6/mcast.c:649 __raw_v6_lookup+0x320/0x3f0 net/ipv6/raw.c:98 ipv6_raw_deliver net/ipv6/raw.c:183 [inline] raw6_local_deliver+0x3d3/0xcb0 net/ipv6/raw.c:240 ip6_input_finish+0x467/0x1aa0 net/ipv6/ip6_input.c:345 NF_HOOK include/linux/netfilter.h:289 [inline] ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:426 ip6_mc_input+0x48a/0xd20 net/ipv6/ip6_input.c:503 dst_input include/net/dst.h:450 [inline] ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76 NF_HOOK include/linux/netfilter.h:289 [inline] ipv6_rcv+0x120/0x640 net/ipv6/ip6_input.c:271 __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023 netif_receive_skb_internal+0x12c/0x620 net/core/dev.c:5126 napi_frags_finish net/core/dev.c:5664 [inline] napi_gro_frags+0x75a/0xc90 net/core/dev.c:5737 tun_get_user+0x3189/0x4250 drivers/net/tun.c:1923 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1968 call_write_iter include/linux/fs.h:1808 [inline] do_iter_readv_writev+0x8b0/0xa80 fs/read_write.c:680 do_iter_write+0x185/0x5f0 fs/read_write.c:959 vfs_writev+0x1f1/0x360 fs/read_write.c:1004 do_writev+0x11a/0x310 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev fs/read_write.c:1109 [inline] __x64_sys_writev+0x75/0xb0 fs/read_write.c:1109 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457421 Code: 75 14 b8 14 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 34 b5 fb ff c3 48 83 ec 08 e8 1a 2d 00 00 48 89 04 24 b8 14 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 63 2d 00 00 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00007f2d30ecaba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 000000000000003e RCX: 0000000000457421 RDX: 0000000000000001 RSI: 00007f2d30ecabf0 RDI: 00000000000000f0 RBP: 0000000020000500 R08: 00000000000000f0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f2d30ecb6d4 R13: 00000000004c4890 R14: 00000000004d7b90 R15: 00000000ffffffff
Allocated by task 22437: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc mm/slab.c:3718 [inline] __kmalloc+0x14e/0x760 mm/slab.c:3727 kmalloc include/linux/slab.h:518 [inline] sock_kmalloc+0x15a/0x1f0 net/core/sock.c:1983 ip6_mc_source+0x14dd/0x1960 net/ipv6/mcast.c:427 do_ipv6_setsockopt.isra.9+0x3afb/0x45d0 net/ipv6/ipv6_sockglue.c:743 ipv6_setsockopt+0xbd/0x170 net/ipv6/ipv6_sockglue.c:933 rawv6_setsockopt+0x59/0x140 net/ipv6/raw.c:1069 sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3038 __sys_setsockopt+0x1ba/0x3c0 net/socket.c:1902 __do_sys_setsockopt net/socket.c:1913 [inline] __se_sys_setsockopt net/socket.c:1910 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1910 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 22430: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x230 mm/slab.c:3813 __sock_kfree_s net/core/sock.c:2004 [inline] sock_kfree_s+0x29/0x60 net/core/sock.c:2010 ip6_mc_leave_src+0x11a/0x1d0 net/ipv6/mcast.c:2448 __ipv6_sock_mc_close+0x20b/0x4e0 net/ipv6/mcast.c:310 ipv6_sock_mc_close+0x158/0x1d0 net/ipv6/mcast.c:328 inet6_release+0x40/0x70 net/ipv6/af_inet6.c:452 __sock_release+0xd7/0x250 net/socket.c:579 sock_close+0x19/0x20 net/socket.c:1141 __fput+0x385/0xa30 fs/file_table.c:278 ____fput+0x15/0x20 fs/file_table.c:309 task_work_run+0x1e8/0x2a0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:193 [inline] exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8801ce7f2500 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 16 bytes inside of 192-byte region [ffff8801ce7f2500, ffff8801ce7f25c0) The buggy address belongs to the page: page:ffffea000739fc80 count:1 mapcount:0 mapping:ffff8801da800040 index:0x0 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006f6e548 ffffea000737b948 ffff8801da800040 raw: 0000000000000000 ffff8801ce7f2000 0000000100000010 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff8801ce7f2400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801ce7f2480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff8801ce7f2500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8801ce7f2580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8801ce7f2600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/mcast.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -2436,17 +2436,17 @@ static int ip6_mc_leave_src(struct sock { int err;
- /* callers have the socket lock and rtnl lock - * so no other readers or writers of iml or its sflist - */ + write_lock_bh(&iml->sflock); if (!iml->sflist) { /* any-source empty exclude case */ - return ip6_mc_del_src(idev, &iml->addr, iml->sfmode, 0, NULL, 0); + err = ip6_mc_del_src(idev, &iml->addr, iml->sfmode, 0, NULL, 0); + } else { + err = ip6_mc_del_src(idev, &iml->addr, iml->sfmode, + iml->sflist->sl_count, iml->sflist->sl_addr, 0); + sock_kfree_s(sk, iml->sflist, IP6_SFLSIZE(iml->sflist->sl_max)); + iml->sflist = NULL; } - err = ip6_mc_del_src(idev, &iml->addr, iml->sfmode, - iml->sflist->sl_count, iml->sflist->sl_addr, 0); - sock_kfree_s(sk, iml->sflist, IP6_SFLSIZE(iml->sflist->sl_max)); - iml->sflist = NULL; + write_unlock_bh(&iml->sflock); return err; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit ee1abcf689353f36d9322231b4320926096bdee0 ]
Commit a61bbcf28a8c ("[NET]: Store skb->timestamp as offset to a base timestamp") introduces a neighbour control buffer and zeroes it out in ndisc_rcv(), as ndisc_recv_ns() uses it.
Commit f2776ff04722 ("[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture.") introduces the usage of the IPv6 control buffer in protocol error handlers (e.g. inet6_iif() in present-day __udp6_lib_err()).
Now, with commit b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect()."), we call protocol error handlers from ndisc_redirect_rcv(), after the control buffer is already stolen and some parts are already zeroed out. This implies that inet6_iif() on this path will always return zero.
This gives unexpected results on UDP socket lookup in __udp6_lib_err(), as we might actually need to match sockets for a given interface.
Instead of always claiming the control buffer in ndisc_rcv(), do that only when needed.
Fixes: b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().") Signed-off-by: Stefano Brivio sbrivio@redhat.com Reviewed-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ndisc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1732,10 +1732,9 @@ int ndisc_rcv(struct sk_buff *skb) return 0; }
- memset(NEIGH_CB(skb), 0, sizeof(struct neighbour_cb)); - switch (msg->icmph.icmp6_type) { case NDISC_NEIGHBOUR_SOLICITATION: + memset(NEIGH_CB(skb), 0, sizeof(struct neighbour_cb)); ndisc_recv_ns(skb); break;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit f547fac624be53ad8b07e9ebca7654a7827ba61b ]
When commit 270972554c91 ("[IPV6]: ROUTE: Add Router Reachability Probing (RFC4191).") introduced router probing, the rt6_probe() function required that a neighbour entry existed. This neighbour entry is used to record the timestamp of the last probe via the ->updated field.
Later, commit 2152caea7196 ("ipv6: Do not depend on rt->n in rt6_probe().") removed the requirement for a neighbour entry. Neighbourless routes skip the interval check and are not rate-limited.
This patch adds rate-limiting for neighbourless routes, by recording the timestamp of the last probe in the fib6_info itself.
Fixes: 2152caea7196 ("ipv6: Do not depend on rt->n in rt6_probe().") Signed-off-by: Sabrina Dubroca sd@queasysnail.net Reviewed-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ip6_fib.h | 4 ++++ net/ipv6/route.c | 12 ++++++------ 2 files changed, 10 insertions(+), 6 deletions(-)
--- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -159,6 +159,10 @@ struct fib6_info { struct rt6_info * __percpu *rt6i_pcpu; struct rt6_exception_bucket __rcu *rt6i_exception_bucket;
+#ifdef CONFIG_IPV6_ROUTER_PREF + unsigned long last_probe; +#endif + u32 fib6_metric; u8 fib6_protocol; u8 fib6_type; --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -517,10 +517,11 @@ static void rt6_probe_deferred(struct wo
static void rt6_probe(struct fib6_info *rt) { - struct __rt6_probe_work *work; + struct __rt6_probe_work *work = NULL; const struct in6_addr *nh_gw; struct neighbour *neigh; struct net_device *dev; + struct inet6_dev *idev;
/* * Okay, this does not seem to be appropriate @@ -536,15 +537,12 @@ static void rt6_probe(struct fib6_info * nh_gw = &rt->fib6_nh.nh_gw; dev = rt->fib6_nh.nh_dev; rcu_read_lock_bh(); + idev = __in6_dev_get(dev); neigh = __ipv6_neigh_lookup_noref(dev, nh_gw); if (neigh) { - struct inet6_dev *idev; - if (neigh->nud_state & NUD_VALID) goto out;
- idev = __in6_dev_get(dev); - work = NULL; write_lock(&neigh->lock); if (!(neigh->nud_state & NUD_VALID) && time_after(jiffies, @@ -554,11 +552,13 @@ static void rt6_probe(struct fib6_info * __neigh_set_probe_once(neigh); } write_unlock(&neigh->lock); - } else { + } else if (time_after(jiffies, rt->last_probe + + idev->cnf.rtr_probe_interval)) { work = kmalloc(sizeof(*work), GFP_ATOMIC); }
if (work) { + rt->last_probe = jiffies; INIT_WORK(&work->work, rt6_probe_deferred); work->target = *nh_gw; dev_hold(dev);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 5a8e7aea953bdb6d4da13aff6f1e7f9c62023499 ]
WHen an llc sock is added into the sk_laddr_hash of an llc_sap, it is not marked with SOCK_RCU_FREE.
This causes that the sock could be freed while it is still being read by __llc_lookup_established() with RCU read lock. sock is refcounted, but with RCU read lock, nothing prevents the readers getting a zero refcnt.
Fix it by setting SOCK_RCU_FREE in llc_sap_add_socket().
Reported-by: syzbot+11e05f04c15e03be5254@syzkaller.appspotmail.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/llc/llc_conn.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -734,6 +734,7 @@ void llc_sap_add_socket(struct llc_sap * llc_sk(sk)->sap = sap;
spin_lock_bh(&sap->sk_lock); + sock_set_flag(sk, SOCK_RCU_FREE); sap->sk_count++; sk_nulls_add_node_rcu(sk, laddr_hb); hlist_add_head(&llc->dev_hash_node, dev_hb);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fugang Duan fugang.duan@nxp.com
[ Upstream commit ec20a63aa8b8ec3223fb25cdb2a49f9f9dfda88c ]
Commit db65f35f50e0 ("net: fec: add support of ethtool get_regs") introduce ethool "--register-dump" interface to dump all FEC registers.
But not all silicon implementations of the Freescale FEC hardware module have the FRBR (FIFO Receive Bound Register) and FRSR (FIFO Receive Start Register) register, so we should not be trying to dump them on those that don't.
To fix it we create a quirk flag, FEC_QUIRK_HAS_RFREG, and check it before dump those RX FIFO registers.
Signed-off-by: Fugang Duan fugang.duan@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/freescale/fec.h | 4 ++++ drivers/net/ethernet/freescale/fec_main.c | 16 ++++++++++++---- 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -452,6 +452,10 @@ struct bufdesc_ex { * initialisation. */ #define FEC_QUIRK_MIB_CLEAR (1 << 15) +/* Only i.MX25/i.MX27/i.MX28 controller supports FRBR,FRSR registers, + * those FIFO receive registers are resolved in other platforms. + */ +#define FEC_QUIRK_HAS_FRREG (1 << 16)
struct bufdesc_prop { int qid; --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -90,14 +90,16 @@ static struct platform_device_id fec_dev .driver_data = 0, }, { .name = "imx25-fec", - .driver_data = FEC_QUIRK_USE_GASKET | FEC_QUIRK_MIB_CLEAR, + .driver_data = FEC_QUIRK_USE_GASKET | FEC_QUIRK_MIB_CLEAR | + FEC_QUIRK_HAS_FRREG, }, { .name = "imx27-fec", - .driver_data = FEC_QUIRK_MIB_CLEAR, + .driver_data = FEC_QUIRK_MIB_CLEAR | FEC_QUIRK_HAS_FRREG, }, { .name = "imx28-fec", .driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_SWAP_FRAME | - FEC_QUIRK_SINGLE_MDIO | FEC_QUIRK_HAS_RACC, + FEC_QUIRK_SINGLE_MDIO | FEC_QUIRK_HAS_RACC | + FEC_QUIRK_HAS_FRREG, }, { .name = "imx6q-fec", .driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT | @@ -2163,7 +2165,13 @@ static void fec_enet_get_regs(struct net memset(buf, 0, regs->len);
for (i = 0; i < ARRAY_SIZE(fec_enet_register_offset); i++) { - off = fec_enet_register_offset[i] / 4; + off = fec_enet_register_offset[i]; + + if ((off == FEC_R_BOUND || off == FEC_R_FSTART) && + !(fep->quirks & FEC_QUIRK_HAS_FRREG)) + continue; + + off >>= 2; buf[off] = readl(&theregs[off]); } }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Ahern dsahern@gmail.com
[ Upstream commit 4ba4c566ba8448a05e6257e0b98a21f1a0d55315 ]
The loop wants to skip previously dumped addresses, so loops until current index >= saved index. If the message fills it wants to save the index for the next address to dump - ie., the one that did not fit in the current message.
Currently, it is incrementing the index counter before comparing to the saved index, and then the saved index is off by 1 - it assumes the current address is going to fit in the message.
Change the index handling to increment only after a succesful dump.
Fixes: 502a2ffd7376a ("ipv6: convert idev_list to list macros") Signed-off-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4930,8 +4930,8 @@ static int in6_dump_addrs(struct inet6_d
/* unicast address incl. temp addr */ list_for_each_entry(ifa, &idev->addr_list, if_list) { - if (++ip_idx < s_ip_idx) - continue; + if (ip_idx < s_ip_idx) + goto next; err = inet6_fill_ifaddr(skb, ifa, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, @@ -4940,6 +4940,8 @@ static int in6_dump_addrs(struct inet6_d if (err < 0) break; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); +next: + ip_idx++; } break; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit d48051c5b8376038c2b287c3b1bd55b8d391d567 ]
As shown by Dmitris, we need to use csum_block_add() instead of csum_add() when adding the FCS contribution to skb csum.
Before 4.18 (more exactly commit 88078d98d1bb "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), the whole skb csum was thrown away, so RXFCS changes were ignored.
Then before commit d55bef5059dd ("net: fix pskb_trim_rcsum_slow() with odd trim offset") both mlx5 and pskb_trim_rcsum_slow() bugs were canceling each other.
Now we fixed pskb_trim_rcsum_slow() we need to fix mlx5.
Note that this patch also rewrites mlx5e_get_fcs() to :
- Use skb_header_pointer() instead of reinventing it. - Use __get_unaligned_cpu32() to avoid possible non aligned accesses as Dmitris pointed out.
Fixes: 902a545904c7 ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation") Reported-by: Paweł Staszewski pstaszewski@itcare.pl Signed-off-by: Eric Dumazet edumazet@google.com Cc: Eran Ben Elisha eranbe@mellanox.com Cc: Saeed Mahameed saeedm@mellanox.com Cc: Dimitris Michailidis dmichail@google.com Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Paweł Staszewski pstaszewski@itcare.pl Reviewed-by: Eran Ben Elisha eranbe@mellanox.com Tested-By: Maria Pasechnik mariap@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 45 ++++-------------------- 1 file changed, 9 insertions(+), 36 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -693,43 +693,15 @@ static inline bool is_last_ethertype_ip( return (ethertype == htons(ETH_P_IP) || ethertype == htons(ETH_P_IPV6)); }
-static __be32 mlx5e_get_fcs(struct sk_buff *skb) +static u32 mlx5e_get_fcs(const struct sk_buff *skb) { - int last_frag_sz, bytes_in_prev, nr_frags; - u8 *fcs_p1, *fcs_p2; - skb_frag_t *last_frag; - __be32 fcs_bytes; - - if (!skb_is_nonlinear(skb)) - return *(__be32 *)(skb->data + skb->len - ETH_FCS_LEN); - - nr_frags = skb_shinfo(skb)->nr_frags; - last_frag = &skb_shinfo(skb)->frags[nr_frags - 1]; - last_frag_sz = skb_frag_size(last_frag); - - /* If all FCS data is in last frag */ - if (last_frag_sz >= ETH_FCS_LEN) - return *(__be32 *)(skb_frag_address(last_frag) + - last_frag_sz - ETH_FCS_LEN); - - fcs_p2 = (u8 *)skb_frag_address(last_frag); - bytes_in_prev = ETH_FCS_LEN - last_frag_sz; - - /* Find where the other part of the FCS is - Linear or another frag */ - if (nr_frags == 1) { - fcs_p1 = skb_tail_pointer(skb); - } else { - skb_frag_t *prev_frag = &skb_shinfo(skb)->frags[nr_frags - 2]; + const void *fcs_bytes; + u32 _fcs_bytes;
- fcs_p1 = skb_frag_address(prev_frag) + - skb_frag_size(prev_frag); - } - fcs_p1 -= bytes_in_prev; - - memcpy(&fcs_bytes, fcs_p1, bytes_in_prev); - memcpy(((u8 *)&fcs_bytes) + bytes_in_prev, fcs_p2, last_frag_sz); + fcs_bytes = skb_header_pointer(skb, skb->len - ETH_FCS_LEN, + ETH_FCS_LEN, &_fcs_bytes);
- return fcs_bytes; + return __get_unaligned_cpu32(fcs_bytes); }
static inline void mlx5e_handle_csum(struct net_device *netdev, @@ -762,8 +734,9 @@ static inline void mlx5e_handle_csum(str network_depth - ETH_HLEN, skb->csum); if (unlikely(netdev->features & NETIF_F_RXFCS)) - skb->csum = csum_add(skb->csum, - (__force __wsum)mlx5e_get_fcs(skb)); + skb->csum = csum_block_add(skb->csum, + (__force __wsum)mlx5e_get_fcs(skb), + skb->len - ETH_FCS_LEN); stats->csum_complete++; return; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski jakub.kicinski@netronome.com
[ Upstream commit 38b4f18d56372e1e21771ab7b0357b853330186c ]
gred_change_table_def() takes a pointer to TCA_GRED_DPS attribute, and expects it will be able to interpret its contents as struct tc_gred_sopt. Pass the correct gred attribute, instead of TCA_OPTIONS.
This bug meant the table definition could never be changed after Qdisc was initialized (unless whatever TCA_OPTIONS contained both passed netlink validation and was a valid struct tc_gred_sopt...).
Old behaviour: $ ip link add type dummy $ tc qdisc replace dev dummy0 parent root handle 7: \ gred setup vqs 4 default 0 $ tc qdisc replace dev dummy0 parent root handle 7: \ gred setup vqs 4 default 0 RTNETLINK answers: Invalid argument
Now: $ ip link add type dummy $ tc qdisc replace dev dummy0 parent root handle 7: \ gred setup vqs 4 default 0 $ tc qdisc replace dev dummy0 parent root handle 7: \ gred setup vqs 4 default 0 $ tc qdisc replace dev dummy0 parent root handle 7: \ gred setup vqs 4 default 0
Fixes: f62d6b936df5 ("[PKT_SCHED]: GRED: Use central VQ change procedure") Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_gred.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_gred.c +++ b/net/sched/sch_gred.c @@ -413,7 +413,7 @@ static int gred_change(struct Qdisc *sch if (tb[TCA_GRED_PARMS] == NULL && tb[TCA_GRED_STAB] == NULL) { if (tb[TCA_GRED_LIMIT] != NULL) sch->limit = nla_get_u32(tb[TCA_GRED_LIMIT]); - return gred_change_table_def(sch, opt); + return gred_change_table_def(sch, tb[TCA_GRED_DPS]); }
if (tb[TCA_GRED_PARMS] == NULL ||
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenwen Wang wang6495@umn.edu
[ Upstream commit b6168562c8ce2bd5a30e213021650422e08764dc ]
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch statement to see whether it is necessary to pre-process the ethtool structure, because, as mentioned in the comment, the structure ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc' is allocated through compat_alloc_user_space(). One thing to note here is that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is partially determined by 'rule_cnt', which is actually acquired from the user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through get_user(). After 'rxnfc' is allocated, the data in the original user-space buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(), including the 'rule_cnt' field. However, after this copy, no check is re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user race to change the value in the 'compat_rxnfc->rule_cnt' between these two copies. Through this way, the attacker can bypass the previous check on 'rule_cnt' and inject malicious data. This can cause undefined behavior of the kernel and introduce potential security risk.
This patch avoids the above issue via copying the value acquired by get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
Signed-off-by: Wenwen Wang wang6495@umn.edu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/socket.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/net/socket.c +++ b/net/socket.c @@ -2887,9 +2887,14 @@ static int ethtool_ioctl(struct net *net copy_in_user(&rxnfc->fs.ring_cookie, &compat_rxnfc->fs.ring_cookie, (void __user *)(&rxnfc->fs.location + 1) - - (void __user *)&rxnfc->fs.ring_cookie) || - copy_in_user(&rxnfc->rule_cnt, &compat_rxnfc->rule_cnt, - sizeof(rxnfc->rule_cnt))) + (void __user *)&rxnfc->fs.ring_cookie)) + return -EFAULT; + if (ethcmd == ETHTOOL_GRXCLSRLALL) { + if (put_user(rule_cnt, &rxnfc->rule_cnt)) + return -EFAULT; + } else if (copy_in_user(&rxnfc->rule_cnt, + &compat_rxnfc->rule_cnt, + sizeof(rxnfc->rule_cnt))) return -EFAULT; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel niklas.cassel@linaro.org
[ Upstream commit 30549aab146ccb1275230c3b4b4bc6b4181fd54e ]
When building stmmac, it is only possible to select CONFIG_DWMAC_GENERIC, or any of the glue drivers, when CONFIG_STMMAC_PLATFORM is set. The only exception is CONFIG_STMMAC_PCI.
When calling of_mdiobus_register(), it will call our ->reset() callback, which is set to stmmac_mdio_reset().
Most of the code in stmmac_mdio_reset() is protected by a "#if defined(CONFIG_STMMAC_PLATFORM)", which will evaluate to false when CONFIG_STMMAC_PLATFORM=m.
Because of this, the phy reset gpio will only be pulled when stmmac is built as built-in, but not when built as modules.
Fix this by using "#if IS_ENABLED()" instead of "#if defined()".
Signed-off-by: Niklas Cassel niklas.cassel@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c @@ -133,7 +133,7 @@ static int stmmac_mdio_write(struct mii_ */ int stmmac_mdio_reset(struct mii_bus *bus) { -#if defined(CONFIG_STMMAC_PLATFORM) +#if IS_ENABLED(CONFIG_STMMAC_PLATFORM) struct net_device *ndev = bus->priv; struct stmmac_priv *priv = netdev_priv(ndev); unsigned int mii_address = priv->hw->mii.addr;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Tranchetti stranche@codeaurora.org
[ Upstream commit db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 ]
Current handling of CHECKSUM_COMPLETE packets by the UDP stack is incorrect for any packet that has an incorrect checksum value.
udp4/6_csum_init() will both make a call to __skb_checksum_validate_complete() to initialize/validate the csum field when receiving a CHECKSUM_COMPLETE packet. When this packet fails validation, skb->csum will be overwritten with the pseudoheader checksum so the packet can be fully validated by software, but the skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way the stack can later warn the user about their hardware spewing bad checksums. Unfortunately, leaving the SKB in this state can cause problems later on in the checksum calculation.
Since the the packet is still marked as CHECKSUM_COMPLETE, udp_csum_pull_header() will SUBTRACT the checksum of the UDP header from skb->csum instead of adding it, leaving us with a garbage value in that field. Once we try to copy the packet to userspace in the udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg() to checksum the packet data and add it in the garbage skb->csum value to perform our final validation check.
Since the value we're validating is not the proper checksum, it's possible that the folded value could come out to 0, causing us not to drop the packet. Instead, we believe that the packet was checksummed incorrectly by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt to warn the user with netdev_rx_csum_fault(skb->dev);
Unfortunately, since this is the UDP path, skb->dev has been overwritten by skb->dev_scratch and is no longer a valid pointer, so we end up reading invalid memory.
This patch addresses this problem in two ways: 1) Do not use the dev pointer when calling netdev_rx_csum_fault() from skb_copy_and_csum_datagram_msg(). Since this gets called from the UDP path where skb->dev has been overwritten, we have no way of knowing if the pointer is still valid. Also for the sake of consistency with the other uses of netdev_rx_csum_fault(), don't attempt to call it if the packet was checksummed by software.
2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init(). If we receive a packet that's CHECKSUM_COMPLETE that fails verification (i.e. skb->csum_valid == 0), check who performed the calculation. It's possible that the checksum was done in software by the network stack earlier (such as Netfilter's CONNTRACK module), and if that says the checksum is bad, we can drop the packet immediately instead of waiting until we try and copy it to userspace. Otherwise, we need to mark the SKB as CHECKSUM_NONE, since the skb->csum field no longer contains the full packet checksum after the call to __skb_checksum_validate_complete().
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Fixes: c84d949057ca ("udp: copy skb->truesize in the first cache line") Cc: Sam Kumar samanthakumar@google.com Cc: Eric Dumazet edumazet@google.com Signed-off-by: Sean Tranchetti stranche@codeaurora.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/datagram.c | 5 +++-- net/ipv4/udp.c | 20 ++++++++++++++++++-- net/ipv6/ip6_checksum.c | 20 ++++++++++++++++++-- 3 files changed, 39 insertions(+), 6 deletions(-)
--- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -808,8 +808,9 @@ int skb_copy_and_csum_datagram_msg(struc return -EINVAL; }
- if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE)) - netdev_rx_csum_fault(skb->dev); + if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) && + !skb->csum_complete_sw) + netdev_rx_csum_fault(NULL); } return 0; fault: --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2124,8 +2124,24 @@ static inline int udp4_csum_init(struct /* Note, we are only interested in != 0 or == 0, thus the * force to int. */ - return (__force int)skb_checksum_init_zero_check(skb, proto, uh->check, - inet_compute_pseudo); + err = (__force int)skb_checksum_init_zero_check(skb, proto, uh->check, + inet_compute_pseudo); + if (err) + return err; + + if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) { + /* If SW calculated the value, we know it's bad */ + if (skb->csum_complete_sw) + return 1; + + /* HW says the value is bad. Let's validate that. + * skb->csum is no longer the full packet checksum, + * so don't treat it as such. + */ + skb_checksum_complete_unset(skb); + } + + return 0; }
/* wrapper for udp_queue_rcv_skb tacking care of csum conversion and --- a/net/ipv6/ip6_checksum.c +++ b/net/ipv6/ip6_checksum.c @@ -88,8 +88,24 @@ int udp6_csum_init(struct sk_buff *skb, * Note, we are only interested in != 0 or == 0, thus the * force to int. */ - return (__force int)skb_checksum_init_zero_check(skb, proto, uh->check, - ip6_compute_pseudo); + err = (__force int)skb_checksum_init_zero_check(skb, proto, uh->check, + ip6_compute_pseudo); + if (err) + return err; + + if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) { + /* If SW calculated the value, we know it's bad */ + if (skb->csum_complete_sw) + return 1; + + /* HW says the value is bad. Let's validate that. + * skb->csum is no longer the full packet checksum, + * so don't treat is as such. + */ + skb_checksum_complete_unset(skb); + } + + return 0; } EXPORT_SYMBOL(udp6_csum_init);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 6b839b6cf9eada30b086effb51e5d6076bafc761 ]
rtl_rx() and rtl_tx() are called only if the respective bits are set in the interrupt status register. Under high load NAPI may not be able to process all data (work_done == budget) and it will schedule subsequent calls to the poll callback. rtl_ack_events() however resets the bits in the interrupt status register, therefore subsequent calls to rtl8169_poll() won't call rtl_rx() and rtl_tx() - chip interrupts are still disabled.
Fix this by calling rtl_rx() and rtl_tx() independent of the bits set in the interrupt status register. Both functions will detect if there's nothing to do for them.
Fixes: da78dbff2e05 ("r8169: remove work from irq handler.") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -7044,17 +7044,15 @@ static int rtl8169_poll(struct napi_stru struct rtl8169_private *tp = container_of(napi, struct rtl8169_private, napi); struct net_device *dev = tp->dev; u16 enable_mask = RTL_EVENT_NAPI | tp->event_slow; - int work_done= 0; + int work_done; u16 status;
status = rtl_get_events(tp); rtl_ack_events(tp, status & ~tp->event_slow);
- if (status & RTL_EVENT_NAPI_RX) - work_done = rtl_rx(dev, tp, (u32) budget); + work_done = rtl_rx(dev, tp, (u32) budget);
- if (status & RTL_EVENT_NAPI_TX) - rtl_tx(dev, tp); + rtl_tx(dev, tp);
if (status & tp->event_slow) { enable_mask &= ~tp->event_slow;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@mellanox.com
[ Upstream commit da71577545a52be3e0e9225a946e5fd79cfab015 ]
When an FDB entry is configured, the address is validated to have the length of an Ethernet address, but the device for which the address is configured can be of any type.
The above can result in the use of uninitialized memory when the address is later compared against existing addresses since 'dev->addr_len' is used and it may be greater than ETH_ALEN, as with ip6tnl devices.
Fix this by making sure that FDB entries are only configured for Ethernet devices.
BUG: KMSAN: uninit-value in memcmp+0x11d/0x180 lib/string.c:863 CPU: 1 PID: 4318 Comm: syz-executor998 Not tainted 4.19.0-rc3+ #49 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x14b/0x190 lib/dump_stack.c:113 kmsan_report+0x183/0x2b0 mm/kmsan/kmsan.c:956 __msan_warning+0x70/0xc0 mm/kmsan/kmsan_instr.c:645 memcmp+0x11d/0x180 lib/string.c:863 dev_uc_add_excl+0x165/0x7b0 net/core/dev_addr_lists.c:464 ndo_dflt_fdb_add net/core/rtnetlink.c:3463 [inline] rtnl_fdb_add+0x1081/0x1270 net/core/rtnetlink.c:3558 rtnetlink_rcv_msg+0xa0b/0x1530 net/core/rtnetlink.c:4715 netlink_rcv_skb+0x36e/0x5f0 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4733 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x1638/0x1720 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x1205/0x1290 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114 __sys_sendmsg net/socket.c:2152 [inline] __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159 do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440ee9 Code: e8 cc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff6a93b518 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440ee9 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 000000000000b4b0 R13: 0000000000401ec0 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:256 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:181 kmsan_kmalloc+0x98/0x100 mm/kmsan/kmsan_hooks.c:91 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:100 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2718 [inline] __kmalloc_node_track_caller+0x9e7/0x1160 mm/slub.c:4351 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2f5/0x9e0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:996 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] netlink_sendmsg+0xb49/0x1290 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114 __sys_sendmsg net/socket.c:2152 [inline] __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159 do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
v2: * Make error message more specific (David)
Fixes: 090096bf3db1 ("net: generic fdb support for drivers without ndo_fdb_<op>") Signed-off-by: Ido Schimmel idosch@mellanox.com Reported-and-tested-by: syzbot+3a288d5f5530b901310e@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+d53ab4e92a1db04110ff@syzkaller.appspotmail.com Cc: Vlad Yasevich vyasevich@gmail.com Cc: David Ahern dsahern@gmail.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/rtnetlink.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3480,6 +3480,11 @@ static int rtnl_fdb_add(struct sk_buff * return -EINVAL; }
+ if (dev->type != ARPHRD_ETHER) { + NL_SET_ERR_MSG(extack, "FDB delete only supported for Ethernet devices"); + return -EINVAL; + } + addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack); @@ -3584,6 +3589,11 @@ static int rtnl_fdb_del(struct sk_buff * return -EINVAL; }
+ if (dev->type != ARPHRD_ETHER) { + NL_SET_ERR_MSG(extack, "FDB add only supported for Ethernet devices"); + return -EINVAL; + } + addr = nla_data(tb[NDA_LLADDR]);
err = fdb_vid_parse(tb[NDA_VLAN], &vid, extack);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marcelo Ricardo Leitner marcelo.leitner@gmail.com
[ Upstream commit b336decab22158937975293aea79396525f92bb3 ]
syzbot reported an use-after-free involving sctp_id2asoc. Dmitry Vyukov helped to root cause it and it is because of reading the asoc after it was freed:
CPU 1 CPU 2 (working on socket 1) (working on socket 2) sctp_association_destroy sctp_id2asoc spin lock grab the asoc from idr spin unlock spin lock remove asoc from idr spin unlock free(asoc) if asoc->base.sk != sk ... [*]
This can only be hit if trying to fetch asocs from different sockets. As we have a single IDR for all asocs, in all SCTP sockets, their id is unique on the system. An application can try to send stuff on an id that matches on another socket, and the if in [*] will protect from such usage. But it didn't consider that as that asoc may belong to another socket, it may be freed in parallel (read: under another socket lock).
We fix it by moving the checks in [*] into the protected region. This fixes it because the asoc cannot be freed while the lock is held.
Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com Acked-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -270,11 +270,10 @@ struct sctp_association *sctp_id2assoc(s
spin_lock_bh(&sctp_assocs_id_lock); asoc = (struct sctp_association *)idr_find(&sctp_assocs_id, (int)id); + if (asoc && (asoc->base.sk != sk || asoc->base.dead)) + asoc = NULL; spin_unlock_bh(&sctp_assocs_id_lock);
- if (!asoc || (asoc->base.sk != sk) || asoc->base.dead) - return NULL; - return asoc; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tung Nguyen tung.q.nguyen@dektech.com.au
[ Upstream commit d3092b2efca1cd1d492d0b08499a2066c5ca8cec ]
The binding table's 'cluster_scope' list is rcu protected to handle races between threads changing the list and those traversing the list at the same moment. We have now found that the function named_distribute() uses the regular list_for_each() macro to traverse the said list. Likewise, the function tipc_named_withdraw() is removing items from the same list using the regular list_del() call. When these two functions execute in parallel we see occasional crashes.
This commit fixes this by adding the missing _rcu() suffixes.
Signed-off-by: Tung Nguyen tung.q.nguyen@dektech.com.au Signed-off-by: Jon Maloy jon.maloy@ericsson.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/name_distr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/tipc/name_distr.c +++ b/net/tipc/name_distr.c @@ -115,7 +115,7 @@ struct sk_buff *tipc_named_withdraw(stru struct sk_buff *buf; struct distr_item *item;
- list_del(&publ->binding_node); + list_del_rcu(&publ->binding_node);
if (publ->scope == TIPC_NODE_SCOPE) return NULL; @@ -147,7 +147,7 @@ static void named_distribute(struct net ITEM_SIZE) * ITEM_SIZE; u32 msg_rem = msg_dsz;
- list_for_each_entry(publ, pls, binding_node) { + list_for_each_entry_rcu(publ, pls, binding_node) { /* Prepare next buffer: */ if (!skb) { skb = named_prepare_buf(net, PUBLICATION, msg_rem,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 84dad55951b0d009372ec21760b650634246e144 ]
The commit eb63f2964dbe ("udp6: add missing checks on edumux packet processing") used the same return code convention of the ipv4 counterpart, but ipv6 uses the opposite one: positive values means resubmit.
This change addresses the issue, using positive return value for resubmitting. Also update the related comment, which was broken, too.
Fixes: eb63f2964dbe ("udp6: add missing checks on edumux packet processing") Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/udp.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -762,11 +762,9 @@ static int udp6_unicast_rcv_skb(struct s
ret = udpv6_queue_rcv_skb(sk, skb);
- /* a return value > 0 means to resubmit the input, but - * it wants the return to be -protocol, or 0 - */ + /* a return value > 0 means to resubmit the input */ if (ret > 0) - return -ret; + return ret; return 0; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Wang jasowang@redhat.com
[ Upstream commit ff002269a4ee9c769dbf9365acef633ebcbd6cbe ]
The idx in vhost_vring_ioctl() was controlled by userspace, hence a potential exploitation of the Spectre variant 1 vulnerability.
Fixing this by sanitizing idx before using it to index d->vqs.
Cc: Michael S. Tsirkin mst@redhat.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Signed-off-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vhost/vhost.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -30,6 +30,7 @@ #include <linux/sched/mm.h> #include <linux/sched/signal.h> #include <linux/interval_tree_generic.h> +#include <linux/nospec.h>
#include "vhost.h"
@@ -1362,6 +1363,7 @@ long vhost_vring_ioctl(struct vhost_dev if (idx >= d->nvqs) return -ENOBUFS;
+ idx = array_index_nospec(idx, d->nvqs); vq = d->vqs[idx];
mutex_lock(&vq->mutex);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ake Koomsin ake@igel.co.jp
[ Upstream commit 05c998b738fdd3e5d6a257bcacc8f34b6284d795 ]
Commit 713a98d90c5e ("virtio-net: serialize tx routine during reset") introduces netif_tx_disable() after netif_device_detach() in order to avoid use-after-free of tx queues. However, there are two issues.
1) Its operation is redundant with netif_device_detach() in case the interface is running. 2) In case of the interface is not running before suspending and resuming, the tx does not get resumed by netif_device_attach(). This results in losing network connectivity.
It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for serializing tx routine during reset. This also preserves the symmetry of netif_device_detach() and netif_device_attach().
Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during reset") Signed-off-by: Ake Koomsin ake@igel.co.jp Acked-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2162,8 +2162,9 @@ static void virtnet_freeze_down(struct v /* Make sure no work handler is accessing the device */ flush_work(&vi->config_work);
+ netif_tx_lock_bh(vi->dev); netif_device_detach(vi->dev); - netif_tx_disable(vi->dev); + netif_tx_unlock_bh(vi->dev); cancel_delayed_work_sync(&vi->refill);
if (netif_running(vi->dev)) { @@ -2199,7 +2200,9 @@ static int virtnet_restore_up(struct vir } }
+ netif_tx_lock_bh(vi->dev); netif_device_attach(vi->dev); + netif_tx_unlock_bh(vi->dev); return err; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenwen Wang wang6495@umn.edu
[ Upstream commit 58f5bbe331c566f49c9559568f982202a278aa78 ]
In dev_ethtool(), the eth command 'ethcmd' is firstly copied from the use-space buffer 'useraddr' and checked to see whether it is ETHTOOL_PERQUEUE. If yes, the sub-command 'sub_cmd' is further copied from the user space. Otherwise, 'sub_cmd' is the same as 'ethcmd'. Next, according to 'sub_cmd', a permission check is enforced through the function ns_capable(). For example, the permission check is required if 'sub_cmd' is ETHTOOL_SCOALESCE, but it is not necessary if 'sub_cmd' is ETHTOOL_GCOALESCE, as suggested in the comment "Allow some commands to be done by anyone". The following execution invokes different handlers according to 'ethcmd'. Specifically, if 'ethcmd' is ETHTOOL_PERQUEUE, ethtool_set_per_queue() is called. In ethtool_set_per_queue(), the kernel object 'per_queue_opt' is copied again from the user-space buffer 'useraddr' and 'per_queue_opt.sub_command' is used to determine which operation should be performed. Given that the buffer 'useraddr' is in the user space, a malicious user can race to change the sub-command between the two copies. In particular, the attacker can supply ETHTOOL_PERQUEUE and ETHTOOL_GCOALESCE to bypass the permission check in dev_ethtool(). Then before ethtool_set_per_queue() is called, the attacker changes ETHTOOL_GCOALESCE to ETHTOOL_SCOALESCE. In this way, the attacker can bypass the permission check and execute ETHTOOL_SCOALESCE.
This patch enforces a check in ethtool_set_per_queue() after the second copy from 'useraddr'. If the sub-command is different from the one obtained in the first copy in dev_ethtool(), an error code EINVAL will be returned.
Fixes: f38d138a7da6 ("net/ethtool: support set coalesce per queue") Signed-off-by: Wenwen Wang wang6495@umn.edu Reviewed-by: Michal Kubecek mkubecek@suse.cz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/ethtool.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -2461,13 +2461,17 @@ roll_back: return ret; }
-static int ethtool_set_per_queue(struct net_device *dev, void __user *useraddr) +static int ethtool_set_per_queue(struct net_device *dev, + void __user *useraddr, u32 sub_cmd) { struct ethtool_per_queue_op per_queue_opt;
if (copy_from_user(&per_queue_opt, useraddr, sizeof(per_queue_opt))) return -EFAULT;
+ if (per_queue_opt.sub_command != sub_cmd) + return -EINVAL; + switch (per_queue_opt.sub_command) { case ETHTOOL_GCOALESCE: return ethtool_get_per_queue_coalesce(dev, useraddr, &per_queue_opt); @@ -2838,7 +2842,7 @@ int dev_ethtool(struct net *net, struct rc = ethtool_get_phy_stats(dev, useraddr); break; case ETHTOOL_PERQUEUE: - rc = ethtool_set_per_queue(dev, useraddr); + rc = ethtool_set_per_queue(dev, useraddr, sub_cmd); break; case ETHTOOL_GLINKSETTINGS: rc = ethtool_get_link_ksettings(dev, useraddr);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tobias Jungel tobias.jungel@gmail.com
[ Upstream commit 414dd6fb9a1a1b59983aea7bf0f79f0085ecc5b8 ]
The attribute IFLA_BOND_AD_ACTOR_SYSTEM is sent to user space having the length of sizeof(bond->params.ad_actor_system) which is 8 byte. This patch aligns the length to ETH_ALEN to have the same MAC address exposed as using sysfs.
Fixes: f87fda00b6ed2 ("bonding: prevent out of bound accesses") Signed-off-by: Tobias Jungel tobias.jungel@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_netlink.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -638,8 +638,7 @@ static int bond_fill_info(struct sk_buff goto nla_put_failure;
if (nla_put(skb, IFLA_BOND_AD_ACTOR_SYSTEM, - sizeof(bond->params.ad_actor_system), - &bond->params.ad_actor_system)) + ETH_ALEN, &bond->params.ad_actor_system)) goto nla_put_failure; } if (!bond_3ad_get_active_agg_info(bond, &info)) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Brivio sbrivio@redhat.com
[ Upstream commit d4d576f5ab7edcb757bb33e6a5600666a0b1232d ]
Commit 058214a4d1df ("ip6_tun: Add infrastructure for doing encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header.
As long as the option didn't actually end up in generated packets, this wasn't an issue. Then commit 89a23c8b528b ("ip6_tunnel: Fix missing tunnel encapsulation limit option") fixed sending of this option, and the resulting layout, e.g. for FoU, is:
.-------------------.------------.----------.-------------------.----- - - | Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload '-------------------'------------'----------'-------------------'----- - -
Needless to say, FoU and GUE (at least) won't work over IPv6. The option is appended by default, and I couldn't find a way to disable it with the current iproute2.
Turn this into a more reasonable:
.-------------------.----------.------------.-------------------.----- - - | Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload '-------------------'----------'------------'-------------------'----- - -
With this, and with 84dad55951b0 ("udp6: fix encap return code for resubmitting"), FoU and GUE work again over IPv6.
Fixes: 058214a4d1df ("ip6_tun: Add infrastructure for doing encapsulation") Signed-off-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_tunnel.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1184,10 +1184,6 @@ route_lookup: } skb_dst_set(skb, dst);
- if (encap_limit >= 0) { - init_tel_txopt(&opt, encap_limit); - ipv6_push_frag_opts(skb, &opt.ops, &proto); - } hop_limit = hop_limit ? : ip6_dst_hoplimit(dst);
/* Calculate max headroom for all the headers and adjust @@ -1202,6 +1198,11 @@ route_lookup: if (err) return err;
+ if (encap_limit >= 0) { + init_tel_txopt(&opt, encap_limit); + ipv6_push_frag_opts(skb, &opt.ops, &proto); + } + skb_push(skb, sizeof(struct ipv6hdr)); skb_reset_network_header(skb); ipv6h = ipv6_hdr(skb);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Jaime Caama�o Ruiz" jcaamano@suse.com
[ Upstream commit 46ebe2834ba5b541f28ee72e556a3fed42c47570 ]
When there are both pop and push ethernet header actions among the actions to be applied to a packet, an unexpected EINVAL (Invalid argument) error is obtained. This is due to mac_proto not being reset correctly when those actions are validated.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047554.html Fixes: 91820da6ae85 ("openvswitch: add Ethernet push and pop actions") Signed-off-by: Jaime Caamaño Ruiz jcaamano@suse.com Tested-by: Greg Rose gvrose8192@gmail.com Reviewed-by: Greg Rose gvrose8192@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/flow_netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -2990,7 +2990,7 @@ static int __ovs_nla_copy_actions(struct * is already present */ if (mac_proto != MAC_PROTO_NONE) return -EINVAL; - mac_proto = MAC_PROTO_NONE; + mac_proto = MAC_PROTO_ETHERNET; break;
case OVS_ACTION_ATTR_POP_ETH: @@ -2998,7 +2998,7 @@ static int __ovs_nla_copy_actions(struct return -EINVAL; if (vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; - mac_proto = MAC_PROTO_ETHERNET; + mac_proto = MAC_PROTO_NONE; break;
case OVS_ACTION_ATTR_PUSH_NSH:
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
[ Upstream commit eddf016b910486d2123675a6b5fd7d64f77cdca8 ]
If the skb space ends in an unresolved entry while dumping we'll miss some unresolved entries. The reason is due to zeroing the entry counter between dumping resolved and unresolved mfc entries. We should just keep counting until the whole table is dumped and zero when we move to the next as we have a separate table counter.
Reported-by: Colin Ian King colin.king@canonical.com Fixes: 8fb472c09b9d ("ipmr: improve hash scalability") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ipmr_base.c | 2 -- 1 file changed, 2 deletions(-)
--- a/net/ipv4/ipmr_base.c +++ b/net/ipv4/ipmr_base.c @@ -295,8 +295,6 @@ int mr_rtm_dumproute(struct sk_buff *skb next_entry: e++; } - e = 0; - s_e = 0;
spin_lock_bh(lock); list_for_each_entry(mfc, &mrt->mfc_unres_queue, list) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huy Nguyen huyn@mellanox.com
[ Upstream commit a48bc513159d4767f9988f0d857b2b0c38a4d614 ]
The HW spec defines only bits 24-26 of pftype_wq as the page fault type, use the required mask to ensure that.
Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Huy Nguyen huyn@mellanox.com Signed-off-by: Eli Cohen eli@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -269,7 +269,7 @@ static void eq_pf_process(struct mlx5_eq case MLX5_PFAULT_SUBTYPE_WQE: /* WQE based event */ pfault->type = - be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24; + (be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24) & 0x7; pfault->token = be32_to_cpu(pf_eqe->wqe.token); pfault->wqe.wq_num =
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 64bd9c8135751b561f27edaaffe93d07093f81af ]
On GENETv5, there is a hardware issue which prevents the GENET hardware from generating a link UP interrupt when the link is operating at 10Mbits/sec. Since we do not have any way to configure the link detection logic, fallback to polling in that case.
Fixes: 421380856d9c ("net: bcmgenet: add support for the GENETv5 hardware") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/genet/bcmmii.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -321,9 +321,12 @@ int bcmgenet_mii_probe(struct net_device phydev->advertising = phydev->supported;
/* The internal PHY has its link interrupts routed to the - * Ethernet MAC ISRs + * Ethernet MAC ISRs. On GENETv5 there is a hardware issue + * that prevents the signaling of link UP interrupts when + * the link operates at 10Mbps, so fallback to polling for + * those versions of GENET. */ - if (priv->internal_phy) + if (priv->internal_phy && !GENET_IS_V5(priv)) dev->phydev->irq = PHY_IGNORE_INTERRUPT;
return 0;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit 3c53ed8fef6881a864f0ee8240ed2793ef73ad0d ]
When dumping classes by parent, kernel would return classes twice:
| # tc qdisc add dev lo root prio | # tc class show dev lo | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001: | # tc class show dev lo parent 8001: | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001: | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001:
This comes from qdisc_match_from_root() potentially returning the root qdisc itself if its handle matched. Though in that case, root's classes were already dumped a few lines above.
Fixes: cb395b2010879 ("net: sched: optimize class dumps") Signed-off-by: Phil Sutter phil@nwl.cc Reviewed-by: Jiri Pirko jiri@mellanox.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_api.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -2052,7 +2052,8 @@ static int tc_dump_tclass_root(struct Qd
if (tcm->tcm_parent) { q = qdisc_match_from_root(root, TC_H_MAJ(tcm->tcm_parent)); - if (q && tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) + if (q && q != root && + tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) return -1; return 0; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davide Caratti dcaratti@redhat.com
[ Upstream commit e331473fee3d500bb0d2582a1fe598df3326d8cd ]
Similarly to what has been done in 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes"), fix classifier code to add validation of TCA_CHAIN and TCA_KIND netlink attributes.
tested with: # ./tdc.py -c filter
v2: Let sch_api and cls_api share nla_policy they have in common, thanks to David Ahern. v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done by TC modules, thanks to Cong Wang. While at it, restore the 'Delete / get qdisc' comment to its orginal position, just above tc_get_qdisc() function prototype.
Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters") Signed-off-by: Davide Caratti dcaratti@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/cls_api.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -31,6 +31,8 @@ #include <net/pkt_sched.h> #include <net/pkt_cls.h>
+extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1]; + /* The list of all installed classifier types */ static LIST_HEAD(tcf_proto_base);
@@ -1083,7 +1085,7 @@ static int tc_new_tfilter(struct sk_buff replay: tp_created = 0;
- err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) return err;
@@ -1226,7 +1228,7 @@ static int tc_del_tfilter(struct sk_buff if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) return -EPERM;
- err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) return err;
@@ -1334,7 +1336,7 @@ static int tc_get_tfilter(struct sk_buff void *fh = NULL; int err;
- err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, rtm_tca_policy, extack); if (err < 0) return err;
@@ -1488,7 +1490,8 @@ static int tc_dump_tfilter(struct sk_buf if (nlmsg_len(cb->nlh) < sizeof(*tcm)) return skb->len;
- err = nlmsg_parse(cb->nlh, sizeof(*tcm), tca, TCA_MAX, NULL, NULL); + err = nlmsg_parse(cb->nlh, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, + NULL); if (err) return err;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Ahern dsahern@gmail.com
[ Upstream commit 4ed591c8ab44e711e56b8e021ffaf4f407c045f5 ]
The intent of ip6_route_check_nh_onlink is to make sure the gateway given for an onlink route is not actually on a connected route for a different interface (e.g., 2001:db8:1::/64 is on dev eth1 and then an onlink route has a via 2001:db8:1::1 dev eth2). If the gateway lookup hits the default route then it most likely will be a different interface than the onlink route which is ok.
Update ip6_route_check_nh_onlink to disregard the device mismatch if the gateway lookup hits the default route. Turns out the existing onlink tests are passing because there is no default route or it is an unreachable default, so update the onlink tests to have a default route other than unreachable.
Fixes: fc1e64e1092f6 ("net/ipv6: Add support for onlink flag") Signed-off-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 2 ++ tools/testing/selftests/net/fib-onlink-tests.sh | 14 +++++++------- 2 files changed, 9 insertions(+), 7 deletions(-)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2792,6 +2792,8 @@ static int ip6_route_check_nh_onlink(str grt = ip6_nh_lookup_table(net, cfg, gw_addr, tbid, 0); if (grt) { if (!grt->dst.error && + /* ignore match if it is the default route */ + grt->from && !ipv6_addr_any(&grt->from->fib6_dst.addr) && (grt->rt6i_flags & flags || dev != grt->dst.dev)) { NL_SET_ERR_MSG(extack, "Nexthop has invalid gateway or device mismatch"); --- a/tools/testing/selftests/net/fib-onlink-tests.sh +++ b/tools/testing/selftests/net/fib-onlink-tests.sh @@ -167,8 +167,8 @@ setup() # add vrf table ip li add ${VRF} type vrf table ${VRF_TABLE} ip li set ${VRF} up - ip ro add table ${VRF_TABLE} unreachable default - ip -6 ro add table ${VRF_TABLE} unreachable default + ip ro add table ${VRF_TABLE} unreachable default metric 8192 + ip -6 ro add table ${VRF_TABLE} unreachable default metric 8192
# create test interfaces ip li add ${NETIFS[p1]} type veth peer name ${NETIFS[p2]} @@ -185,20 +185,20 @@ setup() for n in 1 3 5 7; do ip li set ${NETIFS[p${n}]} up ip addr add ${V4ADDRS[p${n}]}/24 dev ${NETIFS[p${n}]} - ip addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} + ip addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} nodad done
# move peer interfaces to namespace and add addresses for n in 2 4 6 8; do ip li set ${NETIFS[p${n}]} netns ${PEER_NS} up ip -netns ${PEER_NS} addr add ${V4ADDRS[p${n}]}/24 dev ${NETIFS[p${n}]} - ip -netns ${PEER_NS} addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} + ip -netns ${PEER_NS} addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} nodad done
- set +e + ip -6 ro add default via ${V6ADDRS[p3]/::[0-9]/::64} + ip -6 ro add table ${VRF_TABLE} default via ${V6ADDRS[p7]/::[0-9]/::64}
- # let DAD complete - assume default of 1 probe - sleep 1 + set +e }
cleanup()
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 5660b9d9d6a29c2c3cc12f62ae44bfb56b0a15a9 ]
sctp data size should be calculated by subtracting data chunk header's length from chunk_hdr->length, not just data header.
Fixes: 668c9beb9020 ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sctp/sm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/net/sctp/sm.h +++ b/include/net/sctp/sm.h @@ -347,7 +347,7 @@ static inline __u16 sctp_data_size(struc __u16 size;
size = ntohs(chunk->chunk_hdr->length); - size -= sctp_datahdr_len(&chunk->asoc->stream); + size -= sctp_datachk_len(&chunk->asoc->stream);
return size; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit c863850ce22e1b0bb365d49cadf51f4765153ae4 ]
When sctp_wait_for_connect is called to wait for connect ready for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could be triggered if cpu is scheduled out and the new asoc is freed elsewhere, as it will return err and later the asoc gets freed again in sctp_sendmsg.
[ 285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100) [ 285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0 [ 285.846193] Kernel panic - not syncing: panic_on_warn set ... [ 285.846193] [ 285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584 [ 285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 285.852164] Call Trace: ... [ 285.872210] ? __list_del_entry_valid+0x50/0xa0 [ 285.872894] sctp_association_free+0x42/0x2d0 [sctp] [ 285.873612] sctp_sendmsg+0x5a4/0x6b0 [sctp] [ 285.874236] sock_sendmsg+0x30/0x40 [ 285.874741] ___sys_sendmsg+0x27a/0x290 [ 285.875304] ? __switch_to_asm+0x34/0x70 [ 285.875872] ? __switch_to_asm+0x40/0x70 [ 285.876438] ? ptep_set_access_flags+0x2a/0x30 [ 285.877083] ? do_wp_page+0x151/0x540 [ 285.877614] __sys_sendmsg+0x58/0xa0 [ 285.878138] do_syscall_64+0x55/0x180 [ 285.878669] entry_SYSCALL_64_after_hwframe+0x44/0xa9
This is a similar issue with the one fixed in Commit ca3af4dd28cf ("sctp: do not free asoc when it is already dead in sctp_sendmsg"). But this one can't be fixed by returning -ESRCH for the dead asoc in sctp_wait_for_connect, as it will break sctp_connect's return value to users.
This patch is to simply set err to -ESRCH before it returns to sctp_sendmsg when any err is returned by sctp_wait_for_connect for sp->strm_interleave, so that no asoc would be freed due to this.
When users see this error, they will know the packet hasn't been sent. And it also makes sense to not free asoc because waiting connect fails, like the second call for sctp_wait_for_connect in sctp_sendmsg_to_asoc.
Fixes: 668c9beb9020 ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1939,8 +1939,10 @@ static int sctp_sendmsg_to_asoc(struct s if (sp->strm_interleave) { timeo = sock_sndtimeo(sk, 0); err = sctp_wait_for_connect(asoc, &timeo); - if (err) + if (err) { + err = -ESRCH; goto err; + } } else { wait_connect = true; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Talat Batheesh talatb@mellanox.com
[ Upstream commit fd7e848077c1a466b9187537adce16658f7cb94b ]
Allocated memory for context should be freed once finished working with it.
Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Talat Batheesh talatb@mellanox.com Reviewed-by: Or Gerlitz ogerlitz@mellanox.com Reviewed-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c @@ -245,7 +245,7 @@ static void *mlx5_fpga_ipsec_cmd_exec(st return ERR_PTR(res); }
- /* Context will be freed by wait func after completion */ + /* Context should be freed by the caller after completion. */ return context; }
@@ -418,10 +418,8 @@ static int mlx5_fpga_ipsec_set_caps(stru cmd.cmd = htonl(MLX5_FPGA_IPSEC_CMD_OP_SET_CAP); cmd.flags = htonl(flags); context = mlx5_fpga_ipsec_cmd_exec(mdev, &cmd, sizeof(cmd)); - if (IS_ERR(context)) { - err = PTR_ERR(context); - goto out; - } + if (IS_ERR(context)) + return PTR_ERR(context);
err = mlx5_fpga_ipsec_cmd_wait(context); if (err) @@ -435,6 +433,7 @@ static int mlx5_fpga_ipsec_set_caps(stru }
out: + kfree(context); return err; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karsten Graul kgraul@linux.ibm.com
[ Upstream commit fb692ec4117f6fd25044cfb5720d6b79d400dc65 ]
The pointer to the link group is unset in the smc connection structure right before the call to smc_buf_unuse. Provide the lgr pointer to smc_buf_unuse explicitly. And move the call to smc_lgr_schedule_free_work to the end of smc_conn_free.
Fixes: a6920d1d130c ("net/smc: handle unregistered buffers") Signed-off-by: Karsten Graul kgraul@linux.ibm.com Signed-off-by: Ursula Braun ubraun@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/smc/smc_core.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-)
--- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -114,22 +114,17 @@ static void __smc_lgr_unregister_conn(st sock_put(&smc->sk); /* sock_hold in smc_lgr_register_conn() */ }
-/* Unregister connection and trigger lgr freeing if applicable +/* Unregister connection from lgr */ static void smc_lgr_unregister_conn(struct smc_connection *conn) { struct smc_link_group *lgr = conn->lgr; - int reduced = 0;
write_lock_bh(&lgr->conns_lock); if (conn->alert_token_local) { - reduced = 1; __smc_lgr_unregister_conn(conn); } write_unlock_bh(&lgr->conns_lock); - if (!reduced || lgr->conns_num) - return; - smc_lgr_schedule_free_work(lgr); }
static void smc_lgr_free_work(struct work_struct *work) @@ -238,7 +233,8 @@ out: return rc; }
-static void smc_buf_unuse(struct smc_connection *conn) +static void smc_buf_unuse(struct smc_connection *conn, + struct smc_link_group *lgr) { if (conn->sndbuf_desc) conn->sndbuf_desc->used = 0; @@ -248,8 +244,6 @@ static void smc_buf_unuse(struct smc_con conn->rmb_desc->used = 0; } else { /* buf registration failed, reuse not possible */ - struct smc_link_group *lgr = conn->lgr; - write_lock_bh(&lgr->rmbs_lock); list_del(&conn->rmb_desc->list); write_unlock_bh(&lgr->rmbs_lock); @@ -262,11 +256,16 @@ static void smc_buf_unuse(struct smc_con /* remove a finished connection from its link group */ void smc_conn_free(struct smc_connection *conn) { - if (!conn->lgr) + struct smc_link_group *lgr = conn->lgr; + + if (!lgr) return; smc_cdc_tx_dismiss_slots(conn); - smc_lgr_unregister_conn(conn); - smc_buf_unuse(conn); + smc_lgr_unregister_conn(conn); /* unsets conn->lgr */ + smc_buf_unuse(conn, lgr); /* allow buffer reuse */ + + if (!lgr->conns_num) + smc_lgr_schedule_free_work(lgr); }
static void smc_link_clear(struct smc_link *lnk)
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Machata petrm@mellanox.com
[ Upstream commit ad0b9d94182be8356978d220c82f9837cffeb7a9 ]
Demands to remove FDB entries should be honored even if the FDB entry in question was originally learned, and not added by the user. Therefore ignore the added_by_user datum for SWITCHDEV_FDB_DEL_TO_DEVICE.
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Signed-off-by: Petr Machata petrm@mellanox.com Suggested-by: Ido Schimmel idosch@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -2307,8 +2307,6 @@ static void mlxsw_sp_switchdev_event_wor break; case SWITCHDEV_FDB_DEL_TO_DEVICE: fdb_info = &switchdev_work->fdb_info; - if (!fdb_info->added_by_user) - break; mlxsw_sp_port_fdb_set(mlxsw_sp_port, fdb_info, false); break; case SWITCHDEV_FDB_ADD_TO_BRIDGE: /* fall through */
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 84258438e8ce12d6888b68a1238bba9cb25307e2 ]
pid_task() dereferences rcu protected tasks array. But there is no rcu_read_lock() in shutdown_umh() routine so that rcu_read_lock() is needed. get_pid_task() is wrapper function of pid_task. it holds rcu_read_lock() then calls pid_task(). if task isn't NULL, it increases reference count of task.
test commands: %modprobe bpfilter %modprobe -rv bpfilter
splat looks like: [15102.030932] ============================= [15102.030957] WARNING: suspicious RCU usage [15102.030985] 4.19.0-rc7+ #21 Not tainted [15102.031010] ----------------------------- [15102.031038] kernel/pid.c:330 suspicious rcu_dereference_check() usage! [15102.031063] other info that might help us debug this:
[15102.031332] rcu_scheduler_active = 2, debug_locks = 1 [15102.031363] 1 lock held by modprobe/1570: [15102.031389] #0: 00000000580ef2b0 (bpfilter_lock){+.+.}, at: stop_umh+0x13/0x52 [bpfilter] [15102.031552] stack backtrace: [15102.031583] CPU: 1 PID: 1570 Comm: modprobe Not tainted 4.19.0-rc7+ #21 [15102.031607] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [15102.031628] Call Trace: [15102.031676] dump_stack+0xc9/0x16b [15102.031723] ? show_regs_print_info+0x5/0x5 [15102.031801] ? lockdep_rcu_suspicious+0x117/0x160 [15102.031855] pid_task+0x134/0x160 [15102.031900] ? find_vpid+0xf0/0xf0 [15102.032017] shutdown_umh.constprop.1+0x1e/0x53 [bpfilter] [15102.032055] stop_umh+0x46/0x52 [bpfilter] [15102.032092] __x64_sys_delete_module+0x47e/0x570 [ ... ]
Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Taehee Yoo ap420073@gmail.com Acked-by: Alexei Starovoitov ast@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bpfilter/bpfilter_kern.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/bpfilter/bpfilter_kern.c +++ b/net/bpfilter/bpfilter_kern.c @@ -23,9 +23,11 @@ static void shutdown_umh(struct umh_info
if (!info->pid) return; - tsk = pid_task(find_vpid(info->pid), PIDTYPE_PID); - if (tsk) + tsk = get_pid_task(find_vpid(info->pid), PIDTYPE_PID); + if (tsk) { force_sig(SIGKILL, tsk); + put_task_struct(tsk); + } fput(info->pipe_to_umh); fput(info->pipe_from_umh); info->pid = 0;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 7de414a9dd91426318df7b63da024b2b07e53df5 ]
Most callers of pskb_trim_rcsum() simply drop the skb when it fails, however, ip_check_defrag() still continues to pass the skb up to stack. This is suspicious.
In ip_check_defrag(), after we learn the skb is an IP fragment, passing the skb to callers makes no sense, because callers expect fragments are defrag'ed on success. So, dropping the skb when we can't defrag it is reasonable.
Note, prior to commit 88078d98d1bb, this is not a big problem as checksum will be fixed up anyway. After it, the checksum is not correct on failure.
Found this during code review.
Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet edumazet@google.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_fragment.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -657,10 +657,14 @@ struct sk_buff *ip_check_defrag(struct n if (ip_is_fragment(&iph)) { skb = skb_share_check(skb, GFP_ATOMIC); if (skb) { - if (!pskb_may_pull(skb, netoff + iph.ihl * 4)) - return skb; - if (pskb_trim_rcsum(skb, netoff + len)) - return skb; + if (!pskb_may_pull(skb, netoff + iph.ihl * 4)) { + kfree_skb(skb); + return NULL; + } + if (pskb_trim_rcsum(skb, netoff + len)) { + kfree_skb(skb); + return NULL; + } memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); if (ip_defrag(net, skb, user)) return NULL;
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dimitris Michailidis dmichail@google.com
[ Upstream commit d55bef5059dd057bd077155375c581b49d25be7e ]
We've been getting checksum errors involving small UDP packets, usually 59B packets with 1 extra non-zero padding byte. netdev_rx_csum_fault() has been complaining that HW is providing bad checksums. Turns out the problem is in pskb_trim_rcsum_slow(), introduced in commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends").
The source of the problem is that when the bytes we are trimming start at an odd address, as in the case of the 1 padding byte above, skb_checksum() returns a byte-swapped value. We cannot just combine this with skb->csum using csum_sub(). We need to use csum_block_sub() here that takes into account the parity of the start address and handles the swapping.
Matches existing code in __skb_postpull_rcsum() and esp_remove_trailer().
Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Signed-off-by: Dimitris Michailidis dmichail@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/skbuff.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1845,8 +1845,9 @@ int pskb_trim_rcsum_slow(struct sk_buff if (skb->ip_summed == CHECKSUM_COMPLETE) { int delta = skb->len - len;
- skb->csum = csum_sub(skb->csum, - skb_checksum(skb, len, delta, 0)); + skb->csum = csum_block_sub(skb->csum, + skb_checksum(skb, len, delta, 0), + len); } return __pskb_trim(skb, len); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tariq Toukan tariqt@mellanox.com
[ Upstream commit 37fdffb217a45609edccbb8b407d031143f551c0 ]
mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API.
Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue.
This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix()
The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers.
New calculation is also more efficient, and improves performance as follows:
Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.
Before: 16,477,619 pps After: 17,085,793 pps
3.7% improvement
Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan tariqt@mellanox.com Reviewed-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 ++++----- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 22 +++++++++--------- drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h | 5 +--- drivers/net/ethernet/mellanox/mlx5/core/wq.c | 5 ---- drivers/net/ethernet/mellanox/mlx5/core/wq.h | 11 ++++----- include/linux/mlx5/driver.h | 8 ++++++ 6 files changed, 31 insertions(+), 32 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -429,10 +429,9 @@ static inline u16 mlx5e_icosq_wrap_cnt(s
static inline void mlx5e_fill_icosq_frag_edge(struct mlx5e_icosq *sq, struct mlx5_wq_cyc *wq, - u16 pi, u16 frag_pi) + u16 pi, u16 nnops) { struct mlx5e_sq_wqe_info *edge_wi, *wi = &sq->db.ico_wqe[pi]; - u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
edge_wi = wi + nnops;
@@ -451,15 +450,14 @@ static int mlx5e_alloc_rx_mpwqe(struct m struct mlx5_wq_cyc *wq = &sq->wq; struct mlx5e_umr_wqe *umr_wqe; u16 xlt_offset = ix << (MLX5E_LOG_ALIGNED_MPWQE_PPW - 1); - u16 pi, frag_pi; + u16 pi, contig_wqebbs_room; int err; int i;
pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc); - frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc); - - if (unlikely(frag_pi + MLX5E_UMR_WQEBBS > mlx5_wq_cyc_get_frag_size(wq))) { - mlx5e_fill_icosq_frag_edge(sq, wq, pi, frag_pi); + contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi); + if (unlikely(contig_wqebbs_room < MLX5E_UMR_WQEBBS)) { + mlx5e_fill_icosq_frag_edge(sq, wq, pi, contig_wqebbs_room); pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc); }
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c @@ -287,10 +287,9 @@ dma_unmap_wqe_err:
static inline void mlx5e_fill_sq_frag_edge(struct mlx5e_txqsq *sq, struct mlx5_wq_cyc *wq, - u16 pi, u16 frag_pi) + u16 pi, u16 nnops) { struct mlx5e_tx_wqe_info *edge_wi, *wi = &sq->db.wqe_info[pi]; - u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
edge_wi = wi + nnops;
@@ -345,8 +344,8 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_t struct mlx5e_tx_wqe_info *wi;
struct mlx5e_sq_stats *stats = sq->stats; + u16 headlen, ihs, contig_wqebbs_room; u16 ds_cnt, ds_cnt_inl = 0; - u16 headlen, ihs, frag_pi; u8 num_wqebbs, opcode; u32 num_bytes; int num_dma; @@ -383,9 +382,9 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_t }
num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS); - frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc); - if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) { - mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi); + contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi); + if (unlikely(contig_wqebbs_room < num_wqebbs)) { + mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room); mlx5e_sq_fetch_wqe(sq, &wqe, &pi); }
@@ -629,7 +628,7 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_t struct mlx5e_tx_wqe_info *wi;
struct mlx5e_sq_stats *stats = sq->stats; - u16 headlen, ihs, pi, frag_pi; + u16 headlen, ihs, pi, contig_wqebbs_room; u16 ds_cnt, ds_cnt_inl = 0; u8 num_wqebbs, opcode; u32 num_bytes; @@ -665,13 +664,14 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_t }
num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS); - frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc); - if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) { + pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc); + contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi); + if (unlikely(contig_wqebbs_room < num_wqebbs)) { + mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room); pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc); - mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi); }
- mlx5i_sq_fetch_wqe(sq, &wqe, &pi); + mlx5i_sq_fetch_wqe(sq, &wqe, pi);
/* fill wqe */ wi = &sq->db.wqe_info[pi]; --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h @@ -109,12 +109,11 @@ struct mlx5i_tx_wqe {
static inline void mlx5i_sq_fetch_wqe(struct mlx5e_txqsq *sq, struct mlx5i_tx_wqe **wqe, - u16 *pi) + u16 pi) { struct mlx5_wq_cyc *wq = &sq->wq;
- *pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc); - *wqe = mlx5_wq_cyc_get_wqe(wq, *pi); + *wqe = mlx5_wq_cyc_get_wqe(wq, pi); memset(*wqe, 0, sizeof(**wqe)); }
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c @@ -39,11 +39,6 @@ u32 mlx5_wq_cyc_get_size(struct mlx5_wq_ return (u32)wq->fbc.sz_m1 + 1; }
-u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq) -{ - return wq->fbc.frag_sz_m1 + 1; -} - u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq) { return wq->fbc.sz_m1 + 1; --- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h @@ -80,7 +80,6 @@ int mlx5_wq_cyc_create(struct mlx5_core_ void *wqc, struct mlx5_wq_cyc *wq, struct mlx5_wq_ctrl *wq_ctrl); u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq); -u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq);
int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param, void *qpc, struct mlx5_wq_qp *wq, @@ -140,11 +139,6 @@ static inline u16 mlx5_wq_cyc_ctr2ix(str return ctr & wq->fbc.sz_m1; }
-static inline u16 mlx5_wq_cyc_ctr2fragix(struct mlx5_wq_cyc *wq, u16 ctr) -{ - return ctr & wq->fbc.frag_sz_m1; -} - static inline u16 mlx5_wq_cyc_get_head(struct mlx5_wq_cyc *wq) { return mlx5_wq_cyc_ctr2ix(wq, wq->wqe_ctr); @@ -160,6 +154,11 @@ static inline void *mlx5_wq_cyc_get_wqe( return mlx5_frag_buf_get_wqe(&wq->fbc, ix); }
+static inline u16 mlx5_wq_cyc_get_contig_wqebbs(struct mlx5_wq_cyc *wq, u16 ix) +{ + return mlx5_frag_buf_get_idx_last_contig_stride(&wq->fbc, ix) - ix + 1; +} + static inline int mlx5_wq_cyc_cc_bigger(u16 cc1, u16 cc2) { int equal = (cc1 == cc2); --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1022,6 +1022,14 @@ static inline void *mlx5_frag_buf_get_wq ((fbc->frag_sz_m1 & ix) << fbc->log_stride); }
+static inline u32 +mlx5_frag_buf_get_idx_last_contig_stride(struct mlx5_frag_buf_ctrl *fbc, u32 ix) +{ + u32 last_frag_stride_idx = (ix + fbc->strides_offset) | fbc->frag_sz_m1; + + return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1); +} + int mlx5_cmd_init(struct mlx5_core_dev *dev); void mlx5_cmd_cleanup(struct mlx5_core_dev *dev); void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shalom Toledo shalomt@mellanox.com
[ Upstream commit a22712a962912faf257e857ab6857f56a93cfb34 ]
After a failed reload, the driver is still registered to devlink, its devlink instance is still allocated and the 'reload_fail' flag is set. Then, in the next reload try, the driver's allocated devlink instance will be freed without unregistering from devlink and its components (e.g, resources). This scenario can cause a use-after-free if the user tries to execute command via devlink user-space tool.
Fix by not freeing the devlink instance during reload (failed or not).
Fixes: 24cc68ad6c46 ("mlxsw: core: Add support for reload") Signed-off-by: Shalom Toledo shalomt@mellanox.com Reviewed-by: Jiri Pirko jiri@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlxsw/core.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core.c @@ -985,8 +985,8 @@ static int mlxsw_devlink_core_bus_device mlxsw_core->bus, mlxsw_core->bus_priv, true, devlink); - if (err) - mlxsw_core->reload_fail = true; + mlxsw_core->reload_fail = !!err; + return err; }
@@ -1126,8 +1126,15 @@ void mlxsw_core_bus_device_unregister(st const char *device_kind = mlxsw_core->bus_info->device_kind; struct devlink *devlink = priv_to_devlink(mlxsw_core);
- if (mlxsw_core->reload_fail) - goto reload_fail; + if (mlxsw_core->reload_fail) { + if (!reload) + /* Only the parts that were not de-initialized in the + * failed reload attempt need to be de-initialized. + */ + goto reload_fail_deinit; + else + return; + }
if (mlxsw_core->driver->fini) mlxsw_core->driver->fini(mlxsw_core); @@ -1140,9 +1147,12 @@ void mlxsw_core_bus_device_unregister(st if (!reload) devlink_resources_unregister(devlink, NULL); mlxsw_core->bus->fini(mlxsw_core->bus_priv); - if (reload) - return; -reload_fail: + + return; + +reload_fail_deinit: + devlink_unregister(devlink); + devlink_resources_unregister(devlink, NULL); devlink_free(devlink); mlxsw_core_driver_put(device_kind); }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit 2b4792eaa9f553764047d157365ed8b7787751a3 ]
Some drivers reference it via node_distance(), for example the NVME host driver core.
ERROR: "__node_distance" [drivers/nvme/host/nvme-core.ko] undefined! make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/mm/init_64.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -1383,6 +1383,7 @@ int __node_distance(int from, int to) } return numa_latency[from][to]; } +EXPORT_SYMBOL(__node_distance);
static int __init find_best_numa_node_for_mlgroup(struct mdesc_mlgroup *grp) {
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Miller davem@redhat.com
[ Upstream commit 5b4fc3882a649c9411dd0dcad2ddb78e911d340e ]
Right now if we get a corrupted user stack frame we do a do_exit(SIGILL) which is not helpful.
If under a debugger, this behavior causes the inferior process to exit. So the register and other state cannot be examined at the time of the event.
Instead, conditionally log a rate limited kernel log message and then force a SIGSEGV.
With bits and ideas borrowed (as usual) from powerpc.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/include/asm/switch_to_64.h | 3 ++- arch/sparc/kernel/process_64.c | 25 +++++++++++++++++++------ arch/sparc/kernel/rtrap_64.S | 1 + arch/sparc/kernel/signal32.c | 12 ++++++++++-- arch/sparc/kernel/signal_64.c | 6 +++++- 5 files changed, 37 insertions(+), 10 deletions(-)
--- a/arch/sparc/include/asm/switch_to_64.h +++ b/arch/sparc/include/asm/switch_to_64.h @@ -67,6 +67,7 @@ do { save_and_clear_fpu(); \ } while(0)
void synchronize_user_stack(void); -void fault_in_user_windows(void); +struct pt_regs; +void fault_in_user_windows(struct pt_regs *);
#endif /* __SPARC64_SWITCH_TO_64_H */ --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -36,6 +36,7 @@ #include <linux/sysrq.h> #include <linux/nmi.h> #include <linux/context_tracking.h> +#include <linux/signal.h>
#include <linux/uaccess.h> #include <asm/page.h> @@ -521,7 +522,12 @@ static void stack_unaligned(unsigned lon force_sig_fault(SIGBUS, BUS_ADRALN, (void __user *) sp, 0, current); }
-void fault_in_user_windows(void) +static const char uwfault32[] = KERN_INFO \ + "%s[%d]: bad register window fault: SP %08lx (orig_sp %08lx) TPC %08lx O7 %08lx\n"; +static const char uwfault64[] = KERN_INFO \ + "%s[%d]: bad register window fault: SP %016lx (orig_sp %016lx) TPC %08lx O7 %016lx\n"; + +void fault_in_user_windows(struct pt_regs *regs) { struct thread_info *t = current_thread_info(); unsigned long window; @@ -534,9 +540,9 @@ void fault_in_user_windows(void) do { struct reg_window *rwin = &t->reg_window[window]; int winsize = sizeof(struct reg_window); - unsigned long sp; + unsigned long sp, orig_sp;
- sp = t->rwbuf_stkptrs[window]; + orig_sp = sp = t->rwbuf_stkptrs[window];
if (test_thread_64bit_stack(sp)) sp += STACK_BIAS; @@ -547,8 +553,16 @@ void fault_in_user_windows(void) stack_unaligned(sp);
if (unlikely(copy_to_user((char __user *)sp, - rwin, winsize))) + rwin, winsize))) { + if (show_unhandled_signals) + printk_ratelimited(is_compat_task() ? + uwfault32 : uwfault64, + current->comm, current->pid, + sp, orig_sp, + regs->tpc, + regs->u_regs[UREG_I7]); goto barf; + } } while (window--); } set_thread_wsaved(0); @@ -556,8 +570,7 @@ void fault_in_user_windows(void)
barf: set_thread_wsaved(window + 1); - user_exit(); - do_exit(SIGILL); + force_sig(SIGSEGV, current); }
asmlinkage long sparc_do_fork(unsigned long clone_flags, --- a/arch/sparc/kernel/rtrap_64.S +++ b/arch/sparc/kernel/rtrap_64.S @@ -39,6 +39,7 @@ __handle_preemption: wrpr %g0, RTRAP_PSTATE_IRQOFF, %pstate
__handle_user_windows: + add %sp, PTREGS_OFF, %o0 call fault_in_user_windows 661: wrpr %g0, RTRAP_PSTATE, %pstate /* If userspace is using ADI, it could potentially pass --- a/arch/sparc/kernel/signal32.c +++ b/arch/sparc/kernel/signal32.c @@ -371,7 +371,11 @@ static int setup_frame32(struct ksignal get_sigframe(ksig, regs, sigframe_size); if (invalid_frame_pointer(sf, sigframe_size)) { - do_exit(SIGILL); + if (show_unhandled_signals) + pr_info("%s[%d] bad frame in setup_frame32: %08lx TPC %08lx O7 %08lx\n", + current->comm, current->pid, (unsigned long)sf, + regs->tpc, regs->u_regs[UREG_I7]); + force_sigsegv(ksig->sig, current); return -EINVAL; }
@@ -501,7 +505,11 @@ static int setup_rt_frame32(struct ksign get_sigframe(ksig, regs, sigframe_size); if (invalid_frame_pointer(sf, sigframe_size)) { - do_exit(SIGILL); + if (show_unhandled_signals) + pr_info("%s[%d] bad frame in setup_rt_frame32: %08lx TPC %08lx O7 %08lx\n", + current->comm, current->pid, (unsigned long)sf, + regs->tpc, regs->u_regs[UREG_I7]); + force_sigsegv(ksig->sig, current); return -EINVAL; }
--- a/arch/sparc/kernel/signal_64.c +++ b/arch/sparc/kernel/signal_64.c @@ -370,7 +370,11 @@ setup_rt_frame(struct ksignal *ksig, str get_sigframe(ksig, regs, sf_size);
if (invalid_frame_pointer (sf)) { - do_exit(SIGILL); /* won't return, actually */ + if (show_unhandled_signals) + pr_info("%s[%d] bad frame in setup_rt_frame: %016lx TPC %016lx O7 %016lx\n", + current->comm, current->pid, (unsigned long)sf, + regs->tpc, regs->u_regs[UREG_I7]); + force_sigsegv(ksig->sig, current); return -EINVAL; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit b3e1eb8e7ac9aaa283989496651d99267c4cad6c ]
So that when it is unset, ie. '-1', userspace can see it properly.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/include/asm/cpudata_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/sparc/include/asm/cpudata_64.h +++ b/arch/sparc/include/asm/cpudata_64.h @@ -28,7 +28,7 @@ typedef struct { unsigned short sock_id; /* physical package */ unsigned short core_id; unsigned short max_cache_id; /* groupings of highest shared cache */ - unsigned short proc_id; /* strand (aka HW thread) id */ + signed short proc_id; /* strand (aka HW thread) id */ } cpuinfo_sparc;
DECLARE_PER_CPU(cpuinfo_sparc, __cpu_data);
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit d1f1f98c6d1708a90436e1a3b2aff5e93946731b ]
If we did some signal processing, we have to reload the pt_regs tstate register because it's value may have changed.
In doing so we also have to extract the %pil value contained in there anre load that into %l4.
This value is at bit 20 and thus needs to be shifted down before we later write it into the %pil register.
Most of the time this is harmless as we are returning to userspace and the %pil is zero for that case.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/kernel/rtrap_64.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/sparc/kernel/rtrap_64.S +++ b/arch/sparc/kernel/rtrap_64.S @@ -85,8 +85,9 @@ __handle_signal: ldx [%sp + PTREGS_OFF + PT_V9_TSTATE], %l1 sethi %hi(0xf << 20), %l4 and %l1, %l4, %l4 + andn %l1, %l4, %l1 ba,pt %xcc, __handle_preemption_continue - andn %l1, %l4, %l1 + srl %l4, 20, %l4
/* When returning from a NMI (%pil==15) interrupt we want to * avoid running softirqs, doing IRQ tracing, preempting, etc.
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit 1f2b5b8e2df4591fbca430aff9c5a072dcc0f408 ]
Fixes: 8b30ca73b7cc ("sparc: Add all necessary direct socket system calls.") Reported-by: Joseph Myers joseph@codesourcery.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/kernel/systbls_64.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/sparc/kernel/systbls_64.S +++ b/arch/sparc/kernel/systbls_64.S @@ -47,9 +47,9 @@ sys_call_table32: .word sys_recvfrom, sys_setreuid16, sys_setregid16, sys_rename, compat_sys_truncate /*130*/ .word compat_sys_ftruncate, sys_flock, compat_sys_lstat64, sys_sendto, sys_shutdown .word sys_socketpair, sys_mkdir, sys_rmdir, compat_sys_utimes, compat_sys_stat64 -/*140*/ .word sys_sendfile64, sys_nis_syscall, compat_sys_futex, sys_gettid, compat_sys_getrlimit +/*140*/ .word sys_sendfile64, sys_getpeername, compat_sys_futex, sys_gettid, compat_sys_getrlimit .word compat_sys_setrlimit, sys_pivot_root, sys_prctl, sys_pciconfig_read, sys_pciconfig_write -/*150*/ .word sys_nis_syscall, sys_inotify_init, sys_inotify_add_watch, sys_poll, sys_getdents64 +/*150*/ .word sys_getsockname, sys_inotify_init, sys_inotify_add_watch, sys_poll, sys_getdents64 .word compat_sys_fcntl64, sys_inotify_rm_watch, compat_sys_statfs, compat_sys_fstatfs, sys_oldumount /*160*/ .word compat_sys_sched_setaffinity, compat_sys_sched_getaffinity, sys_getdomainname, sys_setdomainname, sys_nis_syscall .word sys_quotactl, sys_set_tid_address, compat_sys_mount, compat_sys_ustat, sys_setxattr
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit cfdc3170d214046b9509183fe9b9544dc644d40b ]
It is important to clear the hw->state value for non-stopped events when they are added into the PMU. Otherwise when the event is scheduled out, we won't read the counter because HES_UPTODATE is still set. This breaks 'perf stat' and similar use cases, causing all the events to show zero.
This worked for multi-pcr because we make explicit sparc_pmu_start() calls in calculate_multiple_pcrs(). calculate_single_pcr() doesn't do this because the idea there is to accumulate all of the counter settings into the single pcr value. So we have to add explicit hw->state handling there.
Like x86, we use the PERF_HES_ARCH bit to track truly stopped events so that we don't accidently start them on a reload.
Related to all of this, sparc_pmu_start() is missing a userpage update so add it.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/kernel/perf_event.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
--- a/arch/sparc/kernel/perf_event.c +++ b/arch/sparc/kernel/perf_event.c @@ -927,6 +927,8 @@ static void read_in_all_counters(struct sparc_perf_event_update(cp, &cp->hw, cpuc->current_idx[i]); cpuc->current_idx[i] = PIC_NO_INDEX; + if (cp->hw.state & PERF_HES_STOPPED) + cp->hw.state |= PERF_HES_ARCH; } } } @@ -959,10 +961,12 @@ static void calculate_single_pcr(struct
enc = perf_event_get_enc(cpuc->events[i]); cpuc->pcr[0] &= ~mask_for_index(idx); - if (hwc->state & PERF_HES_STOPPED) + if (hwc->state & PERF_HES_ARCH) { cpuc->pcr[0] |= nop_for_index(idx); - else + } else { cpuc->pcr[0] |= event_encoding(enc, idx); + hwc->state = 0; + } } out: cpuc->pcr[0] |= cpuc->event[0]->hw.config_base; @@ -988,6 +992,9 @@ static void calculate_multiple_pcrs(stru
cpuc->current_idx[i] = idx;
+ if (cp->hw.state & PERF_HES_ARCH) + continue; + sparc_pmu_start(cp, PERF_EF_RELOAD); } out: @@ -1079,6 +1086,8 @@ static void sparc_pmu_start(struct perf_ event->hw.state = 0;
sparc_pmu_enable_event(cpuc, &event->hw, idx); + + perf_event_update_userpage(event); }
static void sparc_pmu_stop(struct perf_event *event, int flags) @@ -1371,9 +1380,9 @@ static int sparc_pmu_add(struct perf_eve cpuc->events[n0] = event->hw.event_base; cpuc->current_idx[n0] = PIC_NO_INDEX;
- event->hw.state = PERF_HES_UPTODATE; + event->hw.state = PERF_HES_UPTODATE | PERF_HES_STOPPED; if (!(ef_flags & PERF_EF_START)) - event->hw.state |= PERF_HES_STOPPED; + event->hw.state |= PERF_HES_ARCH;
/* * If group events scheduling transaction was started,
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit 776ca1543b5fe673aaf1beb244fcc2429d378083 ]
First, the trap number for 32-bit syscalls is 0x10.
Also, only negate the return value when syscall error is indicated by the carry bit being set.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/vdso/vclock_gettime.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/arch/sparc/vdso/vclock_gettime.c +++ b/arch/sparc/vdso/vclock_gettime.c @@ -33,9 +33,19 @@ #define TICK_PRIV_BIT (1ULL << 63) #endif
+#ifdef CONFIG_SPARC64 #define SYSCALL_STRING \ "ta 0x6d;" \ - "sub %%g0, %%o0, %%o0;" \ + "bcs,a 1f;" \ + " sub %%g0, %%o0, %%o0;" \ + "1:" +#else +#define SYSCALL_STRING \ + "ta 0x10;" \ + "bcs,a 1f;" \ + " sub %%g0, %%o0, %%o0;" \ + "1:" +#endif
#define SYSCALL_CLOBBERS \ "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7", \
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" davem@davemloft.net
[ Upstream commit 455adb3174d2c8518cef1a61140c211f6ac224d2 ]
Like x86 and arm, call perf_sample_event_took() in perf event NMI interrupt handler.
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/kernel/perf_event.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/arch/sparc/kernel/perf_event.c +++ b/arch/sparc/kernel/perf_event.c @@ -24,6 +24,7 @@ #include <asm/cpudata.h> #include <linux/uaccess.h> #include <linux/atomic.h> +#include <linux/sched/clock.h> #include <asm/nmi.h> #include <asm/pcr.h> #include <asm/cacheflush.h> @@ -1612,6 +1613,8 @@ static int __kprobes perf_event_nmi_hand struct perf_sample_data data; struct cpu_hw_events *cpuc; struct pt_regs *regs; + u64 finish_clock; + u64 start_clock; int i;
if (!atomic_read(&active_events)) @@ -1625,6 +1628,8 @@ static int __kprobes perf_event_nmi_hand return NOTIFY_DONE; }
+ start_clock = sched_clock(); + regs = args->regs;
cpuc = this_cpu_ptr(&cpu_hw_events); @@ -1663,6 +1668,10 @@ static int __kprobes perf_event_nmi_hand sparc_pmu_stop(event, 0); }
+ finish_clock = sched_clock(); + + perf_sample_event_took(finish_clock - start_clock); + return NOTIFY_STOP; }
4.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov nikolay@cumulusnetworks.com
commit 0fe5119e267f3e3d8ac206895f5922195ec55a8a upstream.
Recently a check was added which prevents marking of routers with zero source address, but for IPv6 that cannot happen as the relevant RFCs actually forbid such packets: RFC 2710 (MLDv1): "To be valid, the Query message MUST come from a link-local IPv6 Source Address, be at least 24 octets long, and have a correct MLD checksum."
Same goes for RFC 3810.
And also it can be seen as a requirement in ipv6_mc_check_mld_query() which is used by the bridge to validate the message before processing it. Thus any queries with :: source address won't be processed anyway. So just remove the check for zero IPv6 source address from the query processing function.
Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0") Signed-off-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/bridge/br_multicast.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1426,8 +1426,7 @@ static void br_multicast_query_received( * is 0.0.0.0 should not be added to router port list. */ if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) || - (saddr->proto == htons(ETH_P_IPV6) && - !ipv6_addr_any(&saddr->u.ip6))) + saddr->proto == htons(ETH_P_IPV6)) br_multicast_mark_router(br, port); }
stable-rc/linux-4.18.y boot: 130 boots: 0 failed, 116 passed with 13 offline, 1 conflict (v4.18.16-151-g8a21f68d3977)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.18.y/kernel/v4.18... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.18.y/kernel/v4.18.16-151...
Tree: stable-rc Branch: linux-4.18.y Git Describe: v4.18.16-151-g8a21f68d3977 Git Commit: 8a21f68d39776cc3ca539a12266806e0a483da57 Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 66 unique boards, 24 SoC families, 15 builds out of 187
Boot Regressions Detected:
arm:
exynos_defconfig: exynos5422-odroidxu3: lab-baylibre-seattle: new failure (last pass: v4.18.16)
Offline Platforms:
arm:
tegra_defconfig: tegra124-jetson-tk1: 1 offline lab
bcm2835_defconfig: bcm2835-rpi-b: 1 offline lab
sama5_defconfig: at91-sama5d4_xplained: 1 offline lab
multi_v7_defconfig: alpine-db: 1 offline lab at91-sama5d4_xplained: 1 offline lab socfpga_cyclone5_de0_sockit: 1 offline lab sun5i-r8-chip: 1 offline lab tegra124-jetson-tk1: 1 offline lab
sunxi_defconfig: sun5i-r8-chip: 1 offline lab
arm64:
defconfig: apq8016-sbc: 1 offline lab juno-r2: 1 offline lab meson-gxl-s905d-p230: 1 offline lab mt7622-rfb1: 1 offline lab
Conflicting Boot Failure Detected: (These likely are not failures as other labs are reporting PASS. Needs review.)
arm:
exynos_defconfig: exynos5422-odroidxu3: lab-baylibre: PASS lab-baylibre-seattle: FAIL lab-collabora: PASS
--- For more info write to info@kernelci.org
On 11/2/18 11:32 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.18.17 release. There are 150 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Nov 4 18:28:05 UTC 2018. Anything received after that time might be too late.
Build results: total: 136 pass: 136 fail: 0 Qemu test results: total: 321 pass: 321 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
On Sat, 3 Nov 2018 at 00:08, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.18.17 release. There are 150 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun Nov 4 18:28:05 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.18.17-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.18.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64 and i386.
Summary ------------------------------------------------------------------------
kernel: 4.18.17-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.18.y git commit: 79190d1d2b752a03eced074e98b3aa2a03563370 git describe: v4.18.16-151-g79190d1d2b75 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.18-oe/build/v4.18.16-15...
No regressions (compared to build v4.18.16-151-g8a21f68d3977)
linux-stable-mirror@lists.linaro.org