This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.77-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.77-rc1
Jiri Olsa jolsa@kernel.org perf tools: Fix snprint warnings for gcc 8
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v1: mitigate user accesses
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v1: use get_user() for __get_user()
Russell King rmk+kernel@armlinux.org.uk ARM: use __inttype() in get_user()
Russell King rmk+kernel@armlinux.org.uk ARM: oabi-compat: copy semops using __copy_from_user()
Russell King rmk+kernel@armlinux.org.uk ARM: vfp: use __copy_from_user() when restoring VFP state
Russell King rmk+kernel@armlinux.org.uk ARM: signal: copy registers using __copy_from_user()
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v1: fix syscall entry
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v1: add array_index_mask_nospec() implementation
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v1: add speculation barrier (csdb) macros
Russell King rmk+kernel@armlinux.org.uk ARM: KVM: report support for SMCCC_ARCH_WORKAROUND_1
Russell King rmk+kernel@armlinux.org.uk ARM: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: KVM: invalidate icache on guest exit for Brahma B15
Marc Zyngier marc.zyngier@arm.com ARM: KVM: invalidate icache on guest exit for Cortex-A15
Marc Zyngier marc.zyngier@arm.com ARM: KVM: invalidate BTB on guest exit for Cortex-A12/A17
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: warn about incorrect context switching functions
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: add firmware based hardening
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: harden user aborts in kernel space
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: add Cortex A8 and A15 validation of the IBE bit
Russell King rmk+kernel@armlinux.org.uk ARM: spectre-v2: harden branch predictor on context switches
Russell King rmk+kernel@armlinux.org.uk ARM: spectre: add Kconfig symbol for CPUs vulnerable to Spectre
Russell King rmk+kernel@armlinux.org.uk ARM: bugs: add support for per-processor bug checking
Russell King rmk+kernel@armlinux.org.uk ARM: bugs: hook processor bug checking into SMP and suspend paths
Russell King rmk+kernel@armlinux.org.uk ARM: bugs: prepare processor bug infrastructure
Russell King rmk+kernel@armlinux.org.uk ARM: add more CPU part numbers for Cortex and Brahma B15 CPUs
Roman Gushchin guro@fb.com mm: don't show nr_indirectly_reclaimable in /proc/vmstat
Roman Gushchin guro@fb.com mm: treat indirectly reclaimable memory as free in overcommit logic
Roman Gushchin guro@fb.com dcache: account external names as indirectly reclaimable memory
Roman Gushchin guro@fb.com mm: treat indirectly reclaimable memory as available in MemAvailable
Roman Gushchin guro@fb.com mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES
Mathias Nyman mathias.nyman@linux.intel.com xhci: Don't print a warning when setting link state for disabled ports
Edgar Cherkasov echerkasov@dev.rtsoft.ru i2c: i2c-scmi: fix for i2c_smbus_write_block_data
Jan Kara jack@suse.cz mm: Preserve _PAGE_DEVMAP across mprotect() calls
Jérôme Glisse jglisse@redhat.com mm/thp: fix call to mmu_notifier in set_pmd_migration_entry() v2
Will Deacon will.deacon@arm.com arm64: perf: Reject stand-alone CHAIN events for PMUv3
Marco Felsch m.felsch@pengutronix.de pinctrl: mcp23s08: fix irq and irqchip setup order
Chris Boot bootc@bootc.net mmc: block: avoid multiblock reads for the last sector in SPI mode
Tejun Heo tj@kernel.org cgroup: Fix dom_cgrp propagation when enabling threaded mode
Damien Le Moal damien.lemoal@wdc.com dm linear: fix linear_end_io conditional definition
Mike Snitzer snitzer@redhat.com dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled
Damien Le Moal damien.lemoal@wdc.com dm: fix report zone remapping to account for partition offset
Shenghui Wang shhuiw@foxmail.com dm cache: destroy migration_cache if cache target registration failed
Eric Farman farman@linux.ibm.com s390/cio: Fix how vfio-ccw checks pinned pages
Adrian Hunter adrian.hunter@intel.com perf script python: Fix export-to-sqlite.py sample columns
Adrian Hunter adrian.hunter@intel.com perf script python: Fix export-to-postgresql.py occasional failure
Mike Rapoport rppt@linux.vnet.ibm.com percpu: stop leaking bitmap metadata blocks
Mikulas Patocka mpatocka@redhat.com mach64: detect the dot clock divider correctly on sparc
Paul Burton paul.burton@mips.com MIPS: VDSO: Always map near top of user memory
Jann Horn jannh@google.com mm/vmstat.c: fix outdated vmstat_text
Amber Lin Amber.Lin@amd.com drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7
Vitaly Kuznetsov vkuznets@redhat.com x86/kvm/lapic: always disable MMIO interface in x2APIC mode
Hans de Goede hdegoede@redhat.com clk: x86: Stop marking clocks as CLK_IS_CRITICAL
Hans de Goede hdegoede@redhat.com clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail
Stephen Hemminger stephen@networkplumber.org PCI: hv: support reporting serial number as slot information
Nicolas Ferre nicolas.ferre@microchip.com ARM: dts: at91: add new compatibility string for macb on sama5d3
Nicolas Ferre nicolas.ferre@microchip.com net: macb: disable scatter-gather for macb on sama5d3
Jongsung Kim neidhard.kim@lge.com stmmac: fix valid numbers of unicast filter entries
Stephen Hemminger stephen@networkplumber.org hv_netvsc: fix schedule in RCU context
Yu Zhao yuzhao@google.com sound: don't call skl_init_chip() to reset intel skl soc
Yu Zhao yuzhao@google.com sound: enable interrupt after dma buffer initialization
Dan Carpenter dan.carpenter@oracle.com scsi: qla2xxx: Fix an endian bug in fcpcmd_is_corrupted()
Laura Abbott labbott@redhat.com scsi: iscsi: target: Don't use stack buffer for scatterlist
Tony Lindgren tony@atomide.com mfd: omap-usb-host: Fix dts probe of children
Hermes Zhang chenhuiz@axis.com Bluetooth: hci_ldisc: Free rw_semaphore on close
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ASoC: rsnd: don't fallback to PIO mode when -EPROBE_DEFER
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ASoC: rsnd: adg: care clock-frequency size
Lei Yang Lei.Yang@windriver.com selftests: memory-hotplug: add required configs
Lei Yang Lei.Yang@windriver.com selftests/efivarfs: add required kernel configs
Danny Smith danny.smith@axis.com ASoC: sigmadsp: safeload should not have lower byte limit
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: wm8804: Add ACPI support
Oder Chiou oder_chiou@realtek.com ASoC: rt5514: Fix the issue of the delay volume applied again
Eric Dumazet edumazet@google.com inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
Eric Dumazet edumazet@google.com tcp/dccp: fix lockdep issue when SYN is backlogged
Maciej Żenczykowski maze@google.com net-ethtool: ETHTOOL_GUFO did not and should not require CAP_NET_ADMIN
Davide Caratti dcaratti@redhat.com bnxt_en: don't try to offload VLAN 'modify' action
Jakub Kicinski jakub.kicinski@netronome.com nfp: avoid soft lockups under control message storm
Mahesh Bandewar maheshb@google.com bonding: fix warning message
Mahesh Bandewar maheshb@google.com bonding: pass link-local packets to bonding master also.
Eran Ben Elisha eranbe@mellanox.com net/mlx5: E-Switch, Fix out of bound access when setting vport rate
Friedemann Gerold f.gerold@b-c-s.de net: aquantia: memory corruption on jumbo frames
Jianbo Liu jianbol@mellanox.com net/mlx5e: Set vlan masks for all offloaded TC rules
Florian Fainelli f.fainelli@gmail.com net: dsa: bcm_sf2: Fix unbind ordering
Jianfeng Tan jianfeng.tan@linux.alibaba.com net/packet: fix packet drop as of virtio gso
Jose Abreu Jose.Abreu@synopsys.com net: stmmac: Fixup the tail addr setting in xmit path
Jiri Kosina jkosina@suse.cz udp: Unbreak modules that rely on external __skb_recv_udp() availability
Parthasarathy Bhuvaragan parthasarathy.bhuvaragan@ericsson.com tipc: fix flow control accounting for implicit connect
Ido Schimmel idosch@mellanox.com team: Forbid enslaving team device to itself
Xin Long lucien.xin@gmail.com sctp: update dst pmtu with the correct daddr
Eric Dumazet edumazet@google.com rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
Mauricio Faria de Oliveira mfo@canonical.com rtnetlink: fix rtnl_fdb_dump() for ndmsg header
Giacinto Cifelli gciofono@gmail.com qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface
Shahed Shaikh shahed.shaikh@cavium.com qlcnic: fix Tx descriptor corruption on 82xx devices
Yu Zhao yuzhao@google.com net/usb: cancel pending work when unbinding smsc75xx
Florian Fainelli f.fainelli@gmail.com net: systemport: Fix wake-up interrupt race during resume
David Ahern dsahern@gmail.com net: sched: Add policy validation for tc attributes
Antoine Tenart antoine.tenart@bootlin.com net: mvpp2: fix a txq_done race condition
Maxime Chevallier maxime.chevallier@bootlin.com net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
Sean Tranchetti stranche@codeaurora.org netlabel: check for IPV4MASK in addrinfo_get
Jeff Barnhill 0xeffeff@gmail.com net/ipv6: Display all addresses in output of /proc/net/if_inet6
Sabrina Dubroca sd@queasysnail.net net: ipv4: update fnhe_pmtu when first hop's MTU changes
Yunsheng Lin linyunsheng@huawei.com net: hns: fix for unmapping problem when SMMU is on
Florian Fainelli f.fainelli@gmail.com net: dsa: bcm_sf2: Call setup during switch resume
Wei Wang weiwan@google.com ipv6: take rcu lock in rawv6_send_hdrinc()
Eric Dumazet edumazet@google.com ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
Paolo Abeni pabeni@redhat.com ip_tunnel: be careful when accessing the inner header
Paolo Abeni pabeni@redhat.com ip6_tunnel: be careful when accessing the inner header
Mahesh Bandewar maheshb@google.com bonding: avoid possible dead-lock
Venkat Duvvuru venkatkumar.duvvuru@broadcom.com bnxt_en: free hwrm resources, if driver probe fails.
Michael Chan michael.chan@broadcom.com bnxt_en: Fix TX timeout during netpoll.
-------------
Diffstat:
Documentation/devicetree/bindings/net/macb.txt | 1 + Makefile | 4 +- arch/arm/boot/dts/sama5d3_emac.dtsi | 2 +- arch/arm/include/asm/assembler.h | 12 ++ arch/arm/include/asm/barrier.h | 32 ++++ arch/arm/include/asm/bugs.h | 6 +- arch/arm/include/asm/cp15.h | 3 + arch/arm/include/asm/cputype.h | 8 + arch/arm/include/asm/kvm_asm.h | 2 - arch/arm/include/asm/kvm_host.h | 14 +- arch/arm/include/asm/kvm_mmu.h | 23 ++- arch/arm/include/asm/proc-fns.h | 4 + arch/arm/include/asm/system_misc.h | 15 ++ arch/arm/include/asm/thread_info.h | 4 +- arch/arm/include/asm/uaccess.h | 26 ++- arch/arm/kernel/Makefile | 1 + arch/arm/kernel/bugs.c | 18 +++ arch/arm/kernel/entry-common.S | 18 +-- arch/arm/kernel/entry-header.S | 25 +++ arch/arm/kernel/signal.c | 58 +++---- arch/arm/kernel/smp.c | 4 + arch/arm/kernel/suspend.c | 2 + arch/arm/kernel/sys_oabi-compat.c | 8 +- arch/arm/kvm/hyp/hyp-entry.S | 112 ++++++++++++- arch/arm/lib/copy_from_user.S | 9 ++ arch/arm/mm/Kconfig | 23 +++ arch/arm/mm/Makefile | 2 +- arch/arm/mm/fault.c | 3 + arch/arm/mm/proc-macros.S | 3 +- arch/arm/mm/proc-v7-2level.S | 6 - arch/arm/mm/proc-v7-bugs.c | 174 +++++++++++++++++++++ arch/arm/mm/proc-v7.S | 154 ++++++++++++++---- arch/arm/vfp/vfpmodule.c | 17 +- arch/arm64/kernel/perf_event.c | 7 + arch/mips/include/asm/processor.h | 10 +- arch/mips/kernel/process.c | 25 +++ arch/mips/kernel/vdso.c | 18 ++- arch/powerpc/include/asm/book3s/64/pgtable.h | 4 +- arch/x86/include/asm/pgtable_types.h | 2 +- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/lapic.c | 22 ++- drivers/bluetooth/hci_ldisc.c | 2 + drivers/clk/x86/clk-pmc-atom.c | 18 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +- drivers/i2c/busses/i2c-scmi.c | 1 + drivers/md/dm-cache-target.c | 5 +- drivers/md/dm-flakey.c | 2 + drivers/md/dm-linear.c | 8 +- drivers/md/dm.c | 27 +++- drivers/mfd/omap-usb-host.c | 11 +- drivers/mmc/core/block.c | 10 ++ drivers/net/bonding/bond_main.c | 65 ++++---- drivers/net/dsa/bcm_sf2.c | 12 +- drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 32 ++-- drivers/net/ethernet/broadcom/bcmsysport.c | 22 +-- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 23 ++- drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 20 ++- drivers/net/ethernet/cadence/macb_main.c | 8 + drivers/net/ethernet/hisilicon/hns/hnae.c | 2 +- drivers/net/ethernet/hisilicon/hns/hns_enet.c | 30 ++-- drivers/net/ethernet/marvell/mvpp2.c | 20 ++- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 + drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 4 +- .../net/ethernet/netronome/nfp/nfp_net_common.c | 17 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 8 +- .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 3 +- .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h | 3 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h | 3 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c | 12 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 8 +- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 +- drivers/net/hyperv/netvsc_drv.c | 9 +- drivers/net/team/team.c | 5 + drivers/net/usb/qmi_wwan.c | 1 + drivers/net/usb/smsc75xx.c | 1 + drivers/pci/host/pci-hyperv.c | 37 +++++ drivers/perf/arm_pmu.c | 8 +- drivers/pinctrl/pinctrl-mcp23s08.c | 13 +- drivers/s390/cio/vfio_ccw_cp.c | 2 +- drivers/scsi/qla2xxx/qla_target.h | 4 +- drivers/target/iscsi/iscsi_target.c | 22 ++- drivers/usb/host/xhci-hub.c | 18 +-- drivers/video/fbdev/aty/atyfb.h | 3 +- drivers/video/fbdev/aty/atyfb_base.c | 7 +- drivers/video/fbdev/aty/mach64_ct.c | 10 +- fs/dcache.c | 38 +++-- include/linux/cgroup-defs.h | 1 + include/linux/mmzone.h | 1 + include/linux/netdevice.h | 7 + include/linux/perf/arm_pmu.h | 1 + include/linux/virtio_net.h | 18 +++ include/net/bonding.h | 7 +- include/net/inet_sock.h | 6 - include/net/ip_fib.h | 1 + include/sound/hdaudio.h | 1 + kernel/cgroup/cgroup.c | 25 +-- mm/huge_memory.c | 6 - mm/page_alloc.c | 7 + mm/percpu.c | 1 + mm/util.c | 7 + mm/vmstat.c | 6 +- net/core/dev.c | 28 +++- net/core/ethtool.c | 1 + net/core/rtnetlink.c | 35 +++-- net/dccp/input.c | 4 +- net/dccp/ipv4.c | 4 +- net/ipv4/fib_frontend.c | 12 +- net/ipv4/fib_semantics.c | 50 ++++++ net/ipv4/inet_connection_sock.c | 5 +- net/ipv4/ip_sockglue.c | 3 +- net/ipv4/ip_tunnel.c | 9 ++ net/ipv4/tcp_input.c | 4 +- net/ipv4/tcp_ipv4.c | 4 +- net/ipv4/udp.c | 2 +- net/ipv6/addrconf.c | 4 +- net/ipv6/ip6_tunnel.c | 13 +- net/ipv6/raw.c | 29 ++-- net/netlabel/netlabel_unlabeled.c | 3 +- net/packet/af_packet.c | 11 +- net/sched/sch_api.c | 22 ++- net/sctp/transport.c | 12 +- net/tipc/socket.c | 4 +- sound/hda/hdac_controller.c | 15 +- sound/soc/codecs/rt5514.c | 8 +- sound/soc/codecs/sigmadsp.c | 3 +- sound/soc/codecs/wm8804-i2c.c | 15 +- sound/soc/intel/skylake/skl.c | 2 +- sound/soc/sh/rcar/adg.c | 5 + sound/soc/sh/rcar/core.c | 10 +- sound/soc/sh/rcar/dma.c | 4 + tools/perf/builtin-script.c | 22 +-- tools/perf/scripts/python/export-to-postgresql.py | 9 ++ tools/perf/scripts/python/export-to-sqlite.py | 6 +- tools/perf/tests/attr.c | 4 +- tools/perf/tests/mem.c | 2 +- tools/perf/tests/pmu.c | 2 +- tools/perf/util/cgroup.c | 2 +- tools/perf/util/parse-events.c | 4 +- tools/perf/util/pmu.c | 2 +- tools/testing/selftests/efivarfs/config | 1 + tools/testing/selftests/memory-hotplug/config | 1 + 141 files changed, 1489 insertions(+), 428 deletions(-)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 73f21c653f930f438d53eed29b5e4c65c8a0f906 ]
The current netpoll implementation in the bnxt_en driver has problems that may miss TX completion events. bnxt_poll_work() in effect is only handling at most 1 TX packet before exiting. In addition, there may be in flight TX completions that ->poll() may miss even after we fix bnxt_poll_work() to handle all visible TX completions. netpoll may not call ->poll() again and HW may not generate IRQ because the driver does not ARM the IRQ when the budget (0 for netpoll) is reached.
We fix it by handling all TX completions and to always ARM the IRQ when we exit ->poll() with 0 budget.
Also, the logic to ACK the completion ring in case it is almost filled with TX completions need to be adjusted to take care of the 0 budget case, as discussed with Eric Dumazet edumazet@google.com
Reported-by: Song Liu songliubraving@fb.com Reviewed-by: Song Liu songliubraving@fb.com Tested-by: Song Liu songliubraving@fb.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1864,8 +1864,11 @@ static int bnxt_poll_work(struct bnxt *b if (TX_CMP_TYPE(txcmp) == CMP_TYPE_TX_L2_CMP) { tx_pkts++; /* return full budget so NAPI will complete. */ - if (unlikely(tx_pkts > bp->tx_wake_thresh)) + if (unlikely(tx_pkts > bp->tx_wake_thresh)) { rx_pkts = budget; + raw_cons = NEXT_RAW_CMP(raw_cons); + break; + } } else if ((TX_CMP_TYPE(txcmp) & 0x30) == 0x10) { if (likely(budget)) rc = bnxt_rx_pkt(bp, bnapi, &raw_cons, &event); @@ -1893,7 +1896,7 @@ static int bnxt_poll_work(struct bnxt *b } raw_cons = NEXT_RAW_CMP(raw_cons);
- if (rx_pkts == budget) + if (rx_pkts && rx_pkts == budget) break; }
@@ -2007,8 +2010,12 @@ static int bnxt_poll(struct napi_struct while (1) { work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
- if (work_done >= budget) + if (work_done >= budget) { + if (!budget) + BNXT_CP_DB_REARM(cpr->cp_doorbell, + cpr->cp_raw_cons); break; + }
if (!bnxt_has_work(bp, cpr)) { if (napi_complete_done(napi, work_done))
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Venkat Duvvuru venkatkumar.duvvuru@broadcom.com
[ Upstream commit a2bf74f4e1b82395dad2b08d2a911d9151db71c1 ]
When the driver probe fails, all the resources that were allocated prior to the failure must be freed. However, hwrm dma response memory is not getting freed.
This patch fixes the problem described above.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Venkat Duvvuru venkatkumar.duvvuru@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2964,10 +2964,11 @@ static void bnxt_free_hwrm_resources(str { struct pci_dev *pdev = bp->pdev;
- dma_free_coherent(&pdev->dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr, - bp->hwrm_cmd_resp_dma_addr); - - bp->hwrm_cmd_resp_addr = NULL; + if (bp->hwrm_cmd_resp_addr) { + dma_free_coherent(&pdev->dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr, + bp->hwrm_cmd_resp_dma_addr); + bp->hwrm_cmd_resp_addr = NULL; + } if (bp->hwrm_dbg_resp_addr) { dma_free_coherent(&pdev->dev, HWRM_DBG_REG_BUF_SIZE, bp->hwrm_dbg_resp_addr, @@ -8217,6 +8218,7 @@ init_err_cleanup_tc: bnxt_clear_int_mode(bp);
init_err_pci_clean: + bnxt_free_hwrm_resources(bp); bnxt_cleanup_pci(bp);
init_err_free:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit d4859d749aa7090ffb743d15648adb962a1baeae ]
Syzkaller reported this on a slightly older kernel but it's still applicable to the current kernel -
====================================================== WARNING: possible circular locking dependency detected 4.18.0-next-20180823+ #46 Not tainted ------------------------------------------------------ syz-executor4/26841 is trying to acquire lock: 00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652
but task is already holding lock: 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline] 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (rtnl_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline] bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320 process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
-> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}: process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415
-> #0 ((wq_completion)bond_dev->name){+.+.}: lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of: (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock((work_completion)(&(&nnw->work)->work)); lock(rtnl_mutex); lock((wq_completion)bond_dev->name);
*** DEADLOCK ***
1 lock held by syz-executor4/26841:
stack backtrace: CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222 check_prev_add kernel/locking/lockdep.c:1862 [inline] check_prevs_add kernel/locking/lockdep.c:1975 [inline] validate_chain kernel/locking/lockdep.c:2416 [inline] __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412 lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457089 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089 RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003 RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001
Signed-off-by: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 43 +++++++++++++++------------------------- include/net/bonding.h | 7 ------ 2 files changed, 18 insertions(+), 32 deletions(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -210,6 +210,7 @@ static void bond_get_stats(struct net_de static void bond_slave_arr_handler(struct work_struct *work); static bool bond_time_in_interval(struct bonding *bond, unsigned long last_act, int mod); +static void bond_netdev_notify_work(struct work_struct *work);
/*---------------------------- General routines -----------------------------*/
@@ -1254,6 +1255,8 @@ static struct slave *bond_alloc_slave(st return NULL; } } + INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work); + return slave; }
@@ -1261,6 +1264,7 @@ static void bond_free_slave(struct slave { struct bonding *bond = bond_get_bond_by_slave(slave);
+ cancel_delayed_work_sync(&slave->notify_work); if (BOND_MODE(bond) == BOND_MODE_8023AD) kfree(SLAVE_AD_INFO(slave));
@@ -1282,39 +1286,26 @@ static void bond_fill_ifslave(struct sla info->link_failure_count = slave->link_failure_count; }
-static void bond_netdev_notify(struct net_device *dev, - struct netdev_bonding_info *info) -{ - rtnl_lock(); - netdev_bonding_info_change(dev, info); - rtnl_unlock(); -} - static void bond_netdev_notify_work(struct work_struct *_work) { - struct netdev_notify_work *w = - container_of(_work, struct netdev_notify_work, work.work); + struct slave *slave = container_of(_work, struct slave, + notify_work.work); + + if (rtnl_trylock()) { + struct netdev_bonding_info binfo;
- bond_netdev_notify(w->dev, &w->bonding_info); - dev_put(w->dev); - kfree(w); + bond_fill_ifslave(slave, &binfo.slave); + bond_fill_ifbond(slave->bond, &binfo.master); + netdev_bonding_info_change(slave->dev, &binfo); + rtnl_unlock(); + } else { + queue_delayed_work(slave->bond->wq, &slave->notify_work, 1); + } }
void bond_queue_slave_event(struct slave *slave) { - struct bonding *bond = slave->bond; - struct netdev_notify_work *nnw = kzalloc(sizeof(*nnw), GFP_ATOMIC); - - if (!nnw) - return; - - dev_hold(slave->dev); - nnw->dev = slave->dev; - bond_fill_ifslave(slave, &nnw->bonding_info.slave); - bond_fill_ifbond(bond, &nnw->bonding_info.master); - INIT_DELAYED_WORK(&nnw->work, bond_netdev_notify_work); - - queue_delayed_work(slave->bond->wq, &nnw->work, 0); + queue_delayed_work(slave->bond->wq, &slave->notify_work, 0); }
void bond_lower_state_changed(struct slave *slave) --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -139,12 +139,6 @@ struct bond_parm_tbl { int mode; };
-struct netdev_notify_work { - struct delayed_work work; - struct net_device *dev; - struct netdev_bonding_info bonding_info; -}; - struct slave { struct net_device *dev; /* first - useful for panic debug */ struct bonding *bond; /* our master */ @@ -172,6 +166,7 @@ struct slave { #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *np; #endif + struct delayed_work notify_work; struct kobject kobj; struct rtnl_link_stats64 slave_stats; };
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 76c0ddd8c3a683f6e2c6e60e11dc1a1558caf4bc ]
the ip6 tunnel xmit ndo assumes that the processed skb always contains an ip[v6] header, but syzbot has found a way to send frames that fall short of this assumption, leading to the following splat:
BUG: KMSAN: uninit-value in ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307 [inline] BUG: KMSAN: uninit-value in ip6_tnl_start_xmit+0x7d2/0x1ef0 net/ipv6/ip6_tunnel.c:1390 CPU: 0 PID: 4504 Comm: syz-executor558 Not tainted 4.16.0+ #87 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 ip6ip6_tnl_xmit net/ipv6/ip6_tunnel.c:1307 [inline] ip6_tnl_start_xmit+0x7d2/0x1ef0 net/ipv6/ip6_tunnel.c:1390 __netdev_start_xmit include/linux/netdevice.h:4066 [inline] netdev_start_xmit include/linux/netdevice.h:4075 [inline] xmit_one net/core/dev.c:3026 [inline] dev_hard_start_xmit+0x5f1/0xc70 net/core/dev.c:3042 __dev_queue_xmit+0x27ee/0x3520 net/core/dev.c:3557 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x7c70/0x8a30 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmmsg+0x42d/0x800 net/socket.c:2136 SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167 SyS_sendmmsg+0x63/0x90 net/socket.c:2162 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x441819 RSP: 002b:00007ffe58ee8268 EFLAGS: 00000213 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441819 RDX: 0000000000000002 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006cd018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000402510 R13: 00000000004025a0 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085 packet_alloc_skb net/packet/af_packet.c:2803 [inline] packet_snd net/packet/af_packet.c:2894 [inline] packet_sendmsg+0x6454/0x8a30 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmmsg+0x42d/0x800 net/socket.c:2136 SYSC_sendmmsg+0xc4/0x110 net/socket.c:2167 SyS_sendmmsg+0x63/0x90 net/socket.c:2162 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
This change addresses the issue adding the needed check before accessing the inner header.
The ipv4 side of the issue is apparently there since the ipv4 over ipv6 initial support, and the ipv6 side predates git history.
Fixes: c4d3efafcc93 ("[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel.") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+3fde91d4d394747d6db4@syzkaller.appspotmail.com Tested-by: Alexander Potapenko glider@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_tunnel.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1227,7 +1227,7 @@ static inline int ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - const struct iphdr *iph = ip_hdr(skb); + const struct iphdr *iph; int encap_limit = -1; struct flowi6 fl6; __u8 dsfield; @@ -1235,6 +1235,11 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, str u8 tproto; int err;
+ /* ensure we can access the full inner ip header */ + if (!pskb_may_pull(skb, sizeof(struct iphdr))) + return -1; + + iph = ip_hdr(skb); memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
tproto = ACCESS_ONCE(t->parms.proto); @@ -1298,7 +1303,7 @@ static inline int ip6ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev) { struct ip6_tnl *t = netdev_priv(dev); - struct ipv6hdr *ipv6h = ipv6_hdr(skb); + struct ipv6hdr *ipv6h; int encap_limit = -1; __u16 offset; struct flowi6 fl6; @@ -1307,6 +1312,10 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str u8 tproto; int err;
+ if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h)))) + return -1; + + ipv6h = ipv6_hdr(skb); tproto = ACCESS_ONCE(t->parms.proto); if ((tproto != IPPROTO_IPV6 && tproto != 0) || ip6_tnl_addr_conflict(t, ipv6h))
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit ccfec9e5cb2d48df5a955b7bf47f7782157d3bc2]
Cong noted that we need the same checks introduced by commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the inner header") even for ipv4 tunnels.
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Suggested-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_tunnel.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -635,6 +635,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, const struct iphdr *tnl_params, u8 protocol) { struct ip_tunnel *tunnel = netdev_priv(dev); + unsigned int inner_nhdr_len = 0; const struct iphdr *inner_iph; struct flowi4 fl4; u8 tos, ttl; @@ -644,6 +645,14 @@ void ip_tunnel_xmit(struct sk_buff *skb, __be32 dst; bool connected;
+ /* ensure we can access the inner net header, for several users below */ + if (skb->protocol == htons(ETH_P_IP)) + inner_nhdr_len = sizeof(struct iphdr); + else if (skb->protocol == htons(ETH_P_IPV6)) + inner_nhdr_len = sizeof(struct ipv6hdr); + if (unlikely(!pskb_may_pull(skb, inner_nhdr_len))) + goto tx_error; + inner_iph = (const struct iphdr *)skb_inner_network_header(skb); connected = (tunnel->parms.iph.daddr != 0);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 64199fc0a46ba211362472f7f942f900af9492fd ]
Caching ip_hdr(skb) before a call to pskb_may_pull() is buggy, do not do it.
Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Willem de Bruijn willemb@google.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_sockglue.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -147,7 +147,6 @@ static void ip_cmsg_recv_security(struct static void ip_cmsg_recv_dstaddr(struct msghdr *msg, struct sk_buff *skb) { struct sockaddr_in sin; - const struct iphdr *iph = ip_hdr(skb); __be16 *ports; int end;
@@ -162,7 +161,7 @@ static void ip_cmsg_recv_dstaddr(struct ports = (__be16 *)skb_transport_header(skb);
sin.sin_family = AF_INET; - sin.sin_addr.s_addr = iph->daddr; + sin.sin_addr.s_addr = ip_hdr(skb)->daddr; sin.sin_port = ports[1]; memset(sin.sin_zero, 0, sizeof(sin.sin_zero));
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Wang weiwan@google.com
[ Upstream commit a688caa34beb2fd2a92f1b6d33e40cde433ba160 ]
In rawv6_send_hdrinc(), in order to avoid an extra dst_hold(), we directly assign the dst to skb and set passed in dst to NULL to avoid double free. However, in error case, we free skb and then do stats update with the dst pointer passed in. This causes use-after-free on the dst. Fix it by taking rcu read lock right before dst could get released to make sure dst does not get freed until the stats update is done. Note: we don't have this issue in ipv4 cause dst is not used for stats update in v4.
Syzkaller reported following crash: BUG: KASAN: use-after-free in rawv6_send_hdrinc net/ipv6/raw.c:692 [inline] BUG: KASAN: use-after-free in rawv6_sendmsg+0x4421/0x4630 net/ipv6/raw.c:921 Read of size 8 at addr ffff8801d95ba730 by task syz-executor0/32088
CPU: 1 PID: 32088 Comm: syz-executor0 Not tainted 4.19.0-rc2+ #93 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 rawv6_send_hdrinc net/ipv6/raw.c:692 [inline] rawv6_sendmsg+0x4421/0x4630 net/ipv6/raw.c:921 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114 __sys_sendmsg+0x11d/0x280 net/socket.c:2152 __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg net/socket.c:2159 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457099 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f83756edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f83756ee6d4 RCX: 0000000000457099 RDX: 0000000000000000 RSI: 0000000020003840 RDI: 0000000000000004 RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d4b30 R14: 00000000004c90b1 R15: 0000000000000000
Allocated by task 32088: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x730 mm/slab.c:3554 dst_alloc+0xbb/0x1d0 net/core/dst.c:105 ip6_dst_alloc+0x35/0xa0 net/ipv6/route.c:353 ip6_rt_cache_alloc+0x247/0x7b0 net/ipv6/route.c:1186 ip6_pol_route+0x8f8/0xd90 net/ipv6/route.c:1895 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2093 fib6_rule_lookup+0x277/0x860 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2121 ip6_route_output include/net/ip6_route.h:88 [inline] ip6_dst_lookup_tail+0xe27/0x1d60 net/ipv6/ip6_output.c:951 ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079 rawv6_sendmsg+0x12d9/0x4630 net/ipv6/raw.c:905 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114 __sys_sendmsg+0x11d/0x280 net/socket.c:2152 __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg net/socket.c:2159 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 5356: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3756 dst_destroy+0x267/0x3c0 net/core/dst.c:141 dst_destroy_rcu+0x16/0x19 net/core/dst.c:154 __rcu_reclaim kernel/rcu/rcu.h:236 [inline] rcu_do_batch kernel/rcu/tree.c:2576 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2880 [inline] __rcu_process_callbacks kernel/rcu/tree.c:2847 [inline] rcu_process_callbacks+0xf23/0x2670 kernel/rcu/tree.c:2864 __do_softirq+0x30b/0xad8 kernel/softirq.c:292
Fixes: 1789a640f556 ("raw: avoid two atomics in xmit") Signed-off-by: Wei Wang weiwan@google.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/raw.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -650,8 +650,6 @@ static int rawv6_send_hdrinc(struct sock skb->protocol = htons(ETH_P_IPV6); skb->priority = sk->sk_priority; skb->mark = sk->sk_mark; - skb_dst_set(skb, &rt->dst); - *dstp = NULL;
skb_put(skb, length); skb_reset_network_header(skb); @@ -664,8 +662,14 @@ static int rawv6_send_hdrinc(struct sock
skb->transport_header = skb->network_header; err = memcpy_from_msg(iph, msg, length); - if (err) - goto error_fault; + if (err) { + err = -EFAULT; + kfree_skb(skb); + goto error; + } + + skb_dst_set(skb, &rt->dst); + *dstp = NULL;
/* if egress device is enslaved to an L3 master device pass the * skb to its handler for processing @@ -674,21 +678,28 @@ static int rawv6_send_hdrinc(struct sock if (unlikely(!skb)) return 0;
+ /* Acquire rcu_read_lock() in case we need to use rt->rt6i_idev + * in the error path. Since skb has been freed, the dst could + * have been queued for deletion. + */ + rcu_read_lock(); IP6_UPD_PO_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUT, skb->len); err = NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_OUT, net, sk, skb, NULL, rt->dst.dev, dst_output); if (err > 0) err = net_xmit_errno(err); - if (err) - goto error; + if (err) { + IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS); + rcu_read_unlock(); + goto error_check; + } + rcu_read_unlock(); out: return 0;
-error_fault: - err = -EFAULT; - kfree_skb(skb); error: IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS); +error_check: if (err == -ENOBUFS && !np->recverr) err = 0; return err;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 54baca096386d862d19c10f58f34bf787c6b3cbe ]
There is no reason to open code what the switch setup function does, in fact, because we just issued a switch reset, we would make all the register get their default values, including for instance, having unused port be enabled again and wasting power and leading to an inappropriate switch core clock being selected.
Fixes: 8cfa94984c9c ("net: dsa: bcm_sf2: add suspend/resume callbacks") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/bcm_sf2.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
--- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -772,7 +772,6 @@ static int bcm_sf2_sw_suspend(struct dsa static int bcm_sf2_sw_resume(struct dsa_switch *ds) { struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds); - unsigned int port; int ret;
ret = bcm_sf2_sw_rst(priv); @@ -784,12 +783,7 @@ static int bcm_sf2_sw_resume(struct dsa_ if (priv->hw_params.num_gphy == 1) bcm_sf2_gphy_enable_set(ds, true);
- for (port = 0; port < DSA_MAX_PORTS; port++) { - if ((1 << port) & ds->enabled_port_mask) - bcm_sf2_port_setup(ds, port, NULL); - else if (dsa_is_cpu_port(ds, port)) - bcm_sf2_imp_setup(ds, port); - } + ds->ops->setup(ds);
return 0; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yunsheng Lin linyunsheng@huawei.com
[ Upstream commit 2e9361efa707e186d91b938e44f9e326725259f7 ]
If SMMU is on, there is more likely that skb_shinfo(skb)->frags[i] can not send by a single BD. when this happen, the hns_nic_net_xmit_hw function map the whole data in a frags using skb_frag_dma_map, but unmap each BD' data individually when tx is done, which causes problem when SMMU is on.
This patch fixes this problem by ummapping the whole data in a frags when tx is done.
Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Reviewed-by: Yisen Zhuang yisen.zhuang@huawei.com Signed-off-by: Salil Mehta salil.mehta@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns/hnae.c | 2 - drivers/net/ethernet/hisilicon/hns/hns_enet.c | 30 ++++++++++++++++---------- 2 files changed, 20 insertions(+), 12 deletions(-)
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c +++ b/drivers/net/ethernet/hisilicon/hns/hnae.c @@ -84,7 +84,7 @@ static void hnae_unmap_buffer(struct hna if (cb->type == DESC_TYPE_SKB) dma_unmap_single(ring_to_dev(ring), cb->dma, cb->length, ring_to_dma_dir(ring)); - else + else if (cb->length) dma_unmap_page(ring_to_dev(ring), cb->dma, cb->length, ring_to_dma_dir(ring)); } --- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c +++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c @@ -40,9 +40,9 @@ #define SKB_TMP_LEN(SKB) \ (((SKB)->transport_header - (SKB)->mac_header) + tcp_hdrlen(SKB))
-static void fill_v2_desc(struct hnae_ring *ring, void *priv, - int size, dma_addr_t dma, int frag_end, - int buf_num, enum hns_desc_type type, int mtu) +static void fill_v2_desc_hw(struct hnae_ring *ring, void *priv, int size, + int send_sz, dma_addr_t dma, int frag_end, + int buf_num, enum hns_desc_type type, int mtu) { struct hnae_desc *desc = &ring->desc[ring->next_to_use]; struct hnae_desc_cb *desc_cb = &ring->desc_cb[ring->next_to_use]; @@ -64,7 +64,7 @@ static void fill_v2_desc(struct hnae_rin desc_cb->type = type;
desc->addr = cpu_to_le64(dma); - desc->tx.send_size = cpu_to_le16((u16)size); + desc->tx.send_size = cpu_to_le16((u16)send_sz);
/* config bd buffer end */ hnae_set_bit(rrcfv, HNSV2_TXD_VLD_B, 1); @@ -133,6 +133,14 @@ static void fill_v2_desc(struct hnae_rin ring_ptr_move_fw(ring, next_to_use); }
+static void fill_v2_desc(struct hnae_ring *ring, void *priv, + int size, dma_addr_t dma, int frag_end, + int buf_num, enum hns_desc_type type, int mtu) +{ + fill_v2_desc_hw(ring, priv, size, size, dma, frag_end, + buf_num, type, mtu); +} + static const struct acpi_device_id hns_enet_acpi_match[] = { { "HISI00C1", 0 }, { "HISI00C2", 0 }, @@ -289,15 +297,15 @@ static void fill_tso_desc(struct hnae_ri
/* when the frag size is bigger than hardware, split this frag */ for (k = 0; k < frag_buf_num; k++) - fill_v2_desc(ring, priv, - (k == frag_buf_num - 1) ? + fill_v2_desc_hw(ring, priv, k == 0 ? size : 0, + (k == frag_buf_num - 1) ? sizeoflast : BD_MAX_SEND_SIZE, - dma + BD_MAX_SEND_SIZE * k, - frag_end && (k == frag_buf_num - 1) ? 1 : 0, - buf_num, - (type == DESC_TYPE_SKB && !k) ? + dma + BD_MAX_SEND_SIZE * k, + frag_end && (k == frag_buf_num - 1) ? 1 : 0, + buf_num, + (type == DESC_TYPE_SKB && !k) ? DESC_TYPE_SKB : DESC_TYPE_PAGE, - mtu); + mtu); }
netdev_tx_t hns_nic_net_xmit_hw(struct net_device *ndev,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit af7d6cce53694a88d6a1bb60c9a239a6a5144459 ]
Since commit 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions"), exceptions get deprecated separately from cached routes. In particular, administrative changes don't clear PMTU anymore.
As Stefano described in commit e9fa1495d738 ("ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes"), the PMTU discovered before the local MTU change can become stale: - if the local MTU is now lower than the PMTU, that PMTU is now incorrect - if the local MTU was the lowest value in the path, and is increased, we might discover a higher PMTU
Similarly to what commit e9fa1495d738 did for IPv6, update PMTU in those cases.
If the exception was locked, the discovered PMTU was smaller than the minimal accepted PMTU. In that case, if the new local MTU is smaller than the current PMTU, let PMTU discovery figure out if locking of the exception is still needed.
To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU notifier. By the time the notifier is called, dev->mtu has been changed. This patch adds the old MTU as additional information in the notifier structure, and a new call_netdevice_notifiers_u32() function.
Fixes: 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions") Signed-off-by: Sabrina Dubroca sd@queasysnail.net Reviewed-by: Stefano Brivio sbrivio@redhat.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/netdevice.h | 7 ++++++ include/net/ip_fib.h | 1 net/core/dev.c | 28 +++++++++++++++++++++++-- net/ipv4/fib_frontend.c | 12 +++++++---- net/ipv4/fib_semantics.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 92 insertions(+), 6 deletions(-)
--- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2307,6 +2307,13 @@ struct netdev_notifier_info { struct net_device *dev; };
+struct netdev_notifier_info_ext { + struct netdev_notifier_info info; /* must be first */ + union { + u32 mtu; + } ext; +}; + struct netdev_notifier_change_info { struct netdev_notifier_info info; /* must be first */ unsigned int flags_changed; --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -372,6 +372,7 @@ int ip_fib_check_default(__be32 gw, stru int fib_sync_down_dev(struct net_device *dev, unsigned long event, bool force); int fib_sync_down_addr(struct net_device *dev, __be32 local); int fib_sync_up(struct net_device *dev, unsigned int nh_flags); +void fib_sync_mtu(struct net_device *dev, u32 orig_mtu);
#ifdef CONFIG_IP_ROUTE_MULTIPATH int fib_multipath_hash(const struct fib_info *fi, const struct flowi4 *fl4, --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1688,6 +1688,28 @@ int call_netdevice_notifiers(unsigned lo } EXPORT_SYMBOL(call_netdevice_notifiers);
+/** + * call_netdevice_notifiers_mtu - call all network notifier blocks + * @val: value passed unmodified to notifier function + * @dev: net_device pointer passed unmodified to notifier function + * @arg: additional u32 argument passed to the notifier function + * + * Call all network notifier blocks. Parameters and return value + * are as for raw_notifier_call_chain(). + */ +static int call_netdevice_notifiers_mtu(unsigned long val, + struct net_device *dev, u32 arg) +{ + struct netdev_notifier_info_ext info = { + .info.dev = dev, + .ext.mtu = arg, + }; + + BUILD_BUG_ON(offsetof(struct netdev_notifier_info_ext, info) != 0); + + return call_netdevice_notifiers_info(val, dev, &info.info); +} + #ifdef CONFIG_NET_INGRESS static struct static_key ingress_needed __read_mostly;
@@ -6891,14 +6913,16 @@ int dev_set_mtu(struct net_device *dev, err = __dev_set_mtu(dev, new_mtu);
if (!err) { - err = call_netdevice_notifiers(NETDEV_CHANGEMTU, dev); + err = call_netdevice_notifiers_mtu(NETDEV_CHANGEMTU, dev, + orig_mtu); err = notifier_to_errno(err); if (err) { /* setting mtu back and notifying everyone again, * so that they have a chance to revert changes. */ __dev_set_mtu(dev, orig_mtu); - call_netdevice_notifiers(NETDEV_CHANGEMTU, dev); + call_netdevice_notifiers_mtu(NETDEV_CHANGEMTU, dev, + new_mtu); } } return err; --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -1185,7 +1185,8 @@ static int fib_inetaddr_event(struct not static int fib_netdev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = netdev_notifier_info_to_dev(ptr); - struct netdev_notifier_changeupper_info *info; + struct netdev_notifier_changeupper_info *upper_info = ptr; + struct netdev_notifier_info_ext *info_ext = ptr; struct in_device *in_dev; struct net *net = dev_net(dev); unsigned int flags; @@ -1220,16 +1221,19 @@ static int fib_netdev_event(struct notif fib_sync_up(dev, RTNH_F_LINKDOWN); else fib_sync_down_dev(dev, event, false); - /* fall through */ + rt_cache_flush(net); + break; case NETDEV_CHANGEMTU: + fib_sync_mtu(dev, info_ext->ext.mtu); rt_cache_flush(net); break; case NETDEV_CHANGEUPPER: - info = ptr; + upper_info = ptr; /* flush all routes if dev is linked to or unlinked from * an L3 master device (e.g., VRF) */ - if (info->upper_dev && netif_is_l3_master(info->upper_dev)) + if (upper_info->upper_dev && + netif_is_l3_master(upper_info->upper_dev)) fib_disable_ip(dev, NETDEV_DOWN, true); break; } --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -1520,6 +1520,56 @@ static int call_fib_nh_notifiers(struct return NOTIFY_DONE; }
+/* Update the PMTU of exceptions when: + * - the new MTU of the first hop becomes smaller than the PMTU + * - the old MTU was the same as the PMTU, and it limited discovery of + * larger MTUs on the path. With that limit raised, we can now + * discover larger MTUs + * A special case is locked exceptions, for which the PMTU is smaller + * than the minimal accepted PMTU: + * - if the new MTU is greater than the PMTU, don't make any change + * - otherwise, unlock and set PMTU + */ +static void nh_update_mtu(struct fib_nh *nh, u32 new, u32 orig) +{ + struct fnhe_hash_bucket *bucket; + int i; + + bucket = rcu_dereference_protected(nh->nh_exceptions, 1); + if (!bucket) + return; + + for (i = 0; i < FNHE_HASH_SIZE; i++) { + struct fib_nh_exception *fnhe; + + for (fnhe = rcu_dereference_protected(bucket[i].chain, 1); + fnhe; + fnhe = rcu_dereference_protected(fnhe->fnhe_next, 1)) { + if (fnhe->fnhe_mtu_locked) { + if (new <= fnhe->fnhe_pmtu) { + fnhe->fnhe_pmtu = new; + fnhe->fnhe_mtu_locked = false; + } + } else if (new < fnhe->fnhe_pmtu || + orig == fnhe->fnhe_pmtu) { + fnhe->fnhe_pmtu = new; + } + } + } +} + +void fib_sync_mtu(struct net_device *dev, u32 orig_mtu) +{ + unsigned int hash = fib_devindex_hashfn(dev->ifindex); + struct hlist_head *head = &fib_info_devhash[hash]; + struct fib_nh *nh; + + hlist_for_each_entry(nh, head, nh_hash) { + if (nh->nh_dev == dev) + nh_update_mtu(nh, dev->mtu, orig_mtu); + } +} + /* Event force Flags Description * NETDEV_CHANGE 0 LINKDOWN Carrier OFF, not for scope host * NETDEV_DOWN 0 LINKDOWN|DEAD Link down, not for scope host
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Barnhill 0xeffeff@gmail.com
[ Upstream commit 86f9bd1ff61c413a2a251fa736463295e4e24733 ]
The backend handling for /proc/net/if_inet6 in addrconf.c doesn't properly handle starting/stopping the iteration. The problem is that at some point during the iteration, an overflow is detected and the process is subsequently stopped. The item being shown via seq_printf() when the overflow occurs is not actually shown, though. When start() is subsequently called to resume iterating, it returns the next item, and thus the item that was being processed when the overflow occurred never gets printed.
Alter the meaning of the private data member "offset". Currently, when it is not 0 (which only happens at the very beginning), "offset" represents the next hlist item to be printed. After this change, "offset" always represents the current item.
This is also consistent with the private data member "bucket", which represents the current bucket, and also the use of "pos" as defined in seq_file.txt: The pos passed to start() will always be either zero, or the most recent pos used in the previous session.
Signed-off-by: Jeff Barnhill 0xeffeff@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -4136,7 +4136,6 @@ static struct inet6_ifaddr *if6_get_firs p++; continue; } - state->offset++; return ifa; }
@@ -4160,13 +4159,12 @@ static struct inet6_ifaddr *if6_get_next return ifa; }
+ state->offset = 0; while (++state->bucket < IN6_ADDR_HSIZE) { - state->offset = 0; hlist_for_each_entry_rcu_bh(ifa, &inet6_addr_lst[state->bucket], addr_lst) { if (!net_eq(dev_net(ifa->idev->dev), net)) continue; - state->offset++; return ifa; } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Tranchetti stranche@codeaurora.org
[ Upstream commit f88b4c01b97e09535505cf3c327fdbce55c27f00 ]
netlbl_unlabel_addrinfo_get() assumes that if it finds the NLBL_UNLABEL_A_IPV4ADDR attribute, it must also have the NLBL_UNLABEL_A_IPV4MASK attribute as well. However, this is not necessarily the case as the current checks in netlbl_unlabel_staticadd() and friends are not sufficent to enforce this.
If passed a netlink message with NLBL_UNLABEL_A_IPV4ADDR, NLBL_UNLABEL_A_IPV6ADDR, and NLBL_UNLABEL_A_IPV6MASK attributes, these functions will all call netlbl_unlabel_addrinfo_get() which will then attempt dereference NULL when fetching the non-existent NLBL_UNLABEL_A_IPV4MASK attribute:
Unable to handle kernel NULL pointer dereference at virtual address 0 Process unlab (pid: 31762, stack limit = 0xffffff80502d8000) Call trace: netlbl_unlabel_addrinfo_get+0x44/0xd8 netlbl_unlabel_staticremovedef+0x98/0xe0 genl_rcv_msg+0x354/0x388 netlink_rcv_skb+0xac/0x118 genl_rcv+0x34/0x48 netlink_unicast+0x158/0x1f0 netlink_sendmsg+0x32c/0x338 sock_sendmsg+0x44/0x60 ___sys_sendmsg+0x1d0/0x2a8 __sys_sendmsg+0x64/0xb4 SyS_sendmsg+0x34/0x4c el0_svc_naked+0x34/0x38 Code: 51001149 7100113f 540000a0 f9401508 (79400108) ---[ end trace f6438a488e737143 ]--- Kernel panic - not syncing: Fatal exception
Signed-off-by: Sean Tranchetti stranche@codeaurora.org
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netlabel/netlabel_unlabeled.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/netlabel/netlabel_unlabeled.c +++ b/net/netlabel/netlabel_unlabeled.c @@ -781,7 +781,8 @@ static int netlbl_unlabel_addrinfo_get(s { u32 addr_len;
- if (info->attrs[NLBL_UNLABEL_A_IPV4ADDR]) { + if (info->attrs[NLBL_UNLABEL_A_IPV4ADDR] && + info->attrs[NLBL_UNLABEL_A_IPV4MASK]) { addr_len = nla_len(info->attrs[NLBL_UNLABEL_A_IPV4ADDR]); if (addr_len != sizeof(struct in_addr) && addr_len != nla_len(info->attrs[NLBL_UNLABEL_A_IPV4MASK]))
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maxime Chevallier maxime.chevallier@bootlin.com
[ Upstream commit 35f3625c21852ad839f20c91c7d81c4c1101e207 ]
When offloading the L3 and L4 csum computation on TX, we need to extract the l3_proto from the ethtype, independently of the presence of a vlan tag.
The actual driver uses skb->protocol as-is, resulting in packets with the wrong L4 checksum being sent when there's a vlan tag in the packet header and checksum offloading is enabled.
This commit makes use of vlan_protocol_get() to get the correct ethtype regardless the presence of a vlan tag.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Maxime Chevallier maxime.chevallier@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -33,6 +33,7 @@ #include <linux/hrtimer.h> #include <linux/ktime.h> #include <linux/regmap.h> +#include <linux/if_vlan.h> #include <uapi/linux/ppp_defs.h> #include <net/ip.h> #include <net/ipv6.h> @@ -5101,7 +5102,7 @@ static void mvpp2_txq_desc_put(struct mv }
/* Set Tx descriptors fields relevant for CSUM calculation */ -static u32 mvpp2_txq_desc_csum(int l3_offs, int l3_proto, +static u32 mvpp2_txq_desc_csum(int l3_offs, __be16 l3_proto, int ip_hdr_len, int l4_proto) { u32 command; @@ -6065,14 +6066,15 @@ static u32 mvpp2_skb_tx_csum(struct mvpp if (skb->ip_summed == CHECKSUM_PARTIAL) { int ip_hdr_len = 0; u8 l4_proto; + __be16 l3_proto = vlan_get_protocol(skb);
- if (skb->protocol == htons(ETH_P_IP)) { + if (l3_proto == htons(ETH_P_IP)) { struct iphdr *ip4h = ip_hdr(skb);
/* Calculate IPv4 checksum and L4 checksum */ ip_hdr_len = ip4h->ihl; l4_proto = ip4h->protocol; - } else if (skb->protocol == htons(ETH_P_IPV6)) { + } else if (l3_proto == htons(ETH_P_IPV6)) { struct ipv6hdr *ip6h = ipv6_hdr(skb);
/* Read l4_protocol from one of IPv6 extra headers */ @@ -6084,7 +6086,7 @@ static u32 mvpp2_skb_tx_csum(struct mvpp }
return mvpp2_txq_desc_csum(skb_network_offset(skb), - skb->protocol, ip_hdr_len, l4_proto); + l3_proto, ip_hdr_len, l4_proto); }
return MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antoine Tenart antoine.tenart@bootlin.com
[ Upstream commit 774268f3e51b53ed432a1ec516574fd5ba469398 ]
When no Tx IRQ is available, the txq_done() routine (called from tx_done()) shouldn't be called from the polling function, as in such case it is already called in the Tx path thanks to an hrtimer. This mostly occurred when using PPv2.1, as the engine then do not have Tx IRQs.
Fixes: edc660fa09e2 ("net: mvpp2: replace TX coalescing interrupts with hrtimer") Reported-by: Stefan Chulski stefanc@marvell.com Signed-off-by: Antoine Tenart antoine.tenart@bootlin.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/marvell/mvpp2.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/marvell/mvpp2.c +++ b/drivers/net/ethernet/marvell/mvpp2.c @@ -6534,10 +6534,12 @@ static int mvpp2_poll(struct napi_struct cause_rx_tx & ~MVPP2_CAUSE_MISC_SUM_MASK); }
- cause_tx = cause_rx_tx & MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK; - if (cause_tx) { - cause_tx >>= MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_OFFSET; - mvpp2_tx_done(port, cause_tx, qv->sw_thread_id); + if (port->has_tx_irqs) { + cause_tx = cause_rx_tx & MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK; + if (cause_tx) { + cause_tx >>= MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_OFFSET; + mvpp2_tx_done(port, cause_tx, qv->sw_thread_id); + } }
/* Process RX packets */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Ahern dsahern@gmail.com
[ Upstream commit 8b4c3cdd9dd8290343ce959a132d3b334062c5b9 ]
A number of TC attributes are processed without proper validation (e.g., length checks). Add a tca policy for all input attributes and use when invoking nlmsg_parse.
The 2 Fixes tags below cover the latest additions. The other attributes are a string (KIND), nested attribute (OPTIONS which does seem to have validation in most cases), for dumps only or a flag.
Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters") Fixes: d47a6b0e7c492 ("net: sched: introduce ingress/egress block index attributes for qdisc") Signed-off-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_api.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
--- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1216,6 +1216,16 @@ check_loop_fn(struct Qdisc *q, unsigned * Delete/get qdisc. */
+const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = { + [TCA_KIND] = { .type = NLA_STRING }, + [TCA_OPTIONS] = { .type = NLA_NESTED }, + [TCA_RATE] = { .type = NLA_BINARY, + .len = sizeof(struct tc_estimator) }, + [TCA_STAB] = { .type = NLA_NESTED }, + [TCA_DUMP_INVISIBLE] = { .type = NLA_FLAG }, + [TCA_CHAIN] = { .type = NLA_U32 }, +}; + static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n, struct netlink_ext_ack *extack) { @@ -1232,7 +1242,8 @@ static int tc_get_qdisc(struct sk_buff * !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) return -EPERM;
- err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, + extack); if (err < 0) return err;
@@ -1302,7 +1313,8 @@ static int tc_modify_qdisc(struct sk_buf
replay: /* Reinit, just in case something touches this. */ - err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, + extack); if (err < 0) return err;
@@ -1512,7 +1524,8 @@ static int tc_dump_qdisc(struct sk_buff idx = 0; ASSERT_RTNL();
- err = nlmsg_parse(nlh, sizeof(*tcm), tca, TCA_MAX, NULL, NULL); + err = nlmsg_parse(nlh, sizeof(*tcm), tca, TCA_MAX, + rtm_tca_policy, NULL); if (err < 0) return err;
@@ -1729,7 +1742,8 @@ static int tc_ctl_tclass(struct sk_buff !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) return -EPERM;
- err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, NULL, extack); + err = nlmsg_parse(n, sizeof(*tcm), tca, TCA_MAX, rtm_tca_policy, + extack); if (err < 0) return err;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 45ec318578c0c22a11f5b9927d064418e1ab1905 ]
The AON_PM_L2 is normally used to trigger and identify the source of a wake-up event. Since the RX_SYS clock is no longer turned off, we also have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and that interrupt remains active up until the magic packet detector is disabled which happens much later during the driver resumption.
The race happens if we have a CPU that is entering the SYSTEMPORT INTRL2_0 handler during resume, and another CPU has managed to clear the wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we have the first CPU stuck in the interrupt handler with an interrupt cause that has been cleared under its feet, and so we keep returning IRQ_NONE and we never make any progress.
This was not a problem before because we would always turn off the RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned off as well, thus not latching the interrupt.
The fix is to make sure we do not enable either the MPD or BRCM_TAG_MATCH interrupts since those are redundant with what the AON_PM_L2 interrupt controller already processes and they would cause such a race to occur.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER") Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bcmsysport.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -1001,14 +1001,22 @@ static void bcm_sysport_resume_from_wol( { u32 reg;
- /* Stop monitoring MPD interrupt */ - intrl2_0_mask_set(priv, INTRL2_0_MPD); - /* Clear the MagicPacket detection logic */ reg = umac_readl(priv, UMAC_MPD_CTRL); reg &= ~MPD_EN; umac_writel(priv, reg, UMAC_MPD_CTRL);
+ reg = intrl2_0_readl(priv, INTRL2_CPU_STATUS); + if (reg & INTRL2_0_MPD) + netdev_info(priv->netdev, "Wake-on-LAN (MPD) interrupt!\n"); + + if (reg & INTRL2_0_BRCM_MATCH_TAG) { + reg = rxchk_readl(priv, RXCHK_BRCM_TAG_MATCH_STATUS) & + RXCHK_BRCM_TAG_MATCH_MASK; + netdev_info(priv->netdev, + "Wake-on-LAN (filters 0x%02x) interrupt!\n", reg); + } + netif_dbg(priv, wol, priv->netdev, "resumed from WOL\n"); }
@@ -1043,11 +1051,6 @@ static irqreturn_t bcm_sysport_rx_isr(in if (priv->irq0_stat & INTRL2_0_TX_RING_FULL) bcm_sysport_tx_reclaim_all(priv);
- if (priv->irq0_stat & INTRL2_0_MPD) { - netdev_info(priv->netdev, "Wake-on-LAN interrupt!\n"); - bcm_sysport_resume_from_wol(priv); - } - if (!priv->is_lite) goto out;
@@ -2248,9 +2251,6 @@ static int bcm_sysport_suspend_to_wol(st /* UniMAC receive needs to be turned on */ umac_enable_set(priv, CMD_RX_EN, 1);
- /* Enable the interrupt wake-up source */ - intrl2_0_mask_clear(priv, INTRL2_0_MPD); - netif_dbg(priv, wol, ndev, "entered WOL mode\n");
return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Zhao yuzhao@google.com
[ Upstream commit f7b2a56e1f3dcbdb4cf09b2b63e859ffe0e09df8 ]
Cancel pending work before freeing smsc75xx private data structure during binding. This fixes the following crash in the driver:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 IP: mutex_lock+0x2b/0x3f <snipped> Workqueue: events smsc75xx_deferred_multicast_write [smsc75xx] task: ffff8caa83e85700 task.stack: ffff948b80518000 RIP: 0010:mutex_lock+0x2b/0x3f <snipped> Call Trace: smsc75xx_deferred_multicast_write+0x40/0x1af [smsc75xx] process_one_work+0x18d/0x2fc worker_thread+0x1a2/0x269 ? pr_cont_work+0x58/0x58 kthread+0xfa/0x10a ? pr_cont_work+0x58/0x58 ? rcu_read_unlock_sched_notrace+0x48/0x48 ret_from_fork+0x22/0x40
Signed-off-by: Yu Zhao yuzhao@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/smsc75xx.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -1517,6 +1517,7 @@ static void smsc75xx_unbind(struct usbne { struct smsc75xx_priv *pdata = (struct smsc75xx_priv *)(dev->data[0]); if (pdata) { + cancel_work_sync(&pdata->set_multicast); netif_dbg(dev, ifdown, dev->net, "free pdata\n"); kfree(pdata); pdata = NULL;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shahed Shaikh shahed.shaikh@cavium.com
[ Upstream commit c333fa0c4f220f8f7ea5acd6b0ebf3bf13fd684d ]
In regular NIC transmission flow, driver always configures MAC using Tx queue zero descriptor as a part of MAC learning flow. But with multi Tx queue supported NIC, regular transmission can occur on any non-zero Tx queue and from that context it uses Tx queue zero descriptor to configure MAC, at the same time TX queue zero could be used by another CPU for regular transmission which could lead to Tx queue zero descriptor corruption and cause FW abort.
This patch fixes this in such a way that driver always configures learned MAC address from the same Tx queue which is used for regular transmission.
Fixes: 7e2cf4feba05 ("qlcnic: change driver hardware interface mechanism") Signed-off-by: Shahed Shaikh shahed.shaikh@cavium.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 8 +++++--- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 3 ++- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h | 3 ++- drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h | 3 ++- drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c | 12 ++++++------ 5 files changed, 17 insertions(+), 12 deletions(-)
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -1800,7 +1800,8 @@ struct qlcnic_hardware_ops { int (*config_loopback) (struct qlcnic_adapter *, u8); int (*clear_loopback) (struct qlcnic_adapter *, u8); int (*config_promisc_mode) (struct qlcnic_adapter *, u32); - void (*change_l2_filter) (struct qlcnic_adapter *, u64 *, u16); + void (*change_l2_filter)(struct qlcnic_adapter *adapter, u64 *addr, + u16 vlan, struct qlcnic_host_tx_ring *tx_ring); int (*get_board_info) (struct qlcnic_adapter *); void (*set_mac_filter_count) (struct qlcnic_adapter *); void (*free_mac_list) (struct qlcnic_adapter *); @@ -2064,9 +2065,10 @@ static inline int qlcnic_nic_set_promisc }
static inline void qlcnic_change_filter(struct qlcnic_adapter *adapter, - u64 *addr, u16 id) + u64 *addr, u16 vlan, + struct qlcnic_host_tx_ring *tx_ring) { - adapter->ahw->hw_ops->change_l2_filter(adapter, addr, id); + adapter->ahw->hw_ops->change_l2_filter(adapter, addr, vlan, tx_ring); }
static inline int qlcnic_get_board_info(struct qlcnic_adapter *adapter) --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -2134,7 +2134,8 @@ out: }
void qlcnic_83xx_change_l2_filter(struct qlcnic_adapter *adapter, u64 *addr, - u16 vlan_id) + u16 vlan_id, + struct qlcnic_host_tx_ring *tx_ring) { u8 mac[ETH_ALEN]; memcpy(&mac, addr, ETH_ALEN); --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h @@ -550,7 +550,8 @@ int qlcnic_83xx_wrt_reg_indirect(struct int qlcnic_83xx_nic_set_promisc(struct qlcnic_adapter *, u32); int qlcnic_83xx_config_hw_lro(struct qlcnic_adapter *, int); int qlcnic_83xx_config_rss(struct qlcnic_adapter *, int); -void qlcnic_83xx_change_l2_filter(struct qlcnic_adapter *, u64 *, u16); +void qlcnic_83xx_change_l2_filter(struct qlcnic_adapter *adapter, u64 *addr, + u16 vlan, struct qlcnic_host_tx_ring *ring); int qlcnic_83xx_get_pci_info(struct qlcnic_adapter *, struct qlcnic_pci_info *); int qlcnic_83xx_set_nic_info(struct qlcnic_adapter *, struct qlcnic_info *); void qlcnic_83xx_initialize_nic(struct qlcnic_adapter *, int); --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h @@ -173,7 +173,8 @@ int qlcnic_82xx_napi_add(struct qlcnic_a struct net_device *netdev); void qlcnic_82xx_get_beacon_state(struct qlcnic_adapter *); void qlcnic_82xx_change_filter(struct qlcnic_adapter *adapter, - u64 *uaddr, u16 vlan_id); + u64 *uaddr, u16 vlan_id, + struct qlcnic_host_tx_ring *tx_ring); int qlcnic_82xx_config_intr_coalesce(struct qlcnic_adapter *, struct ethtool_coalesce *); int qlcnic_82xx_set_rx_coalesce(struct qlcnic_adapter *); --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c @@ -268,13 +268,12 @@ static void qlcnic_add_lb_filter(struct }
void qlcnic_82xx_change_filter(struct qlcnic_adapter *adapter, u64 *uaddr, - u16 vlan_id) + u16 vlan_id, struct qlcnic_host_tx_ring *tx_ring) { struct cmd_desc_type0 *hwdesc; struct qlcnic_nic_req *req; struct qlcnic_mac_req *mac_req; struct qlcnic_vlan_req *vlan_req; - struct qlcnic_host_tx_ring *tx_ring = adapter->tx_ring; u32 producer; u64 word;
@@ -301,7 +300,8 @@ void qlcnic_82xx_change_filter(struct ql
static void qlcnic_send_filter(struct qlcnic_adapter *adapter, struct cmd_desc_type0 *first_desc, - struct sk_buff *skb) + struct sk_buff *skb, + struct qlcnic_host_tx_ring *tx_ring) { struct vlan_ethhdr *vh = (struct vlan_ethhdr *)(skb->data); struct ethhdr *phdr = (struct ethhdr *)(skb->data); @@ -335,7 +335,7 @@ static void qlcnic_send_filter(struct ql tmp_fil->vlan_id == vlan_id) { if (jiffies > (QLCNIC_READD_AGE * HZ + tmp_fil->ftime)) qlcnic_change_filter(adapter, &src_addr, - vlan_id); + vlan_id, tx_ring); tmp_fil->ftime = jiffies; return; } @@ -350,7 +350,7 @@ static void qlcnic_send_filter(struct ql if (!fil) return;
- qlcnic_change_filter(adapter, &src_addr, vlan_id); + qlcnic_change_filter(adapter, &src_addr, vlan_id, tx_ring); fil->ftime = jiffies; fil->vlan_id = vlan_id; memcpy(fil->faddr, &src_addr, ETH_ALEN); @@ -766,7 +766,7 @@ netdev_tx_t qlcnic_xmit_frame(struct sk_ }
if (adapter->drv_mac_learn) - qlcnic_send_filter(adapter, first_desc, skb); + qlcnic_send_filter(adapter, first_desc, skb, tx_ring);
tx_ring->tx_stats.tx_bytes += skb->len; tx_ring->tx_stats.xmit_called++;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Giacinto Cifelli gciofono@gmail.com
[ Upstream commit 4f7617705bfff84d756fe4401a1f4f032f374984 ]
Added support for Gemalto's Cinterion ALASxx WWAN interfaces by adding QMI_FIXED_INTF with Cinterion's VID and PID.
Signed-off-by: Giacinto Cifelli gciofono@gmail.com Acked-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1233,6 +1233,7 @@ static const struct usb_device_id produc {QMI_FIXED_INTF(0x0b3c, 0xc00b, 4)}, /* Olivetti Olicard 500 */ {QMI_FIXED_INTF(0x1e2d, 0x0060, 4)}, /* Cinterion PLxx */ {QMI_FIXED_INTF(0x1e2d, 0x0053, 4)}, /* Cinterion PHxx,PXxx */ + {QMI_FIXED_INTF(0x1e2d, 0x0063, 10)}, /* Cinterion ALASxx (1 RmNet) */ {QMI_FIXED_INTF(0x1e2d, 0x0082, 4)}, /* Cinterion PHxx,PXxx (2 RmNet) */ {QMI_FIXED_INTF(0x1e2d, 0x0082, 5)}, /* Cinterion PHxx,PXxx (2 RmNet) */ {QMI_FIXED_INTF(0x1e2d, 0x0083, 4)}, /* Cinterion PHxx,PXxx (1 RmNet + USB Audio)*/
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mauricio Faria de Oliveira mfo@canonical.com
[ Upstream commit bd961c9bc66497f0c63f4ba1d02900bb85078366 ]
Currently, rtnl_fdb_dump() assumes the family header is 'struct ifinfomsg', which is not always true -- 'struct ndmsg' is used by iproute2 ('ip neigh').
The problem is, the function bails out early if nlmsg_parse() fails, which does occur for iproute2 usage of 'struct ndmsg' because the payload length is shorter than the family header alone (as 'struct ifinfomsg' is assumed).
This breaks backward compatibility with userspace -- nothing is sent back.
Some examples with iproute2 and netlink library for go [1]:
1) $ bridge fdb show 33:33:00:00:00:01 dev ens3 self permanent 01:00:5e:00:00:01 dev ens3 self permanent 33:33:ff:15:98:30 dev ens3 self permanent
This one works, as it uses 'struct ifinfomsg'.
fdb_show() @ iproute2/bridge/fdb.c """ .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)), ... if (rtnl_dump_request(&rth, RTM_GETNEIGH, [...] """
2) $ ip --family bridge neigh RTNETLINK answers: Invalid argument Dump terminated
This one fails, as it uses 'struct ndmsg'.
do_show_or_flush() @ iproute2/ip/ipneigh.c """ .n.nlmsg_type = RTM_GETNEIGH, .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)), """
3) $ ./neighlist < no output >
This one fails, as it uses 'struct ndmsg'-based.
neighList() @ netlink/neigh_linux.go """ req := h.newNetlinkRequest(unix.RTM_GETNEIGH, [...] msg := Ndmsg{ """
The actual breakage was introduced by commit 0ff50e83b512 ("net: rtnetlink: bail out from rtnl_fdb_dump() on parse error"), because nlmsg_parse() fails if the payload length (with the _actual_ family header) is less than the family header length alone (which is assumed, in parameter 'hdrlen'). This is true in the examples above with struct ndmsg, with size and payload length shorter than struct ifinfomsg.
However, that commit just intends to fix something under the assumption the family header is indeed an 'struct ifinfomsg' - by preventing access to the payload as such (via 'ifm' pointer) if the payload length is not sufficient to actually contain it.
The assumption was introduced by commit 5e6d24358799 ("bridge: netlink dump interface at par with brctl"), to support iproute2's 'bridge fdb' command (not 'ip neigh') which indeed uses 'struct ifinfomsg', thus is not broken.
So, in order to unbreak the 'struct ndmsg' family headers and still allow 'struct ifinfomsg' to continue to work, check for the known message sizes used with 'struct ndmsg' in iproute2 (with zero or one attribute which is not used in this function anyway) then do not parse the data as ifinfomsg.
Same examples with this patch applied (or revert/before the original fix):
$ bridge fdb show 33:33:00:00:00:01 dev ens3 self permanent 01:00:5e:00:00:01 dev ens3 self permanent 33:33:ff:15:98:30 dev ens3 self permanent
$ ip --family bridge neigh dev ens3 lladdr 33:33:00:00:00:01 PERMANENT dev ens3 lladdr 01:00:5e:00:00:01 PERMANENT dev ens3 lladdr 33:33:ff:15:98:30 PERMANENT
$ ./neighlist netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0x0, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0} netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x1, 0x0, 0x5e, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0} netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0xff, 0x15, 0x98, 0x30}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0}
Tested on mainline (v4.19-rc6) and net-next (3bd09b05b068).
References:
[1] netlink library for go (test-case) https://github.com/vishvananda/netlink
$ cat ~/go/src/neighlist/main.go package main import ("fmt"; "syscall"; "github.com/vishvananda/netlink") func main() { neighs, _ := netlink.NeighList(0, syscall.AF_BRIDGE) for _, neigh := range neighs { fmt.Printf("%#v\n", neigh) } }
$ export GOPATH=~/go $ go get github.com/vishvananda/netlink $ go build neighlist $ ~/go/src/neighlist/neighlist
Thanks to David Ahern for suggestions to improve this patch.
Fixes: 0ff50e83b512 ("net: rtnetlink: bail out from rtnl_fdb_dump() on parse error") Fixes: 5e6d24358799 ("bridge: netlink dump interface at par with brctl") Reported-by: Aidan Obley aobley@pivotal.io Signed-off-by: Mauricio Faria de Oliveira mfo@canonical.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/rtnetlink.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3292,16 +3292,27 @@ static int rtnl_fdb_dump(struct sk_buff int err = 0; int fidx = 0;
- err = nlmsg_parse(cb->nlh, sizeof(struct ifinfomsg), tb, - IFLA_MAX, ifla_policy, NULL); - if (err < 0) { - return -EINVAL; - } else if (err == 0) { - if (tb[IFLA_MASTER]) - br_idx = nla_get_u32(tb[IFLA_MASTER]); - } + /* A hack to preserve kernel<->userspace interface. + * Before Linux v4.12 this code accepted ndmsg since iproute2 v3.3.0. + * However, ndmsg is shorter than ifinfomsg thus nlmsg_parse() bails. + * So, check for ndmsg with an optional u32 attribute (not used here). + * Fortunately these sizes don't conflict with the size of ifinfomsg + * with an optional attribute. + */ + if (nlmsg_len(cb->nlh) != sizeof(struct ndmsg) && + (nlmsg_len(cb->nlh) != sizeof(struct ndmsg) + + nla_attr_size(sizeof(u32)))) { + err = nlmsg_parse(cb->nlh, sizeof(struct ifinfomsg), tb, + IFLA_MAX, ifla_policy, NULL); + if (err < 0) { + return -EINVAL; + } else if (err == 0) { + if (tb[IFLA_MASTER]) + br_idx = nla_get_u32(tb[IFLA_MASTER]); + }
- brport_idx = ifm->ifi_index; + brport_idx = ifm->ifi_index; + }
if (br_idx) { br_dev = __dev_get_by_index(net, br_idx);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0e1d6eca5113858ed2caea61a5adc03c595f6096 ]
We have an impressive number of syzkaller bugs that are linked to the fact that syzbot was able to create a networking device with millions of TX (or RX) queues.
Let's limit the number of RX/TX queues to 4096, this really should cover all known cases.
A separate patch will add various cond_resched() in the loops handling sysfs entries at device creation and dismantle.
Tested:
lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap RTNETLINK answers: Invalid argument
lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap
real 0m0.180s user 0m0.000s sys 0m0.107s
Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/rtnetlink.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2430,6 +2430,12 @@ struct net_device *rtnl_create_link(stru else if (ops->get_num_rx_queues) num_rx_queues = ops->get_num_rx_queues();
+ if (num_tx_queues < 1 || num_tx_queues > 4096) + return ERR_PTR(-EINVAL); + + if (num_rx_queues < 1 || num_rx_queues > 4096) + return ERR_PTR(-EINVAL); + dev = alloc_netdev_mqs(ops->priv_size, ifname, name_assign_type, ops->setup, num_tx_queues, num_rx_queues); if (!dev)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit d7ab5cdce54da631f0c8c11e506c974536a3581e ]
When processing pmtu update from an icmp packet, it calls .update_pmtu with sk instead of skb in sctp_transport_update_pmtu.
However for sctp, the daddr in the transport might be different from inet_sock->inet_daddr or sk->sk_v6_daddr, which is used to update or create the route cache. The incorrect daddr will cause a different route cache created for the path.
So before calling .update_pmtu, inet_sock->inet_daddr/sk->sk_v6_daddr should be updated with the daddr in the transport, and update it back after it's done.
The issue has existed since route exceptions introduction.
Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") Reported-by: ian.periam@dialogic.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/transport.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/net/sctp/transport.c +++ b/net/sctp/transport.c @@ -254,6 +254,7 @@ void sctp_transport_pmtu(struct sctp_tra bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu) { struct dst_entry *dst = sctp_transport_dst_check(t); + struct sock *sk = t->asoc->base.sk; bool change = true;
if (unlikely(pmtu < SCTP_DEFAULT_MINSEGMENT)) { @@ -265,12 +266,19 @@ bool sctp_transport_update_pmtu(struct s pmtu = SCTP_TRUNC4(pmtu);
if (dst) { - dst->ops->update_pmtu(dst, t->asoc->base.sk, NULL, pmtu); + struct sctp_pf *pf = sctp_get_pf_specific(dst->ops->family); + union sctp_addr addr; + + pf->af->from_sk(&addr, sk); + pf->to_sk_daddr(&t->ipaddr, sk); + dst->ops->update_pmtu(dst, sk, NULL, pmtu); + pf->to_sk_daddr(&addr, sk); + dst = sctp_transport_dst_check(t); }
if (!dst) { - t->af_specific->get_dst(t, &t->saddr, &t->fl, t->asoc->base.sk); + t->af_specific->get_dst(t, &t->saddr, &t->fl, sk); dst = t->dst; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ido Schimmel idosch@mellanox.com
[ Upstream commit 471b83bd8bbe4e89743683ef8ecb78f7029d8288 ]
team's ndo_add_slave() acquires 'team->lock' and later tries to open the newly enslaved device via dev_open(). This emits a 'NETDEV_UP' event that causes the VLAN driver to add VLAN 0 on the team device. team's ndo_vlan_rx_add_vid() will also try to acquire 'team->lock' and deadlock.
Fix this by checking early at the enslavement function that a team device is not being enslaved to itself.
A similar check was added to the bond driver in commit 09a89c219baf ("bonding: disallow enslaving a bond to itself").
WARNING: possible recursive locking detected 4.18.0-rc7+ #176 Not tainted -------------------------------------------- syz-executor4/6391 is trying to acquire lock: (____ptrval____) (&team->lock){+.+.}, at: team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868
but task is already holding lock: (____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&team->lock); lock(&team->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by syz-executor4/6391: #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline] #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4662 #1: (____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
stack backtrace: CPU: 1 PID: 6391 Comm: syz-executor4 Not tainted 4.18.0-rc7+ #176 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1765 [inline] check_deadlock kernel/locking/lockdep.c:1809 [inline] validate_chain kernel/locking/lockdep.c:2405 [inline] __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 __mutex_lock_common kernel/locking/mutex.c:757 [inline] __mutex_lock+0x176/0x1820 kernel/locking/mutex.c:894 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:909 team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868 vlan_add_rx_filter_info+0x14a/0x1d0 net/8021q/vlan_core.c:210 __vlan_vid_add net/8021q/vlan_core.c:278 [inline] vlan_vid_add+0x63e/0x9d0 net/8021q/vlan_core.c:308 vlan_device_event.cold.12+0x2a/0x2f net/8021q/vlan.c:381 notifier_call_chain+0x180/0x390 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735 call_netdevice_notifiers net/core/dev.c:1753 [inline] dev_open+0x173/0x1b0 net/core/dev.c:1433 team_port_add drivers/net/team/team.c:1219 [inline] team_add_slave+0xa8b/0x1c30 drivers/net/team/team.c:1948 do_set_master+0x1c9/0x220 net/core/rtnetlink.c:2248 do_setlink+0xba4/0x3e10 net/core/rtnetlink.c:2382 rtnl_setlink+0x2a9/0x400 net/core/rtnetlink.c:2636 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4665 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2455 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4683 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfd0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:642 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:652 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2126 __sys_sendmsg+0x11d/0x290 net/socket.c:2164 __do_sys_sendmsg net/socket.c:2173 [inline] __se_sys_sendmsg net/socket.c:2171 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2171 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x456b29 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f9706bf8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f9706bf96d4 RCX: 0000000000456b29 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004 RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d3548 R14: 00000000004c8227 R15: 0000000000000000
Fixes: 87002b03baab ("net: introduce vlan_vid_[add/del] and use them instead of direct [add/kill]_vid ndo calls") Signed-off-by: Ido Schimmel idosch@mellanox.com Reported-and-tested-by: syzbot+bd051aba086537515cdb@syzkaller.appspotmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/team/team.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1165,6 +1165,11 @@ static int team_port_add(struct team *te return -EBUSY; }
+ if (dev == port_dev) { + netdev_err(dev, "Cannot enslave team device to itself\n"); + return -EINVAL; + } + if (port_dev->features & NETIF_F_VLAN_CHALLENGED && vlan_uses_dev(dev)) { netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n",
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Parthasarathy Bhuvaragan parthasarathy.bhuvaragan@ericsson.com
[ Upstream commit 92ef12b32feab8f277b69e9fb89ede2796777f4d ]
In the case of implicit connect message with data > 1K, the flow control accounting is incorrect. At this state, the socket does not know the peer nodes capability and falls back to legacy flow control by return 1, however the receiver of this message will perform the new block accounting. This leads to a slack and eventually traffic disturbance.
In this commit, we perform tipc_node_get_capabilities() at implicit connect and perform accounting based on the peer's capability.
Signed-off-by: Parthasarathy Bhuvaragan parthasarathy.bhuvaragan@ericsson.com Signed-off-by: Jon Maloy jon.maloy@ericsson.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/socket.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -1063,8 +1063,10 @@ static int __tipc_sendstream(struct sock /* Handle implicit connection setup */ if (unlikely(dest)) { rc = __tipc_sendmsg(sock, m, dlen); - if (dlen && (dlen == rc)) + if (dlen && dlen == rc) { + tsk->peer_caps = tipc_node_get_capabilities(net, dnode); tsk->snt_unacked = tsk_inc(tsk, dlen + msg_hdr_sz(hdr)); + } return rc; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Kosina jkosina@suse.cz
[ Upstream commit 7e823644b60555f70f241274b8d0120dd919269a ]
Commit 2276f58ac589 ("udp: use a separate rx queue for packet reception") turned static inline __skb_recv_udp() from being a trivial helper around __skb_recv_datagram() into a UDP specific implementaion, making it EXPORT_SYMBOL_GPL() at the same time.
There are external modules that got broken by __skb_recv_udp() not being visible to them. Let's unbreak them by making __skb_recv_udp EXPORT_SYMBOL().
Rationale (one of those) why this is actually "technically correct" thing to do: __skb_recv_udp() used to be an inline wrapper around __skb_recv_datagram(), which itself (still, and correctly so, I believe) is EXPORT_SYMBOL().
Cc: Paolo Abeni pabeni@redhat.com Cc: Eric Dumazet edumazet@google.com Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception") Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1565,7 +1565,7 @@ busy_check: *err = error; return NULL; } -EXPORT_SYMBOL_GPL(__skb_recv_udp); +EXPORT_SYMBOL(__skb_recv_udp);
/* * This should be easy, if there is something there we
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jose Abreu Jose.Abreu@synopsys.com
[ Upstream commit 0431100b3d82c509729ece1ab22ada2484e209c1 ]
Currently we are always setting the tail address of descriptor list to the end of the pre-allocated list.
According to databook this is not correct. Tail address should point to the last available descriptor + 1, which means we have to update the tail address everytime we call the xmit function.
This should make no impact in older versions of MAC but in newer versions there are some DMA features which allows the IP to fetch descriptors in advance and in a non sequential order so its critical that we set the tail address correctly.
Signed-off-by: Jose Abreu joabreu@synopsys.com Fixes: f748be531d70 ("stmmac: support new GMAC4") Cc: David S. Miller davem@davemloft.net Cc: Joao Pinto jpinto@synopsys.com Cc: Giuseppe Cavallaro peppe.cavallaro@st.com Cc: Alexandre Torgue alexandre.torgue@st.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2190,8 +2190,7 @@ static int stmmac_init_dma_engine(struct priv->plat->dma_cfg, tx_q->dma_tx_phy, chan);
- tx_q->tx_tail_addr = tx_q->dma_tx_phy + - (DMA_TX_SIZE * sizeof(struct dma_desc)); + tx_q->tx_tail_addr = tx_q->dma_tx_phy; priv->hw->dma->set_tx_tail_ptr(priv->ioaddr, tx_q->tx_tail_addr, chan); @@ -2963,6 +2962,7 @@ static netdev_tx_t stmmac_tso_xmit(struc
netdev_tx_sent_queue(netdev_get_tx_queue(dev, queue), skb->len);
+ tx_q->tx_tail_addr = tx_q->dma_tx_phy + (tx_q->cur_tx * sizeof(*desc)); priv->hw->dma->set_tx_tail_ptr(priv->ioaddr, tx_q->tx_tail_addr, queue);
@@ -3178,9 +3178,11 @@ static netdev_tx_t stmmac_xmit(struct sk
if (priv->synopsys_id < DWMAC_CORE_4_00) priv->hw->dma->enable_dma_transmission(priv->ioaddr); - else + else { + tx_q->tx_tail_addr = tx_q->dma_tx_phy + (tx_q->cur_tx * sizeof(*desc)); priv->hw->dma->set_tx_tail_ptr(priv->ioaddr, tx_q->tx_tail_addr, queue); + }
return NETDEV_TX_OK;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianfeng Tan jianfeng.tan@linux.alibaba.com
[ Upstream commit 9d2f67e43b73e8af7438be219b66a5de0cfa8bd9 ]
When we use raw socket as the vhost backend, a packet from virito with gso offloading information, cannot be sent out in later validaton at xmit path, as we did not set correct skb->protocol which is further used for looking up the gso function.
To fix this, we set this field according to virito hdr information.
Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: Jianfeng Tan jianfeng.tan@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/virtio_net.h | 18 ++++++++++++++++++ net/packet/af_packet.c | 11 +++++++---- 2 files changed, 25 insertions(+), 4 deletions(-)
--- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -5,6 +5,24 @@ #include <linux/if_vlan.h> #include <uapi/linux/virtio_net.h>
+static inline int virtio_net_hdr_set_proto(struct sk_buff *skb, + const struct virtio_net_hdr *hdr) +{ + switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { + case VIRTIO_NET_HDR_GSO_TCPV4: + case VIRTIO_NET_HDR_GSO_UDP: + skb->protocol = cpu_to_be16(ETH_P_IP); + break; + case VIRTIO_NET_HDR_GSO_TCPV6: + skb->protocol = cpu_to_be16(ETH_P_IPV6); + break; + default: + return -EINVAL; + } + + return 0; +} + static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, const struct virtio_net_hdr *hdr, bool little_endian) --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2753,10 +2753,12 @@ tpacket_error: } }
- if (po->has_vnet_hdr && virtio_net_hdr_to_skb(skb, vnet_hdr, - vio_le())) { - tp_len = -EINVAL; - goto tpacket_error; + if (po->has_vnet_hdr) { + if (virtio_net_hdr_to_skb(skb, vnet_hdr, vio_le())) { + tp_len = -EINVAL; + goto tpacket_error; + } + virtio_net_hdr_set_proto(skb, vnet_hdr); }
skb->destructor = tpacket_destruct_skb; @@ -2952,6 +2954,7 @@ static int packet_snd(struct socket *soc if (err) goto out_free; len += sizeof(vnet_hdr); + virtio_net_hdr_set_proto(skb, &vnet_hdr); }
skb_probe_transport_header(skb, reserve);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit bf3b452b7af787b8bf27de6490dc4eedf6f97599 ]
The order in which we release resources is unfortunately leading to bus errors while dismantling the port. This is because we set priv->wol_ports_mask to 0 to tell bcm_sf2_sw_suspend() that it is now permissible to clock gate the switch. Later on, when dsa_slave_destroy() comes in from dsa_unregister_switch() and calls dsa_switch_ops::port_disable, we perform the same dismantling again, and this time we hit registers that are clock gated.
Make sure that dsa_unregister_switch() is the first thing that happens, which takes care of releasing all user visible resources, then proceed with clock gating hardware. We still need to set priv->wol_ports_mask to 0 to make sure that an enabled port properly gets disabled in case it was previously used as part of Wake-on-LAN.
Fixes: d9338023fb8e ("net: dsa: bcm_sf2: Make it a real platform device driver") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/bcm_sf2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -1264,10 +1264,10 @@ static int bcm_sf2_sw_remove(struct plat { struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
- /* Disable all ports and interrupts */ priv->wol_ports_mask = 0; - bcm_sf2_sw_suspend(priv->dev->ds); dsa_unregister_switch(priv->dev->ds); + /* Disable all ports and interrupts */ + bcm_sf2_sw_suspend(priv->dev->ds); bcm_sf2_mdio_unregister(priv);
return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianbo Liu jianbol@mellanox.com
[ Upstream commit cee26487620bc9bc3c7db21b6984d91f7bae12ae ]
In flow steering, if asked to, the hardware matches on the first ethertype which is not vlan. It's possible to set a rule as follows, which is meant to match on untagged packet, but will match on a vlan packet: tc filter add dev eth0 parent ffff: protocol ip flower ...
To avoid this for packets with single tag, we set vlan masks to tell hardware to check the tags for every matched packet.
Fixes: 095b6cfd69ce ('net/mlx5e: Add TC vlan match parsing') Signed-off-by: Jianbo Liu jianbol@mellanox.com Reviewed-by: Or Gerlitz ogerlitz@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -864,6 +864,9 @@ static int __parse_cls_flower(struct mlx MLX5_SET(fte_match_set_lyr_2_4, headers_c, first_prio, mask->vlan_priority); MLX5_SET(fte_match_set_lyr_2_4, headers_v, first_prio, key->vlan_priority); } + } else { + MLX5_SET(fte_match_set_lyr_2_4, headers_c, svlan_tag, 1); + MLX5_SET(fte_match_set_lyr_2_4, headers_c, cvlan_tag, 1); }
if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Friedemann Gerold f.gerold@b-c-s.de
[ Upstream commit d26ed6b0e5e23190d43ab34bc69cbecdc464a2cf ]
This patch fixes skb_shared area, which will be corrupted upon reception of 4K jumbo packets.
Originally build_skb usage purpose was to reuse page for skb to eliminate needs of extra fragments. But that logic does not take into account that skb_shared_info should be reserved at the end of skb data area.
In case packet data consumes all the page (4K), skb_shinfo location overflows the page. As a consequence, __build_skb zeroed shinfo data above the allocated page, corrupting next page.
The issue is rarely seen in real life because jumbo are normally larger than 4K and that causes another code path to trigger. But it 100% reproducible with simple scapy packet, like:
sendp(IP(dst="192.168.100.3") / TCP(dport=443) \ / Raw(RandString(size=(4096-40))), iface="enp1s0")
Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Reported-by: Friedemann Gerold f.gerold@b-c-s.de Reported-by: Michael Rauch michael@rauch.be Signed-off-by: Friedemann Gerold f.gerold@b-c-s.de Tested-by: Nikita Danilov nikita.danilov@aquantia.com Signed-off-by: Igor Russkikh igor.russkikh@aquantia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 32 ++++++++++++----------- 1 file changed, 18 insertions(+), 14 deletions(-)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c @@ -222,9 +222,10 @@ int aq_ring_rx_clean(struct aq_ring_s *s }
/* for single fragment packets use build_skb() */ - if (buff->is_eop) { + if (buff->is_eop && + buff->len <= AQ_CFG_RX_FRAME_MAX - AQ_SKB_ALIGN) { skb = build_skb(page_address(buff->page), - buff->len + AQ_SKB_ALIGN); + AQ_CFG_RX_FRAME_MAX); if (unlikely(!skb)) { err = -ENOMEM; goto err_exit; @@ -244,18 +245,21 @@ int aq_ring_rx_clean(struct aq_ring_s *s buff->len - ETH_HLEN, SKB_TRUESIZE(buff->len - ETH_HLEN));
- for (i = 1U, next_ = buff->next, - buff_ = &self->buff_ring[next_]; true; - next_ = buff_->next, - buff_ = &self->buff_ring[next_], ++i) { - skb_add_rx_frag(skb, i, buff_->page, 0, - buff_->len, - SKB_TRUESIZE(buff->len - - ETH_HLEN)); - buff_->is_cleaned = 1; - - if (buff_->is_eop) - break; + if (!buff->is_eop) { + for (i = 1U, next_ = buff->next, + buff_ = &self->buff_ring[next_]; + true; next_ = buff_->next, + buff_ = &self->buff_ring[next_], ++i) { + skb_add_rx_frag(skb, i, + buff_->page, 0, + buff_->len, + SKB_TRUESIZE(buff->len - + ETH_HLEN)); + buff_->is_cleaned = 1; + + if (buff_->is_eop) + break; + } } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eran Ben Elisha eranbe@mellanox.com
[ Upstream commit 11aa5800ed66ed0415b7509f02881c76417d212a ]
The code that deals with eswitch vport bw guarantee was going beyond the eswitch vport array limit, fix that. This was pointed out by the kernel address sanitizer (KASAN).
The error from KASAN log: [2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core]
Fixes: c9497c98901c ("net/mlx5: Add support for setting VF min rate") Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Reviewed-by: Or Gerlitz ogerlitz@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1922,7 +1922,7 @@ static u32 calculate_vports_min_rate_div u32 max_guarantee = 0; int i;
- for (i = 0; i <= esw->total_vports; i++) { + for (i = 0; i < esw->total_vports; i++) { evport = &esw->vports[i]; if (!evport->enabled || evport->info.min_rate < max_guarantee) continue; @@ -1942,7 +1942,7 @@ static int normalize_vports_min_rate(str int err; int i;
- for (i = 0; i <= esw->total_vports; i++) { + for (i = 0; i < esw->total_vports; i++) { evport = &esw->vports[i]; if (!evport->enabled) continue;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit 6a9e461f6fe4434e6172304b69774daff9a3ac4c ]
Commit b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") changed the behavior of how link-local-multicast packets are processed. The change in the behavior broke some legacy use cases where these packets are expected to arrive on bonding master device also.
This patch passes the packet to the stack with the link it arrived on as well as passes to the bonding-master device to preserve the legacy use case.
Fixes: b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") Reported-by: Michal Soltys soltys@ziu.info Signed-off-by: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1177,9 +1177,26 @@ static rx_handler_result_t bond_handle_f } }
- /* don't change skb->dev for link-local packets */ - if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) + /* Link-local multicast packets should be passed to the + * stack on the link they arrive as well as pass them to the + * bond-master device. These packets are mostly usable when + * stack receives it with the link on which they arrive + * (e.g. LLDP) they also must be available on master. Some of + * the use cases include (but are not limited to): LLDP agents + * that must be able to operate both on enslaved interfaces as + * well as on bonds themselves; linux bridges that must be able + * to process/pass BPDUs from attached bonds when any kind of + * STP version is enabled on the network. + */ + if (is_link_local_ether_addr(eth_hdr(skb)->h_dest)) { + struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC); + + if (nskb) { + nskb->dev = bond->dev; + netif_rx(nskb); + } return RX_HANDLER_PASS; + } if (bond_should_deliver_exact_match(skb, slave, bond)) return RX_HANDLER_EXACT;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahesh Bandewar maheshb@google.com
[ Upstream commit 0f3b914c9cfcd7bbedd445dc4ac5dd999fa213c2 ]
RX queue config for bonding master could be different from its slave device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also."), the packet is reinjected into stack with skb->dev as bonding master. This potentially triggers the message:
"bondX received packet on queue Y, but number of RX queues is Z"
whenever the queue that packet is received on is higher than the numrxqueues on bonding master (Y > Z).
Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.") Reported-by: John Sperbeck jsperbeck@google.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: Mahesh Bandewar maheshb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1193,6 +1193,7 @@ static rx_handler_result_t bond_handle_f
if (nskb) { nskb->dev = bond->dev; + nskb->queue_mapping = 0; netif_rx(nskb); } return RX_HANDLER_PASS;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski jakub.kicinski@netronome.com
[ Upstream commit ff58e2df62ce29d0552278c290ae494b30fe0c6f ]
When FW floods the driver with control messages try to exit the cmsg processing loop every now and then to avoid soft lockups. Cmsg processing is generally very lightweight so 512 seems like a reasonable budget, which should not be exceeded under normal conditions.
Fixes: 77ece8d5f196 ("nfp: add control vNIC datapath") Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Reviewed-by: Simon Horman simon.horman@netronome.com Tested-by: Pieter Jansen van Vuuren pieter.jansenvanvuuren@netronome.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -2058,14 +2058,17 @@ nfp_ctrl_rx_one(struct nfp_net *nn, stru return true; }
-static void nfp_ctrl_rx(struct nfp_net_r_vector *r_vec) +static bool nfp_ctrl_rx(struct nfp_net_r_vector *r_vec) { struct nfp_net_rx_ring *rx_ring = r_vec->rx_ring; struct nfp_net *nn = r_vec->nfp_net; struct nfp_net_dp *dp = &nn->dp; + unsigned int budget = 512;
- while (nfp_ctrl_rx_one(nn, dp, r_vec, rx_ring)) + while (nfp_ctrl_rx_one(nn, dp, r_vec, rx_ring) && budget--) continue; + + return budget; }
static void nfp_ctrl_poll(unsigned long arg) @@ -2077,9 +2080,13 @@ static void nfp_ctrl_poll(unsigned long __nfp_ctrl_tx_queued(r_vec); spin_unlock_bh(&r_vec->lock);
- nfp_ctrl_rx(r_vec); - - nfp_net_irq_unmask(r_vec->nfp_net, r_vec->irq_entry); + if (nfp_ctrl_rx(r_vec)) { + nfp_net_irq_unmask(r_vec->nfp_net, r_vec->irq_entry); + } else { + tasklet_schedule(&r_vec->tasklet); + nn_dp_warn(&r_vec->nfp_net->dp, + "control message budget exceeded!\n"); + } }
/* Setup and Configuration
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davide Caratti dcaratti@redhat.com
[ Upstream commit 8c6ec3613e7b0aade20a3196169c0bab32ed3e3f ]
bnxt offload code currently supports only 'push' and 'pop' operation: let .ndo_setup_tc() return -EOPNOTSUPP if VLAN 'modify' action is configured.
Fixes: 2ae7408fedfe ("bnxt_en: bnxt: add TC flower filter offload support") Signed-off-by: Davide Caratti dcaratti@redhat.com Acked-by: Sathya Perla sathya.perla@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c @@ -78,17 +78,23 @@ static int bnxt_tc_parse_redir(struct bn return 0; }
-static void bnxt_tc_parse_vlan(struct bnxt *bp, - struct bnxt_tc_actions *actions, - const struct tc_action *tc_act) +static int bnxt_tc_parse_vlan(struct bnxt *bp, + struct bnxt_tc_actions *actions, + const struct tc_action *tc_act) { - if (tcf_vlan_action(tc_act) == TCA_VLAN_ACT_POP) { + switch (tcf_vlan_action(tc_act)) { + case TCA_VLAN_ACT_POP: actions->flags |= BNXT_TC_ACTION_FLAG_POP_VLAN; - } else if (tcf_vlan_action(tc_act) == TCA_VLAN_ACT_PUSH) { + break; + case TCA_VLAN_ACT_PUSH: actions->flags |= BNXT_TC_ACTION_FLAG_PUSH_VLAN; actions->push_vlan_tci = htons(tcf_vlan_push_vid(tc_act)); actions->push_vlan_tpid = tcf_vlan_push_proto(tc_act); + break; + default: + return -EOPNOTSUPP; } + return 0; }
static int bnxt_tc_parse_actions(struct bnxt *bp, @@ -122,7 +128,9 @@ static int bnxt_tc_parse_actions(struct
/* Push/pop VLAN */ if (is_tcf_vlan(tc_act)) { - bnxt_tc_parse_vlan(bp, actions, tc_act); + rc = bnxt_tc_parse_vlan(bp, actions, tc_act); + if (rc) + return rc; continue; } }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Maciej Żenczykowski" maze@google.com
[ Upstream commit 474ff2600889e16280dbc6ada8bfecb216169a70 ]
So it should not fail with EPERM even though it is no longer implemented...
This is a fix for: (userns)$ egrep ^Cap /proc/self/status CapInh: 0000003fffffffff CapPrm: 0000003fffffffff CapEff: 0000003fffffffff CapBnd: 0000003fffffffff CapAmb: 0000003fffffffff
(userns)$ tcpdump -i usb_rndis0 tcpdump: WARNING: usb_rndis0: SIOCETHTOOL(ETHTOOL_GUFO) ioctl failed: Operation not permitted Warning: Kernel filter failed: Bad file descriptor tcpdump: can't remove kernel filter: Bad file descriptor
With this change it returns EOPNOTSUPP instead of EPERM.
See also https://github.com/the-tcpdump-group/libpcap/issues/689
Fixes: 08a00fea6de2 "net: Remove references to NETIF_F_UFO from ethtool." Cc: David S. Miller davem@davemloft.net Signed-off-by: Maciej Żenczykowski maze@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/ethtool.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -2572,6 +2572,7 @@ int dev_ethtool(struct net *net, struct case ETHTOOL_GPHYSTATS: case ETHTOOL_GTSO: case ETHTOOL_GPERMADDR: + case ETHTOOL_GUFO: case ETHTOOL_GGSO: case ETHTOOL_GGRO: case ETHTOOL_GFLAGS:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 1ad98e9d1bdf4724c0a8532fabd84bf3c457c2bc ]
In normal SYN processing, packets are handled without listener lock and in RCU protected ingress path.
But syzkaller is known to be able to trick us and SYN packets might be processed in process context, after being queued into socket backlog.
In commit 06f877d613be ("tcp/dccp: fix other lockdep splats accessing ireq_opt") I made a very stupid fix, that happened to work mostly because of the regular path being RCU protected.
Really the thing protecting ireq->ireq_opt is RCU read lock, and the pseudo request refcnt is not relevant.
This patch extends what I did in commit 449809a66c1d ("tcp/dccp: block BH for SYN processing") by adding an extra rcu_read_{lock|unlock} pair in the paths that might be taken when processing SYN from socket backlog (thus possibly in process context)
Fixes: 06f877d613be ("tcp/dccp: fix other lockdep splats accessing ireq_opt") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/inet_sock.h | 3 +-- net/dccp/input.c | 4 +++- net/ipv4/tcp_input.c | 4 +++- 3 files changed, 7 insertions(+), 4 deletions(-)
--- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -131,8 +131,7 @@ static inline int inet_request_bound_dev
static inline struct ip_options_rcu *ireq_opt_deref(const struct inet_request_sock *ireq) { - return rcu_dereference_check(ireq->ireq_opt, - refcount_read(&ireq->req.rsk_refcnt) > 0); + return rcu_dereference(ireq->ireq_opt); }
struct inet_cork { --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -605,11 +605,13 @@ int dccp_rcv_state_process(struct sock * if (sk->sk_state == DCCP_LISTEN) { if (dh->dccph_type == DCCP_PKT_REQUEST) { /* It is possible that we process SYN packets from backlog, - * so we need to make sure to disable BH right there. + * so we need to make sure to disable BH and RCU right there. */ + rcu_read_lock(); local_bh_disable(); acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0; local_bh_enable(); + rcu_read_unlock(); if (!acceptable) return 1; consume_skb(skb); --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5913,11 +5913,13 @@ int tcp_rcv_state_process(struct sock *s if (th->fin) goto discard; /* It is possible that we process SYN packets from backlog, - * so we need to make sure to disable BH right there. + * so we need to make sure to disable BH and RCU right there. */ + rcu_read_lock(); local_bh_disable(); acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0; local_bh_enable(); + rcu_read_unlock();
if (!acceptable) return 1;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 2ab2ddd301a22ca3c5f0b743593e4ad2953dfa53 ]
Timer handlers do not imply rcu_read_lock(), so my recent fix triggered a LOCKDEP warning when SYNACK is retransmit.
Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt usages instead of guessing what is done by callers, since it is not worth the pain.
Get rid of ireq_opt_deref() helper since it hides the logic without real benefit, since it is now a standard rcu_dereference().
Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/inet_sock.h | 5 ----- net/dccp/ipv4.c | 4 +++- net/ipv4/inet_connection_sock.c | 5 ++++- net/ipv4/tcp_ipv4.c | 4 +++- 4 files changed, 10 insertions(+), 8 deletions(-)
--- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -129,11 +129,6 @@ static inline int inet_request_bound_dev return sk->sk_bound_dev_if; }
-static inline struct ip_options_rcu *ireq_opt_deref(const struct inet_request_sock *ireq) -{ - return rcu_dereference(ireq->ireq_opt); -} - struct inet_cork { unsigned int flags; __be32 addr; --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -493,9 +493,11 @@ static int dccp_v4_send_response(const s
dh->dccph_checksum = dccp_v4_csum_finish(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr); + rcu_read_lock(); err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr, ireq->ir_rmt_addr, - ireq_opt_deref(ireq)); + rcu_dereference(ireq->ireq_opt)); + rcu_read_unlock(); err = net_xmit_eval(err); }
--- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -542,7 +542,8 @@ struct dst_entry *inet_csk_route_req(con struct ip_options_rcu *opt; struct rtable *rt;
- opt = ireq_opt_deref(ireq); + rcu_read_lock(); + opt = rcu_dereference(ireq->ireq_opt);
flowi4_init_output(fl4, ireq->ir_iif, ireq->ir_mark, RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, @@ -556,11 +557,13 @@ struct dst_entry *inet_csk_route_req(con goto no_route; if (opt && opt->opt.is_strictroute && rt->rt_uses_gateway) goto route_err; + rcu_read_unlock(); return &rt->dst;
route_err: ip_rt_put(rt); no_route: + rcu_read_unlock(); __IP_INC_STATS(net, IPSTATS_MIB_OUTNOROUTES); return NULL; } --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -875,9 +875,11 @@ static int tcp_v4_send_synack(const stru if (skb) { __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr);
+ rcu_read_lock(); err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr, ireq->ir_rmt_addr, - ireq_opt_deref(ireq)); + rcu_dereference(ireq->ireq_opt)); + rcu_read_unlock(); err = net_xmit_eval(err); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oder Chiou oder_chiou@realtek.com
[ Upstream commit 6f0a256253f48095ba2e5bcdfbed41f21643c105 ]
After our evaluation, we need to modify the default values to make sure the volume applied immediately.
Signed-off-by: Oder Chiou oder_chiou@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/rt5514.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/sound/soc/codecs/rt5514.c +++ b/sound/soc/codecs/rt5514.c @@ -64,8 +64,8 @@ static const struct reg_sequence rt5514_ {RT5514_ANA_CTRL_LDO10, 0x00028604}, {RT5514_ANA_CTRL_ADCFED, 0x00000800}, {RT5514_ASRC_IN_CTRL1, 0x00000003}, - {RT5514_DOWNFILTER0_CTRL3, 0x10000352}, - {RT5514_DOWNFILTER1_CTRL3, 0x10000352}, + {RT5514_DOWNFILTER0_CTRL3, 0x10000342}, + {RT5514_DOWNFILTER1_CTRL3, 0x10000342}, };
static const struct reg_default rt5514_reg[] = { @@ -92,10 +92,10 @@ static const struct reg_default rt5514_r {RT5514_ASRC_IN_CTRL1, 0x00000003}, {RT5514_DOWNFILTER0_CTRL1, 0x00020c2f}, {RT5514_DOWNFILTER0_CTRL2, 0x00020c2f}, - {RT5514_DOWNFILTER0_CTRL3, 0x10000352}, + {RT5514_DOWNFILTER0_CTRL3, 0x10000342}, {RT5514_DOWNFILTER1_CTRL1, 0x00020c2f}, {RT5514_DOWNFILTER1_CTRL2, 0x00020c2f}, - {RT5514_DOWNFILTER1_CTRL3, 0x10000352}, + {RT5514_DOWNFILTER1_CTRL3, 0x10000342}, {RT5514_ANA_CTRL_LDO10, 0x00028604}, {RT5514_ANA_CTRL_LDO18_16, 0x02000345}, {RT5514_ANA_CTRL_ADC12, 0x0000a2a8},
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 960cdd50ca9fdfeb82c2757107bcb7f93c8d7d41 ]
HID made of either Wolfson/CirrusLogic PCI ID + 8804 identifier.
This helps enumerate the HifiBerry Digi+ HAT boards on the Up2 platform.
The scripts at https://github.com/thesofproject/acpi-scripts can be used to add the ACPI initrd overlays.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wm8804-i2c.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/sound/soc/codecs/wm8804-i2c.c +++ b/sound/soc/codecs/wm8804-i2c.c @@ -13,6 +13,7 @@ #include <linux/init.h> #include <linux/module.h> #include <linux/i2c.h> +#include <linux/acpi.h>
#include "wm8804.h"
@@ -40,17 +41,29 @@ static const struct i2c_device_id wm8804 }; MODULE_DEVICE_TABLE(i2c, wm8804_i2c_id);
+#if defined(CONFIG_OF) static const struct of_device_id wm8804_of_match[] = { { .compatible = "wlf,wm8804", }, { } }; MODULE_DEVICE_TABLE(of, wm8804_of_match); +#endif + +#ifdef CONFIG_ACPI +static const struct acpi_device_id wm8804_acpi_match[] = { + { "1AEC8804", 0 }, /* Wolfson PCI ID + part ID */ + { "10138804", 0 }, /* Cirrus Logic PCI ID + part ID */ + { }, +}; +MODULE_DEVICE_TABLE(acpi, wm8804_acpi_match); +#endif
static struct i2c_driver wm8804_i2c_driver = { .driver = { .name = "wm8804", .pm = &wm8804_pm, - .of_match_table = wm8804_of_match, + .of_match_table = of_match_ptr(wm8804_of_match), + .acpi_match_table = ACPI_PTR(wm8804_acpi_match), }, .probe = wm8804_i2c_probe, .remove = wm8804_i2c_remove,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Danny Smith danny.smith@axis.com
[ Upstream commit 5ea752c6efdf5aa8a57aed816d453a8f479f1b0a ]
Fixed range in safeload conditional to allow safeload to up to 20 bytes, without a lower limit.
Signed-off-by: Danny Smith dannys@axis.com Acked-by: Lars-Peter Clausen lars@metafoo.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/sigmadsp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/sound/soc/codecs/sigmadsp.c +++ b/sound/soc/codecs/sigmadsp.c @@ -117,8 +117,7 @@ static int sigmadsp_ctrl_write(struct si struct sigmadsp_control *ctrl, void *data) { /* safeload loads up to 20 bytes in a atomic operation */ - if (ctrl->num_bytes > 4 && ctrl->num_bytes <= 20 && sigmadsp->ops && - sigmadsp->ops->safeload) + if (ctrl->num_bytes <= 20 && sigmadsp->ops && sigmadsp->ops->safeload) return sigmadsp->ops->safeload(sigmadsp, ctrl->addr, data, ctrl->num_bytes); else
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lei Yang Lei.Yang@windriver.com
[ Upstream commit 53cf59d6c0ad3edc4f4449098706a8f8986258b6 ]
add config file
Signed-off-by: Lei Yang Lei.Yang@windriver.com Signed-off-by: Shuah Khan (Samsung OSG) shuah@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/efivarfs/config | 1 + 1 file changed, 1 insertion(+) create mode 100644 tools/testing/selftests/efivarfs/config
--- /dev/null +++ b/tools/testing/selftests/efivarfs/config @@ -0,0 +1 @@ +CONFIG_EFIVAR_FS=y
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lei Yang Lei.Yang@windriver.com
[ Upstream commit 4d85af102a66ee6aeefa596f273169e77fb2b48e ]
add CONFIG_MEMORY_HOTREMOVE=y in config without this config, /sys/devices/system/memory/memory*/removable always return 0, I endup getting an early skip during test
Signed-off-by: Lei Yang Lei.Yang@windriver.com Signed-off-by: Shuah Khan (Samsung OSG) shuah@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/memory-hotplug/config | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/memory-hotplug/config +++ b/tools/testing/selftests/memory-hotplug/config @@ -2,3 +2,4 @@ CONFIG_MEMORY_HOTPLUG=y CONFIG_MEMORY_HOTPLUG_SPARSE=y CONFIG_NOTIFIER_ERROR_INJECTION=y CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m +CONFIG_MEMORY_HOTREMOVE=y
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 69235ccf491d2e26aefd465c0d3ccd1e3b2a9a9c ]
ADG has buffer over flow bug if DT has more than 3 clock-frequency. This patch fixup this issue, and uses first 2 values.
clock-frequency = <x y>; /* this is OK */ clock-frequency = <x y z>; /* this is NG */
Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Tested-by: Hiroyuki Yokoyama hiroyuki.yokoyama.vx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sh/rcar/adg.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/sound/soc/sh/rcar/adg.c +++ b/sound/soc/sh/rcar/adg.c @@ -467,6 +467,11 @@ static void rsnd_adg_get_clkout(struct r goto rsnd_adg_get_clkout_end;
req_size = prop->length / sizeof(u32); + if (req_size > REQ_SIZE) { + dev_err(dev, + "too many clock-frequency, use top %d\n", REQ_SIZE); + req_size = REQ_SIZE; + }
of_property_read_u32_array(np, "clock-frequency", req_rate, req_size); req_48kHz_rate = 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 6c92d5a2744e27619a8fcc9d74b91ee9f1cdebd1 ]
Current rsnd driver will fallback to PIO mode if it can't get DMA handler. But, DMA might return -EPROBE_DEFER when probe timing. This driver always fallback to PIO mode especially from commit ac6bbf0cdf4206c ("iommu: Remove IOMMU_OF_DECLARE") because of this reason.
The DMA driver will be probed later, but sound driver might be probed as PIO mode in such case. This patch fixup this issue. Then, -EPROBE_DEFER is not error. Thus, let's don't indicate error message in such case. And it needs to call rsnd_adg_remove() individually if probe failed, because it registers clk which should be unregister.
Maybe PIO fallback feature itself is not needed, but let's keep it so far.
Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sh/rcar/core.c | 10 +++++++++- sound/soc/sh/rcar/dma.c | 4 ++++ 2 files changed, 13 insertions(+), 1 deletion(-)
--- a/sound/soc/sh/rcar/core.c +++ b/sound/soc/sh/rcar/core.c @@ -486,7 +486,7 @@ static int rsnd_status_update(u32 *statu (func_call && (mod)->ops->fn) ? #fn : ""); \ if (func_call && (mod)->ops->fn) \ tmp = (mod)->ops->fn(mod, io, param); \ - if (tmp) \ + if (tmp && (tmp != -EPROBE_DEFER)) \ dev_err(dev, "%s[%d] : %s error %d\n", \ rsnd_mod_name(mod), rsnd_mod_id(mod), \ #fn, tmp); \ @@ -1469,6 +1469,14 @@ exit_snd_probe: rsnd_dai_call(remove, &rdai->capture, priv); }
+ /* + * adg is very special mod which can't use rsnd_dai_call(remove), + * and it registers ADG clock on probe. + * It should be unregister if probe failed. + * Mainly it is assuming -EPROBE_DEFER case + */ + rsnd_adg_remove(priv); + return ret; }
--- a/sound/soc/sh/rcar/dma.c +++ b/sound/soc/sh/rcar/dma.c @@ -330,6 +330,10 @@ static int rsnd_dmaen_attach(struct rsnd /* try to get DMAEngine channel */ chan = rsnd_dmaen_request_channel(io, mod_from, mod_to); if (IS_ERR_OR_NULL(chan)) { + /* Let's follow when -EPROBE_DEFER case */ + if (PTR_ERR(chan) == -EPROBE_DEFER) + return PTR_ERR(chan); + /* * DMA failed. try to PIO mode * see
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hermes Zhang chenhuiz@axis.com
[ Upstream commit e6a57d22f787e73635ce0d29eef0abb77928b3e9 ]
The percpu_rw_semaphore is not currently freed, and this leads to a crash when the stale rcu callback is invoked. DEBUG_OBJECTS detects this.
ODEBUG: free active (active state 1) object type: rcu_head hint: (null) ------------[ cut here ]------------ WARNING: CPU: 1 PID: 2024 at debug_print_object+0xac/0xc8 PC is at debug_print_object+0xac/0xc8 LR is at debug_print_object+0xac/0xc8 Call trace: [<ffffff80082e2c2c>] debug_print_object+0xac/0xc8 [<ffffff80082e40b0>] debug_check_no_obj_freed+0x1e8/0x228 [<ffffff8008191254>] kfree+0x1cc/0x250 [<ffffff80083cc03c>] hci_uart_tty_close+0x54/0x108 [<ffffff800832e118>] tty_ldisc_close.isra.1+0x40/0x58 [<ffffff800832e14c>] tty_ldisc_kill+0x1c/0x40 [<ffffff800832e3dc>] tty_ldisc_release+0x94/0x170 [<ffffff8008325554>] tty_release_struct+0x1c/0x58 [<ffffff8008326400>] tty_release+0x3b0/0x490 [<ffffff80081a3fe8>] __fput+0x88/0x1d0 [<ffffff80081a418c>] ____fput+0xc/0x18 [<ffffff80080c0624>] task_work_run+0x9c/0xc0 [<ffffff80080a9e24>] do_exit+0x24c/0x8a0 [<ffffff80080aa4e0>] do_group_exit+0x38/0xa0 [<ffffff80080aa558>] __wake_up_parent+0x0/0x28 [<ffffff8008082c00>] el0_svc_naked+0x34/0x38 ---[ end trace bfe08cbd89098cdf ]---
Signed-off-by: Hermes Zhang chenhuiz@axis.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/hci_ldisc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/bluetooth/hci_ldisc.c +++ b/drivers/bluetooth/hci_ldisc.c @@ -539,6 +539,8 @@ static void hci_uart_tty_close(struct tt } clear_bit(HCI_UART_PROTO_SET, &hu->flags);
+ percpu_free_rwsem(&hu->proto_lock); + kfree(hu); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren tony@atomide.com
[ Upstream commit 10492ee8ed9188d6d420e1f79b2b9bdbc0624e65 ]
It currently only works if the parent bus uses "simple-bus". We currently try to probe children with non-existing compatible values. And we're missing .probe.
I noticed this while testing devices configured to probe using ti-sysc interconnect target module driver. For that we also may want to rebind the driver, so let's remove __init and __exit.
Signed-off-by: Tony Lindgren tony@atomide.com Acked-by: Roger Quadros rogerq@ti.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/omap-usb-host.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/mfd/omap-usb-host.c +++ b/drivers/mfd/omap-usb-host.c @@ -548,8 +548,8 @@ static int usbhs_omap_get_dt_pdata(struc }
static const struct of_device_id usbhs_child_match_table[] = { - { .compatible = "ti,omap-ehci", }, - { .compatible = "ti,omap-ohci", }, + { .compatible = "ti,ehci-omap", }, + { .compatible = "ti,ohci-omap3", }, { } };
@@ -875,6 +875,7 @@ static struct platform_driver usbhs_omap .pm = &usbhsomap_dev_pm_ops, .of_match_table = usbhs_omap_dt_ids, }, + .probe = usbhs_omap_probe, .remove = usbhs_omap_remove, };
@@ -884,9 +885,9 @@ MODULE_ALIAS("platform:" USBHS_DRIVER_NA MODULE_LICENSE("GPL v2"); MODULE_DESCRIPTION("usb host common core driver for omap EHCI and OHCI");
-static int __init omap_usbhs_drvinit(void) +static int omap_usbhs_drvinit(void) { - return platform_driver_probe(&usbhs_omap_driver, usbhs_omap_probe); + return platform_driver_register(&usbhs_omap_driver); }
/* @@ -898,7 +899,7 @@ static int __init omap_usbhs_drvinit(voi */ fs_initcall_sync(omap_usbhs_drvinit);
-static void __exit omap_usbhs_drvexit(void) +static void omap_usbhs_drvexit(void) { platform_driver_unregister(&usbhs_omap_driver); }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laura Abbott labbott@redhat.com
[ Upstream commit 679fcae46c8b2352bba3485d521da070cfbe68e6 ]
Fedora got a bug report of a crash with iSCSI:
kernel BUG at include/linux/scatterlist.h:143! ... RIP: 0010:iscsit_do_crypto_hash_buf+0x154/0x180 [iscsi_target_mod] ... Call Trace: ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod] iscsit_get_rx_pdu+0x4cd/0xa90 [iscsi_target_mod] ? native_sched_clock+0x3e/0xa0 ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod] iscsi_target_rx_thread+0x81/0xf0 [iscsi_target_mod] kthread+0x120/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50
This is a BUG_ON for using a stack buffer with a scatterlist. There are two cases that trigger this bug. Switch to using a dynamically allocated buffer for one case and do not assign a NULL buffer in another case.
Signed-off-by: Laura Abbott labbott@redhat.com Reviewed-by: Mike Christie mchristi@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/target/iscsi/iscsi_target.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1421,7 +1421,8 @@ static void iscsit_do_crypto_hash_buf(
sg_init_table(sg, ARRAY_SIZE(sg)); sg_set_buf(sg, buf, payload_length); - sg_set_buf(sg + 1, pad_bytes, padding); + if (padding) + sg_set_buf(sg + 1, pad_bytes, padding);
ahash_request_set_crypt(hash, sg, data_crc, payload_length + padding);
@@ -3942,10 +3943,14 @@ static bool iscsi_target_check_conn_stat static void iscsit_get_rx_pdu(struct iscsi_conn *conn) { int ret; - u8 buffer[ISCSI_HDR_LEN], opcode; + u8 *buffer, opcode; u32 checksum = 0, digest = 0; struct kvec iov;
+ buffer = kcalloc(ISCSI_HDR_LEN, sizeof(*buffer), GFP_KERNEL); + if (!buffer) + return; + while (!kthread_should_stop()) { /* * Ensure that both TX and RX per connection kthreads @@ -3953,7 +3958,6 @@ static void iscsit_get_rx_pdu(struct isc */ iscsit_thread_check_cpumask(conn, current, 0);
- memset(buffer, 0, ISCSI_HDR_LEN); memset(&iov, 0, sizeof(struct kvec));
iov.iov_base = buffer; @@ -3962,7 +3966,7 @@ static void iscsit_get_rx_pdu(struct isc ret = rx_data(conn, &iov, 1, ISCSI_HDR_LEN); if (ret != ISCSI_HDR_LEN) { iscsit_rx_thread_wait_for_tcp(conn); - return; + break; }
if (conn->conn_ops->HeaderDigest) { @@ -3972,7 +3976,7 @@ static void iscsit_get_rx_pdu(struct isc ret = rx_data(conn, &iov, 1, ISCSI_CRC_LEN); if (ret != ISCSI_CRC_LEN) { iscsit_rx_thread_wait_for_tcp(conn); - return; + break; }
iscsit_do_crypto_hash_buf(conn->conn_rx_hash, @@ -3996,7 +4000,7 @@ static void iscsit_get_rx_pdu(struct isc }
if (conn->conn_state == TARG_CONN_STATE_IN_LOGOUT) - return; + break;
opcode = buffer[0] & ISCSI_OPCODE_MASK;
@@ -4007,13 +4011,15 @@ static void iscsit_get_rx_pdu(struct isc " while in Discovery Session, rejecting.\n", opcode); iscsit_add_reject(conn, ISCSI_REASON_PROTOCOL_ERROR, buffer); - return; + break; }
ret = iscsi_target_rx_opcode(conn, buffer); if (ret < 0) - return; + break; } + + kfree(buffer); }
int iscsi_target_rx_thread(void *arg)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit cbe3fd39d223f14b1c60c80fe9347a3dd08c2edb ]
We should first do the le16_to_cpu endian conversion and then apply the FCP_CMD_LENGTH_MASK mask.
Fixes: 5f35509db179 ("qla2xxx: Terminate exchange if corrupted") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Quinn Tran Quinn.Tran@cavium.com Acked-by: Himanshu Madhani himanshu.madhani@cavium.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_target.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_target.h +++ b/drivers/scsi/qla2xxx/qla_target.h @@ -374,8 +374,8 @@ struct atio_from_isp { static inline int fcpcmd_is_corrupted(struct atio *atio) { if (atio->entry_type == ATIO_TYPE7 && - (le16_to_cpu(atio->attr_n_length & FCP_CMD_LENGTH_MASK) < - FCP_CMD_LENGTH_MIN)) + ((le16_to_cpu(atio->attr_n_length) & FCP_CMD_LENGTH_MASK) < + FCP_CMD_LENGTH_MIN)) return 1; else return 0;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Zhao yuzhao@google.com
[ Upstream commit b61749a89f826eb61fc59794d9e4697bd246eb61 ]
In snd_hdac_bus_init_chip(), we enable interrupt before snd_hdac_bus_init_cmd_io() initializing dma buffers. If irq has been acquired and irq handler uses the dma buffer, kernel may crash when interrupt comes in.
Fix the problem by postponing enabling irq after dma buffer initialization. And warn once on null dma buffer pointer during the initialization.
Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/hda/hdac_controller.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -40,6 +40,8 @@ static void azx_clear_corbrp(struct hdac */ void snd_hdac_bus_init_cmd_io(struct hdac_bus *bus) { + WARN_ON_ONCE(!bus->rb.area); + spin_lock_irq(&bus->reg_lock); /* CORB set up */ bus->corb.addr = bus->rb.addr; @@ -478,13 +480,15 @@ bool snd_hdac_bus_init_chip(struct hdac_ /* reset controller */ azx_reset(bus, full_reset);
- /* initialize interrupts */ + /* clear interrupts */ azx_int_clear(bus); - azx_int_enable(bus);
/* initialize the codec command I/O */ snd_hdac_bus_init_cmd_io(bus);
+ /* enable interrupts after CORB/RIRB buffers are initialized above */ + azx_int_enable(bus); + /* program the position buffer */ if (bus->use_posbuf && bus->posbuf.addr) { snd_hdac_chip_writel(bus, DPLBASE, (u32)bus->posbuf.addr);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Zhao yuzhao@google.com
[ Upstream commit 75383f8d39d4c0fb96083dd460b7b139fbdac492 ]
Internally, skl_init_chip() calls snd_hdac_bus_init_chip() which 1) sets bus->chip_init to prevent multiple entrances before device is stopped; 2) enables interrupt.
We shouldn't use it for the purpose of resetting device only because 1) when we really want to initialize device, we won't be able to do so; 2) we are ready to handle interrupt yet, and kernel crashes when interrupt comes in.
Rename azx_reset() to snd_hdac_bus_reset_link(), and use it to reset device properly.
Fixes: 60767abcea3d ("ASoC: Intel: Skylake: Reset the controller in probe") Reviewed-by: Takashi Iwai tiwai@suse.de Signed-off-by: Yu Zhao yuzhao@google.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/sound/hdaudio.h | 1 + sound/hda/hdac_controller.c | 7 ++++--- sound/soc/intel/skylake/skl.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-)
--- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -357,6 +357,7 @@ void snd_hdac_bus_init_cmd_io(struct hda void snd_hdac_bus_stop_cmd_io(struct hdac_bus *bus); void snd_hdac_bus_enter_link_reset(struct hdac_bus *bus); void snd_hdac_bus_exit_link_reset(struct hdac_bus *bus); +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset);
void snd_hdac_bus_update_rirb(struct hdac_bus *bus); int snd_hdac_bus_handle_stream_irq(struct hdac_bus *bus, unsigned int status, --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -384,7 +384,7 @@ void snd_hdac_bus_exit_link_reset(struct EXPORT_SYMBOL_GPL(snd_hdac_bus_exit_link_reset);
/* reset codec link */ -static int azx_reset(struct hdac_bus *bus, bool full_reset) +int snd_hdac_bus_reset_link(struct hdac_bus *bus, bool full_reset) { if (!full_reset) goto skip_reset; @@ -409,7 +409,7 @@ static int azx_reset(struct hdac_bus *bu skip_reset: /* check to see if controller is ready */ if (!snd_hdac_chip_readb(bus, GCTL)) { - dev_dbg(bus->dev, "azx_reset: controller not ready!\n"); + dev_dbg(bus->dev, "controller not ready!\n"); return -EBUSY; }
@@ -424,6 +424,7 @@ static int azx_reset(struct hdac_bus *bu
return 0; } +EXPORT_SYMBOL_GPL(snd_hdac_bus_reset_link);
/* enable interrupts */ static void azx_int_enable(struct hdac_bus *bus) @@ -478,7 +479,7 @@ bool snd_hdac_bus_init_chip(struct hdac_ return false;
/* reset controller */ - azx_reset(bus, full_reset); + snd_hdac_bus_reset_link(bus, full_reset);
/* clear interrupts */ azx_int_clear(bus); --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -698,7 +698,7 @@ static int skl_first_init(struct hdac_ex return -ENXIO; }
- skl_init_chip(bus, true); + snd_hdac_bus_reset_link(bus, true);
snd_hdac_bus_parse_capabilities(bus);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Hemminger stephen@networkplumber.org
[ Upstream commit 018349d70f28a78d5343b3660cb66e1667005f8a ]
When netvsc device is removed it can call reschedule in RCU context. This happens because canceling the subchannel setup work could (in theory) cause a reschedule when manipulating the timer.
To reproduce, run with lockdep enabled kernel and unbind a network device from hv_netvsc (via sysfs).
[ 160.682011] WARNING: suspicious RCU usage [ 160.707466] 4.19.0-rc3-uio+ #2 Not tainted [ 160.709937] ----------------------------- [ 160.712352] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section! [ 160.723691] [ 160.723691] other info that might help us debug this: [ 160.723691] [ 160.730955] [ 160.730955] rcu_scheduler_active = 2, debug_locks = 1 [ 160.762813] 5 locks held by rebind-eth.sh/1812: [ 160.766851] #0: 000000008befa37a (sb_writers#6){.+.+}, at: vfs_write+0x184/0x1b0 [ 160.773416] #1: 00000000b097f236 (&of->mutex){+.+.}, at: kernfs_fop_write+0xe2/0x1a0 [ 160.783766] #2: 0000000041ee6889 (kn->count#3){++++}, at: kernfs_fop_write+0xeb/0x1a0 [ 160.787465] #3: 0000000056d92a74 (&dev->mutex){....}, at: device_release_driver_internal+0x39/0x250 [ 160.816987] #4: 0000000030f6031e (rcu_read_lock){....}, at: netvsc_remove+0x1e/0x250 [hv_netvsc] [ 160.828629] [ 160.828629] stack backtrace: [ 160.831966] CPU: 1 PID: 1812 Comm: rebind-eth.sh Not tainted 4.19.0-rc3-uio+ #2 [ 160.832952] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012 [ 160.832952] Call Trace: [ 160.832952] dump_stack+0x85/0xcb [ 160.832952] ___might_sleep+0x1a3/0x240 [ 160.832952] __flush_work+0x57/0x2e0 [ 160.832952] ? __mutex_lock+0x83/0x990 [ 160.832952] ? __kernfs_remove+0x24f/0x2e0 [ 160.832952] ? __kernfs_remove+0x1b2/0x2e0 [ 160.832952] ? mark_held_locks+0x50/0x80 [ 160.832952] ? get_work_pool+0x90/0x90 [ 160.832952] __cancel_work_timer+0x13c/0x1e0 [ 160.832952] ? netvsc_remove+0x1e/0x250 [hv_netvsc] [ 160.832952] ? __lock_is_held+0x55/0x90 [ 160.832952] netvsc_remove+0x9a/0x250 [hv_netvsc] [ 160.832952] vmbus_remove+0x26/0x30 [ 160.832952] device_release_driver_internal+0x18a/0x250 [ 160.832952] unbind_store+0xb4/0x180 [ 160.832952] kernfs_fop_write+0x113/0x1a0 [ 160.832952] __vfs_write+0x36/0x1a0 [ 160.832952] ? rcu_read_lock_sched_held+0x6b/0x80 [ 160.832952] ? rcu_sync_lockdep_assert+0x2e/0x60 [ 160.832952] ? __sb_start_write+0x141/0x1a0 [ 160.832952] ? vfs_write+0x184/0x1b0 [ 160.832952] vfs_write+0xbe/0x1b0 [ 160.832952] ksys_write+0x55/0xc0 [ 160.832952] do_syscall_64+0x60/0x1b0 [ 160.832952] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 160.832952] RIP: 0033:0x7fe48f4c8154
Resolve this by getting RTNL earlier. This is safe because the subchannel work queue does trylock on RTNL and will detect the race.
Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Signed-off-by: Stephen Hemminger sthemmin@microsoft.com Reviewed-by: Haiyang Zhang haiyangz@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/hyperv/netvsc_drv.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2110,17 +2110,15 @@ static int netvsc_remove(struct hv_devic
cancel_delayed_work_sync(&ndev_ctx->dwork);
- rcu_read_lock(); - nvdev = rcu_dereference(ndev_ctx->nvdev); - - if (nvdev) + rtnl_lock(); + nvdev = rtnl_dereference(ndev_ctx->nvdev); + if (nvdev) cancel_work_sync(&nvdev->subchan_work);
/* * Call to the vsc driver to let it know that the device is being * removed. Also blocks mtu and channel changes. */ - rtnl_lock(); vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev); if (vf_netdev) netvsc_unregister_vf(vf_netdev); @@ -2132,7 +2130,6 @@ static int netvsc_remove(struct hv_devic list_del(&ndev_ctx->list);
rtnl_unlock(); - rcu_read_unlock();
hv_set_drvdata(dev, NULL);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jongsung Kim neidhard.kim@lge.com
[ Upstream commit edf2ef7242805e53ec2e0841db26e06d8bc7da70 ]
Synopsys DWC Ethernet MAC can be configured to have 1..32, 64, or 128 unicast filter entries. (Table 7-8 MAC Address Registers from databook) Fix dwmac1000_validate_ucast_entries() to accept values between 1 and 32 in addition.
Signed-off-by: Jongsung Kim neidhard.kim@lge.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -67,7 +67,7 @@ static int dwmac1000_validate_mcast_bins * Description: * This function validates the number of Unicast address entries supported * by a particular Synopsys 10/100/1000 controller. The Synopsys controller - * supports 1, 32, 64, or 128 Unicast filter entries for it's Unicast filter + * supports 1..32, 64, or 128 Unicast filter entries for it's Unicast filter * logic. This function validates a valid, supported configuration is * selected, and defaults to 1 Unicast address if an unsupported * configuration is selected. @@ -77,8 +77,7 @@ static int dwmac1000_validate_ucast_entr int x = ucast_entries;
switch (x) { - case 1: - case 32: + case 1 ... 32: case 64: case 128: break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre nicolas.ferre@microchip.com
[ Upstream commit eb4ed8e2d7fecb5f40db38e4498b9ee23cddf196 ]
Create a new configuration for the sama5d3-macb new compatibility string. This configuration disables scatter-gather because we experienced lock down of the macb interface of this particular SoC under very high load.
Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/cadence/macb_main.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -3301,6 +3301,13 @@ static const struct macb_config at91sam9 .init = macb_init, };
+static const struct macb_config sama5d3macb_config = { + .caps = MACB_CAPS_SG_DISABLED + | MACB_CAPS_USRIO_HAS_CLKEN | MACB_CAPS_USRIO_DEFAULT_IS_MII_GMII, + .clk_init = macb_clk_init, + .init = macb_init, +}; + static const struct macb_config pc302gem_config = { .caps = MACB_CAPS_SG_DISABLED | MACB_CAPS_GIGABIT_MODE_AVAILABLE, .dma_burst_length = 16, @@ -3368,6 +3375,7 @@ static const struct of_device_id macb_dt { .compatible = "cdns,gem", .data = &pc302gem_config }, { .compatible = "atmel,sama5d2-gem", .data = &sama5d2_config }, { .compatible = "atmel,sama5d3-gem", .data = &sama5d3_config }, + { .compatible = "atmel,sama5d3-macb", .data = &sama5d3macb_config }, { .compatible = "atmel,sama5d4-gem", .data = &sama5d4_config }, { .compatible = "cdns,at91rm9200-emac", .data = &emac_config }, { .compatible = "cdns,emac", .data = &emac_config },
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Ferre nicolas.ferre@microchip.com
[ Upstream commit 321cc359d899a8e988f3725d87c18a628e1cc624 ]
We need this new compatibility string as we experienced different behavior for this 10/100Mbits/s macb interface on this particular SoC. Backward compatibility is preserved as we keep the alternative strings.
Signed-off-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/net/macb.txt | 1 + arch/arm/boot/dts/sama5d3_emac.dtsi | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/net/macb.txt +++ b/Documentation/devicetree/bindings/net/macb.txt @@ -10,6 +10,7 @@ Required properties: Use "cdns,pc302-gem" for Picochip picoXcell pc302 and later devices based on the Cadence GEM, or the generic form: "cdns,gem". Use "atmel,sama5d2-gem" for the GEM IP (10/100) available on Atmel sama5d2 SoCs. + Use "atmel,sama5d3-macb" for the 10/100Mbit IP available on Atmel sama5d3 SoCs. Use "atmel,sama5d3-gem" for the Gigabit IP available on Atmel sama5d3 SoCs. Use "atmel,sama5d4-gem" for the GEM IP (10/100) available on Atmel sama5d4 SoCs. Use "cdns,zynq-gem" Xilinx Zynq-7xxx SoC. --- a/arch/arm/boot/dts/sama5d3_emac.dtsi +++ b/arch/arm/boot/dts/sama5d3_emac.dtsi @@ -41,7 +41,7 @@ };
macb1: ethernet@f802c000 { - compatible = "cdns,at91sam9260-macb", "cdns,macb"; + compatible = "atmel,sama5d3-macb", "cdns,at91sam9260-macb", "cdns,macb"; reg = <0xf802c000 0x100>; interrupts = <35 IRQ_TYPE_LEVEL_HIGH 3>; pinctrl-names = "default";
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Hemminger stephen@networkplumber.org
[ Upstream commit a15f2c08c70811f120d99288d81f70d7f3d104f1 ]
The Hyper-V host API for PCI provides a unique "serial number" which can be used as basis for sysfs PCI slot table. This can be useful for cases where userspace wants to find the PCI device based on serial number.
When an SR-IOV NIC is added, the host sends an attach message with serial number. The kernel doesn't use the serial number, but it is useful when doing the same thing in a userspace driver such as the DPDK. By having /sys/bus/pci/slots/N it provides a direct way to find the matching PCI device.
There maybe some cases where serial number is not unique such as when using GPU's. But the PCI slot infrastructure will handle that.
This has a side effect which may also be useful. The common udev network device naming policy uses the slot information (rather than PCI address).
Signed-off-by: Stephen Hemminger sthemmin@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/host/pci-hyperv.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
--- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -100,6 +100,9 @@ static enum pci_protocol_version_t pci_p
#define STATUS_REVISION_MISMATCH 0xC0000059
+/* space for 32bit serial number as string */ +#define SLOT_NAME_SIZE 11 + /* * Message Types */ @@ -516,6 +519,7 @@ struct hv_pci_dev { struct list_head list_entry; refcount_t refs; enum hv_pcichild_state state; + struct pci_slot *pci_slot; struct pci_function_description desc; bool reported_missing; struct hv_pcibus_device *hbus; @@ -1481,6 +1485,34 @@ static void prepopulate_bars(struct hv_p spin_unlock_irqrestore(&hbus->device_list_lock, flags); }
+/* + * Assign entries in sysfs pci slot directory. + * + * Note that this function does not need to lock the children list + * because it is called from pci_devices_present_work which + * is serialized with hv_eject_device_work because they are on the + * same ordered workqueue. Therefore hbus->children list will not change + * even when pci_create_slot sleeps. + */ +static void hv_pci_assign_slots(struct hv_pcibus_device *hbus) +{ + struct hv_pci_dev *hpdev; + char name[SLOT_NAME_SIZE]; + int slot_nr; + + list_for_each_entry(hpdev, &hbus->children, list_entry) { + if (hpdev->pci_slot) + continue; + + slot_nr = PCI_SLOT(wslot_to_devfn(hpdev->desc.win_slot.slot)); + snprintf(name, SLOT_NAME_SIZE, "%u", hpdev->desc.ser); + hpdev->pci_slot = pci_create_slot(hbus->pci_bus, slot_nr, + name, NULL); + if (!hpdev->pci_slot) + pr_warn("pci_create slot %s failed\n", name); + } +} + /** * create_root_hv_pci_bus() - Expose a new root PCI bus * @hbus: Root PCI bus, as understood by this driver @@ -1504,6 +1536,7 @@ static int create_root_hv_pci_bus(struct pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); pci_bus_assign_resources(hbus->pci_bus); + hv_pci_assign_slots(hbus); pci_bus_add_devices(hbus->pci_bus); pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; @@ -1787,6 +1820,7 @@ static void pci_devices_present_work(str */ pci_lock_rescan_remove(); pci_scan_child_bus(hbus->pci_bus); + hv_pci_assign_slots(hbus); pci_unlock_rescan_remove(); break;
@@ -1895,6 +1929,9 @@ static void hv_eject_device_work(struct list_del(&hpdev->list_entry); spin_unlock_irqrestore(&hpdev->hbus->device_list_lock, flags);
+ if (hpdev->pci_slot) + pci_destroy_slot(hpdev->pci_slot); + memset(&ctxt, 0, sizeof(ctxt)); ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message; ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit b1e3454d39f992e5409cd19f97782185950df6e7 ]
Commit d31fd43c0f9a ("clk: x86: Do not gate clocks enabled by the firmware") causes all unclaimed PMC clocks on Cherry Trail devices to be on all the time, resulting on the device not being able to reach S0i2 or S0i3 when suspended.
The reason for this commit is that on some Bay Trail / Cherry Trail devices the ethernet controller uses pmc_plt_clk_4. This commit adds an "ether_clk" alias, so that the relevant ethernet drivers can try to (optionally) use this, without needing X86 specific code / hacks, thus fixing ethernet on these devices without breaking S0i3 support.
This commit uses clkdev_hw_create() to create the alias, mirroring the code for the already existing "mclk" alias for pmc_plt_clk_3.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach js@sig21.net Cc: Carlo Caione carlo@endlessm.com Reported-by: Johannes Stezenbach js@sig21.net Acked-by: Stephen Boyd sboyd@kernel.org Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/x86/clk-pmc-atom.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/clk/x86/clk-pmc-atom.c +++ b/drivers/clk/x86/clk-pmc-atom.c @@ -55,6 +55,7 @@ struct clk_plt_data { u8 nparents; struct clk_plt *clks[PMC_CLK_NUM]; struct clk_lookup *mclk_lookup; + struct clk_lookup *ether_clk_lookup; };
/* Return an index in parent table */ @@ -351,11 +352,20 @@ static int plt_clk_probe(struct platform goto err_unreg_clk_plt; }
+ data->ether_clk_lookup = clkdev_hw_create(&data->clks[4]->hw, + "ether_clk", NULL); + if (!data->ether_clk_lookup) { + err = -ENOMEM; + goto err_drop_mclk; + } + plt_clk_free_parent_names_loop(parent_names, data->nparents);
platform_set_drvdata(pdev, data); return 0;
+err_drop_mclk: + clkdev_drop(data->mclk_lookup); err_unreg_clk_plt: plt_clk_unregister_loop(data, i); plt_clk_unregister_parents(data); @@ -369,6 +379,7 @@ static int plt_clk_remove(struct platfor
data = platform_get_drvdata(pdev);
+ clkdev_drop(data->ether_clk_lookup); clkdev_drop(data->mclk_lookup); plt_clk_unregister_loop(data, PMC_CLK_NUM); plt_clk_unregister_parents(data);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 648e921888ad96ea3dc922739e96716ad3225d7f ]
Commit d31fd43c0f9a ("clk: x86: Do not gate clocks enabled by the firmware"), which added the code to mark clocks as CLK_IS_CRITICAL, causes all unclaimed PMC clocks on Cherry Trail devices to be on all the time, resulting on the device not being able to reach S0i3 when suspended.
The reason for this commit is that on some Bay Trail / Cherry Trail devices the r8169 ethernet controller uses pmc_plt_clk_4. Now that the clk-pmc-atom driver exports an "ether_clk" alias for pmc_plt_clk_4 and the r8169 driver has been modified to get and enable this clock (if present) the marking of the clocks as CLK_IS_CRITICAL is no longer necessary.
This commit removes the CLK_IS_CRITICAL marking, fixing Cherry Trail devices not being able to reach S0i3 greatly decreasing their battery drain when suspended.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach js@sig21.net Cc: Carlo Caione carlo@endlessm.com Reported-by: Johannes Stezenbach js@sig21.net Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/x86/clk-pmc-atom.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/drivers/clk/x86/clk-pmc-atom.c +++ b/drivers/clk/x86/clk-pmc-atom.c @@ -187,13 +187,6 @@ static struct clk_plt *plt_clk_register( pclk->reg = base + PMC_CLK_CTL_OFFSET + id * PMC_CLK_CTL_SIZE; spin_lock_init(&pclk->lock);
- /* - * If the clock was already enabled by the firmware mark it as critical - * to avoid it being gated by the clock framework if no driver owns it. - */ - if (plt_clk_is_enabled(&pclk->hw)) - init.flags |= CLK_IS_CRITICAL; - ret = devm_clk_hw_register(&pdev->dev, &pclk->hw); if (ret) { pclk = ERR_PTR(ret);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit d1766202779e81d0f2a94c4650a6ba31497d369d ]
When VMX is used with flexpriority disabled (because of no support or if disabled with module parameter) MMIO interface to lAPIC is still available in x2APIC mode while it shouldn't be (kvm-unit-tests):
PASS: apic_disable: Local apic enabled in x2APIC mode PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set FAIL: apic_disable: *0xfee00030: 50014
The issue appears because we basically do nothing while switching to x2APIC mode when APIC access page is not used. apic_mmio_{read,write} only check if lAPIC is disabled before proceeding to actual write.
When APIC access is virtualized we correctly manipulate with VMX controls in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes in x2APIC mode so there's no issue.
Disabling MMIO interface seems to be easy. The question is: what do we do with these reads and writes? If we add apic_x2apic_mode() check to apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will go to userspace. When lAPIC is in kernel, Qemu uses this interface to inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This somehow works with disabled lAPIC but when we're in xAPIC mode we will get a real injected MSI from every write to lAPIC. Not good.
The simplest solution seems to be to just ignore writes to the region and return ~0 for all reads when we're in x2APIC mode. This is what this patch does. However, this approach is inconsistent with what currently happens when flexpriority is enabled: we allocate APIC access page and create KVM memory region so in x2APIC modes all reads and writes go to this pre-allocated page which is, btw, the same for all vCPUs.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/lapic.c | 22 +++++++++++++++++++--- 2 files changed, 20 insertions(+), 3 deletions(-)
--- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -360,5 +360,6 @@ struct kvm_sync_regs {
#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) #define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2)
#endif /* _ASM_X86_KVM_H */ --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1282,9 +1282,8 @@ EXPORT_SYMBOL_GPL(kvm_lapic_reg_read);
static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr) { - return kvm_apic_hw_enabled(apic) && - addr >= apic->base_address && - addr < apic->base_address + LAPIC_MMIO_LENGTH; + return addr >= apic->base_address && + addr < apic->base_address + LAPIC_MMIO_LENGTH; }
static int apic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this, @@ -1296,6 +1295,15 @@ static int apic_mmio_read(struct kvm_vcp if (!apic_mmio_in_range(apic, address)) return -EOPNOTSUPP;
+ if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) { + if (!kvm_check_has_quirk(vcpu->kvm, + KVM_X86_QUIRK_LAPIC_MMIO_HOLE)) + return -EOPNOTSUPP; + + memset(data, 0xff, len); + return 0; + } + kvm_lapic_reg_read(apic, offset, len, data);
return 0; @@ -1806,6 +1814,14 @@ static int apic_mmio_write(struct kvm_vc if (!apic_mmio_in_range(apic, address)) return -EOPNOTSUPP;
+ if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) { + if (!kvm_check_has_quirk(vcpu->kvm, + KVM_X86_QUIRK_LAPIC_MMIO_HOLE)) + return -EOPNOTSUPP; + + return 0; + } + /* * APIC register must be aligned on 128-bits boundary. * 32/64/128 bits registers must be accessed thru 32 bits.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amber Lin Amber.Lin@amd.com
[ Upstream commit caaa4c8a6be2a275bd14f2369ee364978ff74704 ]
A wrong register bit was examinated for checking SDMA status so it reports false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct bit.
Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Amber Lin Amber.Lin@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -576,7 +576,7 @@ static int kgd_hqd_sdma_destroy(struct k
while (true) { temp = RREG32(sdma_base_addr + mmSDMA0_RLC0_CONTEXT_STATUS); - if (temp & SDMA0_STATUS_REG__RB_CMD_IDLE__SHIFT) + if (temp & SDMA0_RLC0_CONTEXT_STATUS__IDLE_MASK) break; if (timeout <= 0) return -ETIME;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit 28e2c4bb99aa40f9d5f07ac130cbc4da0ea93079 upstream.
7a9cdebdcc17 ("mm: get rid of vmacache_flush_all() entirely") removed the VMACACHE_FULL_FLUSHES statistics, but didn't remove the corresponding entry in vmstat_text. This causes an out-of-bounds access in vmstat_show().
Luckily this only affects kernels with CONFIG_DEBUG_VM_VMACACHE=y, which is probably very rare.
Link: http://lkml.kernel.org/r/20181001143138.95119-1-jannh@google.com Fixes: 7a9cdebdcc17 ("mm: get rid of vmacache_flush_all() entirely") Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Andrew Morton akpm@linux-foundation.org Acked-by: Michal Hocko mhocko@suse.com Acked-by: Roman Gushchin guro@fb.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Oleg Nesterov oleg@redhat.com Cc: Christoph Lameter clameter@sgi.com Cc: Kemi Wang kemi.wang@intel.com Cc: Andy Lutomirski luto@kernel.org Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmstat.c | 1 - 1 file changed, 1 deletion(-)
--- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1214,7 +1214,6 @@ const char * const vmstat_text[] = { #ifdef CONFIG_DEBUG_VM_VMACACHE "vmacache_find_calls", "vmacache_find_hits", - "vmacache_full_flushes", #endif #ifdef CONFIG_SWAP "swap_ra",
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Burton paul.burton@mips.com
commit ea7e0480a4b695d0aa6b3fa99bd658a003122113 upstream.
When using the legacy mmap layout, for example triggered using ulimit -s unlimited, get_unmapped_area() fills memory from bottom to top starting from a fairly low address near TASK_UNMAPPED_BASE.
This placement is suboptimal if the user application wishes to allocate large amounts of heap memory using the brk syscall. With the VDSO being located low in the user's virtual address space, the amount of space available for access using brk is limited much more than it was prior to the introduction of the VDSO.
For example:
# ulimit -s unlimited; cat /proc/self/maps 00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils 004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils 004fd000-0050f000 rwxp 00000000 00:00 0 00cc3000-00ce4000 rwxp 00000000 00:00 0 [heap] 2ab96000-2ab98000 r--p 00000000 00:00 0 [vvar] 2ab98000-2ab99000 r-xp 00000000 00:00 0 [vdso] 2ab99000-2ab9d000 rwxp 00000000 00:00 0 ...
Resolve this by adjusting STACK_TOP to reserve space for the VDSO & providing an address hint to get_unmapped_area() causing it to use this space even when using the legacy mmap layout.
We reserve enough space for the VDSO, plus 1MB or 256MB for 32 bit & 64 bit systems respectively within which we randomize the VDSO base address. Previously this randomization was taken care of by the mmap base address randomization performed by arch_mmap_rnd(). The 1MB & 256MB sizes are somewhat arbitrary but chosen such that we have some randomization without taking up too much of the user's virtual address space, which is often in short supply for 32 bit systems.
With this the VDSO is always mapped at a high address, leaving lots of space for statically linked programs to make use of brk:
# ulimit -s unlimited; cat /proc/self/maps 00400000-004ec000 r-xp 00000000 08:00 71436 /usr/bin/coreutils 004fc000-004fd000 rwxp 000ec000 08:00 71436 /usr/bin/coreutils 004fd000-0050f000 rwxp 00000000 00:00 0 00c28000-00c49000 rwxp 00000000 00:00 0 [heap] ... 7f67c000-7f69d000 rwxp 00000000 00:00 0 [stack] 7f7fc000-7f7fd000 rwxp 00000000 00:00 0 7fcf1000-7fcf3000 r--p 00000000 00:00 0 [vvar] 7fcf3000-7fcf4000 r-xp 00000000 00:00 0 [vdso]
Signed-off-by: Paul Burton paul.burton@mips.com Reported-by: Huacai Chen chenhc@lemote.com Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Cc: Huacai Chen chenhc@lemote.com Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/include/asm/processor.h | 10 +++++----- arch/mips/kernel/process.c | 25 +++++++++++++++++++++++++ arch/mips/kernel/vdso.c | 18 +++++++++++++++++- 3 files changed, 47 insertions(+), 6 deletions(-)
--- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -13,6 +13,7 @@
#include <linux/atomic.h> #include <linux/cpumask.h> +#include <linux/sizes.h> #include <linux/threads.h>
#include <asm/cachectl.h> @@ -80,11 +81,10 @@ extern unsigned int vced_count, vcei_cou
#endif
-/* - * One page above the stack is used for branch delay slot "emulation". - * See dsemul.c for details. - */ -#define STACK_TOP ((TASK_SIZE & PAGE_MASK) - PAGE_SIZE) +#define VDSO_RANDOMIZE_SIZE (TASK_IS_32BIT_ADDR ? SZ_1M : SZ_256M) + +extern unsigned long mips_stack_top(void); +#define STACK_TOP mips_stack_top()
/* * This decides where the kernel will search for a free chunk of vm --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -31,6 +31,7 @@ #include <linux/prctl.h> #include <linux/nmi.h>
+#include <asm/abi.h> #include <asm/asm.h> #include <asm/bootinfo.h> #include <asm/cpu.h> @@ -38,6 +39,7 @@ #include <asm/dsp.h> #include <asm/fpu.h> #include <asm/irq.h> +#include <asm/mips-cps.h> #include <asm/msa.h> #include <asm/pgtable.h> #include <asm/mipsregs.h> @@ -644,6 +646,29 @@ out: return pc; }
+unsigned long mips_stack_top(void) +{ + unsigned long top = TASK_SIZE & PAGE_MASK; + + /* One page for branch delay slot "emulation" */ + top -= PAGE_SIZE; + + /* Space for the VDSO, data page & GIC user page */ + top -= PAGE_ALIGN(current->thread.abi->vdso->size); + top -= PAGE_SIZE; + top -= mips_gic_present() ? PAGE_SIZE : 0; + + /* Space for cache colour alignment */ + if (cpu_has_dc_aliases) + top -= shm_align_mask + 1; + + /* Space to randomize the VDSO base */ + if (current->flags & PF_RANDOMIZE) + top -= VDSO_RANDOMIZE_SIZE; + + return top; +} + /* * Don't forget that the stack pointer must be aligned on a 8 bytes * boundary for 32-bits ABI and 16 bytes for 64-bits ABI. --- a/arch/mips/kernel/vdso.c +++ b/arch/mips/kernel/vdso.c @@ -15,6 +15,7 @@ #include <linux/ioport.h> #include <linux/kernel.h> #include <linux/mm.h> +#include <linux/random.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/timekeeper_internal.h> @@ -97,6 +98,21 @@ void update_vsyscall_tz(void) } }
+static unsigned long vdso_base(void) +{ + unsigned long base; + + /* Skip the delay slot emulation page */ + base = STACK_TOP + PAGE_SIZE; + + if (current->flags & PF_RANDOMIZE) { + base += get_random_int() & (VDSO_RANDOMIZE_SIZE - 1); + base = PAGE_ALIGN(base); + } + + return base; +} + int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) { struct mips_vdso_image *image = current->thread.abi->vdso; @@ -137,7 +153,7 @@ int arch_setup_additional_pages(struct l if (cpu_has_dc_aliases) size += shm_align_mask + 1;
- base = get_unmapped_area(NULL, 0, size, 0, 0); + base = get_unmapped_area(NULL, vdso_base(), size, 0, 0); if (IS_ERR_VALUE(base)) { ret = base; goto out;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 76ebebd2464c5c8a4453c98b6dbf9c95a599e810 upstream.
On Sun Ultra 5, it happens that the dot clock is not set up properly for some videomodes. For example, if we set the videomode "r1024x768x60" in the firmware, Linux would incorrectly set a videomode with refresh rate 180Hz when booting (suprisingly, my LCD monitor can display it, although display quality is very low).
The reason is this: Older mach64 cards set the divider in the register VCLK_POST_DIV. The register has four 2-bit fields (the field that is actually used is specified in the lowest two bits of the register CLOCK_CNTL). The 2 bits select divider "1, 2, 4, 8". On newer mach64 cards, there's another bit added - the top four bits of PLL_EXT_CNTL extend the divider selection, so we have possible dividers "1, 2, 4, 8, 3, 5, 6, 12". The Linux driver clears the top four bits of PLL_EXT_CNTL and never sets them, so it can work regardless if the card supports them. However, the sparc64 firmware may set these extended dividers during boot - and the mach64 driver detects incorrect dot clock in this case.
This patch makes the driver read the additional divider bit from PLL_EXT_CNTL and calculate the initial refresh rate properly.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Acked-by: David S. Miller davem@davemloft.net Reviewed-by: Ville Syrjälä syrjala@sci.fi Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/aty/atyfb.h | 3 ++- drivers/video/fbdev/aty/atyfb_base.c | 7 ++++--- drivers/video/fbdev/aty/mach64_ct.c | 10 +++++----- 3 files changed, 11 insertions(+), 9 deletions(-)
--- a/drivers/video/fbdev/aty/atyfb.h +++ b/drivers/video/fbdev/aty/atyfb.h @@ -333,6 +333,8 @@ extern const struct aty_pll_ops aty_pll_ extern void aty_set_pll_ct(const struct fb_info *info, const union aty_pll *pll); extern u8 aty_ld_pll_ct(int offset, const struct atyfb_par *par);
+extern const u8 aty_postdividers[8]; +
/* * Hardware cursor support @@ -359,7 +361,6 @@ static inline void wait_for_idle(struct
extern void aty_reset_engine(const struct atyfb_par *par); extern void aty_init_engine(struct atyfb_par *par, struct fb_info *info); -extern u8 aty_ld_pll_ct(int offset, const struct atyfb_par *par);
void atyfb_copyarea(struct fb_info *info, const struct fb_copyarea *area); void atyfb_fillrect(struct fb_info *info, const struct fb_fillrect *rect); --- a/drivers/video/fbdev/aty/atyfb_base.c +++ b/drivers/video/fbdev/aty/atyfb_base.c @@ -3087,17 +3087,18 @@ static int atyfb_setup_sparc(struct pci_ /* * PLL Reference Divider M: */ - M = pll_regs[2]; + M = pll_regs[PLL_REF_DIV];
/* * PLL Feedback Divider N (Dependent on CLOCK_CNTL): */ - N = pll_regs[7 + (clock_cntl & 3)]; + N = pll_regs[VCLK0_FB_DIV + (clock_cntl & 3)];
/* * PLL Post Divider P (Dependent on CLOCK_CNTL): */ - P = 1 << (pll_regs[6] >> ((clock_cntl & 3) << 1)); + P = aty_postdividers[((pll_regs[VCLK_POST_DIV] >> ((clock_cntl & 3) << 1)) & 3) | + ((pll_regs[PLL_EXT_CNTL] >> (2 + (clock_cntl & 3))) & 4)];
/* * PLL Divider Q: --- a/drivers/video/fbdev/aty/mach64_ct.c +++ b/drivers/video/fbdev/aty/mach64_ct.c @@ -115,7 +115,7 @@ static void aty_st_pll_ct(int offset, u8 */
#define Maximum_DSP_PRECISION 7 -static u8 postdividers[] = {1,2,4,8,3}; +const u8 aty_postdividers[8] = {1,2,4,8,3,5,6,12};
static int aty_dsp_gt(const struct fb_info *info, u32 bpp, struct pll_ct *pll) { @@ -222,7 +222,7 @@ static int aty_valid_pll_ct(const struct pll->vclk_post_div += (q < 64*8); pll->vclk_post_div += (q < 32*8); } - pll->vclk_post_div_real = postdividers[pll->vclk_post_div]; + pll->vclk_post_div_real = aty_postdividers[pll->vclk_post_div]; // pll->vclk_post_div <<= 6; pll->vclk_fb_div = q * pll->vclk_post_div_real / 8; pllvclk = (1000000 * 2 * pll->vclk_fb_div) / @@ -513,7 +513,7 @@ static int aty_init_pll_ct(const struct u8 mclk_fb_div, pll_ext_cntl; pll->ct.pll_ref_div = aty_ld_pll_ct(PLL_REF_DIV, par); pll_ext_cntl = aty_ld_pll_ct(PLL_EXT_CNTL, par); - pll->ct.xclk_post_div_real = postdividers[pll_ext_cntl & 0x07]; + pll->ct.xclk_post_div_real = aty_postdividers[pll_ext_cntl & 0x07]; mclk_fb_div = aty_ld_pll_ct(MCLK_FB_DIV, par); if (pll_ext_cntl & PLL_MFB_TIMES_4_2B) mclk_fb_div <<= 1; @@ -535,7 +535,7 @@ static int aty_init_pll_ct(const struct xpost_div += (q < 64*8); xpost_div += (q < 32*8); } - pll->ct.xclk_post_div_real = postdividers[xpost_div]; + pll->ct.xclk_post_div_real = aty_postdividers[xpost_div]; pll->ct.mclk_fb_div = q * pll->ct.xclk_post_div_real / 8;
#ifdef CONFIG_PPC @@ -584,7 +584,7 @@ static int aty_init_pll_ct(const struct mpost_div += (q < 64*8); mpost_div += (q < 32*8); } - sclk_post_div_real = postdividers[mpost_div]; + sclk_post_div_real = aty_postdividers[mpost_div]; pll->ct.sclk_fb_div = q * sclk_post_div_real / 8; pll->ct.spll_cntl2 = mpost_div << 4; #ifdef DEBUG
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Rapoport rppt@linux.vnet.ibm.com
commit 6685b357363bfe295e3ae73665014db4aed62c58 upstream.
The commit ca460b3c9627 ("percpu: introduce bitmap metadata blocks") introduced bitmap metadata blocks. These metadata blocks are allocated whenever a new chunk is created, but they are never freed. Fix it.
Fixes: ca460b3c9627 ("percpu: introduce bitmap metadata blocks") Signed-off-by: Mike Rapoport rppt@linux.vnet.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Dennis Zhou dennis@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/percpu.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/percpu.c +++ b/mm/percpu.c @@ -1208,6 +1208,7 @@ static void pcpu_free_chunk(struct pcpu_ { if (!chunk) return; + pcpu_mem_free(chunk->md_blocks); pcpu_mem_free(chunk->bound_map); pcpu_mem_free(chunk->alloc_map); pcpu_mem_free(chunk);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit 25e11700b54c7b6b5ebfc4361981dae12299557b upstream.
Occasional export failures were found to be caused by truncating 64-bit pointers to 32-bits. Fix by explicitly setting types for all ctype arguments and results.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/scripts/python/export-to-postgresql.py | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/tools/perf/scripts/python/export-to-postgresql.py +++ b/tools/perf/scripts/python/export-to-postgresql.py @@ -204,14 +204,23 @@ from ctypes import * libpq = CDLL("libpq.so.5") PQconnectdb = libpq.PQconnectdb PQconnectdb.restype = c_void_p +PQconnectdb.argtypes = [ c_char_p ] PQfinish = libpq.PQfinish +PQfinish.argtypes = [ c_void_p ] PQstatus = libpq.PQstatus +PQstatus.restype = c_int +PQstatus.argtypes = [ c_void_p ] PQexec = libpq.PQexec PQexec.restype = c_void_p +PQexec.argtypes = [ c_void_p, c_char_p ] PQresultStatus = libpq.PQresultStatus +PQresultStatus.restype = c_int +PQresultStatus.argtypes = [ c_void_p ] PQputCopyData = libpq.PQputCopyData +PQputCopyData.restype = c_int PQputCopyData.argtypes = [ c_void_p, c_void_p, c_int ] PQputCopyEnd = libpq.PQputCopyEnd +PQputCopyEnd.restype = c_int PQputCopyEnd.argtypes = [ c_void_p, c_void_p ]
sys.path.append(os.environ['PERF_EXEC_PATH'] + \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit d005efe18db0b4a123dd92ea8e77e27aee8f99fd upstream.
With the "branches" export option, not all sample columns are exported. However the unwanted columns are not at the end of the tuple, as assumed by the code. Fix by taking the first 15 and last 3 values, instead of the first 18.
Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180911114504.28516-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/scripts/python/export-to-sqlite.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/tools/perf/scripts/python/export-to-sqlite.py +++ b/tools/perf/scripts/python/export-to-sqlite.py @@ -440,7 +440,11 @@ def branch_type_table(*x):
def sample_table(*x): if branches: - bind_exec(sample_query, 18, x) + for xx in x[0:15]: + sample_query.addBindValue(str(xx)) + for xx in x[19:22]: + sample_query.addBindValue(str(xx)) + do_query_(sample_query) else: bind_exec(sample_query, 22, x)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Farman farman@linux.ibm.com
commit 24abf2901b18bf941b9f21ea2ce5791f61097ae4 upstream.
We have two nested loops to check the entries within the pfn_array_table arrays. But we mistakenly use the outer array as an index in our check, and completely ignore the indexing performed by the inner loop.
Cc: stable@vger.kernel.org Signed-off-by: Eric Farman farman@linux.ibm.com Message-Id: 20181002010235.42483-1-farman@linux.ibm.com Signed-off-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/cio/vfio_ccw_cp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -172,7 +172,7 @@ static bool pfn_array_table_iova_pinned(
for (i = 0; i < pat->pat_nr; i++, pa++) for (j = 0; j < pa->pa_nr; j++) - if (pa->pa_iova_pfn[i] == iova_pfn) + if (pa->pa_iova_pfn[j] == iova_pfn) return true;
return false;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shenghui Wang shhuiw@foxmail.com
commit c7cd55504a5b0fc826a2cd9540845979d24ae542 upstream.
Commit 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") inadvertently introduced this bug when it moved dm_register_target() after the call to KMEM_CACHE().
Fixes: 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") Cc: stable@vger.kernel.org Signed-off-by: Shenghui Wang shhuiw@foxmail.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-cache-target.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -3571,14 +3571,13 @@ static int __init dm_cache_init(void) int r;
migration_cache = KMEM_CACHE(dm_cache_migration, 0); - if (!migration_cache) { - dm_unregister_target(&cache_target); + if (!migration_cache) return -ENOMEM; - }
r = dm_register_target(&cache_target); if (r) { DMERR("cache target registration failed: %d", r); + kmem_cache_destroy(migration_cache); return r; }
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal damien.lemoal@wdc.com
commit 9864cd5dc54cade89fd4b0954c2e522841aa247c upstream.
If dm-linear or dm-flakey are layered on top of a partition of a zoned block device, remapping of the start sector and write pointer position of the zones reported by a report zones BIO must be modified to account for the target table entry mapping (start offset within the device and entry mapping with the dm device). If the target's backing device is a partition of a whole disk, the start sector on the physical device of the partition must also be accounted for when modifying the zone information. However, dm_remap_zone_report() was not considering this last case, resulting in incorrect zone information remapping with targets using disk partitions.
Fix this by calculating the target backing device start sector using the position of the completed report zones BIO and the unchanged position and size of the original report zone BIO. With this value calculated, the start sector and write pointer position of the target zones can be correctly remapped.
Fixes: 10999307c14e ("dm: introduce dm_remap_zone_report()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-)
--- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1034,12 +1034,14 @@ void dm_accept_partial_bio(struct bio *b EXPORT_SYMBOL_GPL(dm_accept_partial_bio);
/* - * The zone descriptors obtained with a zone report indicate - * zone positions within the target device. The zone descriptors - * must be remapped to match their position within the dm device. - * A target may call dm_remap_zone_report after completion of a - * REQ_OP_ZONE_REPORT bio to remap the zone descriptors obtained - * from the target device mapping to the dm device. + * The zone descriptors obtained with a zone report indicate zone positions + * within the target backing device, regardless of that device is a partition + * and regardless of the target mapping start sector on the device or partition. + * The zone descriptors start sector and write pointer position must be adjusted + * to match their relative position within the dm device. + * A target may call dm_remap_zone_report() after completion of a + * REQ_OP_ZONE_REPORT bio to remap the zone descriptors obtained from the + * backing device. */ void dm_remap_zone_report(struct dm_target *ti, struct bio *bio, sector_t start) { @@ -1050,6 +1052,7 @@ void dm_remap_zone_report(struct dm_targ struct blk_zone *zone; unsigned int nr_rep = 0; unsigned int ofst; + sector_t part_offset; struct bio_vec bvec; struct bvec_iter iter; void *addr; @@ -1058,6 +1061,15 @@ void dm_remap_zone_report(struct dm_targ return;
/* + * bio sector was incremented by the request size on completion. Taking + * into account the original request sector, the target start offset on + * the backing device and the target mapping offset (ti->begin), the + * start sector of the backing device. The partition offset is always 0 + * if the target uses a whole device. + */ + part_offset = bio->bi_iter.bi_sector + ti->begin - (start + bio_end_sector(report_bio)); + + /* * Remap the start sector of the reported zones. For sequential zones, * also remap the write pointer position. */ @@ -1074,6 +1086,7 @@ void dm_remap_zone_report(struct dm_targ /* Set zones start sector */ while (hdr->nr_zones && ofst < bvec.bv_len) { zone = addr + ofst; + zone->start -= part_offset; if (zone->start >= start + ti->len) { hdr->nr_zones = 0; break; @@ -1085,7 +1098,7 @@ void dm_remap_zone_report(struct dm_targ else if (zone->cond == BLK_ZONE_COND_EMPTY) zone->wp = zone->start; else - zone->wp = zone->wp + ti->begin - start; + zone->wp = zone->wp + ti->begin - start - part_offset; } ofst += sizeof(struct blk_zone); hdr->nr_zones--;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Snitzer snitzer@redhat.com
commit beb9caac211c1be1bc118bb62d5cf09c4107e6a5 upstream.
It is best to avoid any extra overhead associated with bio completion. DM core will indirectly call a DM target's .end_io if it is defined. In the case of DM linear, there is no need to do so (for every bio that completes) if CONFIG_DM_ZONED is not enabled.
Avoiding an extra indirect call for every bio completion is very important for ensuring DM linear doesn't incur more overhead that further widens the performance gap between dm-linear and raw block devices.
Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-linear.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -101,6 +101,7 @@ static int linear_map(struct dm_target * return DM_MAPIO_REMAPPED; }
+#ifdef CONFIG_DM_ZONED static int linear_end_io(struct dm_target *ti, struct bio *bio, blk_status_t *error) { @@ -111,6 +112,7 @@ static int linear_end_io(struct dm_targe
return DM_ENDIO_DONE; } +#endif
static void linear_status(struct dm_target *ti, status_type_t type, unsigned status_flags, char *result, unsigned maxlen) @@ -187,12 +189,16 @@ static size_t linear_dax_copy_from_iter( static struct target_type linear_target = { .name = "linear", .version = {1, 4, 0}, +#ifdef CONFIG_DM_ZONED + .end_io = linear_end_io, .features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_ZONED_HM, +#else + .features = DM_TARGET_PASSES_INTEGRITY, +#endif .module = THIS_MODULE, .ctr = linear_ctr, .dtr = linear_dtr, .map = linear_map, - .end_io = linear_end_io, .status = linear_status, .prepare_ioctl = linear_prepare_ioctl, .iterate_devices = linear_iterate_devices,
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal damien.lemoal@wdc.com
commit 118aa47c7072bce05fc39bd40a1c0a90caed72ab upstream.
The dm-linear target is independent of the dm-zoned target. For code requiring support for zoned block devices, use CONFIG_BLK_DEV_ZONED instead of CONFIG_DM_ZONED.
While at it, similarly to dm linear, also enable the DM_TARGET_ZONED_HM feature in dm-flakey only if CONFIG_BLK_DEV_ZONED is defined.
Fixes: beb9caac211c1 ("dm linear: eliminate linear_end_io call if CONFIG_DM_ZONED disabled") Fixes: 0be12c1c7fce7 ("dm linear: add support for zoned block devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/md/dm-flakey.c | 2 ++ drivers/md/dm-linear.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-flakey.c +++ b/drivers/md/dm-flakey.c @@ -463,7 +463,9 @@ static int flakey_iterate_devices(struct static struct target_type flakey_target = { .name = "flakey", .version = {1, 5, 0}, +#ifdef CONFIG_BLK_DEV_ZONED .features = DM_TARGET_ZONED_HM, +#endif .module = THIS_MODULE, .ctr = flakey_ctr, .dtr = flakey_dtr, --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -101,7 +101,7 @@ static int linear_map(struct dm_target * return DM_MAPIO_REMAPPED; }
-#ifdef CONFIG_DM_ZONED +#ifdef CONFIG_BLK_DEV_ZONED static int linear_end_io(struct dm_target *ti, struct bio *bio, blk_status_t *error) { @@ -189,7 +189,7 @@ static size_t linear_dax_copy_from_iter( static struct target_type linear_target = { .name = "linear", .version = {1, 4, 0}, -#ifdef CONFIG_DM_ZONED +#ifdef CONFIG_BLK_DEV_ZONED .end_io = linear_end_io, .features = DM_TARGET_PASSES_INTEGRITY | DM_TARGET_ZONED_HM, #else
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo tj@kernel.org
commit 479adb89a97b0a33e5a9d702119872cc82ca21aa upstream.
A cgroup which is already a threaded domain may be converted into a threaded cgroup if the prerequisite conditions are met. When this happens, all threaded descendant should also have their ->dom_cgrp updated to the new threaded domain cgroup. Unfortunately, this propagation was missing leading to the following failure.
# cd /sys/fs/cgroup/unified # cat cgroup.subtree_control # show that no controllers are enabled
# mkdir -p mycgrp/a/b/c # echo threaded > mycgrp/a/b/cgroup.type
At this point, the hierarchy looks as follows:
mycgrp [d] a [dt] b [t] c [inv]
Now let's make node "a" threaded (and thus "mycgrp" s made "domain threaded"):
# echo threaded > mycgrp/a/cgroup.type
By this point, we now have a hierarchy that looks as follows:
mycgrp [dt] a [t] b [t] c [inv]
But, when we try to convert the node "c" from "domain invalid" to "threaded", we get ENOTSUP on the write():
# echo threaded > mycgrp/a/b/c/cgroup.type sh: echo: write error: Operation not supported
This patch fixes the problem by
* Moving the opencoded ->dom_cgrp save and restoration in cgroup_enable_threaded() into cgroup_{save|restore}_control() so that mulitple cgroups can be handled.
* Updating all threaded descendants' ->dom_cgrp to point to the new dom_cgrp when enabling threaded mode.
Signed-off-by: Tejun Heo tj@kernel.org Reported-and-tested-by: "Michael Kerrisk (man-pages)" mtk.manpages@gmail.com Reported-by: Amin Jamali ajamali@pivotal.io Reported-by: Joao De Almeida Pereira jpereira@pivotal.io Link: https://lore.kernel.org/r/CAKgNAkhHYCMn74TCNiMJ=ccLd7DcmXSbvw3CbZ1YREeG7iJM5... Fixes: 454000adaa2a ("cgroup: introduce cgroup->dom_cgrp and threaded css_set handling") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/cgroup-defs.h | 1 + kernel/cgroup/cgroup.c | 25 ++++++++++++++++--------- 2 files changed, 17 insertions(+), 9 deletions(-)
--- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -353,6 +353,7 @@ struct cgroup { * specific task are charged to the dom_cgrp. */ struct cgroup *dom_cgrp; + struct cgroup *old_dom_cgrp; /* used while enabling threaded */
/* * list of pidlists, up to two for each namespace (one for procs, one --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2780,11 +2780,12 @@ restart: }
/** - * cgroup_save_control - save control masks of a subtree + * cgroup_save_control - save control masks and dom_cgrp of a subtree * @cgrp: root of the target subtree * - * Save ->subtree_control and ->subtree_ss_mask to the respective old_ - * prefixed fields for @cgrp's subtree including @cgrp itself. + * Save ->subtree_control, ->subtree_ss_mask and ->dom_cgrp to the + * respective old_ prefixed fields for @cgrp's subtree including @cgrp + * itself. */ static void cgroup_save_control(struct cgroup *cgrp) { @@ -2794,6 +2795,7 @@ static void cgroup_save_control(struct c cgroup_for_each_live_descendant_pre(dsct, d_css, cgrp) { dsct->old_subtree_control = dsct->subtree_control; dsct->old_subtree_ss_mask = dsct->subtree_ss_mask; + dsct->old_dom_cgrp = dsct->dom_cgrp; } }
@@ -2819,11 +2821,12 @@ static void cgroup_propagate_control(str }
/** - * cgroup_restore_control - restore control masks of a subtree + * cgroup_restore_control - restore control masks and dom_cgrp of a subtree * @cgrp: root of the target subtree * - * Restore ->subtree_control and ->subtree_ss_mask from the respective old_ - * prefixed fields for @cgrp's subtree including @cgrp itself. + * Restore ->subtree_control, ->subtree_ss_mask and ->dom_cgrp from the + * respective old_ prefixed fields for @cgrp's subtree including @cgrp + * itself. */ static void cgroup_restore_control(struct cgroup *cgrp) { @@ -2833,6 +2836,7 @@ static void cgroup_restore_control(struc cgroup_for_each_live_descendant_post(dsct, d_css, cgrp) { dsct->subtree_control = dsct->old_subtree_control; dsct->subtree_ss_mask = dsct->old_subtree_ss_mask; + dsct->dom_cgrp = dsct->old_dom_cgrp; } }
@@ -3140,6 +3144,8 @@ static int cgroup_enable_threaded(struct { struct cgroup *parent = cgroup_parent(cgrp); struct cgroup *dom_cgrp = parent->dom_cgrp; + struct cgroup *dsct; + struct cgroup_subsys_state *d_css; int ret;
lockdep_assert_held(&cgroup_mutex); @@ -3169,12 +3175,13 @@ static int cgroup_enable_threaded(struct */ cgroup_save_control(cgrp);
- cgrp->dom_cgrp = dom_cgrp; + cgroup_for_each_live_descendant_pre(dsct, d_css, cgrp) + if (dsct == cgrp || cgroup_is_threaded(dsct)) + dsct->dom_cgrp = dom_cgrp; + ret = cgroup_apply_control(cgrp); if (!ret) parent->nr_threaded_children++; - else - cgrp->dom_cgrp = cgrp;
cgroup_finalize_control(cgrp, ret); return ret;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Boot bootc@bootc.net
commit 41591b38f5f8f78344954b68582b5f00e56ffe61 upstream.
On some SD cards over SPI, reading with the multiblock read command the last sector will leave the card in a bad state.
Remove last sectors from the multiblock reading cmd.
Signed-off-by: Chris Boot bootc@bootc.net Signed-off-by: Clément Péron peron.clem@gmail.com Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/core/block.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1614,6 +1614,16 @@ static void mmc_blk_data_prep(struct mmc
if (brq->data.blocks > 1) { /* + * Some SD cards in SPI mode return a CRC error or even lock up + * completely when trying to read the last block using a + * multiblock read command. + */ + if (mmc_host_is_spi(card->host) && (rq_data_dir(req) == READ) && + (blk_rq_pos(req) + blk_rq_sectors(req) == + get_capacity(md->disk))) + brq->data.blocks--; + + /* * After a read error, we redo the request one sector * at a time in order to accurately determine which * sectors can be read successfully.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Felsch m.felsch@pengutronix.de
commit f259f896f2348f0302f6f88d4382378cf9d23a7e upstream.
Since 'commit 02e389e63e35 ("pinctrl: mcp23s08: fix irq setup order")' the irq request isn't the last devm_* allocation. Without a deeper look at the irq and testing this isn't a good solution. Since this driver relies on the devm mechanism, requesting a interrupt should be the last thing to avoid memory corruptions during unbinding.
'Commit 02e389e63e35 ("pinctrl: mcp23s08: fix irq setup order")' fixed the order for the interrupt-controller use case only. The mcp23s08_irq_setup() must be split into two to fix it for the interrupt-controller use case and to register the irq at last. So the irq will be freed first during unbind.
Cc: stable@vger.kernel.org Cc: Jan Kundrát jan.kundrat@cesnet.cz Cc: Dmitry Mastykin mastichi@gmail.com Cc: Sebastian Reichel sebastian.reichel@collabora.co.uk Fixes: 82039d244f87 ("pinctrl: mcp23s08: add pinconf support") Fixes: 02e389e63e35 ("pinctrl: mcp23s08: fix irq setup order") Signed-off-by: Marco Felsch m.felsch@pengutronix.de Tested-by: Phil Reid preid@electromag.com.au Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pinctrl/pinctrl-mcp23s08.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/pinctrl/pinctrl-mcp23s08.c +++ b/drivers/pinctrl/pinctrl-mcp23s08.c @@ -643,6 +643,14 @@ static int mcp23s08_irq_setup(struct mcp return err; }
+ return 0; +} + +static int mcp23s08_irqchip_setup(struct mcp23s08 *mcp) +{ + struct gpio_chip *chip = &mcp->chip; + int err; + err = gpiochip_irqchip_add_nested(chip, &mcp23s08_irq_chip, 0, @@ -907,7 +915,7 @@ static int mcp23s08_probe_one(struct mcp }
if (mcp->irq && mcp->irq_controller) { - ret = mcp23s08_irq_setup(mcp); + ret = mcp23s08_irqchip_setup(mcp); if (ret) goto fail; } @@ -932,6 +940,9 @@ static int mcp23s08_probe_one(struct mcp goto fail; }
+ if (mcp->irq) + ret = mcp23s08_irq_setup(mcp); + fail: if (ret < 0) dev_dbg(dev, "can't setup chip %d, --> %d\n", addr, ret);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will.deacon@arm.com
commit ca2b497253ad01c80061a1f3ee9eb91b5d54a849 upstream.
It doesn't make sense for a perf event to be configured as a CHAIN event in isolation, so extend the arm_pmu structure with a ->filter_match() function to allow the backend PMU implementation to reject CHAIN events early.
Cc: stable@vger.kernel.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/perf_event.c | 7 +++++++ drivers/perf/arm_pmu.c | 8 +++++++- include/linux/perf/arm_pmu.h | 1 + 3 files changed, 15 insertions(+), 1 deletion(-)
--- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -824,6 +824,12 @@ static int armv8pmu_set_event_filter(str return 0; }
+static int armv8pmu_filter_match(struct perf_event *event) +{ + unsigned long evtype = event->hw.config_base & ARMV8_PMU_EVTYPE_EVENT; + return evtype != ARMV8_PMUV3_PERFCTR_CHAIN; +} + static void armv8pmu_reset(void *info) { struct arm_pmu *cpu_pmu = (struct arm_pmu *)info; @@ -970,6 +976,7 @@ static int armv8_pmu_init(struct arm_pmu cpu_pmu->reset = armv8pmu_reset, cpu_pmu->max_period = (1LLU << 32) - 1, cpu_pmu->set_event_filter = armv8pmu_set_event_filter; + cpu_pmu->filter_match = armv8pmu_filter_match;
return 0; } --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -483,7 +483,13 @@ static int armpmu_filter_match(struct pe { struct arm_pmu *armpmu = to_arm_pmu(event->pmu); unsigned int cpu = smp_processor_id(); - return cpumask_test_cpu(cpu, &armpmu->supported_cpus); + int ret; + + ret = cpumask_test_cpu(cpu, &armpmu->supported_cpus); + if (ret && armpmu->filter_match) + return armpmu->filter_match(event); + + return ret; }
static ssize_t armpmu_cpumask_show(struct device *dev, --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -110,6 +110,7 @@ struct arm_pmu { void (*stop)(struct arm_pmu *); void (*reset)(void *); int (*map_event)(struct perf_event *event); + int (*filter_match)(struct perf_event *event); int num_events; u64 max_period; bool secure_access; /* 32-bit ARM only */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jérôme Glisse jglisse@redhat.com
commit bfba8e5cf28f413aa05571af493871d74438979f upstream.
Inside set_pmd_migration_entry() we are holding page table locks and thus we can not sleep so we can not call invalidate_range_start/end()
So remove call to mmu_notifier_invalidate_range_start/end() because they are call inside the function calling set_pmd_migration_entry() (see try_to_unmap_one()).
Link: http://lkml.kernel.org/r/20181012181056.7864-1-jglisse@redhat.com Signed-off-by: Jérôme Glisse jglisse@redhat.com Reported-by: Andrea Arcangeli aarcange@redhat.com Reviewed-by: Zi Yan zi.yan@cs.rutgers.edu Acked-by: Michal Hocko mhocko@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Anshuman Khandual khandual@linux.vnet.ibm.com Cc: Dave Hansen dave.hansen@intel.com Cc: David Nellans dnellans@nvidia.com Cc: Ingo Molnar mingo@elte.hu Cc: Mel Gorman mgorman@techsingularity.net Cc: Minchan Kim minchan@kernel.org Cc: Naoya Horiguchi n-horiguchi@ah.jp.nec.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/huge_memory.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2843,9 +2843,6 @@ void set_pmd_migration_entry(struct page if (!(pvmw->pmd && !pvmw->pte)) return;
- mmu_notifier_invalidate_range_start(mm, address, - address + HPAGE_PMD_SIZE); - flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); pmdval = *pvmw->pmd; pmdp_invalidate(vma, address, pvmw->pmd); @@ -2858,9 +2855,6 @@ void set_pmd_migration_entry(struct page set_pmd_at(mm, address, pvmw->pmd, pmdswp); page_remove_rmap(page, true); put_page(page); - - mmu_notifier_invalidate_range_end(mm, address, - address + HPAGE_PMD_SIZE); }
void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 4628a64591e6cee181237060961e98c615c33966 upstream.
Currently _PAGE_DEVMAP bit is not preserved in mprotect(2) calls. As a result we will see warnings such as:
BUG: Bad page map in process JobWrk0013 pte:800001803875ea25 pmd:7624381067 addr:00007f0930720000 vm_flags:280000f9 anon_vma: (null) mapping:ffff97f2384056f0 index:0 file:457-000000fe00000030-00000009-000000ca-00000001_2001.fileblock fault:xfs_filemap_fault [xfs] mmap:xfs_file_mmap [xfs] readpage: (null) CPU: 3 PID: 15848 Comm: JobWrk0013 Tainted: G W 4.12.14-2.g7573215-default #1 SLE12-SP4 (unreleased) Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.051120182255 05/11/2018 Call Trace: dump_stack+0x5a/0x75 print_bad_pte+0x217/0x2c0 ? enqueue_task_fair+0x76/0x9f0 _vm_normal_page+0xe5/0x100 zap_pte_range+0x148/0x740 unmap_page_range+0x39a/0x4b0 unmap_vmas+0x42/0x90 unmap_region+0x99/0xf0 ? vma_gap_callbacks_rotate+0x1a/0x20 do_munmap+0x255/0x3a0 vm_munmap+0x54/0x80 SyS_munmap+0x1d/0x30 do_syscall_64+0x74/0x150 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 ...
when mprotect(2) gets used on DAX mappings. Also there is a wide variety of other failures that can result from the missing _PAGE_DEVMAP flag when the area gets used by get_user_pages() later.
Fix the problem by including _PAGE_DEVMAP in a set of flags that get preserved by mprotect(2).
Fixes: 69660fd797c3 ("x86, mm: introduce _PAGE_DEVMAP") Fixes: ebd31197931d ("powerpc/mm: Add devmap support for ppc64") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Johannes Thumshirn jthumshirn@suse.de Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++-- arch/x86/include/asm/pgtable_types.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -102,7 +102,7 @@ */ #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \ _PAGE_ACCESSED | H_PAGE_THP_HUGE | _PAGE_PTE | \ - _PAGE_SOFT_DIRTY) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) /* * user access blocked by key */ @@ -120,7 +120,7 @@ */ #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \ _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE | \ - _PAGE_SOFT_DIRTY) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) /* * Mask of bits returned by pte_pgprot() */ --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -124,7 +124,7 @@ */ #define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \ - _PAGE_SOFT_DIRTY) + _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE)
/*
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edgar Cherkasov echerkasov@dev.rtsoft.ru
commit 08d9db00fe0e300d6df976e6c294f974988226dd upstream.
The i2c-scmi driver crashes when the SMBus Write Block transaction is executed:
WARNING: CPU: 9 PID: 2194 at mm/page_alloc.c:3931 __alloc_pages_slowpath+0x9db/0xec0 Call Trace: ? get_page_from_freelist+0x49d/0x11f0 ? alloc_pages_current+0x6a/0xe0 ? new_slab+0x499/0x690 __alloc_pages_nodemask+0x265/0x280 alloc_pages_current+0x6a/0xe0 kmalloc_order+0x18/0x40 kmalloc_order_trace+0x24/0xb0 ? acpi_ut_allocate_object_desc_dbg+0x62/0x10c __kmalloc+0x203/0x220 acpi_os_allocate_zeroed+0x34/0x36 acpi_ut_copy_eobject_to_iobject+0x266/0x31e acpi_evaluate_object+0x166/0x3b2 acpi_smbus_cmi_access+0x144/0x530 [i2c_scmi] i2c_smbus_xfer+0xda/0x370 i2cdev_ioctl_smbus+0x1bd/0x270 i2cdev_ioctl+0xaa/0x250 do_vfs_ioctl+0xa4/0x600 SyS_ioctl+0x79/0x90 do_syscall_64+0x73/0x130 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 ACPI Error: Evaluating _SBW: 4 (20170831/smbus_cmi-185)
This problem occurs because the length of ACPI Buffer object is not defined/initialized in the code before a corresponding ACPI method is called. The obvious patch below fixes this issue.
Signed-off-by: Edgar Cherkasov echerkasov@dev.rtsoft.ru Acked-by: Viktor Krasnov vkrasnov@dev.rtsoft.ru Acked-by: Michael Brunner Michael.Brunner@kontron.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/busses/i2c-scmi.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/i2c/busses/i2c-scmi.c +++ b/drivers/i2c/busses/i2c-scmi.c @@ -152,6 +152,7 @@ acpi_smbus_cmi_access(struct i2c_adapter mt_params[3].type = ACPI_TYPE_INTEGER; mt_params[3].integer.value = len; mt_params[4].type = ACPI_TYPE_BUFFER; + mt_params[4].buffer.length = len; mt_params[4].buffer.pointer = data->block + 1; } break;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 1208d8a84fdcae6b395c57911cdf907450d30e70 upstream.
When disabling a USB3 port the hub driver will set the port link state to U3 to prevent "ejected" or "safely removed" devices that are still physically connected from immediately re-enumerating.
If the device was really unplugged, then error messages were printed as the hub tries to set the U3 link state for a port that is no longer enabled.
xhci-hcd ee000000.usb: Cannot set link state. usb usb8-port1: cannot disable (err = -32)
Don't print error message in xhci-hub if hub tries to set port link state for a disabled port. Return -ENODEV instead which also silences hub driver.
Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Tested-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Ross Zwisler zwisler@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-hub.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1236,17 +1236,17 @@ int xhci_hub_control(struct usb_hcd *hcd temp = readl(port_array[wIndex]); break; } - - /* Software should not attempt to set - * port link state above '3' (U3) and the port - * must be enabled. - */ - if ((temp & PORT_PE) == 0 || - (link_state > USB_SS_PORT_LS_U3)) { - xhci_warn(xhci, "Cannot set link state.\n"); + /* Port must be enabled */ + if (!(temp & PORT_PE)) { + retval = -ENODEV; + break; + } + /* Can't set port link state above '3' (U3) */ + if (link_state > USB_SS_PORT_LS_U3) { + xhci_warn(xhci, "Cannot set port %d link state %d\n", + wIndex, link_state); goto error; } - if (link_state == USB_SS_PORT_LS_U3) { slot_id = xhci_find_slot_id_by_port(hcd, xhci, wIndex + 1);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin guro@fb.com
commit eb59254608bc1d42c4c6afdcdce9c0d3ce02b318 upstream.
Patch series "indirectly reclaimable memory", v2.
This patchset introduces the concept of indirectly reclaimable memory and applies it to fix the issue of when a big number of dentries with external names can significantly affect the MemAvailable value.
This patch (of 3):
Introduce a concept of indirectly reclaimable memory and adds the corresponding memory counter and /proc/vmstat item.
Indirectly reclaimable memory is any sort of memory, used by the kernel (except of reclaimable slabs), which is actually reclaimable, i.e. will be released under memory pressure.
The counter is in bytes, as it's not always possible to count such objects in pages. The name contains BYTES by analogy to NR_KERNEL_STACK_KB.
Link: http://lkml.kernel.org/r/20180305133743.12746-2-guro@fb.com Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@techsingularity.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/mmzone.h | 1 + mm/vmstat.c | 1 + 2 files changed, 2 insertions(+)
--- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -180,6 +180,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */ NR_VM_NODE_STAT_ITEMS };
--- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1090,6 +1090,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_indirectly_reclaimable",
/* enum writeback_stat_item counters */ "nr_dirty_threshold",
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin guro@fb.com
commit 034ebf65c3c21d85b963d39f992258a64a85e3a9 upstream.
Adjust /proc/meminfo MemAvailable calculation by adding the amount of indirectly reclaimable memory (rounded to the PAGE_SIZE).
Link: http://lkml.kernel.org/r/20180305133743.12746-4-guro@fb.com Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@techsingularity.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/page_alloc.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4557,6 +4557,13 @@ long si_mem_available(void) min(global_node_page_state(NR_SLAB_RECLAIMABLE) / 2, wmark_low);
+ /* + * Part of the kernel memory, which can be released under memory + * pressure. + */ + available += global_node_page_state(NR_INDIRECTLY_RECLAIMABLE_BYTES) >> + PAGE_SHIFT; + if (available < 0) available = 0; return available;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin guro@fb.com
commit f1782c9bc547754f4bd3043fe8cfda53db85f13f upstream.
I received a report about suspicious growth of unreclaimable slabs on some machines. I've found that it happens on machines with low memory pressure, and these unreclaimable slabs are external names attached to dentries.
External names are allocated using generic kmalloc() function, so they are accounted as unreclaimable. But they are held by dentries, which are reclaimable, and they will be reclaimed under the memory pressure.
In particular, this breaks MemAvailable calculation, as it doesn't take unreclaimable slabs into account. This leads to a silly situation, when a machine is almost idle, has no memory pressure and therefore has a big dentry cache. And the resulting MemAvailable is too low to start a new workload.
To address the issue, the NR_INDIRECTLY_RECLAIMABLE_BYTES counter is used to track the amount of memory, consumed by external names. The counter is increased in the dentry allocation path, if an external name structure is allocated; and it's decreased in the dentry freeing path.
To reproduce the problem I've used the following Python script:
import os
for iter in range (0, 10000000): try: name = ("/some_long_name_%d" % iter) + "_" * 220 os.stat(name) except Exception: pass
Without this patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7811688 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 2753052 kB
With the patch: $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7809516 kB $ python indirect.py $ cat /proc/meminfo | grep MemAvailable MemAvailable: 7749144 kB
[guro@fb.com: fix indirectly reclaimable memory accounting for CONFIG_SLOB] Link: http://lkml.kernel.org/r/20180312194140.19517-1-guro@fb.com [guro@fb.com: fix indirectly reclaimable memory accounting] Link: http://lkml.kernel.org/r/20180313125701.7955-1-guro@fb.com Link: http://lkml.kernel.org/r/20180305133743.12746-5-guro@fb.com Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Mel Gorman mgorman@techsingularity.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/dcache.c | 38 +++++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-)
--- a/fs/dcache.c +++ b/fs/dcache.c @@ -270,11 +270,25 @@ static void __d_free(struct rcu_head *he kmem_cache_free(dentry_cache, dentry); }
+static void __d_free_external_name(struct rcu_head *head) +{ + struct external_name *name = container_of(head, struct external_name, + u.head); + + mod_node_page_state(page_pgdat(virt_to_page(name)), + NR_INDIRECTLY_RECLAIMABLE_BYTES, + -ksize(name)); + + kfree(name); +} + static void __d_free_external(struct rcu_head *head) { struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu); - kfree(external_name(dentry)); - kmem_cache_free(dentry_cache, dentry); + + __d_free_external_name(&external_name(dentry)->u.head); + + kmem_cache_free(dentry_cache, dentry); }
static inline int dname_external(const struct dentry *dentry) @@ -305,7 +319,7 @@ void release_dentry_name_snapshot(struct struct external_name *p; p = container_of(name->name, struct external_name, name[0]); if (unlikely(atomic_dec_and_test(&p->u.count))) - kfree_rcu(p, u.head); + call_rcu(&p->u.head, __d_free_external_name); } } EXPORT_SYMBOL(release_dentry_name_snapshot); @@ -1605,6 +1619,7 @@ EXPORT_SYMBOL(d_invalidate);
struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) { + struct external_name *ext = NULL; struct dentry *dentry; char *dname; int err; @@ -1625,14 +1640,13 @@ struct dentry *__d_alloc(struct super_bl dname = dentry->d_iname; } else if (name->len > DNAME_INLINE_LEN-1) { size_t size = offsetof(struct external_name, name[1]); - struct external_name *p = kmalloc(size + name->len, - GFP_KERNEL_ACCOUNT); - if (!p) { + ext = kmalloc(size + name->len, GFP_KERNEL_ACCOUNT); + if (!ext) { kmem_cache_free(dentry_cache, dentry); return NULL; } - atomic_set(&p->u.count, 1); - dname = p->name; + atomic_set(&ext->u.count, 1); + dname = ext->name; if (IS_ENABLED(CONFIG_DCACHE_WORD_ACCESS)) kasan_unpoison_shadow(dname, round_up(name->len + 1, sizeof(unsigned long))); @@ -1675,6 +1689,12 @@ struct dentry *__d_alloc(struct super_bl } }
+ if (unlikely(ext)) { + pg_data_t *pgdat = page_pgdat(virt_to_page(ext)); + mod_node_page_state(pgdat, NR_INDIRECTLY_RECLAIMABLE_BYTES, + ksize(ext)); + } + this_cpu_inc(nr_dentry);
return dentry; @@ -2769,7 +2789,7 @@ static void copy_name(struct dentry *den dentry->d_name.hash_len = target->d_name.hash_len; } if (old_name && likely(atomic_dec_and_test(&old_name->u.count))) - kfree_rcu(old_name, u.head); + call_rcu(&old_name->u.head, __d_free_external_name); }
static void dentry_lock_for_move(struct dentry *dentry, struct dentry *target)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin guro@fb.com
commit d79f7aa496fc94d763f67b833a1f36f4c171176f upstream.
Indirectly reclaimable memory can consume a significant part of total memory and it's actually reclaimable (it will be released under actual memory pressure).
So, the overcommit logic should treat it as free.
Otherwise, it's possible to cause random system-wide memory allocation failures by consuming a significant amount of memory by indirectly reclaimable memory, e.g. dentry external names.
If overcommit policy GUESS is used, it might be used for denial of service attack under some conditions.
The following program illustrates the approach. It causes the kernel to allocate an unreclaimable kmalloc-256 chunk for each stat() call, so that at some point the overcommit logic may start blocking large allocation system-wide.
int main() { char buf[256]; unsigned long i; struct stat statbuf;
buf[0] = '/'; for (i = 1; i < sizeof(buf); i++) buf[i] = '_';
for (i = 0; 1; i++) { sprintf(&buf[248], "%8lu", i); stat(buf, &statbuf); }
return 0; }
This patch in combination with related indirectly reclaimable memory patches closes this issue.
Link: http://lkml.kernel.org/r/20180313130041.8078-1-guro@fb.com Signed-off-by: Roman Gushchin guro@fb.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/util.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/mm/util.c +++ b/mm/util.c @@ -636,6 +636,13 @@ int __vm_enough_memory(struct mm_struct free += global_node_page_state(NR_SLAB_RECLAIMABLE);
/* + * Part of the kernel memory, which can be released + * under memory pressure. + */ + free += global_node_page_state( + NR_INDIRECTLY_RECLAIMABLE_BYTES) >> PAGE_SHIFT; + + /* * Leave reserved pages. The pages are not for anonymous pages. */ if (free <= totalreserve_pages)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin guro@fb.com
commit 7aaf7727235870f497eb928f728f7773d6df3b40 upstream.
Don't show nr_indirectly_reclaimable in /proc/vmstat, because there is no need to export this vm counter to userspace, and some changes are expected in reclaimable object accounting, which can alter this counter.
Link: http://lkml.kernel.org/r/20180425191422.9159-1-guro@fb.com Signed-off-by: Roman Gushchin guro@fb.com Acked-by: Vlastimil Babka vbabka@suse.cz Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Matthew Wilcox willy@infradead.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Michal Hocko mhocko@suse.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: David Rientjes rientjes@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/vmstat.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1090,7 +1090,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", - "nr_indirectly_reclaimable", + "", /* nr_indirectly_reclaimable */
/* enum writeback_stat_item counters */ "nr_dirty_threshold", @@ -1673,6 +1673,10 @@ static int vmstat_show(struct seq_file * unsigned long *l = arg; unsigned long off = l - (unsigned long *)m->private;
+ /* Skip hidden vmstat items. */ + if (*vmstat_text[off] == '\0') + return 0; + seq_puts(m, vmstat_text[off]); seq_put_decimal_ull(m, " ", *l); seq_putc(m, '\n');
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit f5683e76f35b4ec5891031b6a29036efe0a1ff84 upstream.
Add CPU part numbers for Cortex A53, A57, A72, A73, A75 and the Broadcom Brahma B15 CPU.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/cputype.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/arm/include/asm/cputype.h +++ b/arch/arm/include/asm/cputype.h @@ -77,8 +77,16 @@ #define ARM_CPU_PART_CORTEX_A12 0x4100c0d0 #define ARM_CPU_PART_CORTEX_A17 0x4100c0e0 #define ARM_CPU_PART_CORTEX_A15 0x4100c0f0 +#define ARM_CPU_PART_CORTEX_A53 0x4100d030 +#define ARM_CPU_PART_CORTEX_A57 0x4100d070 +#define ARM_CPU_PART_CORTEX_A72 0x4100d080 +#define ARM_CPU_PART_CORTEX_A73 0x4100d090 +#define ARM_CPU_PART_CORTEX_A75 0x4100d0a0 #define ARM_CPU_PART_MASK 0xff00fff0
+/* Broadcom cores */ +#define ARM_CPU_PART_BRAHMA_B15 0x420000f0 + /* DEC implemented cores */ #define ARM_CPU_PART_SA1100 0x4400a110
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit a5b9177f69329314721aa7022b7e69dab23fa1f0 upstream.
Prepare the processor bug infrastructure so that it can be expanded to check for per-processor bugs.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/bugs.h | 4 ++-- arch/arm/kernel/Makefile | 1 + arch/arm/kernel/bugs.c | 9 +++++++++ 3 files changed, 12 insertions(+), 2 deletions(-) create mode 100644 arch/arm/kernel/bugs.c
--- a/arch/arm/include/asm/bugs.h +++ b/arch/arm/include/asm/bugs.h @@ -10,10 +10,10 @@ #ifndef __ASM_BUGS_H #define __ASM_BUGS_H
-#ifdef CONFIG_MMU extern void check_writebuffer_bugs(void);
-#define check_bugs() check_writebuffer_bugs() +#ifdef CONFIG_MMU +extern void check_bugs(void); #else #define check_bugs() do { } while (0) #endif --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -31,6 +31,7 @@ else obj-y += entry-armv.o endif
+obj-$(CONFIG_MMU) += bugs.o obj-$(CONFIG_CPU_IDLE) += cpuidle.o obj-$(CONFIG_ISA_DMA_API) += dma.o obj-$(CONFIG_FIQ) += fiq.o fiqasm.o --- /dev/null +++ b/arch/arm/kernel/bugs.c @@ -0,0 +1,9 @@ +// SPDX-Identifier: GPL-2.0 +#include <linux/init.h> +#include <asm/bugs.h> +#include <asm/proc-fns.h> + +void __init check_bugs(void) +{ + check_writebuffer_bugs(); +}
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 26602161b5ba795928a5a719fe1d5d9f2ab5c3ef upstream.
Check for CPU bugs when secondary processors are being brought online, and also when CPUs are resuming from a low power mode. This gives an opportunity to check that processor specific bug workarounds are correctly enabled for all paths that a CPU re-enters the kernel.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/bugs.h | 2 ++ arch/arm/kernel/bugs.c | 5 +++++ arch/arm/kernel/smp.c | 4 ++++ arch/arm/kernel/suspend.c | 2 ++ 4 files changed, 13 insertions(+)
--- a/arch/arm/include/asm/bugs.h +++ b/arch/arm/include/asm/bugs.h @@ -14,8 +14,10 @@ extern void check_writebuffer_bugs(void)
#ifdef CONFIG_MMU extern void check_bugs(void); +extern void check_other_bugs(void); #else #define check_bugs() do { } while (0) +#define check_other_bugs() do { } while (0) #endif
#endif --- a/arch/arm/kernel/bugs.c +++ b/arch/arm/kernel/bugs.c @@ -3,7 +3,12 @@ #include <asm/bugs.h> #include <asm/proc-fns.h>
+void check_other_bugs(void) +{ +} + void __init check_bugs(void) { check_writebuffer_bugs(); + check_other_bugs(); } --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -31,6 +31,7 @@ #include <linux/irq_work.h>
#include <linux/atomic.h> +#include <asm/bugs.h> #include <asm/smp.h> #include <asm/cacheflush.h> #include <asm/cpu.h> @@ -402,6 +403,9 @@ asmlinkage void secondary_start_kernel(v * before we continue - which happens after __cpu_up returns. */ set_cpu_online(cpu, true); + + check_other_bugs(); + complete(&cpu_running);
local_irq_enable(); --- a/arch/arm/kernel/suspend.c +++ b/arch/arm/kernel/suspend.c @@ -3,6 +3,7 @@ #include <linux/slab.h> #include <linux/mm_types.h>
+#include <asm/bugs.h> #include <asm/cacheflush.h> #include <asm/idmap.h> #include <asm/pgalloc.h> @@ -36,6 +37,7 @@ int cpu_suspend(unsigned long arg, int ( cpu_switch_mm(mm->pgd, mm); local_flush_bp_all(); local_flush_tlb_all(); + check_other_bugs(); }
return ret;
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 9d3a04925deeabb97c8e26d940b501a2873e8af3 upstream.
Add support for per-processor bug checking - each processor function descriptor gains a function pointer for this check, which must not be an __init function. If non-NULL, this will be called whenever a CPU enters the kernel via which ever path (boot CPU, secondary CPU startup, CPU resuming, etc.)
This allows processor specific bug checks to validate that workaround bits are properly enabled by firmware via all entry paths to the kernel.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/proc-fns.h | 4 ++++ arch/arm/kernel/bugs.c | 4 ++++ arch/arm/mm/proc-macros.S | 3 ++- 3 files changed, 10 insertions(+), 1 deletion(-)
--- a/arch/arm/include/asm/proc-fns.h +++ b/arch/arm/include/asm/proc-fns.h @@ -37,6 +37,10 @@ extern struct processor { */ void (*_proc_init)(void); /* + * Check for processor bugs + */ + void (*check_bugs)(void); + /* * Disable any processor specifics */ void (*_proc_fin)(void); --- a/arch/arm/kernel/bugs.c +++ b/arch/arm/kernel/bugs.c @@ -5,6 +5,10 @@
void check_other_bugs(void) { +#ifdef MULTI_CPU + if (processor.check_bugs) + processor.check_bugs(); +#endif }
void __init check_bugs(void) --- a/arch/arm/mm/proc-macros.S +++ b/arch/arm/mm/proc-macros.S @@ -273,13 +273,14 @@ mcr p15, 0, ip, c7, c10, 4 @ data write barrier .endm
-.macro define_processor_functions name:req, dabort:req, pabort:req, nommu=0, suspend=0 +.macro define_processor_functions name:req, dabort:req, pabort:req, nommu=0, suspend=0, bugs=0 .type \name()_processor_functions, #object .align 2 ENTRY(\name()_processor_functions) .word \dabort .word \pabort .word cpu_\name()_proc_init + .word \bugs .word cpu_\name()_proc_fin .word cpu_\name()_reset .word cpu_\name()_do_idle
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit c58d237d0852a57fde9bc2c310972e8f4e3d155d upstream.
Add a Kconfig symbol for CPUs which are vulnerable to the Spectre attacks.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/Kconfig | 4 ++++ 1 file changed, 4 insertions(+)
--- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -415,6 +415,7 @@ config CPU_V7 select CPU_CP15_MPU if !MMU select CPU_HAS_ASID if MMU select CPU_PABRT_V7 + select CPU_SPECTRE if MMU select CPU_THUMB_CAPABLE select CPU_TLB_V7 if MMU
@@ -826,6 +827,9 @@ config CPU_BPREDICT_DISABLE help Say Y here to disable branch prediction. If unsure, say N.
+config CPU_SPECTRE + bool + config TLS_REG_EMUL bool select NEED_KUSER_HELPERS
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 06c23f5ffe7ad45b908d0fff604dae08a7e334b9 upstream.
Required manual merge of arch/arm/mm/proc-v7.S.
Harden the branch predictor against Spectre v2 attacks on context switches for ARMv7 and later CPUs. We do this by:
Cortex A9, A12, A17, A73, A75: invalidating the BTB. Cortex A15, Brahma B15: invalidating the instruction cache.
Cortex A57 and Cortex A72 are not addressed in this patch.
Cortex R7 and Cortex R8 are also not addressed as we do not enforce memory protection on these cores.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/Kconfig | 19 ++++++ arch/arm/mm/proc-v7-2level.S | 6 -- arch/arm/mm/proc-v7.S | 125 +++++++++++++++++++++++++++++++++---------- 3 files changed, 115 insertions(+), 35 deletions(-)
--- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -830,6 +830,25 @@ config CPU_BPREDICT_DISABLE config CPU_SPECTRE bool
+config HARDEN_BRANCH_PREDICTOR + bool "Harden the branch predictor against aliasing attacks" if EXPERT + depends on CPU_SPECTRE + default y + help + Speculation attacks against some high-performance processors rely + on being able to manipulate the branch predictor for a victim + context by executing aliasing branches in the attacker context. + Such attacks can be partially mitigated against by clearing + internal branch predictor state and limiting the prediction + logic in some situations. + + This config option will take CPU-specific actions to harden + the branch predictor against aliasing attacks and may rely on + specific instruction sequences or control bits being set by + the system firmware. + + If unsure, say Y. + config TLS_REG_EMUL bool select NEED_KUSER_HELPERS --- a/arch/arm/mm/proc-v7-2level.S +++ b/arch/arm/mm/proc-v7-2level.S @@ -41,11 +41,6 @@ * even on Cortex-A8 revisions not affected by 430973. * If IBE is not set, the flush BTAC/BTB won't do anything. */ -ENTRY(cpu_ca8_switch_mm) -#ifdef CONFIG_MMU - mov r2, #0 - mcr p15, 0, r2, c7, c5, 6 @ flush BTAC/BTB -#endif ENTRY(cpu_v7_switch_mm) #ifdef CONFIG_MMU mmid r1, r1 @ get mm->context.id @@ -66,7 +61,6 @@ ENTRY(cpu_v7_switch_mm) #endif bx lr ENDPROC(cpu_v7_switch_mm) -ENDPROC(cpu_ca8_switch_mm)
/* * cpu_v7_set_pte_ext(ptep, pte) --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -93,6 +93,17 @@ ENTRY(cpu_v7_dcache_clean_area) ret lr ENDPROC(cpu_v7_dcache_clean_area)
+ENTRY(cpu_v7_iciallu_switch_mm) + mov r3, #0 + mcr p15, 0, r3, c7, c5, 0 @ ICIALLU + b cpu_v7_switch_mm +ENDPROC(cpu_v7_iciallu_switch_mm) +ENTRY(cpu_v7_bpiall_switch_mm) + mov r3, #0 + mcr p15, 0, r3, c7, c5, 6 @ flush BTAC/BTB + b cpu_v7_switch_mm +ENDPROC(cpu_v7_bpiall_switch_mm) + string cpu_v7_name, "ARMv7 Processor" .align
@@ -158,31 +169,6 @@ ENTRY(cpu_v7_do_resume) ENDPROC(cpu_v7_do_resume) #endif
-/* - * Cortex-A8 - */ - globl_equ cpu_ca8_proc_init, cpu_v7_proc_init - globl_equ cpu_ca8_proc_fin, cpu_v7_proc_fin - globl_equ cpu_ca8_reset, cpu_v7_reset - globl_equ cpu_ca8_do_idle, cpu_v7_do_idle - globl_equ cpu_ca8_dcache_clean_area, cpu_v7_dcache_clean_area - globl_equ cpu_ca8_set_pte_ext, cpu_v7_set_pte_ext - globl_equ cpu_ca8_suspend_size, cpu_v7_suspend_size -#ifdef CONFIG_ARM_CPU_SUSPEND - globl_equ cpu_ca8_do_suspend, cpu_v7_do_suspend - globl_equ cpu_ca8_do_resume, cpu_v7_do_resume -#endif - -/* - * Cortex-A9 processor functions - */ - globl_equ cpu_ca9mp_proc_init, cpu_v7_proc_init - globl_equ cpu_ca9mp_proc_fin, cpu_v7_proc_fin - globl_equ cpu_ca9mp_reset, cpu_v7_reset - globl_equ cpu_ca9mp_do_idle, cpu_v7_do_idle - globl_equ cpu_ca9mp_dcache_clean_area, cpu_v7_dcache_clean_area - globl_equ cpu_ca9mp_switch_mm, cpu_v7_switch_mm - globl_equ cpu_ca9mp_set_pte_ext, cpu_v7_set_pte_ext .globl cpu_ca9mp_suspend_size .equ cpu_ca9mp_suspend_size, cpu_v7_suspend_size + 4 * 2 #ifdef CONFIG_ARM_CPU_SUSPEND @@ -548,10 +534,75 @@ __v7_setup_stack:
@ define struct processor (see <asm/proc-fns.h> and proc-macros.S) define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + @ generic v7 bpiall on context switch + globl_equ cpu_v7_bpiall_proc_init, cpu_v7_proc_init + globl_equ cpu_v7_bpiall_proc_fin, cpu_v7_proc_fin + globl_equ cpu_v7_bpiall_reset, cpu_v7_reset + globl_equ cpu_v7_bpiall_do_idle, cpu_v7_do_idle + globl_equ cpu_v7_bpiall_dcache_clean_area, cpu_v7_dcache_clean_area + globl_equ cpu_v7_bpiall_set_pte_ext, cpu_v7_set_pte_ext + globl_equ cpu_v7_bpiall_suspend_size, cpu_v7_suspend_size +#ifdef CONFIG_ARM_CPU_SUSPEND + globl_equ cpu_v7_bpiall_do_suspend, cpu_v7_do_suspend + globl_equ cpu_v7_bpiall_do_resume, cpu_v7_do_resume +#endif + define_processor_functions v7_bpiall, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + +#define HARDENED_BPIALL_PROCESSOR_FUNCTIONS v7_bpiall_processor_functions +#else +#define HARDENED_BPIALL_PROCESSOR_FUNCTIONS v7_processor_functions +#endif + #ifndef CONFIG_ARM_LPAE + @ Cortex-A8 - always needs bpiall switch_mm implementation + globl_equ cpu_ca8_proc_init, cpu_v7_proc_init + globl_equ cpu_ca8_proc_fin, cpu_v7_proc_fin + globl_equ cpu_ca8_reset, cpu_v7_reset + globl_equ cpu_ca8_do_idle, cpu_v7_do_idle + globl_equ cpu_ca8_dcache_clean_area, cpu_v7_dcache_clean_area + globl_equ cpu_ca8_set_pte_ext, cpu_v7_set_pte_ext + globl_equ cpu_ca8_switch_mm, cpu_v7_bpiall_switch_mm + globl_equ cpu_ca8_suspend_size, cpu_v7_suspend_size +#ifdef CONFIG_ARM_CPU_SUSPEND + globl_equ cpu_ca8_do_suspend, cpu_v7_do_suspend + globl_equ cpu_ca8_do_resume, cpu_v7_do_resume +#endif define_processor_functions ca8, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + + @ Cortex-A9 - needs more registers preserved across suspend/resume + @ and bpiall switch_mm for hardening + globl_equ cpu_ca9mp_proc_init, cpu_v7_proc_init + globl_equ cpu_ca9mp_proc_fin, cpu_v7_proc_fin + globl_equ cpu_ca9mp_reset, cpu_v7_reset + globl_equ cpu_ca9mp_do_idle, cpu_v7_do_idle + globl_equ cpu_ca9mp_dcache_clean_area, cpu_v7_dcache_clean_area +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + globl_equ cpu_ca9mp_switch_mm, cpu_v7_bpiall_switch_mm +#else + globl_equ cpu_ca9mp_switch_mm, cpu_v7_switch_mm +#endif + globl_equ cpu_ca9mp_set_pte_ext, cpu_v7_set_pte_ext define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif + + @ Cortex-A15 - needs iciallu switch_mm for hardening + globl_equ cpu_ca15_proc_init, cpu_v7_proc_init + globl_equ cpu_ca15_proc_fin, cpu_v7_proc_fin + globl_equ cpu_ca15_reset, cpu_v7_reset + globl_equ cpu_ca15_do_idle, cpu_v7_do_idle + globl_equ cpu_ca15_dcache_clean_area, cpu_v7_dcache_clean_area +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + globl_equ cpu_ca15_switch_mm, cpu_v7_iciallu_switch_mm +#else + globl_equ cpu_ca15_switch_mm, cpu_v7_switch_mm +#endif + globl_equ cpu_ca15_set_pte_ext, cpu_v7_set_pte_ext + globl_equ cpu_ca15_suspend_size, cpu_v7_suspend_size + globl_equ cpu_ca15_do_suspend, cpu_v7_do_suspend + globl_equ cpu_ca15_do_resume, cpu_v7_do_resume + define_processor_functions ca15, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #ifdef CONFIG_CPU_PJ4B define_processor_functions pj4b, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif @@ -658,7 +709,7 @@ __v7_ca7mp_proc_info: __v7_ca12mp_proc_info: .long 0x410fc0d0 .long 0xff0ffff0 - __v7_proc __v7_ca12mp_proc_info, __v7_ca12mp_setup + __v7_proc __v7_ca12mp_proc_info, __v7_ca12mp_setup, proc_fns = HARDENED_BPIALL_PROCESSOR_FUNCTIONS .size __v7_ca12mp_proc_info, . - __v7_ca12mp_proc_info
/* @@ -668,7 +719,7 @@ __v7_ca12mp_proc_info: __v7_ca15mp_proc_info: .long 0x410fc0f0 .long 0xff0ffff0 - __v7_proc __v7_ca15mp_proc_info, __v7_ca15mp_setup + __v7_proc __v7_ca15mp_proc_info, __v7_ca15mp_setup, proc_fns = ca15_processor_functions .size __v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
/* @@ -678,7 +729,7 @@ __v7_ca15mp_proc_info: __v7_b15mp_proc_info: .long 0x420f00f0 .long 0xff0ffff0 - __v7_proc __v7_b15mp_proc_info, __v7_b15mp_setup + __v7_proc __v7_b15mp_proc_info, __v7_b15mp_setup, proc_fns = ca15_processor_functions .size __v7_b15mp_proc_info, . - __v7_b15mp_proc_info
/* @@ -688,9 +739,25 @@ __v7_b15mp_proc_info: __v7_ca17mp_proc_info: .long 0x410fc0e0 .long 0xff0ffff0 - __v7_proc __v7_ca17mp_proc_info, __v7_ca17mp_setup + __v7_proc __v7_ca17mp_proc_info, __v7_ca17mp_setup, proc_fns = HARDENED_BPIALL_PROCESSOR_FUNCTIONS .size __v7_ca17mp_proc_info, . - __v7_ca17mp_proc_info
+ /* ARM Ltd. Cortex A73 processor */ + .type __v7_ca73_proc_info, #object +__v7_ca73_proc_info: + .long 0x410fd090 + .long 0xff0ffff0 + __v7_proc __v7_ca73_proc_info, __v7_setup, proc_fns = HARDENED_BPIALL_PROCESSOR_FUNCTIONS + .size __v7_ca73_proc_info, . - __v7_ca73_proc_info + + /* ARM Ltd. Cortex A75 processor */ + .type __v7_ca75_proc_info, #object +__v7_ca75_proc_info: + .long 0x410fd0a0 + .long 0xff0ffff0 + __v7_proc __v7_ca75_proc_info, __v7_setup, proc_fns = HARDENED_BPIALL_PROCESSOR_FUNCTIONS + .size __v7_ca75_proc_info, . - __v7_ca75_proc_info + /* * Qualcomm Inc. Krait processors. */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit e388b80288aade31135aca23d32eee93dd106795 upstream.
When the branch predictor hardening is enabled, firmware must have set the IBE bit in the auxiliary control register. If this bit has not been set, the Spectre workarounds will not be functional.
Add validation that this bit is set, and print a warning at alert level if this is not the case.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Reviewed-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/Makefile | 2 +- arch/arm/mm/proc-v7-bugs.c | 36 ++++++++++++++++++++++++++++++++++++ arch/arm/mm/proc-v7.S | 4 ++-- 3 files changed, 39 insertions(+), 3 deletions(-) create mode 100644 arch/arm/mm/proc-v7-bugs.c
--- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -95,7 +95,7 @@ obj-$(CONFIG_CPU_MOHAWK) += proc-mohawk. obj-$(CONFIG_CPU_FEROCEON) += proc-feroceon.o obj-$(CONFIG_CPU_V6) += proc-v6.o obj-$(CONFIG_CPU_V6K) += proc-v6.o -obj-$(CONFIG_CPU_V7) += proc-v7.o +obj-$(CONFIG_CPU_V7) += proc-v7.o proc-v7-bugs.o obj-$(CONFIG_CPU_V7M) += proc-v7m.o
AFLAGS_proc-v6.o :=-Wa,-march=armv6 --- /dev/null +++ b/arch/arm/mm/proc-v7-bugs.c @@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/kernel.h> +#include <linux/smp.h> + +static __maybe_unused void cpu_v7_check_auxcr_set(bool *warned, + u32 mask, const char *msg) +{ + u32 aux_cr; + + asm("mrc p15, 0, %0, c1, c0, 1" : "=r" (aux_cr)); + + if ((aux_cr & mask) != mask) { + if (!*warned) + pr_err("CPU%u: %s", smp_processor_id(), msg); + *warned = true; + } +} + +static DEFINE_PER_CPU(bool, spectre_warned); + +static void check_spectre_auxcr(bool *warned, u32 bit) +{ + if (IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR) && + cpu_v7_check_auxcr_set(warned, bit, + "Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable\n"); +} + +void cpu_v7_ca8_ibe(void) +{ + check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6)); +} + +void cpu_v7_ca15_ibe(void) +{ + check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0)); +} --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -569,7 +569,7 @@ __v7_setup_stack: globl_equ cpu_ca8_do_suspend, cpu_v7_do_suspend globl_equ cpu_ca8_do_resume, cpu_v7_do_resume #endif - define_processor_functions ca8, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions ca8, dabort=v7_early_abort, pabort=v7_pabort, suspend=1, bugs=cpu_v7_ca8_ibe
@ Cortex-A9 - needs more registers preserved across suspend/resume @ and bpiall switch_mm for hardening @@ -602,7 +602,7 @@ __v7_setup_stack: globl_equ cpu_ca15_suspend_size, cpu_v7_suspend_size globl_equ cpu_ca15_do_suspend, cpu_v7_do_suspend globl_equ cpu_ca15_do_resume, cpu_v7_do_resume - define_processor_functions ca15, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions ca15, dabort=v7_early_abort, pabort=v7_pabort, suspend=1, bugs=cpu_v7_ca15_ibe #ifdef CONFIG_CPU_PJ4B define_processor_functions pj4b, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit f5fe12b1eaee220ce62ff9afb8b90929c396595f upstream.
In order to prevent aliasing attacks on the branch predictor, invalidate the BTB or instruction cache on CPUs that are known to be affected when taking an abort on a address that is outside of a user task limit:
Cortex A8, A9, A12, A17, A73, A75: flush BTB. Cortex A15, Brahma B15: invalidate icache.
If the IBE bit is not set, then there is little point to enabling the workaround.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/cp15.h | 3 + arch/arm/include/asm/system_misc.h | 15 +++++++ arch/arm/mm/fault.c | 3 + arch/arm/mm/proc-v7-bugs.c | 73 ++++++++++++++++++++++++++++++++++--- arch/arm/mm/proc-v7.S | 8 ++-- 5 files changed, 94 insertions(+), 8 deletions(-)
--- a/arch/arm/include/asm/cp15.h +++ b/arch/arm/include/asm/cp15.h @@ -65,6 +65,9 @@ #define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : : "r" ((t)(v))) #define write_sysreg(v, ...) __write_sysreg(v, __VA_ARGS__)
+#define BPIALL __ACCESS_CP15(c7, 0, c5, 6) +#define ICIALLU __ACCESS_CP15(c7, 0, c5, 0) + extern unsigned long cr_alignment; /* defined in entry-armv.S */
static inline unsigned long get_cr(void) --- a/arch/arm/include/asm/system_misc.h +++ b/arch/arm/include/asm/system_misc.h @@ -8,6 +8,7 @@ #include <linux/linkage.h> #include <linux/irqflags.h> #include <linux/reboot.h> +#include <linux/percpu.h>
extern void cpu_init(void);
@@ -15,6 +16,20 @@ void soft_restart(unsigned long); extern void (*arm_pm_restart)(enum reboot_mode reboot_mode, const char *cmd); extern void (*arm_pm_idle)(void);
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +typedef void (*harden_branch_predictor_fn_t)(void); +DECLARE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn); +static inline void harden_branch_predictor(void) +{ + harden_branch_predictor_fn_t fn = per_cpu(harden_branch_predictor_fn, + smp_processor_id()); + if (fn) + fn(); +} +#else +#define harden_branch_predictor() do { } while (0) +#endif + #define UDBG_UNDEFINED (1 << 0) #define UDBG_SYSCALL (1 << 1) #define UDBG_BADABORT (1 << 2) --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -164,6 +164,9 @@ __do_user_fault(struct task_struct *tsk, { struct siginfo si;
+ if (addr > TASK_SIZE) + harden_branch_predictor(); + #ifdef CONFIG_DEBUG_USER if (((user_debug & UDBG_SEGV) && (sig == SIGSEGV)) || ((user_debug & UDBG_BUS) && (sig == SIGBUS))) { --- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -2,7 +2,61 @@ #include <linux/kernel.h> #include <linux/smp.h>
-static __maybe_unused void cpu_v7_check_auxcr_set(bool *warned, +#include <asm/cp15.h> +#include <asm/cputype.h> +#include <asm/system_misc.h> + +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR +DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn); + +static void harden_branch_predictor_bpiall(void) +{ + write_sysreg(0, BPIALL); +} + +static void harden_branch_predictor_iciallu(void) +{ + write_sysreg(0, ICIALLU); +} + +static void cpu_v7_spectre_init(void) +{ + const char *spectre_v2_method = NULL; + int cpu = smp_processor_id(); + + if (per_cpu(harden_branch_predictor_fn, cpu)) + return; + + switch (read_cpuid_part()) { + case ARM_CPU_PART_CORTEX_A8: + case ARM_CPU_PART_CORTEX_A9: + case ARM_CPU_PART_CORTEX_A12: + case ARM_CPU_PART_CORTEX_A17: + case ARM_CPU_PART_CORTEX_A73: + case ARM_CPU_PART_CORTEX_A75: + per_cpu(harden_branch_predictor_fn, cpu) = + harden_branch_predictor_bpiall; + spectre_v2_method = "BPIALL"; + break; + + case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_BRAHMA_B15: + per_cpu(harden_branch_predictor_fn, cpu) = + harden_branch_predictor_iciallu; + spectre_v2_method = "ICIALLU"; + break; + } + if (spectre_v2_method) + pr_info("CPU%u: Spectre v2: using %s workaround\n", + smp_processor_id(), spectre_v2_method); +} +#else +static void cpu_v7_spectre_init(void) +{ +} +#endif + +static __maybe_unused bool cpu_v7_check_auxcr_set(bool *warned, u32 mask, const char *msg) { u32 aux_cr; @@ -13,24 +67,33 @@ static __maybe_unused void cpu_v7_check_ if (!*warned) pr_err("CPU%u: %s", smp_processor_id(), msg); *warned = true; + return false; } + return true; }
static DEFINE_PER_CPU(bool, spectre_warned);
-static void check_spectre_auxcr(bool *warned, u32 bit) +static bool check_spectre_auxcr(bool *warned, u32 bit) { - if (IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR) && + return IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR) && cpu_v7_check_auxcr_set(warned, bit, "Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable\n"); }
void cpu_v7_ca8_ibe(void) { - check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6)); + if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6))) + cpu_v7_spectre_init(); }
void cpu_v7_ca15_ibe(void) { - check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0)); + if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0))) + cpu_v7_spectre_init(); +} + +void cpu_v7_bugs_init(void) +{ + cpu_v7_spectre_init(); } --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -532,8 +532,10 @@ __v7_setup_stack:
__INITDATA
+ .weak cpu_v7_bugs_init + @ define struct processor (see <asm/proc-fns.h> and proc-macros.S) - define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1, bugs=cpu_v7_bugs_init
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR @ generic v7 bpiall on context switch @@ -548,7 +550,7 @@ __v7_setup_stack: globl_equ cpu_v7_bpiall_do_suspend, cpu_v7_do_suspend globl_equ cpu_v7_bpiall_do_resume, cpu_v7_do_resume #endif - define_processor_functions v7_bpiall, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions v7_bpiall, dabort=v7_early_abort, pabort=v7_pabort, suspend=1, bugs=cpu_v7_bugs_init
#define HARDENED_BPIALL_PROCESSOR_FUNCTIONS v7_bpiall_processor_functions #else @@ -584,7 +586,7 @@ __v7_setup_stack: globl_equ cpu_ca9mp_switch_mm, cpu_v7_switch_mm #endif globl_equ cpu_ca9mp_set_pte_ext, cpu_v7_set_pte_ext - define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1, bugs=cpu_v7_bugs_init #endif
@ Cortex-A15 - needs iciallu switch_mm for hardening
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 10115105cb3aa17b5da1cb726ae8dd5f6854bd93 upstream.
Add firmware based hardening for cores that require more complex handling in firmware.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/proc-v7-bugs.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ arch/arm/mm/proc-v7.S | 21 +++++++++++++++ 2 files changed, 81 insertions(+)
--- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -1,14 +1,20 @@ // SPDX-License-Identifier: GPL-2.0 +#include <linux/arm-smccc.h> #include <linux/kernel.h> +#include <linux/psci.h> #include <linux/smp.h>
#include <asm/cp15.h> #include <asm/cputype.h> +#include <asm/proc-fns.h> #include <asm/system_misc.h>
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
+extern void cpu_v7_smc_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); +extern void cpu_v7_hvc_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); + static void harden_branch_predictor_bpiall(void) { write_sysreg(0, BPIALL); @@ -19,6 +25,16 @@ static void harden_branch_predictor_icia write_sysreg(0, ICIALLU); }
+static void __maybe_unused call_smc_arch_workaround_1(void) +{ + arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); +} + +static void __maybe_unused call_hvc_arch_workaround_1(void) +{ + arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); +} + static void cpu_v7_spectre_init(void) { const char *spectre_v2_method = NULL; @@ -45,7 +61,51 @@ static void cpu_v7_spectre_init(void) harden_branch_predictor_iciallu; spectre_v2_method = "ICIALLU"; break; + +#ifdef CONFIG_ARM_PSCI + default: + /* Other ARM CPUs require no workaround */ + if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) + break; + /* fallthrough */ + /* Cortex A57/A72 require firmware workaround */ + case ARM_CPU_PART_CORTEX_A57: + case ARM_CPU_PART_CORTEX_A72: { + struct arm_smccc_res res; + + if (psci_ops.smccc_version == SMCCC_VERSION_1_0) + break; + + switch (psci_ops.conduit) { + case PSCI_CONDUIT_HVC: + arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_WORKAROUND_1, &res); + if ((int)res.a0 != 0) + break; + per_cpu(harden_branch_predictor_fn, cpu) = + call_hvc_arch_workaround_1; + processor.switch_mm = cpu_v7_hvc_switch_mm; + spectre_v2_method = "hypervisor"; + break; + + case PSCI_CONDUIT_SMC: + arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, + ARM_SMCCC_ARCH_WORKAROUND_1, &res); + if ((int)res.a0 != 0) + break; + per_cpu(harden_branch_predictor_fn, cpu) = + call_smc_arch_workaround_1; + processor.switch_mm = cpu_v7_smc_switch_mm; + spectre_v2_method = "firmware"; + break; + + default: + break; + } } +#endif + } + if (spectre_v2_method) pr_info("CPU%u: Spectre v2: using %s workaround\n", smp_processor_id(), spectre_v2_method); --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -9,6 +9,7 @@ * * This is the "shell" of the ARMv7 processor support. */ +#include <linux/arm-smccc.h> #include <linux/init.h> #include <linux/linkage.h> #include <asm/assembler.h> @@ -93,6 +94,26 @@ ENTRY(cpu_v7_dcache_clean_area) ret lr ENDPROC(cpu_v7_dcache_clean_area)
+#ifdef CONFIG_ARM_PSCI + .arch_extension sec +ENTRY(cpu_v7_smc_switch_mm) + stmfd sp!, {r0 - r3} + movw r0, #:lower16:ARM_SMCCC_ARCH_WORKAROUND_1 + movt r0, #:upper16:ARM_SMCCC_ARCH_WORKAROUND_1 + smc #0 + ldmfd sp!, {r0 - r3} + b cpu_v7_switch_mm +ENDPROC(cpu_v7_smc_switch_mm) + .arch_extension virt +ENTRY(cpu_v7_hvc_switch_mm) + stmfd sp!, {r0 - r3} + movw r0, #:lower16:ARM_SMCCC_ARCH_WORKAROUND_1 + movt r0, #:upper16:ARM_SMCCC_ARCH_WORKAROUND_1 + hvc #0 + ldmfd sp!, {r0 - r3} + b cpu_v7_switch_mm +ENDPROC(cpu_v7_smc_switch_mm) +#endif ENTRY(cpu_v7_iciallu_switch_mm) mov r3, #0 mcr p15, 0, r3, c7, c5, 0 @ ICIALLU
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit c44f366ea7c85e1be27d08f2f0880f4120698125 upstream.
Warn at error level if the context switching function is not what we are expecting. This can happen with big.Little systems, which we currently do not support.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mm/proc-v7-bugs.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/arch/arm/mm/proc-v7-bugs.c +++ b/arch/arm/mm/proc-v7-bugs.c @@ -12,6 +12,8 @@ #ifdef CONFIG_HARDEN_BRANCH_PREDICTOR DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
+extern void cpu_v7_iciallu_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); +extern void cpu_v7_bpiall_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); extern void cpu_v7_smc_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm); extern void cpu_v7_hvc_switch_mm(phys_addr_t pgd_phys, struct mm_struct *mm);
@@ -50,6 +52,8 @@ static void cpu_v7_spectre_init(void) case ARM_CPU_PART_CORTEX_A17: case ARM_CPU_PART_CORTEX_A73: case ARM_CPU_PART_CORTEX_A75: + if (processor.switch_mm != cpu_v7_bpiall_switch_mm) + goto bl_error; per_cpu(harden_branch_predictor_fn, cpu) = harden_branch_predictor_bpiall; spectre_v2_method = "BPIALL"; @@ -57,6 +61,8 @@ static void cpu_v7_spectre_init(void)
case ARM_CPU_PART_CORTEX_A15: case ARM_CPU_PART_BRAHMA_B15: + if (processor.switch_mm != cpu_v7_iciallu_switch_mm) + goto bl_error; per_cpu(harden_branch_predictor_fn, cpu) = harden_branch_predictor_iciallu; spectre_v2_method = "ICIALLU"; @@ -82,6 +88,8 @@ static void cpu_v7_spectre_init(void) ARM_SMCCC_ARCH_WORKAROUND_1, &res); if ((int)res.a0 != 0) break; + if (processor.switch_mm != cpu_v7_hvc_switch_mm && cpu) + goto bl_error; per_cpu(harden_branch_predictor_fn, cpu) = call_hvc_arch_workaround_1; processor.switch_mm = cpu_v7_hvc_switch_mm; @@ -93,6 +101,8 @@ static void cpu_v7_spectre_init(void) ARM_SMCCC_ARCH_WORKAROUND_1, &res); if ((int)res.a0 != 0) break; + if (processor.switch_mm != cpu_v7_smc_switch_mm && cpu) + goto bl_error; per_cpu(harden_branch_predictor_fn, cpu) = call_smc_arch_workaround_1; processor.switch_mm = cpu_v7_smc_switch_mm; @@ -109,6 +119,11 @@ static void cpu_v7_spectre_init(void) if (spectre_v2_method) pr_info("CPU%u: Spectre v2: using %s workaround\n", smp_processor_id(), spectre_v2_method); + return; + +bl_error: + pr_err("CPU%u: Spectre v2: incorrect context switching function, system vulnerable\n", + cpu); } #else static void cpu_v7_spectre_init(void)
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
Commit 3f7e8e2e1ebda787f156ce46e3f0a9ce2833fa4f upstream.
In order to avoid aliasing attacks against the branch predictor, let's invalidate the BTB on guest exit. This is made complicated by the fact that we cannot take a branch before invalidating the BTB.
We only apply this to A12 and A17, which are the only two ARM cores on which this useful.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/kvm_asm.h | 2 - arch/arm/include/asm/kvm_mmu.h | 17 +++++++++ arch/arm/kvm/hyp/hyp-entry.S | 71 +++++++++++++++++++++++++++++++++++++++-- 3 files changed, 85 insertions(+), 5 deletions(-)
--- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -61,8 +61,6 @@ struct kvm_vcpu; extern char __kvm_hyp_init[]; extern char __kvm_hyp_init_end[];
-extern char __kvm_hyp_vector[]; - extern void __kvm_flush_vm_context(void); extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa); extern void __kvm_tlb_flush_vmid(struct kvm *kvm); --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -246,7 +246,22 @@ static inline int kvm_read_guest_lock(st
static inline void *kvm_get_hyp_vector(void) { - return kvm_ksym_ref(__kvm_hyp_vector); + switch(read_cpuid_part()) { +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + case ARM_CPU_PART_CORTEX_A12: + case ARM_CPU_PART_CORTEX_A17: + { + extern char __kvm_hyp_vector_bp_inv[]; + return kvm_ksym_ref(__kvm_hyp_vector_bp_inv); + } + +#endif + default: + { + extern char __kvm_hyp_vector[]; + return kvm_ksym_ref(__kvm_hyp_vector); + } + } }
static inline int kvm_map_vectors(void) --- a/arch/arm/kvm/hyp/hyp-entry.S +++ b/arch/arm/kvm/hyp/hyp-entry.S @@ -71,6 +71,66 @@ __kvm_hyp_vector: W(b) hyp_irq W(b) hyp_fiq
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + .align 5 +__kvm_hyp_vector_bp_inv: + .global __kvm_hyp_vector_bp_inv + + /* + * We encode the exception entry in the bottom 3 bits of + * SP, and we have to guarantee to be 8 bytes aligned. + */ + W(add) sp, sp, #1 /* Reset 7 */ + W(add) sp, sp, #1 /* Undef 6 */ + W(add) sp, sp, #1 /* Syscall 5 */ + W(add) sp, sp, #1 /* Prefetch abort 4 */ + W(add) sp, sp, #1 /* Data abort 3 */ + W(add) sp, sp, #1 /* HVC 2 */ + W(add) sp, sp, #1 /* IRQ 1 */ + W(nop) /* FIQ 0 */ + + mcr p15, 0, r0, c7, c5, 6 /* BPIALL */ + isb + +#ifdef CONFIG_THUMB2_KERNEL + /* + * Yet another silly hack: Use VPIDR as a temp register. + * Thumb2 is really a pain, as SP cannot be used with most + * of the bitwise instructions. The vect_br macro ensures + * things gets cleaned-up. + */ + mcr p15, 4, r0, c0, c0, 0 /* VPIDR */ + mov r0, sp + and r0, r0, #7 + sub sp, sp, r0 + push {r1, r2} + mov r1, r0 + mrc p15, 4, r0, c0, c0, 0 /* VPIDR */ + mrc p15, 0, r2, c0, c0, 0 /* MIDR */ + mcr p15, 4, r2, c0, c0, 0 /* VPIDR */ +#endif + +.macro vect_br val, targ +ARM( eor sp, sp, #\val ) +ARM( tst sp, #7 ) +ARM( eorne sp, sp, #\val ) + +THUMB( cmp r1, #\val ) +THUMB( popeq {r1, r2} ) + + beq \targ +.endm + + vect_br 0, hyp_fiq + vect_br 1, hyp_irq + vect_br 2, hyp_hvc + vect_br 3, hyp_dabt + vect_br 4, hyp_pabt + vect_br 5, hyp_svc + vect_br 6, hyp_undef + vect_br 7, hyp_reset +#endif + .macro invalid_vector label, cause .align \label: mov r0, #\cause @@ -149,7 +209,14 @@ hyp_hvc: bx ip
1: - push {lr} + /* + * Pushing r2 here is just a way of keeping the stack aligned to + * 8 bytes on any path that can trigger a HYP exception. Here, + * we may well be about to jump into the guest, and the guest + * exit would otherwise be badly decoded by our fancy + * "decode-exception-without-a-branch" code... + */ + push {r2, lr}
mov lr, r0 mov r0, r1 @@ -159,7 +226,7 @@ hyp_hvc: THUMB( orr lr, #1) blx lr @ Call the HYP function
- pop {lr} + pop {r2, lr} eret
guest_trap:
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier marc.zyngier@arm.com
Commit 0c47ac8cd157727e7a532d665d6fb1b5fd333977 upstream.
In order to avoid aliasing attacks against the branch predictor on Cortex-A15, let's invalidate the BTB on guest exit, which can only be done by invalidating the icache (with ACTLR[0] being set).
We use the same hack as for A12/A17 to perform the vector decoding.
Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/kvm_mmu.h | 5 +++++ arch/arm/kvm/hyp/hyp-entry.S | 24 ++++++++++++++++++++++++ 2 files changed, 29 insertions(+)
--- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -255,6 +255,11 @@ static inline void *kvm_get_hyp_vector(v return kvm_ksym_ref(__kvm_hyp_vector_bp_inv); }
+ case ARM_CPU_PART_CORTEX_A15: + { + extern char __kvm_hyp_vector_ic_inv[]; + return kvm_ksym_ref(__kvm_hyp_vector_ic_inv); + } #endif default: { --- a/arch/arm/kvm/hyp/hyp-entry.S +++ b/arch/arm/kvm/hyp/hyp-entry.S @@ -73,6 +73,28 @@ __kvm_hyp_vector:
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR .align 5 +__kvm_hyp_vector_ic_inv: + .global __kvm_hyp_vector_ic_inv + + /* + * We encode the exception entry in the bottom 3 bits of + * SP, and we have to guarantee to be 8 bytes aligned. + */ + W(add) sp, sp, #1 /* Reset 7 */ + W(add) sp, sp, #1 /* Undef 6 */ + W(add) sp, sp, #1 /* Syscall 5 */ + W(add) sp, sp, #1 /* Prefetch abort 4 */ + W(add) sp, sp, #1 /* Data abort 3 */ + W(add) sp, sp, #1 /* HVC 2 */ + W(add) sp, sp, #1 /* IRQ 1 */ + W(nop) /* FIQ 0 */ + + mcr p15, 0, r0, c7, c5, 0 /* ICIALLU */ + isb + + b decode_vectors + + .align 5 __kvm_hyp_vector_bp_inv: .global __kvm_hyp_vector_bp_inv
@@ -92,6 +114,8 @@ __kvm_hyp_vector_bp_inv: mcr p15, 0, r0, c7, c5, 6 /* BPIALL */ isb
+decode_vectors: + #ifdef CONFIG_THUMB2_KERNEL /* * Yet another silly hack: Use VPIDR as a temp register.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 3c908e16396d130608e831b7fac4b167a2ede6ba upstream.
Include Brahma B15 in the Spectre v2 KVM workarounds.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Florian Fainelli f.fainelli@gmail.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/kvm_mmu.h | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -255,6 +255,7 @@ static inline void *kvm_get_hyp_vector(v return kvm_ksym_ref(__kvm_hyp_vector_bp_inv); }
+ case ARM_CPU_PART_BRAHMA_B15: case ARM_CPU_PART_CORTEX_A15: { extern char __kvm_hyp_vector_ic_inv[];
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit b800acfc70d9fb81fbd6df70f2cf5e20f70023d0 upstream.
We want SMCCC_ARCH_WORKAROUND_1 to be fast. As fast as possible. So let's intercept it as early as we can by testing for the function call number as soon as we've identified a HVC call coming from the guest.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kvm/hyp/hyp-entry.S | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
--- a/arch/arm/kvm/hyp/hyp-entry.S +++ b/arch/arm/kvm/hyp/hyp-entry.S @@ -16,6 +16,7 @@ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
+#include <linux/arm-smccc.h> #include <linux/linkage.h> #include <asm/kvm_arm.h> #include <asm/kvm_asm.h> @@ -202,7 +203,7 @@ hyp_hvc: lsr r2, r2, #16 and r2, r2, #0xff cmp r2, #0 - bne guest_trap @ Guest called HVC + bne guest_hvc_trap @ Guest called HVC
/* * Getting here means host called HVC, we shift parameters and branch @@ -253,6 +254,20 @@ THUMB( orr lr, #1) pop {r2, lr} eret
+guest_hvc_trap: + movw r2, #:lower16:ARM_SMCCC_ARCH_WORKAROUND_1 + movt r2, #:upper16:ARM_SMCCC_ARCH_WORKAROUND_1 + ldr r0, [sp] @ Guest's r0 + teq r0, r2 + bne guest_trap + add sp, sp, #12 + @ Returns: + @ r0 = 0 + @ r1 = HSR value (perfectly predictable) + @ r2 = ARM_SMCCC_ARCH_WORKAROUND_1 + mov r0, #0 + eret + guest_trap: load_vcpu r0 @ Load VCPU pointer to r0
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit add5609877c6785cc002c6ed7e008b1d61064439 upstream.
Report support for SMCCC_ARCH_WORKAROUND_1 to KVM guests for affected CPUs.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Reviewed-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/kvm_host.h | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -21,6 +21,7 @@
#include <linux/types.h> #include <linux/kvm_types.h> +#include <asm/cputype.h> #include <asm/kvm.h> #include <asm/kvm_asm.h> #include <asm/kvm_mmio.h> @@ -298,8 +299,17 @@ int kvm_arm_vcpu_arch_has_attr(struct kv
static inline bool kvm_arm_harden_branch_predictor(void) { - /* No way to detect it yet, pretend it is not there. */ - return false; + switch(read_cpuid_part()) { +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR + case ARM_CPU_PART_BRAHMA_B15: + case ARM_CPU_PART_CORTEX_A12: + case ARM_CPU_PART_CORTEX_A15: + case ARM_CPU_PART_CORTEX_A17: + return true; +#endif + default: + return false; + } }
#define KVM_SSBD_UNKNOWN -1
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit a78d156587931a2c3b354534aa772febf6c9e855 upstream.
Add assembly and C macros for the new CSDB instruction.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Mark Rutland mark.rutland@arm.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/assembler.h | 8 ++++++++ arch/arm/include/asm/barrier.h | 13 +++++++++++++ 2 files changed, 21 insertions(+)
--- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -447,6 +447,14 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) .size \name , . - \name .endm
+ .macro csdb +#ifdef CONFIG_THUMB2_KERNEL + .inst.w 0xf3af8014 +#else + .inst 0xe320f014 +#endif + .endm + .macro check_uaccess, addr:req, size:req, limit:req, tmp:req, bad:req #ifndef CONFIG_CPU_USE_DOMAINS adds \tmp, \addr, #\size - 1 --- a/arch/arm/include/asm/barrier.h +++ b/arch/arm/include/asm/barrier.h @@ -17,6 +17,12 @@ #define isb(option) __asm__ __volatile__ ("isb " #option : : : "memory") #define dsb(option) __asm__ __volatile__ ("dsb " #option : : : "memory") #define dmb(option) __asm__ __volatile__ ("dmb " #option : : : "memory") +#ifdef CONFIG_THUMB2_KERNEL +#define CSDB ".inst.w 0xf3af8014" +#else +#define CSDB ".inst 0xe320f014" +#endif +#define csdb() __asm__ __volatile__(CSDB : : : "memory") #elif defined(CONFIG_CPU_XSC3) || __LINUX_ARM_ARCH__ == 6 #define isb(x) __asm__ __volatile__ ("mcr p15, 0, %0, c7, c5, 4" \ : : "r" (0) : "memory") @@ -37,6 +43,13 @@ #define dmb(x) __asm__ __volatile__ ("" : : : "memory") #endif
+#ifndef CSDB +#define CSDB +#endif +#ifndef csdb +#define csdb() +#endif + #ifdef CONFIG_ARM_HEAVY_MB extern void (*soc_mb)(void); extern void arm_heavy_mb(void);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 1d4238c56f9816ce0f9c8dbe42d7f2ad81cb6613 upstream.
Add an implementation of the array_index_mask_nospec() function for mitigating Spectre variant 1 throughout the kernel.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Mark Rutland mark.rutland@arm.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/barrier.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/arch/arm/include/asm/barrier.h +++ b/arch/arm/include/asm/barrier.h @@ -76,6 +76,25 @@ extern void arm_heavy_mb(void); #define __smp_rmb() __smp_mb() #define __smp_wmb() dmb(ishst)
+#ifdef CONFIG_CPU_SPECTRE +static inline unsigned long array_index_mask_nospec(unsigned long idx, + unsigned long sz) +{ + unsigned long mask; + + asm volatile( + "cmp %1, %2\n" + " sbc %0, %1, %1\n" + CSDB + : "=r" (mask) + : "r" (idx), "Ir" (sz) + : "cc"); + + return mask; +} +#define array_index_mask_nospec array_index_mask_nospec +#endif + #include <asm-generic/barrier.h>
#endif /* !__ASSEMBLY__ */
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 10573ae547c85b2c61417ff1a106cffbfceada35 upstream.
Prevent speculation at the syscall table decoding by clamping the index used to zero on invalid system call numbers, and using the csdb speculative barrier.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Acked-by: Mark Rutland mark.rutland@arm.com Boot-tested-by: Tony Lindgren tony@atomide.com Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/entry-common.S | 18 +++++++----------- arch/arm/kernel/entry-header.S | 25 +++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 11 deletions(-)
--- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -241,9 +241,7 @@ local_restart: tst r10, #_TIF_SYSCALL_WORK @ are we tracing syscalls? bne __sys_trace
- cmp scno, #NR_syscalls @ check upper syscall limit - badr lr, ret_fast_syscall @ return address - ldrcc pc, [tbl, scno, lsl #2] @ call sys_* routine + invoke_syscall tbl, scno, r10, ret_fast_syscall
add r1, sp, #S_OFF 2: cmp scno, #(__ARM_NR_BASE - __NR_SYSCALL_BASE) @@ -277,14 +275,8 @@ __sys_trace: mov r1, scno add r0, sp, #S_OFF bl syscall_trace_enter - - badr lr, __sys_trace_return @ return address - mov scno, r0 @ syscall number (possibly new) - add r1, sp, #S_R0 + S_OFF @ pointer to regs - cmp scno, #NR_syscalls @ check upper syscall limit - ldmccia r1, {r0 - r6} @ have to reload r0 - r6 - stmccia sp, {r4, r5} @ and update the stack args - ldrcc pc, [tbl, scno, lsl #2] @ call sys_* routine + mov scno, r0 + invoke_syscall tbl, scno, r10, __sys_trace_return, reload=1 cmp scno, #-1 @ skip the syscall? bne 2b add sp, sp, #S_OFF @ restore stack @@ -362,6 +354,10 @@ sys_syscall: bic scno, r0, #__NR_OABI_SYSCALL_BASE cmp scno, #__NR_syscall - __NR_SYSCALL_BASE cmpne scno, #NR_syscalls @ check range +#ifdef CONFIG_CPU_SPECTRE + movhs scno, #0 + csdb +#endif stmloia sp, {r5, r6} @ shuffle args movlo r0, r1 movlo r1, r2 --- a/arch/arm/kernel/entry-header.S +++ b/arch/arm/kernel/entry-header.S @@ -378,6 +378,31 @@ #endif .endm
+ .macro invoke_syscall, table, nr, tmp, ret, reload=0 +#ifdef CONFIG_CPU_SPECTRE + mov \tmp, \nr + cmp \tmp, #NR_syscalls @ check upper syscall limit + movcs \tmp, #0 + csdb + badr lr, \ret @ return address + .if \reload + add r1, sp, #S_R0 + S_OFF @ pointer to regs + ldmccia r1, {r0 - r6} @ reload r0-r6 + stmccia sp, {r4, r5} @ update stack arguments + .endif + ldrcc pc, [\table, \tmp, lsl #2] @ call sys_* routine +#else + cmp \nr, #NR_syscalls @ check upper syscall limit + badr lr, \ret @ return address + .if \reload + add r1, sp, #S_R0 + S_OFF @ pointer to regs + ldmccia r1, {r0 - r6} @ reload r0-r6 + stmccia sp, {r4, r5} @ update stack arguments + .endif + ldrcc pc, [\table, \nr, lsl #2] @ call sys_* routine +#endif + .endm + /* * These are the registers used in the syscall handler, and allow us to * have in theory up to 7 arguments to a function - r0 to r6.
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit c32cd419d6650e42b9cdebb83c672ec945e6bd7e upstream.
__get_user_error() is used as a fast accessor to make copying structure members in the signal handling path as efficient as possible. However, with software PAN and the recent Spectre variant 1, the efficiency is reduced as these are no longer fast accessors.
In the case of software PAN, it has to switch the domain register around each access, and with Spectre variant 1, it would have to repeat the access_ok() check for each access.
It becomes much more efficient to use __copy_from_user() instead, so let's use this for the ARM integer registers.
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/signal.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-)
--- a/arch/arm/kernel/signal.c +++ b/arch/arm/kernel/signal.c @@ -184,6 +184,7 @@ struct rt_sigframe {
static int restore_sigframe(struct pt_regs *regs, struct sigframe __user *sf) { + struct sigcontext context; char __user *aux; sigset_t set; int err; @@ -192,23 +193,26 @@ static int restore_sigframe(struct pt_re if (err == 0) set_current_blocked(&set);
- __get_user_error(regs->ARM_r0, &sf->uc.uc_mcontext.arm_r0, err); - __get_user_error(regs->ARM_r1, &sf->uc.uc_mcontext.arm_r1, err); - __get_user_error(regs->ARM_r2, &sf->uc.uc_mcontext.arm_r2, err); - __get_user_error(regs->ARM_r3, &sf->uc.uc_mcontext.arm_r3, err); - __get_user_error(regs->ARM_r4, &sf->uc.uc_mcontext.arm_r4, err); - __get_user_error(regs->ARM_r5, &sf->uc.uc_mcontext.arm_r5, err); - __get_user_error(regs->ARM_r6, &sf->uc.uc_mcontext.arm_r6, err); - __get_user_error(regs->ARM_r7, &sf->uc.uc_mcontext.arm_r7, err); - __get_user_error(regs->ARM_r8, &sf->uc.uc_mcontext.arm_r8, err); - __get_user_error(regs->ARM_r9, &sf->uc.uc_mcontext.arm_r9, err); - __get_user_error(regs->ARM_r10, &sf->uc.uc_mcontext.arm_r10, err); - __get_user_error(regs->ARM_fp, &sf->uc.uc_mcontext.arm_fp, err); - __get_user_error(regs->ARM_ip, &sf->uc.uc_mcontext.arm_ip, err); - __get_user_error(regs->ARM_sp, &sf->uc.uc_mcontext.arm_sp, err); - __get_user_error(regs->ARM_lr, &sf->uc.uc_mcontext.arm_lr, err); - __get_user_error(regs->ARM_pc, &sf->uc.uc_mcontext.arm_pc, err); - __get_user_error(regs->ARM_cpsr, &sf->uc.uc_mcontext.arm_cpsr, err); + err |= __copy_from_user(&context, &sf->uc.uc_mcontext, sizeof(context)); + if (err == 0) { + regs->ARM_r0 = context.arm_r0; + regs->ARM_r1 = context.arm_r1; + regs->ARM_r2 = context.arm_r2; + regs->ARM_r3 = context.arm_r3; + regs->ARM_r4 = context.arm_r4; + regs->ARM_r5 = context.arm_r5; + regs->ARM_r6 = context.arm_r6; + regs->ARM_r7 = context.arm_r7; + regs->ARM_r8 = context.arm_r8; + regs->ARM_r9 = context.arm_r9; + regs->ARM_r10 = context.arm_r10; + regs->ARM_fp = context.arm_fp; + regs->ARM_ip = context.arm_ip; + regs->ARM_sp = context.arm_sp; + regs->ARM_lr = context.arm_lr; + regs->ARM_pc = context.arm_pc; + regs->ARM_cpsr = context.arm_cpsr; + }
err |= !valid_user_regs(regs);
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 42019fc50dfadb219f9e6ddf4c354f3837057d80 upstream.
__get_user_error() is used as a fast accessor to make copying structure members in the signal handling path as efficient as possible. However, with software PAN and the recent Spectre variant 1, the efficiency is reduced as these are no longer fast accessors.
In the case of software PAN, it has to switch the domain register around each access, and with Spectre variant 1, it would have to repeat the access_ok() check for each access.
Use __copy_from_user() rather than __get_user_err() for individual members when restoring VFP state.
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/thread_info.h | 4 ++-- arch/arm/kernel/signal.c | 20 ++++++++------------ arch/arm/vfp/vfpmodule.c | 17 +++++++---------- 3 files changed, 17 insertions(+), 24 deletions(-)
--- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -126,8 +126,8 @@ struct user_vfp_exc;
extern int vfp_preserve_user_clear_hwstate(struct user_vfp __user *, struct user_vfp_exc __user *); -extern int vfp_restore_user_hwstate(struct user_vfp __user *, - struct user_vfp_exc __user *); +extern int vfp_restore_user_hwstate(struct user_vfp *, + struct user_vfp_exc *); #endif
/* --- a/arch/arm/kernel/signal.c +++ b/arch/arm/kernel/signal.c @@ -149,22 +149,18 @@ static int preserve_vfp_context(struct v
static int restore_vfp_context(char __user **auxp) { - struct vfp_sigframe __user *frame = - (struct vfp_sigframe __user *)*auxp; - unsigned long magic; - unsigned long size; - int err = 0; - - __get_user_error(magic, &frame->magic, err); - __get_user_error(size, &frame->size, err); + struct vfp_sigframe frame; + int err;
+ err = __copy_from_user(&frame, *auxp, sizeof(frame)); if (err) - return -EFAULT; - if (magic != VFP_MAGIC || size != VFP_STORAGE_SIZE) + return err; + + if (frame.magic != VFP_MAGIC || frame.size != VFP_STORAGE_SIZE) return -EINVAL;
- *auxp += size; - return vfp_restore_user_hwstate(&frame->ufp, &frame->ufp_exc); + *auxp += sizeof(frame); + return vfp_restore_user_hwstate(&frame.ufp, &frame.ufp_exc); }
#endif --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -597,13 +597,11 @@ int vfp_preserve_user_clear_hwstate(stru }
/* Sanitise and restore the current VFP state from the provided structures. */ -int vfp_restore_user_hwstate(struct user_vfp __user *ufp, - struct user_vfp_exc __user *ufp_exc) +int vfp_restore_user_hwstate(struct user_vfp *ufp, struct user_vfp_exc *ufp_exc) { struct thread_info *thread = current_thread_info(); struct vfp_hard_struct *hwstate = &thread->vfpstate.hard; unsigned long fpexc; - int err = 0;
/* Disable VFP to avoid corrupting the new thread state. */ vfp_flush_hwstate(thread); @@ -612,17 +610,16 @@ int vfp_restore_user_hwstate(struct user * Copy the floating point registers. There can be unused * registers see asm/hwcap.h for details. */ - err |= __copy_from_user(&hwstate->fpregs, &ufp->fpregs, - sizeof(hwstate->fpregs)); + memcpy(&hwstate->fpregs, &ufp->fpregs, sizeof(hwstate->fpregs)); /* * Copy the status and control register. */ - __get_user_error(hwstate->fpscr, &ufp->fpscr, err); + hwstate->fpscr = ufp->fpscr;
/* * Sanitise and restore the exception registers. */ - __get_user_error(fpexc, &ufp_exc->fpexc, err); + fpexc = ufp_exc->fpexc;
/* Ensure the VFP is enabled. */ fpexc |= FPEXC_EN; @@ -631,10 +628,10 @@ int vfp_restore_user_hwstate(struct user fpexc &= ~(FPEXC_EX | FPEXC_FP2V); hwstate->fpexc = fpexc;
- __get_user_error(hwstate->fpinst, &ufp_exc->fpinst, err); - __get_user_error(hwstate->fpinst2, &ufp_exc->fpinst2, err); + hwstate->fpinst = ufp_exc->fpinst; + hwstate->fpinst2 = ufp_exc->fpinst2;
- return err ? -EFAULT : 0; + return 0; }
/*
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit 8c8484a1c18e3231648f5ba7cc5ffb7fd70b3ca4 upstream.
__get_user_error() is used as a fast accessor to make copying structure members as efficient as possible. However, with software PAN and the recent Spectre variant 1, the efficiency is reduced as these are no longer fast accessors.
In the case of software PAN, it has to switch the domain register around each access, and with Spectre variant 1, it would have to repeat the access_ok() check for each access.
Rather than using __get_user_error() to copy each semops element member, copy each semops element in full using __copy_from_user().
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/kernel/sys_oabi-compat.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/arch/arm/kernel/sys_oabi-compat.c +++ b/arch/arm/kernel/sys_oabi-compat.c @@ -329,9 +329,11 @@ asmlinkage long sys_oabi_semtimedop(int return -ENOMEM; err = 0; for (i = 0; i < nsops; i++) { - __get_user_error(sops[i].sem_num, &tsops->sem_num, err); - __get_user_error(sops[i].sem_op, &tsops->sem_op, err); - __get_user_error(sops[i].sem_flg, &tsops->sem_flg, err); + struct oabi_sembuf osb; + err |= __copy_from_user(&osb, tsops, sizeof(osb)); + sops[i].sem_num = osb.sem_num; + sops[i].sem_op = osb.sem_op; + sops[i].sem_flg = osb.sem_flg; tsops++; } if (timeout) {
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit d09fbb327d670737ab40fd8bbb0765ae06b8b739 upstream.
Borrow the x86 implementation of __inttype() to use in get_user() to select an integer type suitable to temporarily hold the result value. This is necessary to avoid propagating the volatile nature of the result argument, which can cause the following warning:
lib/iov_iter.c:413:5: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/uaccess.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -85,6 +85,13 @@ static inline void set_fs(mm_segment_t f flag; })
/* + * This is a type: either unsigned long, if the argument fits into + * that type, or otherwise unsigned long long. + */ +#define __inttype(x) \ + __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) + +/* * Single-value transfer routines. They automatically use the right * size if we just have the right pointer type. Note that the functions * which read from user space (*get_*) need to take care not to leak @@ -153,7 +160,7 @@ extern int __get_user_64t_4(void *); ({ \ unsigned long __limit = current_thread_info()->addr_limit - 1; \ register const typeof(*(p)) __user *__p asm("r0") = (p);\ - register typeof(x) __r2 asm("r2"); \ + register __inttype(x) __r2 asm("r2"); \ register unsigned long __l asm("r1") = __limit; \ register int __e asm("r0"); \ unsigned int __ua_flags = uaccess_save_and_enable(); \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit b1cd0a14806321721aae45f5446ed83a3647c914 upstream.
Fixing __get_user() for spectre variant 1 is not sane: we would have to add address space bounds checking in order to validate that the location should be accessed, and then zero the address if found to be invalid.
Since __get_user() is supposed to avoid the bounds check, and this is exactly what get_user() does, there's no point having two different implementations that are doing the same thing. So, when the Spectre workarounds are required, make __get_user() an alias of get_user().
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/uaccess.h | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -250,6 +250,16 @@ static inline void set_fs(mm_segment_t f #define user_addr_max() \ (uaccess_kernel() ? ~0UL : get_fs())
+#ifdef CONFIG_CPU_SPECTRE +/* + * When mitigating Spectre variant 1, it is not worth fixing the non- + * verifying accessors, because we need to add verification of the + * address space there. Force these to use the standard get_user() + * version instead. + */ +#define __get_user(x, ptr) get_user(x, ptr) +#else + /* * The "__xxx" versions of the user access functions do not verify the * address space - it must have been done previously with a separate @@ -266,12 +276,6 @@ static inline void set_fs(mm_segment_t f __gu_err; \ })
-#define __get_user_error(x, ptr, err) \ -({ \ - __get_user_err((x), (ptr), err); \ - (void) 0; \ -}) - #define __get_user_err(x, ptr, err) \ do { \ unsigned long __gu_addr = (unsigned long)(ptr); \ @@ -331,6 +335,7 @@ do { \
#define __get_user_asm_word(x, addr, err) \ __get_user_asm(x, addr, err, ldr) +#endif
#define __put_user_switch(x, ptr, __err, __fn) \
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King rmk+kernel@armlinux.org.uk
Commit a3c0f84765bb429ba0fd23de1c57b5e1591c9389 upstream.
Spectre variant 1 attacks are about this sequence of pseudo-code:
index = load(user-manipulated pointer); access(base + index * stride);
In order for the cache side-channel to work, the access() must me made to memory which userspace can detect whether cache lines have been loaded. On 32-bit ARM, this must be either user accessible memory, or a kernel mapping of that same user accessible memory.
The problem occurs when the load() speculatively loads privileged data, and the subsequent access() is made to user accessible memory.
Any load() which makes use of a user-maniplated pointer is a potential problem if the data it has loaded is used in a subsequent access. This also applies for the access() if the data loaded by that access is used by a subsequent access.
Harden the get_user() accessors against Spectre attacks by forcing out of bounds addresses to a NULL pointer. This prevents get_user() being used as the load() step above. As a side effect, put_user() will also be affected even though it isn't implicated.
Also harden copy_from_user() by redoing the bounds check within the arm_copy_from_user() code, and NULLing the pointer if out of bounds.
Acked-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: David A. Long dave.long@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/include/asm/assembler.h | 4 ++++ arch/arm/lib/copy_from_user.S | 9 +++++++++ 2 files changed, 13 insertions(+)
--- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -460,6 +460,10 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) adds \tmp, \addr, #\size - 1 sbcccs \tmp, \tmp, \limit bcs \bad +#ifdef CONFIG_CPU_SPECTRE + movcs \addr, #0 + csdb +#endif #endif .endm
--- a/arch/arm/lib/copy_from_user.S +++ b/arch/arm/lib/copy_from_user.S @@ -90,6 +90,15 @@ .text
ENTRY(arm_copy_from_user) +#ifdef CONFIG_CPU_SPECTRE + get_thread_info r3 + ldr r3, [r3, #TI_ADDR_LIMIT] + adds ip, r1, r2 @ ip=addr+size + sub r3, r3, #1 @ addr_limit - 1 + cmpcc ip, r3 @ if (addr+size > addr_limit - 1) + movcs r1, #0 @ addr = NULL + csdb +#endif
#include "copy_template.S"
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa jolsa@kernel.org
commit 77f18153c080855e1c3fb520ca31a4e61530121d upstream.
With gcc 8 we get new set of snprintf() warnings that breaks the compilation, one example:
tests/mem.c: In function ‘check’: tests/mem.c:19:48: error: ‘%s’ directive output may be truncated writing \ up to 99 bytes into a region of size 89 [-Werror=format-truncation=] snprintf(failure, sizeof failure, "unexpected %s", out);
The gcc docs says:
To avoid the warning either use a bigger buffer or handle the function's return value which indicates whether or not its output has been truncated.
Given that all these warnings are harmless, because the code either properly fails due to uncomplete file path or we don't care for truncated output at all, I'm changing all those snprintf() calls to scnprintf(), which actually 'checks' for the snprint return value so the gcc stays silent.
Signed-off-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: David Ahern dsahern@gmail.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Link: http://lkml.kernel.org/r/20180319082902.4518-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Ignat Korchagin ignat@cloudflare.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/perf/builtin-script.c | 22 +++++++++++----------- tools/perf/tests/attr.c | 4 ++-- tools/perf/tests/mem.c | 2 +- tools/perf/tests/pmu.c | 2 +- tools/perf/util/cgroup.c | 2 +- tools/perf/util/parse-events.c | 4 ++-- tools/perf/util/pmu.c | 2 +- 7 files changed, 19 insertions(+), 19 deletions(-)
--- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2304,8 +2304,8 @@ static int list_available_scripts(const }
for_each_lang(scripts_path, scripts_dir, lang_dirent) { - snprintf(lang_path, MAXPATHLEN, "%s/%s/bin", scripts_path, - lang_dirent->d_name); + scnprintf(lang_path, MAXPATHLEN, "%s/%s/bin", scripts_path, + lang_dirent->d_name); lang_dir = opendir(lang_path); if (!lang_dir) continue; @@ -2314,8 +2314,8 @@ static int list_available_scripts(const script_root = get_script_root(script_dirent, REPORT_SUFFIX); if (script_root) { desc = script_desc__findnew(script_root); - snprintf(script_path, MAXPATHLEN, "%s/%s", - lang_path, script_dirent->d_name); + scnprintf(script_path, MAXPATHLEN, "%s/%s", + lang_path, script_dirent->d_name); read_script_info(desc, script_path); free(script_root); } @@ -2351,7 +2351,7 @@ static int check_ev_match(char *dir_name int match, len; FILE *fp;
- sprintf(filename, "%s/bin/%s-record", dir_name, scriptname); + scnprintf(filename, MAXPATHLEN, "%s/bin/%s-record", dir_name, scriptname);
fp = fopen(filename, "r"); if (!fp) @@ -2427,8 +2427,8 @@ int find_scripts(char **scripts_array, c }
for_each_lang(scripts_path, scripts_dir, lang_dirent) { - snprintf(lang_path, MAXPATHLEN, "%s/%s", scripts_path, - lang_dirent->d_name); + scnprintf(lang_path, MAXPATHLEN, "%s/%s", scripts_path, + lang_dirent->d_name); #ifdef NO_LIBPERL if (strstr(lang_path, "perl")) continue; @@ -2483,8 +2483,8 @@ static char *get_script_path(const char return NULL;
for_each_lang(scripts_path, scripts_dir, lang_dirent) { - snprintf(lang_path, MAXPATHLEN, "%s/%s/bin", scripts_path, - lang_dirent->d_name); + scnprintf(lang_path, MAXPATHLEN, "%s/%s/bin", scripts_path, + lang_dirent->d_name); lang_dir = opendir(lang_path); if (!lang_dir) continue; @@ -2495,8 +2495,8 @@ static char *get_script_path(const char free(__script_root); closedir(lang_dir); closedir(scripts_dir); - snprintf(script_path, MAXPATHLEN, "%s/%s", - lang_path, script_dirent->d_name); + scnprintf(script_path, MAXPATHLEN, "%s/%s", + lang_path, script_dirent->d_name); return strdup(script_path); } free(__script_root); --- a/tools/perf/tests/attr.c +++ b/tools/perf/tests/attr.c @@ -164,8 +164,8 @@ static int run_dir(const char *d, const if (verbose > 0) vcnt++;
- snprintf(cmd, 3*PATH_MAX, PYTHON " %s/attr.py -d %s/attr/ -p %s %.*s", - d, d, perf, vcnt, v); + scnprintf(cmd, 3*PATH_MAX, PYTHON " %s/attr.py -d %s/attr/ -p %s %.*s", + d, d, perf, vcnt, v);
return system(cmd) ? TEST_FAIL : TEST_OK; } --- a/tools/perf/tests/mem.c +++ b/tools/perf/tests/mem.c @@ -16,7 +16,7 @@ static int check(union perf_mem_data_src
n = perf_mem__snp_scnprintf(out, sizeof out, &mi); n += perf_mem__lvl_scnprintf(out + n, sizeof out - n, &mi); - snprintf(failure, sizeof failure, "unexpected %s", out); + scnprintf(failure, sizeof failure, "unexpected %s", out); TEST_ASSERT_VAL(failure, !strcmp(string, out)); return 0; } --- a/tools/perf/tests/pmu.c +++ b/tools/perf/tests/pmu.c @@ -98,7 +98,7 @@ static char *test_format_dir_get(void) struct test_format *format = &test_formats[i]; FILE *file;
- snprintf(name, PATH_MAX, "%s/%s", dir, format->name); + scnprintf(name, PATH_MAX, "%s/%s", dir, format->name);
file = fopen(name, "w"); if (!file) --- a/tools/perf/util/cgroup.c +++ b/tools/perf/util/cgroup.c @@ -78,7 +78,7 @@ static int open_cgroup(char *name) if (cgroupfs_find_mountpoint(mnt, PATH_MAX + 1)) return -1;
- snprintf(path, PATH_MAX, "%s/%s", mnt, name); + scnprintf(path, PATH_MAX, "%s/%s", mnt, name);
fd = open(path, O_RDONLY); if (fd == -1) --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -202,8 +202,8 @@ struct tracepoint_path *tracepoint_id_to
for_each_event(sys_dirent, evt_dir, evt_dirent) {
- snprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, - evt_dirent->d_name); + scnprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, + evt_dirent->d_name); fd = open(evt_path, O_RDONLY); if (fd < 0) continue; --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -349,7 +349,7 @@ static int pmu_aliases_parse(char *dir, if (pmu_alias_info_file(name)) continue;
- snprintf(path, PATH_MAX, "%s/%s", dir, name); + scnprintf(path, PATH_MAX, "%s/%s", dir, name);
file = fopen(path, "r"); if (!file) {
On Tue, Oct 16, 2018 at 07:04:28PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.14.77-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: 3dbba66c8671a97270f35e072c54f74ddca6954e git describe: v4.14.76-110-g3dbba66c8671 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.76-11...
No regressions (compared to build v4.14.76)
No fixes (compared to build v4.14.76)
Ran 21021 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * libhugetlbfs * ltp-containers-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * kselftest * ltp-cap_bounds-tests * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Tue, Oct 16, 2018 at 10:56:42PM -0500, Dan Rue wrote:
On Tue, Oct 16, 2018 at 07:04:28PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Wonderful, thanks for testing,
greg k-h
On 10/16/2018 11:04 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.77-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Tue, Oct 16, 2018 at 07:04:28PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
For v4.14.76-110-g3dbba66c8671:
Build results: total: 150 pass: 150 fail: 0 Qemu test results: total: 318 pass: 318 fail: 0
Details are available at https://kerneltests.org/builders/.
Guenter
On 16/10/2018 18:04, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
All tests are passing for Tegra ...
Test results for stable-v4.14: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 14 tests: 14 pass, 0 fail
Linux version: 4.14.77-rc1-g3dbba66 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Thu, Oct 18, 2018 at 07:43:02AM +0100, Jon Hunter wrote:
On 16/10/2018 18:04, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.77 release. There are 109 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Oct 18 17:04:58 UTC 2018. Anything received after that time might be too late.
All tests are passing for Tegra ...
Test results for stable-v4.14: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 14 tests: 14 pass, 0 fail
Linux version: 4.14.77-rc1-g3dbba66 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Great, thanks for testing and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org