This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.14.10-rc1
Daniele Palmas dnlplm@gmail.com drivers: net: mhi: fix error path in mhi_net_newlink
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: Fix oversized kvmalloc() calls
Eric Dumazet edumazet@google.com netfilter: conntrack: serialize hash resizes and cleanups
Haimin Zhang tcs_kernel@tencent.com KVM: x86: Handle SRCU initialization failure during page track init
Yanfei Xu yanfei.xu@windriver.com net: mdiobus: Fix memory leak in __mdiobus_register
Shreyansh Chouhan chouhan.shreyansh630@gmail.com crypto: aesni - xts_crypt() return if walk.nbytes is 0
Anirudh Rayabharam mail@anirudhrb.com HID: usbhid: free raw_report buffers in usbhid_stop
Linus Torvalds torvalds@linux-foundation.org mm: don't allow oversized kvmalloc() calls
Jozsef Kadlecsik kadlec@netfilter.org netfilter: ipset: Fix oversized kvmalloc() calls
F.A.Sulaiman asha.16@itfac.mrt.ac.lk HID: betop: fix slab-out-of-bounds Write in betop_probe
Dongliang Mu mudongliangabcd@gmail.com usb: hso: remove the bailout parameter
Randy Dunlap rdunlap@infradead.org NIOS2: setup.c: drop unused variable 'dram_start'
Eric Dumazet edumazet@google.com net: udp: annotate data race around udp_sk(sk)->corkflag
Andrej Shadura andrew.shadura@collabora.co.uk HID: u2fzero: ignore incomplete packets without data
yangerkun yangerkun@huawei.com ext4: flush s_error_work before journal destroy in ext4_fill_super
yangerkun yangerkun@huawei.com ext4: fix potential infinite loop in ext4_dx_readdir()
Theodore Ts'o tytso@mit.edu ext4: add error checking to ext4_ext_replay_set_iblocks()
Jeffle Xu jefflexu@linux.alibaba.com ext4: fix reserved space counter leakage
Hou Tao houtao1@huawei.com ext4: limit the number of blocks in one ADD_RANGE TLV
Ritesh Harjani riteshh@linux.ibm.com ext4: fix loff_t overflow in ext4_max_bitmap_size()
Johan Hovold johan@kernel.org ipack: ipoctal: fix module reference leak
Johan Hovold johan@kernel.org ipack: ipoctal: fix missing allocation-failure check
Johan Hovold johan@kernel.org ipack: ipoctal: fix tty-registration error handling
Johan Hovold johan@kernel.org ipack: ipoctal: fix tty registration race
Johan Hovold johan@kernel.org ipack: ipoctal: fix stack information leak
Nirmoy Das nirmoy.das@amd.com debugfs: debugfs_create_file_size(): use IS_ERR to check for error
Saravana Kannan saravanak@google.com driver core: fw_devlink: Improve handling of cyclic dependencies
Chen Jingwen chenjingwen6@huawei.com elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings
Keith Busch kbusch@kernel.org nvme: add command id quirk for apple controllers
Linus Torvalds torvalds@linux-foundation.org kvm: fix objtool relocation warning
Vadim Pasternak vadimp@nvidia.com hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controller
Eddie James eajames@linux.ibm.com hwmon: (occ) Fix P10 VRM temp sensors
Mel Gorman mgorman@techsingularity.net sched/fair: Null terminate buffer when updating tunable_scaling
Michal Koutný mkoutny@suse.com sched/fair: Add ancestors of unthrottled undecayed cfs_rq
Kan Liang kan.liang@linux.intel.com perf/x86/intel: Update event constraints for ICX
Peter Zijlstra peterz@infradead.org objtool: Teach get_alt_entry() about more relocation types
Eric Dumazet edumazet@google.com af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
Wong Vee Khee vee.khee.wong@linux.intel.com net: stmmac: fix EEE init issue when paired with EEE capable PHYs
Vlad Buslov vladbu@nvidia.com net: sched: flower: protect fl_walk() with rcu
Paolo Abeni pabeni@redhat.com net: introduce and use lock_sock_fast_nested()
Florian Fainelli f.fainelli@gmail.com net: phy: bcm7xxx: Fixed indirect MMD operations
Guangbin Huang huangguangbin2@huawei.com net: hns3: disable firmware compatible features when uninstall PF
Guangbin Huang huangguangbin2@huawei.com net: hns3: fix always enable rx vlan filter problem after selftest
Peng Li lipeng321@huawei.com net: hns3: reconstruct function hns3_self_test
Jian Shen shenjian15@huawei.com net: hns3: fix show wrong state when add existing uc mac address
Jian Shen shenjian15@huawei.com net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
Jian Shen shenjian15@huawei.com net: hns3: don't rollback when destroy mqprio fail
Jian Shen shenjian15@huawei.com net: hns3: remove tc enable checking
Jian Shen shenjian15@huawei.com net: hns3: do not allow call hns3_nic_net_open repeatedly
Feng Zhou zhoufeng.zf@bytedance.com ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
Rahul Lakkireddy rahul.lakkireddy@chelsio.com scsi: csiostor: Add module softdep on cxgb4
Jens Axboe axboe@kernel.dk Revert "block, bfq: honor already-setup queue merges"
Shannon Nelson snelson@pensando.io ionic: fix gathering of debug stats
Arnd Bergmann arnd@arndb.de net: ks8851: fix link error
Johan Almbladh johan.almbladh@anyfinetworks.com bpf, x86: Fix bpf mapping of atomic fetch implementation
Jiri Benc jbenc@redhat.com selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
Jiri Benc jbenc@redhat.com selftests, bpf: Fix makefile dependencies on libbpf
Kumar Kartikeya Dwivedi memxor@gmail.com libbpf: Fix segfault in static linker for objects without BTF
Lorenz Bauer lmb@cloudflare.com bpf: Exempt CAP_BPF from checks against bpf_jit_limit
Wenpeng Liang liangwenpeng@huawei.com RDMA/hns: Add the check of the CQE size of the user space
Wenpeng Liang liangwenpeng@huawei.com RDMA/hns: Fix the size setting error when copying CQE in clean_cq()
Guo Zhi qtxuning1999@sjtu.edu.cn RDMA/hfi1: Fix kernel pointer leak
Jacob Keller jacob.e.keller@intel.com e100: fix buffer overrun in e100_get_regs
Jacob Keller jacob.e.keller@intel.com e100: fix length calculation in e100_get_regs_len
Andrew Lunn andrew@lunn.ch dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports
Andrew Lunn andrew@lunn.ch dsa: mv88e6xxx: Fix MTU definition
Andrew Lunn andrew@lunn.ch dsa: mv88e6xxx: 6161: Use chip wide MAX MTU
Tejas Upadhyay tejaskumarx.surendrakumar.upadhyay@intel.com drm/i915: Remove warning from the rps worker
Matthew Auld matthew.auld@intel.com drm/i915/request: fix early tracepoints
Aaro Koskinen aaro.koskinen@iki.fi smsc95xx: fix stalled rx after link change
Xiao Liang shaw.leon@gmail.com net: ipv4: Fix rtnexthop len when RTA_FLOW is present
Vladimir Oltean vladimir.oltean@nxp.com net: enetc: fix the incorrect clearing of IF_MODE bits
Paul Fertser fercerpav@gmail.com hwmon: (tmp421) fix rounding for negative values
Paul Fertser fercerpav@gmail.com hwmon: (tmp421) report /PVLD condition as fault
Jason Gunthorpe jgg@ziepe.ca RDMA/hns: Work around broken constant propagation in gcc 8
Davide Caratti dcaratti@redhat.com mptcp: allow changing the 'backup' bit when no sockets are open
Florian Westphal fw@strlen.de mptcp: don't return sockets in foreign netns
Xin Long lucien.xin@gmail.com sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
Saravana Kannan saravanak@google.com net: mdiobus: Set FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for mdiobus parents
Saravana Kannan saravanak@google.com driver core: fw_devlink: Add support for FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD
Johannes Berg johannes.berg@intel.com mac80211-hwsim: fix late beacon hrtimer handling
Johannes Berg johannes.berg@intel.com mac80211: mesh: fix potentially unaligned access
Lorenzo Bianconi lorenzo@kernel.org mac80211: limit injected vht mcs/nss in ieee80211_parse_tx_radiotap
Chih-Kang Chang gary.chang@realtek.com mac80211: Fix ieee80211_amsdu_aggregate frag_tail bug
Felix Fietkau nbd@nbd.name Revert "mac80211: do not use low data rates for data frames with no ack flag"
Florian Westphal fw@strlen.de netfilter: log: work around missing softdep backend module
Florian Westphal fw@strlen.de netfilter: nf_tables: unlink table before deleting it
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Report correct WC error when there are MW bind errors
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Report correct WC error when transport retry counter is exceeded
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Validate number of CQ entries on create CQ
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Skip CQP ring during a reset
Vadim Pasternak vadimp@nvidia.com hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs
Piotr Krysiuk piotras@gmail.com bpf, mips: Validate conditional branch offsets
Tao Liu thomas.liu@ucloud.cn RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure
Christoph Lameter cl@gentwo.de IB/cma: Do not send IGMP leaves for sendonly Multicast groups
Hou Tao houtao1@huawei.com bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
Andrea Claudi aclaudi@redhat.com ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
Zhi A Wang zhi.wang.linux2@gmail.com drm/i915/gvt: fix the usage of ww lock in gvt scheduler.
Shawn Guo shawn.guo@linaro.org interconnect: qcom: sdm660: Correct NOC_QOS_PRIORITY shift and mask
Shawn Guo shawn.guo@linaro.org interconnect: qcom: sdm660: Fix id of slv_cnoc_mnoc_cfg
Hawking Zhang Hawking.Zhang@amd.com drm/amdgpu: correct initial cp_hqd_quantum for gfx9
Simon Ser contact@emersion.fr drm/amdgpu: check tiling flags when creating FB on GFX8-
Prike Liang Prike.Liang@amd.com drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
Praful Swarnakar Praful.Swarnakar@amd.com drm/amd/display: Fix Display Flicker on embedded panels
Charlene Liu Charlene.Liu@amd.com drm/amd/display: Pass PCI deviceid into DC
Josip Pavic Josip.Pavic@amd.com drm/amd/display: initialize backlight_ramping_override to false
Nick Desaulniers ndesaulniers@google.com nbd: use shifts rather than multiplies
Jason Gunthorpe jgg@ziepe.ca RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
Jason Gunthorpe jgg@ziepe.ca RDMA/cma: Do not change route.addr.src_addr.ss_family
Sean Young sean@mess.org media: ir_toy: prevent device from hanging during transmit
Wolfram Sang wsa+renesas@sang-engineering.com mmc: renesas_sdhi: fix regression with hard reset on old SDHIs
Zhenzhong Duan zhenzhong.duan@intel.com KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
Chenyi Qiang chenyi.qiang@intel.com KVM: nVMX: Fix nested bus lock VM exit
Mingwei Zhang mizhang@google.com KVM: SVM: fix missing sev_decommission in sev_receive_start
Peter Gonda pgonda@google.com KVM: SEV: Allow some commands for mirror VM
Peter Gonda pgonda@google.com KVM: SEV: Acquire vcpu mutex when updating VMSA
Sean Christopherson seanjc@google.com KVM: SEV: Pin guest memory for write for RECEIVE_UPDATE_DATA
Peter Gonda pgonda@google.com KVM: SEV: Update svm_vm_copy_asid_from for SEV-ES
Vitaly Kuznetsov vkuznets@redhat.com KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
Sean Christopherson seanjc@google.com KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
Sean Christopherson seanjc@google.com KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
Maxim Levitsky mlevitsk@redhat.com KVM: x86: nSVM: don't copy virt_ext from vmcb12
Vitaly Kuznetsov vkuznets@redhat.com KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
Zelin Deng zelin.deng@linux.alibaba.com ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm
Zelin Deng zelin.deng@linux.alibaba.com x86/kvmclock: Move this_cpu_pvti into kvmclock.h
José Expósito jose.exposito89@gmail.com platform/x86/intel: hid: Add DMI switches allow list
Johannes Berg johannes.berg@intel.com mac80211: fix use-after-free in CCMP/GCMP RX
Jonathan Hsu jonathan.hsu@mediatek.com scsi: ufs: Fix illegal offset in UPIU event trace
Andrey Gusakov andrey.gusakov@cogentembedded.com gpio: pca953x: do not ignore i2c errors
Nadezda Lutovinova lutovinova@ispras.ru hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field
Nadezda Lutovinova lutovinova@ispras.ru hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field
Nadezda Lutovinova lutovinova@ispras.ru hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field
Paul Fertser fercerpav@gmail.com hwmon: (tmp421) handle I2C errors
Eric Biggers ebiggers@google.com fs-verity: fix signed integer overflow with i_size near S64_MAX
Jia He justin.he@arm.com ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect
Cameron Berkenpas cam@neo-zeon.de ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
Takashi Sakamoto o-takashi@sakamocchi.jp ALSA: firewire-motu: fix truncated bytes in message tracepoints
Jaroslav Kysela perex@perex.cz ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION
Adrian Hunter adrian.hunter@intel.com scsi: ufs: ufs-pci: Fix Intel LKF link stability
James Morse james.morse@arm.com cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory
Guchun Chen guchun.chen@amd.com drm/amdgpu: stop scheduler when calling hw_fini (v2)
Guchun Chen guchun.chen@amd.com drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)
Likun Gao Likun.Gao@amd.com drm/amdgpu: adjust fence driver enable sequence
Saurav Kashyap skashyap@marvell.com scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS
Kevin Hao haokexin@gmail.com cpufreq: schedutil: Use kobject release() method to free sugov_tunables
Igor Matheus Andrade Torrente igormtorrente@gmail.com tty: Fix out-of-bound vmalloc access in imageblit
Jackie Liu liuyun01@kylinos.cn watchdog/sb_watchdog: fix compilation problem due to COMPILE_TEST
Like Xu likexu@tencent.com perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
Like Xu likexu@tencent.com perf iostat: Use system-wide mode if the target cpu_list is unspecified
Ian Rogers irogers@google.com perf test: Fix DWARF unwind for optimized builds.
Evgeny Novikov novikov@ispras.ru HID: amd_sfh: Fix potential NULL pointer dereference
Marco Elver elver@google.com kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
Randy Dunlap rdunlap@infradead.org NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
Al Viro viro@zeniv.linux.org.uk m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
Dan Carpenter dan.carpenter@oracle.com crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd()
Alexandra Winter wintera@linux.ibm.com s390/qeth: fix deadlock during failing recovery
Alexandra Winter wintera@linux.ibm.com s390/qeth: Fix deadlock in remove_discipline
Lama Kayal lkayal@nvidia.com net/mlx4_en: Resolve bad operstate value
David Collins collinsd@codeaurora.org pinctrl: qcom: spmi-gpio: correct parent irqspec translation
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: imx: imx8m: Bar index is only valid for IRAM and SRAM types
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: imx: imx8: Bar index is only valid for IRAM and SRAM types
Yong Zhi yong.zhi@intel.com ASoC: SOF: Fix DSP oops stack dump output contents
James Smart jsmart2021@gmail.com scsi: elx: efct: Fix void-pointer-to-enum-cast warning for efc_nport_topology
Trevor Wu trevor.wu@mediatek.com ASoC: mediatek: common: handle NULL case in suspend/resume function
Shengjiu Wang shengjiu.wang@nxp.com ASoC: fsl_xcvr: register platform component before registering cpu dai
Shengjiu Wang shengjiu.wang@nxp.com ASoC: fsl_spdif: register platform component before registering cpu dai
Shengjiu Wang shengjiu.wang@nxp.com ASoC: fsl_micfil: register platform component before registering cpu dai
Shengjiu Wang shengjiu.wang@nxp.com ASoC: fsl_esai: register platform component before registering cpu dai
Shengjiu Wang shengjiu.wang@nxp.com ASoC: fsl_sai: register platform component before registering cpu dai
Randy Dunlap rdunlap@infradead.org media: s5p-jpeg: rename JPEG marker constants to prevent build warnings
Nicolas Dufresne nicolas.dufresne@collabora.com media: cedrus: Fix SUNXI tile size calculation
Jernej Skrabec jernej.skrabec@gmail.com media: hantro: Fix check for single irq
-------------
Diffstat:
Makefile | 4 +- arch/m68k/kernel/entry.S | 2 + arch/mips/net/bpf_jit.c | 57 ++++++--- arch/nios2/Kconfig.debug | 3 +- arch/nios2/kernel/setup.c | 2 - arch/s390/include/asm/ccwgroup.h | 2 +- arch/x86/crypto/aesni-intel_glue.c | 2 +- arch/x86/events/intel/core.c | 1 + arch/x86/include/asm/kvm_page_track.h | 2 +- arch/x86/include/asm/kvmclock.h | 14 +++ arch/x86/kernel/kvmclock.c | 13 +-- arch/x86/kvm/cpuid.c | 4 +- arch/x86/kvm/emulate.c | 1 - arch/x86/kvm/ioapic.c | 10 +- arch/x86/kvm/mmu/page_track.c | 4 +- arch/x86/kvm/svm/nested.c | 1 - arch/x86/kvm/svm/sev.c | 92 ++++++++++----- arch/x86/kvm/vmx/evmcs.c | 12 +- arch/x86/kvm/vmx/nested.c | 6 + arch/x86/kvm/vmx/vmx.c | 11 +- arch/x86/kvm/x86.c | 10 +- arch/x86/net/bpf_jit_comp.c | 66 ++++++++--- block/bfq-iosched.c | 16 +-- drivers/acpi/nfit/core.c | 12 ++ drivers/base/core.c | 36 +++++- drivers/block/nbd.c | 29 +++-- drivers/cpufreq/cpufreq_governor_attr_set.c | 2 +- drivers/crypto/ccp/ccp-ops.c | 14 ++- drivers/gpio/gpio-pca953x.c | 11 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 31 +++++ drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 62 +++------- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 ++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 + drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 15 ++- drivers/gpu/drm/i915/gt/intel_rps.c | 2 - drivers/gpu/drm/i915/gvt/scheduler.c | 4 +- drivers/gpu/drm/i915/i915_request.c | 11 +- drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 6 +- drivers/hid/hid-betopff.c | 13 ++- drivers/hid/hid-u2fzero.c | 4 +- drivers/hid/usbhid/hid-core.c | 13 ++- drivers/hwmon/mlxreg-fan.c | 12 +- drivers/hwmon/occ/common.c | 17 +-- drivers/hwmon/pmbus/mp2975.c | 2 +- drivers/hwmon/tmp421.c | 71 +++++++----- drivers/hwmon/w83791d.c | 29 ++--- drivers/hwmon/w83792d.c | 28 ++--- drivers/hwmon/w83793.c | 26 ++--- drivers/infiniband/core/cma.c | 51 +++++++- drivers/infiniband/core/cma_priv.h | 1 + drivers/infiniband/hw/hfi1/ipoib_tx.c | 8 +- drivers/infiniband/hw/hns/hns_roce_cq.c | 31 +++-- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 13 ++- drivers/infiniband/hw/irdma/cm.c | 4 +- drivers/infiniband/hw/irdma/hw.c | 14 ++- drivers/infiniband/hw/irdma/i40iw_if.c | 2 +- drivers/infiniband/hw/irdma/main.h | 1 - drivers/infiniband/hw/irdma/user.h | 2 + drivers/infiniband/hw/irdma/utils.c | 2 +- drivers/infiniband/hw/irdma/verbs.c | 9 +- drivers/interconnect/qcom/sdm660.c | 11 +- drivers/ipack/devices/ipoctal.c | 63 +++++++--- drivers/media/platform/s5p-jpeg/jpeg-core.c | 18 +-- drivers/media/platform/s5p-jpeg/jpeg-core.h | 28 ++--- drivers/media/rc/ir_toy.c | 21 +++- drivers/mmc/host/renesas_sdhi_core.c | 2 + drivers/net/dsa/mv88e6xxx/chip.c | 17 +-- drivers/net/dsa/mv88e6xxx/chip.h | 1 + drivers/net/dsa/mv88e6xxx/global1.c | 2 + drivers/net/dsa/mv88e6xxx/port.c | 2 + drivers/net/ethernet/freescale/enetc/enetc_pf.c | 3 +- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 - drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 16 ++- drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 105 +++++++++++------ .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 21 ++-- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 29 ++--- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 19 ++- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 33 +----- drivers/net/ethernet/intel/e100.c | 22 ++-- drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 +- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 47 +++++--- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 - drivers/net/ethernet/micrel/Makefile | 6 +- drivers/net/ethernet/micrel/ks8851_common.c | 8 ++ drivers/net/ethernet/pensando/ionic/ionic_stats.c | 9 -- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 + drivers/net/mhi/net.c | 6 +- drivers/net/phy/bcm7xxx.c | 114 +++++++++++++++++- drivers/net/phy/mdio_bus.c | 5 + drivers/net/usb/hso.c | 6 +- drivers/net/usb/smsc95xx.c | 3 + drivers/net/wireless/mac80211_hwsim.c | 4 +- drivers/nvme/host/core.c | 4 +- drivers/nvme/host/nvme.h | 6 + drivers/nvme/host/pci.c | 3 +- drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 37 +++++- drivers/platform/x86/intel-hid.c | 27 ++++- drivers/ptp/ptp_kvm_x86.c | 9 +- drivers/s390/cio/ccwgroup.c | 10 +- drivers/s390/net/qeth_core.h | 1 - drivers/s390/net/qeth_core_main.c | 19 +-- drivers/s390/net/qeth_l2_main.c | 1 - drivers/s390/net/qeth_l3_main.c | 1 - drivers/scsi/csiostor/csio_init.c | 1 + drivers/scsi/elx/libefc/efc_device.c | 7 +- drivers/scsi/elx/libefc/efc_fabric.c | 3 +- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_isr.c | 2 + drivers/scsi/qla2xxx/qla_nvme.c | 40 +++---- drivers/scsi/ufs/ufshcd-pci.c | 78 +++++++++++++ drivers/scsi/ufs/ufshcd.c | 6 +- drivers/scsi/ufs/ufshcd.h | 1 + drivers/staging/media/hantro/hantro_drv.c | 2 +- drivers/staging/media/sunxi/cedrus/cedrus_video.c | 2 +- drivers/tty/vt/vt.c | 21 +++- drivers/watchdog/Kconfig | 2 +- fs/binfmt_elf.c | 2 +- fs/debugfs/inode.c | 2 +- fs/ext4/dir.c | 6 +- fs/ext4/extents.c | 19 ++- fs/ext4/fast_commit.c | 6 + fs/ext4/inode.c | 5 + fs/ext4/super.c | 21 +++- fs/verity/enable.c | 2 +- fs/verity/open.c | 2 +- include/linux/bpf.h | 2 + include/linux/fwnode.h | 11 +- include/net/ip_fib.h | 2 +- include/net/nexthop.h | 2 +- include/net/sock.h | 33 +++++- include/sound/rawmidi.h | 1 + include/uapi/sound/asound.h | 1 + kernel/bpf/bpf_struct_ops.c | 7 +- kernel/bpf/core.c | 2 +- kernel/sched/cpufreq_schedutil.c | 16 ++- kernel/sched/debug.c | 8 +- kernel/sched/fair.c | 6 +- lib/Kconfig.kasan | 2 + mm/util.c | 4 + net/core/sock.c | 49 ++++---- net/ipv4/fib_semantics.c | 16 +-- net/ipv4/udp.c | 10 +- net/ipv6/route.c | 5 +- net/ipv6/udp.c | 2 +- net/mac80211/mesh_ps.c | 3 +- net/mac80211/rate.c | 4 - net/mac80211/tx.c | 12 ++ net/mac80211/wpa.c | 6 + net/mptcp/mptcp_diag.c | 2 +- net/mptcp/pm_netlink.c | 4 +- net/mptcp/protocol.c | 2 +- net/mptcp/protocol.h | 2 +- net/mptcp/subflow.c | 2 +- net/mptcp/syncookies.c | 13 +-- net/mptcp/token.c | 11 +- net/mptcp/token_test.c | 14 ++- net/netfilter/ipset/ip_set_hash_gen.h | 4 +- net/netfilter/ipvs/ip_vs_conn.c | 4 + net/netfilter/nf_conntrack_core.c | 70 +++++------ net/netfilter/nf_tables_api.c | 30 +++-- net/netfilter/nft_compat.c | 17 ++- net/netfilter/xt_LOG.c | 10 +- net/netfilter/xt_NFLOG.c | 10 +- net/sched/cls_flower.c | 6 + net/sctp/input.c | 2 +- net/unix/af_unix.c | 34 +++++- sound/core/rawmidi.c | 9 ++ sound/firewire/motu/amdtp-motu.c | 7 +- sound/pci/hda/patch_realtek.c | 129 +++++++++++++++++++++ sound/soc/fsl/fsl_esai.c | 16 ++- sound/soc/fsl/fsl_micfil.c | 15 ++- sound/soc/fsl/fsl_sai.c | 14 ++- sound/soc/fsl/fsl_spdif.c | 14 ++- sound/soc/fsl/fsl_xcvr.c | 15 ++- sound/soc/mediatek/common/mtk-afe-fe-dai.c | 19 +-- sound/soc/sof/imx/imx8.c | 9 +- sound/soc/sof/imx/imx8m.c | 9 +- sound/soc/sof/xtensa/core.c | 4 +- tools/lib/bpf/linker.c | 8 +- tools/objtool/special.c | 32 +++-- tools/perf/arch/x86/util/iostat.c | 2 +- tools/perf/builtin-stat.c | 2 + tools/perf/tests/dwarf-unwind.c | 39 +++++-- tools/testing/selftests/bpf/Makefile | 3 +- tools/testing/selftests/bpf/test_lwt_ip_encap.sh | 13 ++- 189 files changed, 1870 insertions(+), 875 deletions(-)
From: Jernej Skrabec jernej.skrabec@gmail.com
[ Upstream commit 31692ab9a9ef0119959f66838de74eeb37490c8d ]
Some cores use only one interrupt and in such case interrupt name in DT is not needed. Driver supposedly accounted that, but due to the wrong field check it never worked. Fix that.
Fixes: 18d6c8b7b4c9 ("media: hantro: add fallback handling for single irq/clk") Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Reviewed-by: Ezequiel Garcia ezequiel@collabora.com Reviewed-by: Emil Velikov emil.velikov@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/hantro/hantro_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c index 31d8449ca1d2..fc769c52c6d3 100644 --- a/drivers/staging/media/hantro/hantro_drv.c +++ b/drivers/staging/media/hantro/hantro_drv.c @@ -918,7 +918,7 @@ static int hantro_probe(struct platform_device *pdev) if (!vpu->variant->irqs[i].handler) continue;
- if (vpu->variant->num_clocks > 1) { + if (vpu->variant->num_irqs > 1) { irq_name = vpu->variant->irqs[i].name; irq = platform_get_irq_byname(vpu->pdev, irq_name); } else {
From: Nicolas Dufresne nicolas.dufresne@collabora.com
[ Upstream commit 132c88614f2b3548cd3c8979a434609019db4151 ]
Tiled formats requires full rows being allocated (even for Chroma planes). When the number of Luma tiles is odd, we need to round up to twice the tile width in order to roundup the number of Chroma tiles.
This was notice with a crash running BA1_FT_C compliance test using sunxi tiles using GStreamer. Cedrus driver would allocate 9 rows for Luma, but only 4.5 rows for Chroma, causing userspace to crash.
Signed-off-by: Nicolas Dufresne nicolas.dufresne@collabora.com Fixes: 50e761516f2b8 ("media: platform: Add Cedrus VPU decoder driver") Reviewed-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/sunxi/cedrus/cedrus_video.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_video.c b/drivers/staging/media/sunxi/cedrus/cedrus_video.c index 32c13ecb22d8..a8168ac2fbd0 100644 --- a/drivers/staging/media/sunxi/cedrus/cedrus_video.c +++ b/drivers/staging/media/sunxi/cedrus/cedrus_video.c @@ -135,7 +135,7 @@ void cedrus_prepare_format(struct v4l2_pix_format *pix_fmt) sizeimage = bytesperline * height;
/* Chroma plane size. */ - sizeimage += bytesperline * height / 2; + sizeimage += bytesperline * ALIGN(height, 64) / 2;
break;
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 3ad02c27d89d72b3b49ac51899144b7d0942f05f ]
The use of a macro named 'RST' conflicts with one of the same name in arch/mips/include/asm/mach-rc32434/rb.h. This causes build warnings on some MIPS builds.
Change the names of the JPEG marker constants to be in their own namespace to fix these build warnings and to prevent other similar problems in the future.
Fixes these build warnings:
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-exynos3250.c:14: ../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined 43 | #define RST 0xd0 | ../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition 13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-s5p.c:13: ../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined 43 | #define RST 0xd0 ../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition 13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-exynos4.c:12: ../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined 43 | #define RST 0xd0 ../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition 13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-core.c:31: ../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined 43 | #define RST 0xd0 ../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition 13 | #define RST (1 << 15)
Also update the kernel-doc so that the word "marker" is not repeated.
Link: https://lore.kernel.org/linux-media/20210907044022.30602-1-rdunlap@infradead... Fixes: bb677f3ac434 ("[media] Exynos4 JPEG codec v4l2 driver") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Cc: Andrzej Pietrasiewicz andrzejtp2010@gmail.com Cc: Jacek Anaszewski jacek.anaszewski@gmail.com Cc: Sylwester Nawrocki s.nawrocki@samsung.com Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/s5p-jpeg/jpeg-core.c | 18 ++++++------- drivers/media/platform/s5p-jpeg/jpeg-core.h | 28 ++++++++++----------- 2 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c index d402e456f27d..7d0ab19c38bb 100644 --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c @@ -1140,8 +1140,8 @@ static bool s5p_jpeg_parse_hdr(struct s5p_jpeg_q_data *result, continue; length = 0; switch (c) { - /* SOF0: baseline JPEG */ - case SOF0: + /* JPEG_MARKER_SOF0: baseline JPEG */ + case JPEG_MARKER_SOF0: if (get_word_be(&jpeg_buffer, &word)) break; length = (long)word - 2; @@ -1172,7 +1172,7 @@ static bool s5p_jpeg_parse_hdr(struct s5p_jpeg_q_data *result, notfound = 0; break;
- case DQT: + case JPEG_MARKER_DQT: if (get_word_be(&jpeg_buffer, &word)) break; length = (long)word - 2; @@ -1185,7 +1185,7 @@ static bool s5p_jpeg_parse_hdr(struct s5p_jpeg_q_data *result, skip(&jpeg_buffer, length); break;
- case DHT: + case JPEG_MARKER_DHT: if (get_word_be(&jpeg_buffer, &word)) break; length = (long)word - 2; @@ -1198,15 +1198,15 @@ static bool s5p_jpeg_parse_hdr(struct s5p_jpeg_q_data *result, skip(&jpeg_buffer, length); break;
- case SOS: + case JPEG_MARKER_SOS: sos = jpeg_buffer.curr - 2; /* 0xffda */ break;
/* skip payload-less markers */ - case RST ... RST + 7: - case SOI: - case EOI: - case TEM: + case JPEG_MARKER_RST ... JPEG_MARKER_RST + 7: + case JPEG_MARKER_SOI: + case JPEG_MARKER_EOI: + case JPEG_MARKER_TEM: break;
/* skip uninteresting payload markers */ diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.h b/drivers/media/platform/s5p-jpeg/jpeg-core.h index a77d93c098ce..8473a019bb5f 100644 --- a/drivers/media/platform/s5p-jpeg/jpeg-core.h +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.h @@ -37,15 +37,15 @@ #define EXYNOS3250_IRQ_TIMEOUT 0x10000000
/* a selection of JPEG markers */ -#define TEM 0x01 -#define SOF0 0xc0 -#define DHT 0xc4 -#define RST 0xd0 -#define SOI 0xd8 -#define EOI 0xd9 -#define SOS 0xda -#define DQT 0xdb -#define DHP 0xde +#define JPEG_MARKER_TEM 0x01 +#define JPEG_MARKER_SOF0 0xc0 +#define JPEG_MARKER_DHT 0xc4 +#define JPEG_MARKER_RST 0xd0 +#define JPEG_MARKER_SOI 0xd8 +#define JPEG_MARKER_EOI 0xd9 +#define JPEG_MARKER_SOS 0xda +#define JPEG_MARKER_DQT 0xdb +#define JPEG_MARKER_DHP 0xde
/* Flags that indicate a format can be used for capture/output */ #define SJPEG_FMT_FLAG_ENC_CAPTURE (1 << 0) @@ -187,11 +187,11 @@ struct s5p_jpeg_marker { * @fmt: driver-specific format of this queue * @w: image width * @h: image height - * @sos: SOS marker's position relative to the buffer beginning - * @dht: DHT markers' positions relative to the buffer beginning - * @dqt: DQT markers' positions relative to the buffer beginning - * @sof: SOF0 marker's position relative to the buffer beginning - * @sof_len: SOF0 marker's payload length (without length field itself) + * @sos: JPEG_MARKER_SOS's position relative to the buffer beginning + * @dht: JPEG_MARKER_DHT' positions relative to the buffer beginning + * @dqt: JPEG_MARKER_DQT' positions relative to the buffer beginning + * @sof: JPEG_MARKER_SOF0's position relative to the buffer beginning + * @sof_len: JPEG_MARKER_SOF0's payload length (without length field itself) * @size: image buffer size in bytes */ struct s5p_jpeg_q_data {
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit 9c3ad33b5a412d8bc0a377e7cd9baa53ed52f22d ]
There is no defer probe when adding platform component to snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card() -> snd_soc_bind_card() -> snd_soc_add_pcm_runtime() -> adding cpu dai -> adding codec dai -> adding platform component.
So if the platform component is not ready at that time, then the sound card still registered successfully, but platform component is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register platform component before cpu dai to avoid such issue.
Fixes: 435508214942 ("ASoC: Add SAI SoC Digital Audio Interface driver") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1630665006-31437-2-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_sai.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 223fcd15bfcc..38f6362099d5 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -1152,11 +1152,10 @@ static int fsl_sai_probe(struct platform_device *pdev) if (ret < 0) goto err_pm_get_sync;
- ret = devm_snd_soc_register_component(&pdev->dev, &fsl_component, - &sai->cpu_dai_drv, 1); - if (ret) - goto err_pm_get_sync; - + /* + * Register platform component before registering cpu dai for there + * is not defer probe for platform component in snd_soc_add_pcm_runtime(). + */ if (sai->soc_data->use_imx_pcm) { ret = imx_pcm_dma_init(pdev, IMX_SAI_DMABUF_SIZE); if (ret) @@ -1167,6 +1166,11 @@ static int fsl_sai_probe(struct platform_device *pdev) goto err_pm_get_sync; }
+ ret = devm_snd_soc_register_component(&pdev->dev, &fsl_component, + &sai->cpu_dai_drv, 1); + if (ret) + goto err_pm_get_sync; + return ret;
err_pm_get_sync:
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit f12ce92e98b21c1fc669cd74e12c54a0fe3bc2eb ]
There is no defer probe when adding platform component to snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card() -> snd_soc_bind_card() -> snd_soc_add_pcm_runtime() -> adding cpu dai -> adding codec dai -> adding platform component.
So if the platform component is not ready at that time, then the sound card still registered successfully, but platform component is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register platform component before cpu dai to avoid such issue.
Fixes: 43d24e76b698 ("ASoC: fsl_esai: Add ESAI CPU DAI driver") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1630665006-31437-3-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_esai.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index a961f837cd09..bda66b30e063 100644 --- a/sound/soc/fsl/fsl_esai.c +++ b/sound/soc/fsl/fsl_esai.c @@ -1073,6 +1073,16 @@ static int fsl_esai_probe(struct platform_device *pdev) if (ret < 0) goto err_pm_get_sync;
+ /* + * Register platform component before registering cpu dai for there + * is not defer probe for platform component in snd_soc_add_pcm_runtime(). + */ + ret = imx_pcm_dma_init(pdev, IMX_ESAI_DMABUF_SIZE); + if (ret) { + dev_err(&pdev->dev, "failed to init imx pcm dma: %d\n", ret); + goto err_pm_get_sync; + } + ret = devm_snd_soc_register_component(&pdev->dev, &fsl_esai_component, &fsl_esai_dai, 1); if (ret) { @@ -1082,12 +1092,6 @@ static int fsl_esai_probe(struct platform_device *pdev)
INIT_WORK(&esai_priv->work, fsl_esai_hw_reset);
- ret = imx_pcm_dma_init(pdev, IMX_ESAI_DMABUF_SIZE); - if (ret) { - dev_err(&pdev->dev, "failed to init imx pcm dma: %d\n", ret); - goto err_pm_get_sync; - } - return ret;
err_pm_get_sync:
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit 0adf292069dcca8bab76a603251fcaabf77468ca ]
There is no defer probe when adding platform component to snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card() -> snd_soc_bind_card() -> snd_soc_add_pcm_runtime() -> adding cpu dai -> adding codec dai -> adding platform component.
So if the platform component is not ready at that time, then the sound card still registered successfully, but platform component is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register platform component before cpu dai to avoid such issue.
Fixes: 47a70e6fc9a8 ("ASoC: Add MICFIL SoC Digital Audio Interface driver.") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1630665006-31437-4-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_micfil.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c index 8c0c75ce9490..9f90989ac59a 100644 --- a/sound/soc/fsl/fsl_micfil.c +++ b/sound/soc/fsl/fsl_micfil.c @@ -737,18 +737,23 @@ static int fsl_micfil_probe(struct platform_device *pdev) pm_runtime_enable(&pdev->dev); regcache_cache_only(micfil->regmap, true);
+ /* + * Register platform component before registering cpu dai for there + * is not defer probe for platform component in snd_soc_add_pcm_runtime(). + */ + ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); + if (ret) { + dev_err(&pdev->dev, "failed to pcm register\n"); + return ret; + } + ret = devm_snd_soc_register_component(&pdev->dev, &fsl_micfil_component, &fsl_micfil_dai, 1); if (ret) { dev_err(&pdev->dev, "failed to register component %s\n", fsl_micfil_component.name); - return ret; }
- ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); - if (ret) - dev_err(&pdev->dev, "failed to pcm register\n"); - return ret; }
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit ee8ccc2eb5840e34fce088bdb174fd5329153ef0 ]
There is no defer probe when adding platform component to snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card() -> snd_soc_bind_card() -> snd_soc_add_pcm_runtime() -> adding cpu dai -> adding codec dai -> adding platform component.
So if the platform component is not ready at that time, then the sound card still registered successfully, but platform component is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register platform component before cpu dai to avoid such issue.
Fixes: a2388a498ad2 ("ASoC: fsl: Add S/PDIF CPU DAI driver") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1630665006-31437-5-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_spdif.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c index 8ffb1a6048d6..1c53719bb61e 100644 --- a/sound/soc/fsl/fsl_spdif.c +++ b/sound/soc/fsl/fsl_spdif.c @@ -1434,16 +1434,20 @@ static int fsl_spdif_probe(struct platform_device *pdev) pm_runtime_enable(&pdev->dev); regcache_cache_only(spdif_priv->regmap, true);
- ret = devm_snd_soc_register_component(&pdev->dev, &fsl_spdif_component, - &spdif_priv->cpu_dai_drv, 1); + /* + * Register platform component before registering cpu dai for there + * is not defer probe for platform component in snd_soc_add_pcm_runtime(). + */ + ret = imx_pcm_dma_init(pdev, IMX_SPDIF_DMABUF_SIZE); if (ret) { - dev_err(&pdev->dev, "failed to register DAI: %d\n", ret); + dev_err_probe(&pdev->dev, ret, "imx_pcm_dma_init failed\n"); goto err_pm_disable; }
- ret = imx_pcm_dma_init(pdev, IMX_SPDIF_DMABUF_SIZE); + ret = devm_snd_soc_register_component(&pdev->dev, &fsl_spdif_component, + &spdif_priv->cpu_dai_drv, 1); if (ret) { - dev_err_probe(&pdev->dev, ret, "imx_pcm_dma_init failed\n"); + dev_err(&pdev->dev, "failed to register DAI: %d\n", ret); goto err_pm_disable; }
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit c590fa80b39287a91abeb487829f3190e7ae775f ]
There is no defer probe when adding platform component to snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card() -> snd_soc_bind_card() -> snd_soc_add_pcm_runtime() -> adding cpu dai -> adding codec dai -> adding platform component.
So if the platform component is not ready at that time, then the sound card still registered successfully, but platform component is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register platform component before cpu dai to avoid such issue.
Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1630665006-31437-6-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_xcvr.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c index fb7c29fc39d7..477d16713e72 100644 --- a/sound/soc/fsl/fsl_xcvr.c +++ b/sound/soc/fsl/fsl_xcvr.c @@ -1217,18 +1217,23 @@ static int fsl_xcvr_probe(struct platform_device *pdev) pm_runtime_enable(dev); regcache_cache_only(xcvr->regmap, true);
+ /* + * Register platform component before registering cpu dai for there + * is not defer probe for platform component in snd_soc_add_pcm_runtime(). + */ + ret = devm_snd_dmaengine_pcm_register(dev, NULL, 0); + if (ret) { + dev_err(dev, "failed to pcm register\n"); + return ret; + } + ret = devm_snd_soc_register_component(dev, &fsl_xcvr_comp, &fsl_xcvr_dai, 1); if (ret) { dev_err(dev, "failed to register component %s\n", fsl_xcvr_comp.name); - return ret; }
- ret = devm_snd_dmaengine_pcm_register(dev, NULL, 0); - if (ret) - dev_err(dev, "failed to pcm register\n"); - return ret; }
From: Trevor Wu trevor.wu@mediatek.com
[ Upstream commit 1dd038522615b70f5f8945c5631e9e2fa5bd58b1 ]
When memory allocation for afe->reg_back_up fails, reg_back_up can't be used. Keep the suspend/resume flow but skip register backup when afe->reg_back_up is NULL, in case illegal memory access happens.
Fixes: 283b612429a2 ("ASoC: mediatek: implement mediatek common structure") Signed-off-by: Trevor Wu trevor.wu@mediatek.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/20210910092613.30188-1-trevor.wu@mediatek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/mediatek/common/mtk-afe-fe-dai.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/sound/soc/mediatek/common/mtk-afe-fe-dai.c b/sound/soc/mediatek/common/mtk-afe-fe-dai.c index 3cb2adf420bb..ab7bbd53bb01 100644 --- a/sound/soc/mediatek/common/mtk-afe-fe-dai.c +++ b/sound/soc/mediatek/common/mtk-afe-fe-dai.c @@ -334,9 +334,11 @@ int mtk_afe_suspend(struct snd_soc_component *component) devm_kcalloc(dev, afe->reg_back_up_list_num, sizeof(unsigned int), GFP_KERNEL);
- for (i = 0; i < afe->reg_back_up_list_num; i++) - regmap_read(regmap, afe->reg_back_up_list[i], - &afe->reg_back_up[i]); + if (afe->reg_back_up) { + for (i = 0; i < afe->reg_back_up_list_num; i++) + regmap_read(regmap, afe->reg_back_up_list[i], + &afe->reg_back_up[i]); + }
afe->suspended = true; afe->runtime_suspend(dev); @@ -356,12 +358,13 @@ int mtk_afe_resume(struct snd_soc_component *component)
afe->runtime_resume(dev);
- if (!afe->reg_back_up) + if (!afe->reg_back_up) { dev_dbg(dev, "%s no reg_backup\n", __func__); - - for (i = 0; i < afe->reg_back_up_list_num; i++) - mtk_regmap_write(regmap, afe->reg_back_up_list[i], - afe->reg_back_up[i]); + } else { + for (i = 0; i < afe->reg_back_up_list_num; i++) + mtk_regmap_write(regmap, afe->reg_back_up_list[i], + afe->reg_back_up[i]); + }
afe->suspended = false; return 0;
From: James Smart jsmart2021@gmail.com
[ Upstream commit 96fafe7c6523886308605d30ec92c7936abe7c2c ]
The kernel test robot flagged an warning for ".../efc_device.c:932:6: warning: cast to smaller integer type 'enum efc_nport_topology' from 'void *'"
For the topology events, the "arg" field is generically defined as a void * and is used to pass different arguments. Most of the arguments are pointers to data structures. But for the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event, the argument is an enum value, and the code is typecasting the void * to an enum generating the warning.
Fix by converting the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event to pass a pointer to the enum, thus it's a straight-forward pointer dereference in the event handler.
Link: https://lore.kernel.org/r/20210830231050.5951-1-jsmart2021@gmail.com Fixes: 202bfdffae27 ("scsi: elx: libefc: FC node ELS and state handling") Reported-by: kernel test robot lkp@intel.com Co-developed-by: Ram Vegesna ram.vegesna@broadcom.com Signed-off-by: Ram Vegesna ram.vegesna@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/elx/libefc/efc_device.c | 7 +++---- drivers/scsi/elx/libefc/efc_fabric.c | 3 +-- 2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/elx/libefc/efc_device.c b/drivers/scsi/elx/libefc/efc_device.c index 725ca2a23fb2..52be01333c6e 100644 --- a/drivers/scsi/elx/libefc/efc_device.c +++ b/drivers/scsi/elx/libefc/efc_device.c @@ -928,22 +928,21 @@ __efc_d_wait_topology_notify(struct efc_sm_ctx *ctx, break;
case EFC_EVT_NPORT_TOPOLOGY_NOTIFY: { - enum efc_nport_topology topology = - (enum efc_nport_topology)arg; + enum efc_nport_topology *topology = arg;
WARN_ON(node->nport->domain->attached);
WARN_ON(node->send_ls_acc != EFC_NODE_SEND_LS_ACC_PLOGI);
node_printf(node, "topology notification, topology=%d\n", - topology); + *topology);
/* At the time the PLOGI was received, the topology was unknown, * so we didn't know which node would perform the domain attach: * 1. The node from which the PLOGI was sent (p2p) or * 2. The node to which the FLOGI was sent (fabric). */ - if (topology == EFC_NPORT_TOPO_P2P) { + if (*topology == EFC_NPORT_TOPO_P2P) { /* if this is p2p, need to attach to the domain using * the d_id from the PLOGI received */ diff --git a/drivers/scsi/elx/libefc/efc_fabric.c b/drivers/scsi/elx/libefc/efc_fabric.c index d397220d9e54..3270ce40196c 100644 --- a/drivers/scsi/elx/libefc/efc_fabric.c +++ b/drivers/scsi/elx/libefc/efc_fabric.c @@ -107,7 +107,6 @@ void efc_fabric_notify_topology(struct efc_node *node) { struct efc_node *tmp_node; - enum efc_nport_topology topology = node->nport->topology; unsigned long index;
/* @@ -118,7 +117,7 @@ efc_fabric_notify_topology(struct efc_node *node) if (tmp_node != node) { efc_node_post_event(tmp_node, EFC_EVT_NPORT_TOPOLOGY_NOTIFY, - (void *)topology); + &node->nport->topology); } } }
From: Yong Zhi yong.zhi@intel.com
[ Upstream commit ac4dfccb96571ca03af7cac64b7a0b2952c97f3a ]
Fix @buf arg given to hex_dump_to_buffer() and stack address used in dump error output.
Fixes: e657c18a01c8 ('ASoC: SOF: Add xtensa support') Signed-off-by: Yong Zhi yong.zhi@intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Daniel Baluta daniel.baluta@gmail.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20210915063230.29711-1-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/xtensa/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/xtensa/core.c b/sound/soc/sof/xtensa/core.c index bbb9a2282ed9..f6e3411b33cf 100644 --- a/sound/soc/sof/xtensa/core.c +++ b/sound/soc/sof/xtensa/core.c @@ -122,9 +122,9 @@ static void xtensa_stack(struct snd_sof_dev *sdev, void *oops, u32 *stack, * 0x0049fbb0: 8000f2d0 0049fc00 6f6c6c61 00632e63 */ for (i = 0; i < stack_words; i += 4) { - hex_dump_to_buffer(stack + i * 4, 16, 16, 4, + hex_dump_to_buffer(stack + i, 16, 16, 4, buf, sizeof(buf), false); - dev_err(sdev->dev, "0x%08x: %s\n", stack_ptr + i, buf); + dev_err(sdev->dev, "0x%08x: %s\n", stack_ptr + i * 4, buf); } }
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 10d93a98190aec2c3ff98d9472ab1bf0543aa02c ]
i.MX8 only uses SOF_FW_BLK_TYPE_IRAM (1) and SOF_FW_BLK_TYPE_SRAM (3) bars, everything else is left as 0 in sdev->bar[] array.
If a broken or purposefully crafted firmware image is loaded with other types of FW_BLK_TYPE then a kernel crash can be triggered.
Make sure that only IRAM/SRAM type is converted to bar index.
Fixes: 202acc565a1f0 ("ASoC: SOF: imx: Add i.MX8 HW support") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Daniel Baluta daniel.baluta@gmail.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Link: https://lore.kernel.org/r/20210915122116.18317-5-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/imx/imx8.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/sound/soc/sof/imx/imx8.c b/sound/soc/sof/imx/imx8.c index 12fedf0984bd..7e9723a10d02 100644 --- a/sound/soc/sof/imx/imx8.c +++ b/sound/soc/sof/imx/imx8.c @@ -365,7 +365,14 @@ static int imx8_remove(struct snd_sof_dev *sdev) /* on i.MX8 there is 1 to 1 match between type and BAR idx */ static int imx8_get_bar_index(struct snd_sof_dev *sdev, u32 type) { - return type; + /* Only IRAM and SRAM bars are valid */ + switch (type) { + case SOF_FW_BLK_TYPE_IRAM: + case SOF_FW_BLK_TYPE_SRAM: + return type; + default: + return -EINVAL; + } }
static void imx8_ipc_msg_data(struct snd_sof_dev *sdev,
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit d9be4a88c3627c270bbe032b623dc43f3b764565 ]
i.MX8 only uses SOF_FW_BLK_TYPE_IRAM (1) and SOF_FW_BLK_TYPE_SRAM (3) bars, everything else is left as 0 in sdev->bar[] array.
If a broken or purposefully crafted firmware image is loaded with other types of FW_BLK_TYPE then a kernel crash can be triggered.
Make sure that only IRAM/SRAM type is converted to bar index.
Fixes: afb93d716533d ("ASoC: SOF: imx: Add i.MX8M HW support") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Daniel Baluta daniel.baluta@gmail.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Link: https://lore.kernel.org/r/20210915122116.18317-6-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/imx/imx8m.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/sound/soc/sof/imx/imx8m.c b/sound/soc/sof/imx/imx8m.c index cb822d953767..892e1482f97f 100644 --- a/sound/soc/sof/imx/imx8m.c +++ b/sound/soc/sof/imx/imx8m.c @@ -228,7 +228,14 @@ static int imx8m_remove(struct snd_sof_dev *sdev) /* on i.MX8 there is 1 to 1 match between type and BAR idx */ static int imx8m_get_bar_index(struct snd_sof_dev *sdev, u32 type) { - return type; + /* Only IRAM and SRAM bars are valid */ + switch (type) { + case SOF_FW_BLK_TYPE_IRAM: + case SOF_FW_BLK_TYPE_SRAM: + return type; + default: + return -EINVAL; + } }
static void imx8m_ipc_msg_data(struct snd_sof_dev *sdev,
From: David Collins collinsd@codeaurora.org
[ Upstream commit d36a97736b2cc9b13db0dfdf6f32b115ec193614 ]
pmic_gpio_child_to_parent_hwirq() and gpiochip_populate_parent_fwspec_fourcell() translate a pinctrl- spmi-gpio irqspec to an SPMI controller irqspec. When they do this, they use a fixed SPMI slave ID of 0 and a fixed GPIO peripheral offset of 0xC0 (corresponding to SPMI address 0xC000). This translation results in an incorrect irqspec for secondary PMICs that don't have a slave ID of 0 as well as for PMIC chips which have GPIO peripherals located at a base address other than 0xC000.
Correct this issue by passing the slave ID of the pinctrl-spmi- gpio device's parent in the SPMI controller irqspec and by calculating the peripheral ID base from the device tree 'reg' property of the pinctrl-spmi-gpio device.
Signed-off-by: David Collins collinsd@codeaurora.org Signed-off-by: satya priya skakit@codeaurora.org Fixes: ca69e2d165eb ("qcom: spmi-gpio: add support for hierarchical IRQ chip") Reviewed-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/1631798498-10864-2-git-send-email-skakit@codeauror... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 37 ++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c index a89d24a040af..9b524969eff7 100644 --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Copyright (c) 2012-2014, The Linux Foundation. All rights reserved. + * Copyright (c) 2012-2014, 2016-2021 The Linux Foundation. All rights reserved. */
#include <linux/gpio/driver.h> @@ -14,6 +14,7 @@ #include <linux/platform_device.h> #include <linux/regmap.h> #include <linux/slab.h> +#include <linux/spmi.h> #include <linux/types.h>
#include <dt-bindings/pinctrl/qcom,pmic-gpio.h> @@ -171,6 +172,8 @@ struct pmic_gpio_state { struct pinctrl_dev *ctrl; struct gpio_chip chip; struct irq_chip irq; + u8 usid; + u8 pid_base; };
static const struct pinconf_generic_params pmic_gpio_bindings[] = { @@ -949,12 +952,36 @@ static int pmic_gpio_child_to_parent_hwirq(struct gpio_chip *chip, unsigned int *parent_hwirq, unsigned int *parent_type) { - *parent_hwirq = child_hwirq + 0xc0; + struct pmic_gpio_state *state = gpiochip_get_data(chip); + + *parent_hwirq = child_hwirq + state->pid_base; *parent_type = child_type;
return 0; }
+static void *pmic_gpio_populate_parent_fwspec(struct gpio_chip *chip, + unsigned int parent_hwirq, + unsigned int parent_type) +{ + struct pmic_gpio_state *state = gpiochip_get_data(chip); + struct irq_fwspec *fwspec; + + fwspec = kzalloc(sizeof(*fwspec), GFP_KERNEL); + if (!fwspec) + return NULL; + + fwspec->fwnode = chip->irq.parent_domain->fwnode; + + fwspec->param_count = 4; + fwspec->param[0] = state->usid; + fwspec->param[1] = parent_hwirq; + /* param[2] must be left as 0 */ + fwspec->param[3] = parent_type; + + return fwspec; +} + static int pmic_gpio_probe(struct platform_device *pdev) { struct irq_domain *parent_domain; @@ -965,6 +992,7 @@ static int pmic_gpio_probe(struct platform_device *pdev) struct pmic_gpio_pad *pad, *pads; struct pmic_gpio_state *state; struct gpio_irq_chip *girq; + const struct spmi_device *parent_spmi_dev; int ret, npins, i; u32 reg;
@@ -984,6 +1012,9 @@ static int pmic_gpio_probe(struct platform_device *pdev)
state->dev = &pdev->dev; state->map = dev_get_regmap(dev->parent, NULL); + parent_spmi_dev = to_spmi_device(dev->parent); + state->usid = parent_spmi_dev->usid; + state->pid_base = reg >> 8;
pindesc = devm_kcalloc(dev, npins, sizeof(*pindesc), GFP_KERNEL); if (!pindesc) @@ -1059,7 +1090,7 @@ static int pmic_gpio_probe(struct platform_device *pdev) girq->fwnode = of_node_to_fwnode(state->dev->of_node); girq->parent_domain = parent_domain; girq->child_to_parent_hwirq = pmic_gpio_child_to_parent_hwirq; - girq->populate_parent_alloc_arg = gpiochip_populate_parent_fwspec_fourcell; + girq->populate_parent_alloc_arg = pmic_gpio_populate_parent_fwspec; girq->child_offset_to_irq = pmic_gpio_child_offset_to_irq; girq->child_irq_domain_ops.translate = pmic_gpio_domain_translate;
From: Lama Kayal lkayal@nvidia.com
[ Upstream commit 72a3c58d18fd780eecd80178bb2132ce741a0a74 ]
Any link state change that's done prior to net device registration isn't reflected on the state, thus the operational state is left obsolete, with 'UNKNOWN' status.
To resolve the issue, query link state from FW upon open operations to ensure operational state is updated.
Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC") Signed-off-by: Lama Kayal lkayal@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx4/en_netdev.c | 47 ++++++++++++------- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 - 2 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 1e672bc36c4d..a6878e5f922a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -1272,7 +1272,6 @@ static void mlx4_en_do_set_rx_mode(struct work_struct *work) if (!netif_carrier_ok(dev)) { if (!mlx4_en_QUERY_PORT(mdev, priv->port)) { if (priv->port_state.link_state) { - priv->last_link_state = MLX4_DEV_EVENT_PORT_UP; netif_carrier_on(dev); en_dbg(LINK, priv, "Link Up\n"); } @@ -1560,26 +1559,36 @@ static void mlx4_en_service_task(struct work_struct *work) mutex_unlock(&mdev->state_lock); }
-static void mlx4_en_linkstate(struct work_struct *work) +static void mlx4_en_linkstate(struct mlx4_en_priv *priv) +{ + struct mlx4_en_port_state *port_state = &priv->port_state; + struct mlx4_en_dev *mdev = priv->mdev; + struct net_device *dev = priv->dev; + bool up; + + if (mlx4_en_QUERY_PORT(mdev, priv->port)) + port_state->link_state = MLX4_PORT_STATE_DEV_EVENT_PORT_DOWN; + + up = port_state->link_state == MLX4_PORT_STATE_DEV_EVENT_PORT_UP; + if (up == netif_carrier_ok(dev)) + netif_carrier_event(dev); + if (!up) { + en_info(priv, "Link Down\n"); + netif_carrier_off(dev); + } else { + en_info(priv, "Link Up\n"); + netif_carrier_on(dev); + } +} + +static void mlx4_en_linkstate_work(struct work_struct *work) { struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv, linkstate_task); struct mlx4_en_dev *mdev = priv->mdev; - int linkstate = priv->link_state;
mutex_lock(&mdev->state_lock); - /* If observable port state changed set carrier state and - * report to system log */ - if (priv->last_link_state != linkstate) { - if (linkstate == MLX4_DEV_EVENT_PORT_DOWN) { - en_info(priv, "Link Down\n"); - netif_carrier_off(priv->dev); - } else { - en_info(priv, "Link Up\n"); - netif_carrier_on(priv->dev); - } - } - priv->last_link_state = linkstate; + mlx4_en_linkstate(priv); mutex_unlock(&mdev->state_lock); }
@@ -2082,9 +2091,11 @@ static int mlx4_en_open(struct net_device *dev) mlx4_en_clear_stats(dev);
err = mlx4_en_start_port(dev); - if (err) + if (err) { en_err(priv, "Failed starting port:%d\n", priv->port); - + goto out; + } + mlx4_en_linkstate(priv); out: mutex_unlock(&mdev->state_lock); return err; @@ -3171,7 +3182,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port, spin_lock_init(&priv->stats_lock); INIT_WORK(&priv->rx_mode_task, mlx4_en_do_set_rx_mode); INIT_WORK(&priv->restart_task, mlx4_en_restart); - INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate); + INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate_work); INIT_DELAYED_WORK(&priv->stats_task, mlx4_en_do_get_stats); INIT_DELAYED_WORK(&priv->service_task, mlx4_en_service_task); #ifdef CONFIG_RFS_ACCEL diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index f3d1a20201ef..6bf558c5ec10 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -552,7 +552,6 @@ struct mlx4_en_priv {
struct mlx4_hwq_resources res; int link_state; - int last_link_state; bool port_up; int port; int registered;
From: Alexandra Winter wintera@linux.ibm.com
[ Upstream commit ee909d0b1dac8632eeb78cbf17661d6c7674bbd0 ]
Problem: qeth_close_dev_handler is a worker that tries to acquire card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline(). Since commit b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") qeth_remove_discipline() is called under card->discipline_mutex and cancels the work and waits for it to finish.
STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the only situation that schedules close_dev_work. In that situation scheduling qeth recovery will also result in an offline interface, when resetting the isolation mode fails, if the external switch is still set to VEB. And since commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") qeth recovery does not aquire card->discipline_mutex anymore.
So we accept the longer pathlength of qeth_schedule_recovery in this error situation and re-use the existing function.
As a side-benefit this changes the hwtrap to behave like during recovery instead of like during a user-triggered set_offline.
Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter wintera@linux.ibm.com Acked-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/qeth_core.h | 1 - drivers/s390/net/qeth_core_main.c | 16 ++++------------ drivers/s390/net/qeth_l2_main.c | 1 - drivers/s390/net/qeth_l3_main.c | 1 - 4 files changed, 4 insertions(+), 15 deletions(-)
diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h index f4d554ea0c93..52bdb2c8c085 100644 --- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -877,7 +877,6 @@ struct qeth_card { struct napi_struct napi; struct qeth_rx rx; struct delayed_work buffer_reclaim_work; - struct work_struct close_dev_work; };
static inline bool qeth_card_hw_is_reachable(struct qeth_card *card) diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index 51f7f4e680c3..dba3b345218f 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -71,15 +71,6 @@ static void qeth_issue_next_read_cb(struct qeth_card *card, static int qeth_qdio_establish(struct qeth_card *); static void qeth_free_qdio_queues(struct qeth_card *card);
-static void qeth_close_dev_handler(struct work_struct *work) -{ - struct qeth_card *card; - - card = container_of(work, struct qeth_card, close_dev_work); - QETH_CARD_TEXT(card, 2, "cldevhdl"); - ccwgroup_set_offline(card->gdev); -} - static const char *qeth_get_cardname(struct qeth_card *card) { if (IS_VM_NIC(card)) { @@ -797,10 +788,12 @@ static struct qeth_ipa_cmd *qeth_check_ipa_data(struct qeth_card *card, case IPA_CMD_STOPLAN: if (cmd->hdr.return_code == IPA_RC_VEPA_TO_VEB_TRANSITION) { dev_err(&card->gdev->dev, - "Interface %s is down because the adjacent port is no longer in reflective relay mode\n", + "Adjacent port of interface %s is no longer in reflective relay mode, trigger recovery\n", netdev_name(card->dev)); - schedule_work(&card->close_dev_work); + /* Set offline, then probably fail to set online: */ + qeth_schedule_recovery(card); } else { + /* stay online for subsequent STARTLAN */ dev_warn(&card->gdev->dev, "The link for interface %s on CHPID 0x%X failed\n", netdev_name(card->dev), card->info.chpid); @@ -1559,7 +1552,6 @@ static void qeth_setup_card(struct qeth_card *card) INIT_LIST_HEAD(&card->ipato.entries); qeth_init_qdio_info(card); INIT_DELAYED_WORK(&card->buffer_reclaim_work, qeth_buffer_reclaim_work); - INIT_WORK(&card->close_dev_work, qeth_close_dev_handler); hash_init(card->rx_mode_addrs); hash_init(card->local_addrs4); hash_init(card->local_addrs6); diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c index d7cdd9cfe485..3dbe592ca97a 100644 --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@ -2218,7 +2218,6 @@ static void qeth_l2_remove_device(struct ccwgroup_device *gdev) if (gdev->state == CCWGROUP_ONLINE) qeth_set_offline(card, card->discipline, false);
- cancel_work_sync(&card->close_dev_work); if (card->dev->reg_state == NETREG_REGISTERED) unregister_netdev(card->dev); } diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c index f0d6f205c53c..5ba38499e3e2 100644 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -1965,7 +1965,6 @@ static void qeth_l3_remove_device(struct ccwgroup_device *cgdev) if (cgdev->state == CCWGROUP_ONLINE) qeth_set_offline(card, card->discipline, false);
- cancel_work_sync(&card->close_dev_work); if (card->dev->reg_state == NETREG_REGISTERED) unregister_netdev(card->dev);
From: Alexandra Winter wintera@linux.ibm.com
[ Upstream commit d2b59bd4b06d84a4eadb520b0f71c62fe8ec0a62 ]
Commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") removed taking discipline_mutex inside qeth_do_reset(), fixing potential deadlocks. An error path was missed though, that still takes discipline_mutex and thus has the original deadlock potential.
Intermittent deadlocks were seen when a qeth channel path is configured offline, causing a race between qeth_do_reset and ccwgroup_remove. Call qeth_set_offline() directly in the qeth_do_reset() error case and then a new variant of ccwgroup_set_offline(), without taking discipline_mutex.
Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter wintera@linux.ibm.com Reviewed-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/ccwgroup.h | 2 +- drivers/s390/cio/ccwgroup.c | 10 ++++++++-- drivers/s390/net/qeth_core_main.c | 3 ++- 3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/ccwgroup.h b/arch/s390/include/asm/ccwgroup.h index 20f169b6db4e..d97301d9d0b8 100644 --- a/arch/s390/include/asm/ccwgroup.h +++ b/arch/s390/include/asm/ccwgroup.h @@ -57,7 +57,7 @@ struct ccwgroup_device *get_ccwgroupdev_by_busid(struct ccwgroup_driver *gdrv, char *bus_id);
extern int ccwgroup_set_online(struct ccwgroup_device *gdev); -extern int ccwgroup_set_offline(struct ccwgroup_device *gdev); +int ccwgroup_set_offline(struct ccwgroup_device *gdev, bool call_gdrv);
extern int ccwgroup_probe_ccwdev(struct ccw_device *cdev); extern void ccwgroup_remove_ccwdev(struct ccw_device *cdev); diff --git a/drivers/s390/cio/ccwgroup.c b/drivers/s390/cio/ccwgroup.c index 9748165e08e9..f19f02e75115 100644 --- a/drivers/s390/cio/ccwgroup.c +++ b/drivers/s390/cio/ccwgroup.c @@ -77,12 +77,13 @@ EXPORT_SYMBOL(ccwgroup_set_online); /** * ccwgroup_set_offline() - disable a ccwgroup device * @gdev: target ccwgroup device + * @call_gdrv: Call the registered gdrv set_offline function * * This function attempts to put the ccwgroup device into the offline state. * Returns: * %0 on success and a negative error value on failure. */ -int ccwgroup_set_offline(struct ccwgroup_device *gdev) +int ccwgroup_set_offline(struct ccwgroup_device *gdev, bool call_gdrv) { struct ccwgroup_driver *gdrv = to_ccwgroupdrv(gdev->dev.driver); int ret = -EINVAL; @@ -91,11 +92,16 @@ int ccwgroup_set_offline(struct ccwgroup_device *gdev) return -EAGAIN; if (gdev->state == CCWGROUP_OFFLINE) goto out; + if (!call_gdrv) { + ret = 0; + goto offline; + } if (gdrv->set_offline) ret = gdrv->set_offline(gdev); if (ret) goto out;
+offline: gdev->state = CCWGROUP_OFFLINE; out: atomic_set(&gdev->onoff, 0); @@ -124,7 +130,7 @@ static ssize_t ccwgroup_online_store(struct device *dev, if (value == 1) ret = ccwgroup_set_online(gdev); else if (value == 0) - ret = ccwgroup_set_offline(gdev); + ret = ccwgroup_set_offline(gdev, true); else ret = -EINVAL; out: diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index dba3b345218f..f5bad10f3f44 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -5548,7 +5548,8 @@ static int qeth_do_reset(void *data) dev_info(&card->gdev->dev, "Device successfully recovered!\n"); } else { - ccwgroup_set_offline(card->gdev); + qeth_set_offline(card, disc, true); + ccwgroup_set_offline(card->gdev, false); dev_warn(&card->gdev->dev, "The qeth device driver failed to recover an error on the device\n"); }
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 505d9dcb0f7ddf9d075e729523a33d38642ae680 ]
There are three bugs in this code:
1) If we ccp_init_data() fails for &src then we need to free aad. Use goto e_aad instead of goto e_ctx. 2) The label to free the &final_wa was named incorrectly as "e_tag" but it should have been "e_final_wa". One error path leaked &final_wa. 3) The &tag was leaked on one error path. In that case, I added a free before the goto because the resource was local to that block.
Fixes: 36cf515b9bbe ("crypto: ccp - Enable support for AES GCM on v5 CCPs") Reported-by: "minihanshen(沈明航)" minihanshen@tencent.com Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: John Allen john.allen@amd.com Tested-by: John Allen john.allen@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/ccp/ccp-ops.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c index bb88198c874e..aa4e1a500691 100644 --- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -778,7 +778,7 @@ ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) in_place ? DMA_BIDIRECTIONAL : DMA_TO_DEVICE); if (ret) - goto e_ctx; + goto e_aad;
if (in_place) { dst = src; @@ -863,7 +863,7 @@ ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) op.u.aes.size = 0; ret = cmd_q->ccp->vdata->perform->aes(&op); if (ret) - goto e_dst; + goto e_final_wa;
if (aes->action == CCP_AES_ACTION_ENCRYPT) { /* Put the ciphered tag after the ciphertext. */ @@ -873,17 +873,19 @@ ccp_run_aes_gcm_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) ret = ccp_init_dm_workarea(&tag, cmd_q, authsize, DMA_BIDIRECTIONAL); if (ret) - goto e_tag; + goto e_final_wa; ret = ccp_set_dm_area(&tag, 0, p_tag, 0, authsize); - if (ret) - goto e_tag; + if (ret) { + ccp_dm_free(&tag); + goto e_final_wa; + }
ret = crypto_memneq(tag.address, final_wa.address, authsize) ? -EBADMSG : 0; ccp_dm_free(&tag); }
-e_tag: +e_final_wa: ccp_dm_free(&final_wa);
e_dst:
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 50e43a57334400668952f8e551c9d87d3ed2dfef ]
We get there when sigreturn has performed obscene acts on kernel stack; in particular, the location of pt_regs has shifted. We are about to call syscall_trace(), which might stop for tracer. If that happens, we'd better have task_pt_regs() returning correct result...
Fucked-up-by: Al Viro viro@zeniv.linux.org.uk Fixes: bd6f56a75bb2 ("m68k: Missing syscall_trace() on sigreturn") Signed-off-by: Al Viro viro@zeniv.linux.org.uk Tested-by: Michael Schmitz schmitzmic@gmail.com Reviewed-by: Michael Schmitz schmitzmic@gmail.com Tested-by: Finn Thain fthain@linux-m68k.org Link: https://lore.kernel.org/r/YP2dMWeV1LkHiOpr@zeniv-ca.linux.org.uk Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/m68k/kernel/entry.S | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S index 9dd76fbb7c6b..ff9e842cec0f 100644 --- a/arch/m68k/kernel/entry.S +++ b/arch/m68k/kernel/entry.S @@ -186,6 +186,8 @@ ENTRY(ret_from_signal) movel %curptr@(TASK_STACK),%a1 tstb %a1@(TINFO_FLAGS+2) jge 1f + lea %sp@(SWITCH_STACK_SIZE),%a1 + movel %a1,%curptr@(TASK_THREAD+THREAD_ESP0) jbsr syscall_trace 1: RESTORE_SWITCH_STACK addql #4,%sp
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit adfc8f9d2f9fefd880abc82cfbf62cbfe6539c97 ]
SERIAL_CORE_CONSOLE depends on TTY so EARLY_PRINTK should also depend on TTY so that it does not select SERIAL_CORE_CONSOLE inadvertently.
WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE Depends on [n]: TTY [=n] && HAS_IOMEM [=y] Selected by [y]: - EARLY_PRINTK [=y]
Fixes: e8bf5bc776ed ("nios2: add early printk support") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/nios2/Kconfig.debug | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/nios2/Kconfig.debug b/arch/nios2/Kconfig.debug index a8bc06e96ef5..ca1beb87f987 100644 --- a/arch/nios2/Kconfig.debug +++ b/arch/nios2/Kconfig.debug @@ -3,9 +3,10 @@ config EARLY_PRINTK bool "Activate early kernel debugging" default y + depends on TTY select SERIAL_CORE_CONSOLE help - Enable early printk on console + Enable early printk on console. This is useful for kernel debugging when your machine crashes very early before the console code is initialized. You should normally say N here, unless you want to debug such a crash.
From: Marco Elver elver@google.com
[ Upstream commit fa360beac4b62d54879a88b182afef4b369c9700 ]
In the main KASAN config option CC_HAS_WORKING_NOSANITIZE_ADDRESS is checked for instrumentation-based modes. However, if HAVE_ARCH_KASAN_HW_TAGS is true all modes may still be selected.
To fix, also make the software modes depend on CC_HAS_WORKING_NOSANITIZE_ADDRESS.
Link: https://lkml.kernel.org/r/20210910084240.1215803-1-elver@google.com Fixes: 6a63a63ff1ac ("kasan: introduce CONFIG_KASAN_HW_TAGS") Signed-off-by: Marco Elver elver@google.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Aleksandr Nogikh nogikh@google.com Cc: Taras Madan tarasmadan@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/Kconfig.kasan | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan index 1e2d10f86011..cdc842d090db 100644 --- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -66,6 +66,7 @@ choice config KASAN_GENERIC bool "Generic mode" depends on HAVE_ARCH_KASAN && CC_HAS_KASAN_GENERIC + depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS select SLUB_DEBUG if SLUB select CONSTRUCTORS help @@ -86,6 +87,7 @@ config KASAN_GENERIC config KASAN_SW_TAGS bool "Software tag-based mode" depends on HAVE_ARCH_KASAN_SW_TAGS && CC_HAS_KASAN_SW_TAGS + depends on CC_HAS_WORKING_NOSANITIZE_ADDRESS select SLUB_DEBUG if SLUB select CONSTRUCTORS help
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit d46ef750ed58cbeeba2d9a55c99231c30a172764 ]
devm_add_action_or_reset() can suddenly invoke amd_mp2_pci_remove() at registration that will cause NULL pointer dereference since corresponding data is not initialized yet. The patch moves initialization of data before devm_add_action_or_reset().
Found by Linux Driver Verification project (linuxtesting.org).
[jkosina@suse.cz: rebase] Signed-off-by: Evgeny Novikov novikov@ispras.ru Acked-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c index 8d68796aa905..4069b813c6c3 100644 --- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c +++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c @@ -235,6 +235,10 @@ static int amd_mp2_pci_probe(struct pci_dev *pdev, const struct pci_device_id *i return rc; }
+ rc = amd_sfh_hid_client_init(privdata); + if (rc) + return rc; + privdata->cl_data = devm_kzalloc(&pdev->dev, sizeof(struct amdtp_cl_data), GFP_KERNEL); if (!privdata->cl_data) return -ENOMEM; @@ -245,7 +249,7 @@ static int amd_mp2_pci_probe(struct pci_dev *pdev, const struct pci_device_id *i
mp2_select_ops(privdata);
- return amd_sfh_hid_client_init(privdata); + return 0; }
static const struct pci_device_id amd_mp2_pci_tbl[] = {
From: Ian Rogers irogers@google.com
[ Upstream commit 5c34aea341b16e29fde6e6c8d4b18866cd99754d ]
To ensure the stack frames are on the stack tail calls optimizations need to be inhibited. If your compiler supports an attribute use it, otherwise use an asm volatile barrier.
The barrier fix was suggested here: https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-a... Tested with an optimized clang build and by forcing the asm barrier route with an optimized clang build.
A GCC bug tracking a proper disable_tail_calls is: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97831
Fixes: 9ae1e990f1ab ("perf tools: Remove broken __no_tail_call attribute")
v2. is a rebase. The original fix patch generated quite a lot of discussion over the right place for the fix: https://lore.kernel.org/lkml/20201114000803.909530-1-irogers@google.com/ The patch reflects my preference of it being near the use, so that future code cleanups don't break this somewhat special usage.
Signed-off-by: Ian Rogers irogers@google.com Acked-by: Jiri Olsa jolsa@redhat.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ard Biesheuvel ardb@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Miguel Ojeda ojeda@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: clang-built-linux@googlegroups.com Link: http://lore.kernel.org/lkml/20210922173812.456348-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/dwarf-unwind.c | 39 +++++++++++++++++++++++++++------ 1 file changed, 32 insertions(+), 7 deletions(-)
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c index a288035eb362..c756284b3b13 100644 --- a/tools/perf/tests/dwarf-unwind.c +++ b/tools/perf/tests/dwarf-unwind.c @@ -20,6 +20,23 @@ /* For bsearch. We try to unwind functions in shared object. */ #include <stdlib.h>
+/* + * The test will assert frames are on the stack but tail call optimizations lose + * the frame of the caller. Clang can disable this optimization on a called + * function but GCC currently (11/2020) lacks this attribute. The barrier is + * used to inhibit tail calls in these cases. + */ +#ifdef __has_attribute +#if __has_attribute(disable_tail_calls) +#define NO_TAIL_CALL_ATTRIBUTE __attribute__((disable_tail_calls)) +#define NO_TAIL_CALL_BARRIER +#endif +#endif +#ifndef NO_TAIL_CALL_ATTRIBUTE +#define NO_TAIL_CALL_ATTRIBUTE +#define NO_TAIL_CALL_BARRIER __asm__ __volatile__("" : : : "memory"); +#endif + static int mmap_handler(struct perf_tool *tool __maybe_unused, union perf_event *event, struct perf_sample *sample, @@ -91,7 +108,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg) return strcmp((const char *) symbol, funcs[idx]); }
-noinline int test_dwarf_unwind__thread(struct thread *thread) +NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__thread(struct thread *thread) { struct perf_sample sample; unsigned long cnt = 0; @@ -122,7 +139,7 @@ noinline int test_dwarf_unwind__thread(struct thread *thread)
static int global_unwind_retval = -INT_MAX;
-noinline int test_dwarf_unwind__compare(void *p1, void *p2) +NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__compare(void *p1, void *p2) { /* Any possible value should be 'thread' */ struct thread *thread = *(struct thread **)p1; @@ -141,7 +158,7 @@ noinline int test_dwarf_unwind__compare(void *p1, void *p2) return p1 - p2; }
-noinline int test_dwarf_unwind__krava_3(struct thread *thread) +NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_3(struct thread *thread) { struct thread *array[2] = {thread, thread}; void *fp = &bsearch; @@ -160,14 +177,22 @@ noinline int test_dwarf_unwind__krava_3(struct thread *thread) return global_unwind_retval; }
-noinline int test_dwarf_unwind__krava_2(struct thread *thread) +NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_2(struct thread *thread) { - return test_dwarf_unwind__krava_3(thread); + int ret; + + ret = test_dwarf_unwind__krava_3(thread); + NO_TAIL_CALL_BARRIER; + return ret; }
-noinline int test_dwarf_unwind__krava_1(struct thread *thread) +NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_1(struct thread *thread) { - return test_dwarf_unwind__krava_2(thread); + int ret; + + ret = test_dwarf_unwind__krava_2(thread); + NO_TAIL_CALL_BARRIER; + return ret; }
int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unused)
From: Like Xu likexu@tencent.com
[ Upstream commit e4fe5d7349e0b1c0d3da5b6b3e1efce591e85bd2 ]
An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be implemented to work in system-wide mode to ensure that the output from print_header() is consistent with the user documentation perf-iostat.txt, rather than incorrectly assuming that the kernel does not support it:
Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) \ for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/). /bin/dmesg | grep -i perf may provide additional information.
This error is easily fixed by assigning system-wide mode by default for IOSTAT_RUN only when the target cpu_list is unspecified.
Fixes: f07952b179697771 ("perf stat: Basic support for iostat in perf") Signed-off-by: Like Xu likexu@tencent.com Cc: Alexander Antonov alexander.antonov@linux.intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-stat.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 634375937db9..36033a7372f9 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -2406,6 +2406,8 @@ int cmd_stat(int argc, const char **argv) goto out; } else if (verbose) iostat_list(evsel_list, &stat_config); + if (iostat_mode == IOSTAT_RUN && !target__has_cpu(&target)) + target.system_wide = true; }
if (add_default_attributes())
From: Like Xu likexu@tencent.com
[ Upstream commit 4da8b121884d84476f3d50d46a471471af1aa9df ]
If the 'perf iostat' user specifies two or more iio_root_ports and also specifies the cpu(s) by -C which is not *connected to all* the above iio ports, the iostat_print_metric() will run into trouble:
For example:
$ perf iostat list S0-uncore_iio_0<0000:16> S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0
$ perf iostat 0000:16,0000:97 -C 1 -- ls port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault (core dumped) perf stat --iostat$DELIMITER$*
The core-dump stack says, in the above corner case, the returned (struct perf_counts_values *) count will be NULL, and the caller iostat_print_metric() apparently doesn't not handle this case.
433 struct perf_counts_values *count = perf_counts(evsel->counts, die, 0); 434 435 if (count->run && count->ena) { (gdb) p count $1 = (struct perf_counts_values *) 0x0
The deeper reason is that there are actually no statistics from the user specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with minimum cost by adding a NULL check in the user space.
Fixes: f9ed693e8bc0e7de ("perf stat: Enable iostat mode for x86 platforms") Signed-off-by: Like Xu likexu@tencent.com Cc: Alexander Antonov alexander.antonov@linux.intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/arch/x86/util/iostat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/arch/x86/util/iostat.c b/tools/perf/arch/x86/util/iostat.c index eeafe97b8105..792cd75ade33 100644 --- a/tools/perf/arch/x86/util/iostat.c +++ b/tools/perf/arch/x86/util/iostat.c @@ -432,7 +432,7 @@ void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel, u8 die = ((struct iio_root_port *)evsel->priv)->die; struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
- if (count->run && count->ena) { + if (count && count->run && count->ena) { if (evsel->prev_raw_counts && !out->force_header) { struct perf_counts_values *prev_count = perf_counts(evsel->prev_raw_counts, die, 0);
From: Jackie Liu liuyun01@kylinos.cn
[ Upstream commit c388a18957efdf31db8e97ec4d2d4b7dc1ca9a44 ]
Compiling sb_watchdog needs to clearly define SIBYTE_HDR_FEATURES. In arch/mips/sibyte/Platform like:
cflags-$(CONFIG_SIBYTE_BCM112X) += \ -I$(srctree)/arch/mips/include/asm/mach-sibyte \ -DSIBYTE_HDR_FEATURES=SIBYTE_HDR_FMASK_1250_112x_ALL
Otherwise, SIBYTE_HDR_FEATURES is SIBYTE_HDR_FMASK_ALL. SIBYTE_HDR_FMASK_ALL is mean:
#define SIBYTE_HDR_FMASK_ALL SIBYTE_HDR_FMASK_1250_ALL | SIBYTE_HDR_FMASK_112x_ALL \ | SIBYTE_HDR_FMASK_1480_ALL)
So, If not limited to CPU_SB1, we will get such an error:
arch/mips/include/asm/sibyte/bcm1480_scd.h:261: error: "M_SPC_CFG_CLEAR" redefined [-Werror] arch/mips/include/asm/sibyte/bcm1480_scd.h:262: error: "M_SPC_CFG_ENABLE" redefined [-Werror]
Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Jackie Liu liuyun01@kylinos.cn Reviewed-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 546dfc1e2349..71cf3f503f16 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -1677,7 +1677,7 @@ config WDT_MTX1
config SIBYTE_WDOG tristate "Sibyte SoC hardware watchdog" - depends on CPU_SB1 || (MIPS && COMPILE_TEST) + depends on CPU_SB1 help Watchdog driver for the built in watchdog hardware in Sibyte SoC processors. There are apparently two watchdog timers
From: Igor Matheus Andrade Torrente igormtorrente@gmail.com
[ Upstream commit 3b0c406124719b625b1aba431659f5cdc24a982c ]
This issue happens when a userspace program does an ioctl FBIOPUT_VSCREENINFO passing the fb_var_screeninfo struct containing only the fields xres, yres, and bits_per_pixel with values.
If this struct is the same as the previous ioctl, the vc_resize() detects it and doesn't call the resize_screen(), leaving the fb_var_screeninfo incomplete. And this leads to the updatescrollmode() calculates a wrong value to fbcon_display->vrows, which makes the real_y() return a wrong value of y, and that value, eventually, causes the imageblit to access an out-of-bound address value.
To solve this issue I made the resize_screen() be called even if the screen does not need any resizing, so it will "fix and fill" the fb_var_screeninfo independently.
Cc: stable stable@vger.kernel.org # after 5.15-rc2 is out, give it time to bake Reported-and-tested-by: syzbot+858dc7a2f7ef07c2c219@syzkaller.appspotmail.com Signed-off-by: Igor Matheus Andrade Torrente igormtorrente@gmail.com Link: https://lore.kernel.org/r/20210628134509.15895-1-igormtorrente@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/vt/vt.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index cb72393f92d3..153d4a88ec9a 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -1219,8 +1219,25 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc, new_row_size = new_cols << 1; new_screen_size = new_row_size * new_rows;
- if (new_cols == vc->vc_cols && new_rows == vc->vc_rows) - return 0; + if (new_cols == vc->vc_cols && new_rows == vc->vc_rows) { + /* + * This function is being called here to cover the case + * where the userspace calls the FBIOPUT_VSCREENINFO twice, + * passing the same fb_var_screeninfo containing the fields + * yres/xres equal to a number non-multiple of vc_font.height + * and yres_virtual/xres_virtual equal to number lesser than the + * vc_font.height and yres/xres. + * In the second call, the struct fb_var_screeninfo isn't + * being modified by the underlying driver because of the + * if above, and this causes the fbcon_display->vrows to become + * negative and it eventually leads to out-of-bound + * access by the imageblit function. + * To give the correct values to the struct and to not have + * to deal with possible errors from the code below, we call + * the resize_screen here as well. + */ + return resize_screen(vc, new_cols, new_rows, user); + }
if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size) return -EINVAL;
From: Kevin Hao haokexin@gmail.com
[ Upstream commit e5c6b312ce3cc97e90ea159446e6bfa06645364d ]
The struct sugov_tunables is protected by the kobject, so we can't free it directly. Otherwise we would get a call trace like this: ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x30 WARNING: CPU: 3 PID: 720 at lib/debugobjects.c:505 debug_print_object+0xb8/0x100 Modules linked in: CPU: 3 PID: 720 Comm: a.sh Tainted: G W 5.14.0-rc1-next-20210715-yocto-standard+ #507 Hardware name: Marvell OcteonTX CN96XX board (DT) pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--) pc : debug_print_object+0xb8/0x100 lr : debug_print_object+0xb8/0x100 sp : ffff80001ecaf910 x29: ffff80001ecaf910 x28: ffff00011b10b8d0 x27: ffff800011043d80 x26: ffff00011a8f0000 x25: ffff800013cb3ff0 x24: 0000000000000000 x23: ffff80001142aa68 x22: ffff800011043d80 x21: ffff00010de46f20 x20: ffff800013c0c520 x19: ffff800011d8f5b0 x18: 0000000000000010 x17: 6e6968207473696c x16: 5f72656d6974203a x15: 6570797420746365 x14: 6a626f2029302065 x13: 303378302f307830 x12: 2b6e665f72656d69 x11: ffff8000124b1560 x10: ffff800012331520 x9 : ffff8000100ca6b0 x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 0000000000000001 x5 : ffff800011d8c000 x4 : ffff800011d8c740 x3 : 0000000000000000 x2 : ffff0001108301c0 x1 : ab3c90eedf9c0f00 x0 : 0000000000000000 Call trace: debug_print_object+0xb8/0x100 __debug_check_no_obj_freed+0x1c0/0x230 debug_check_no_obj_freed+0x20/0x88 slab_free_freelist_hook+0x154/0x1c8 kfree+0x114/0x5d0 sugov_exit+0xbc/0xc0 cpufreq_exit_governor+0x44/0x90 cpufreq_set_policy+0x268/0x4a8 store_scaling_governor+0xe0/0x128 store+0xc0/0xf0 sysfs_kf_write+0x54/0x80 kernfs_fop_write_iter+0x128/0x1c0 new_sync_write+0xf0/0x190 vfs_write+0x2d4/0x478 ksys_write+0x74/0x100 __arm64_sys_write+0x24/0x30 invoke_syscall.constprop.0+0x54/0xe0 do_el0_svc+0x64/0x158 el0_svc+0x2c/0xb0 el0t_64_sync_handler+0xb0/0xb8 el0t_64_sync+0x198/0x19c irq event stamp: 5518 hardirqs last enabled at (5517): [<ffff8000100cbd7c>] console_unlock+0x554/0x6c8 hardirqs last disabled at (5518): [<ffff800010fc0638>] el1_dbg+0x28/0xa0 softirqs last enabled at (5504): [<ffff8000100106e0>] __do_softirq+0x4d0/0x6c0 softirqs last disabled at (5483): [<ffff800010049548>] irq_exit+0x1b0/0x1b8
So split the original sugov_tunables_free() into two functions, sugov_clear_global_tunables() is just used to clear the global_tunables and the new sugov_tunables_free() is used as kobj_type::release to release the sugov_tunables safely.
Fixes: 9bdcb44e391d ("cpufreq: schedutil: New governor based on scheduler utilization data") Cc: 4.7+ stable@vger.kernel.org # 4.7+ Signed-off-by: Kevin Hao haokexin@gmail.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cpufreq_schedutil.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 57124614363d..e7af18857371 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -537,9 +537,17 @@ static struct attribute *sugov_attrs[] = { }; ATTRIBUTE_GROUPS(sugov);
+static void sugov_tunables_free(struct kobject *kobj) +{ + struct gov_attr_set *attr_set = container_of(kobj, struct gov_attr_set, kobj); + + kfree(to_sugov_tunables(attr_set)); +} + static struct kobj_type sugov_tunables_ktype = { .default_groups = sugov_groups, .sysfs_ops = &governor_sysfs_ops, + .release = &sugov_tunables_free, };
/********************** cpufreq governor interface *********************/ @@ -639,12 +647,10 @@ static struct sugov_tunables *sugov_tunables_alloc(struct sugov_policy *sg_polic return tunables; }
-static void sugov_tunables_free(struct sugov_tunables *tunables) +static void sugov_clear_global_tunables(void) { if (!have_governor_per_policy()) global_tunables = NULL; - - kfree(tunables); }
static int sugov_init(struct cpufreq_policy *policy) @@ -707,7 +713,7 @@ out: fail: kobject_put(&tunables->attr_set.kobj); policy->governor_data = NULL; - sugov_tunables_free(tunables); + sugov_clear_global_tunables();
stop_kthread: sugov_kthread_stop(sg_policy); @@ -734,7 +740,7 @@ static void sugov_exit(struct cpufreq_policy *policy) count = gov_attr_set_put(&tunables->attr_set, &sg_policy->tunables_hook); policy->governor_data = NULL; if (!count) - sugov_tunables_free(tunables); + sugov_clear_global_tunables();
mutex_unlock(&global_tunables_lock);
From: Saurav Kashyap skashyap@marvell.com
[ Upstream commit 4a0a542fe5e4273baf9228459ef3f75c29490cba ]
The MSI-X and MSI calls fails in kdump kernel. Because of this qla2xxx_create_qpair() fails leading to .create_queue callback failure. The fix is to return existing qpair instead of allocating new one and allocate a single hw queue.
[ 19.975838] qla2xxx [0000:d8:00.1]-00c7:11: MSI-X: Failed to enable support, giving up -- 16/-28. [ 19.984885] qla2xxx [0000:d8:00.1]-0037:11: Falling back-to MSI mode -- ret=-28. [ 19.992278] qla2xxx [0000:d8:00.1]-0039:11: Falling back-to INTa mode -- ret=-28. .. .. .. [ 21.141518] qla2xxx [0000:d8:00.0]-2104:2: qla_nvme_alloc_queue: handle 00000000e7ee499d, idx =1, qsize 32 [ 21.151166] qla2xxx [0000:d8:00.0]-0181:2: FW/Driver is not multi-queue capable. [ 21.158558] qla2xxx [0000:d8:00.0]-2122:2: Failed to allocate qpair [ 21.164824] nvme nvme0: NVME-FC{0}: reset: Reconnect attempt failed (-22) [ 21.171612] nvme nvme0: NVME-FC{0}: Reconnect attempt in 2 seconds
Link: https://lore.kernel.org/r/20210810043720.1137-13-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Saurav Kashyap skashyap@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_isr.c | 2 ++ drivers/scsi/qla2xxx/qla_nvme.c | 40 +++++++++++++++------------------ 3 files changed, 20 insertions(+), 23 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 2f67ec1df3e6..82b6f4c2eb4a 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -3935,7 +3935,6 @@ struct qla_hw_data { uint32_t scm_supported_f:1; /* Enabled in Driver */ uint32_t scm_enabled:1; - uint32_t max_req_queue_warned:1; uint32_t plogi_template_valid:1; uint32_t port_isolated:1; } flags; diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index d9fb093a60a1..2aa8f519aae6 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4201,6 +4201,8 @@ skip_msi: ql_dbg(ql_dbg_init, vha, 0x0125, "INTa mode: Enabled.\n"); ha->flags.mr_intr_valid = 1; + /* Set max_qpair to 0, as MSI-X and MSI in not enabled */ + ha->max_qpairs = 0; }
clear_risc_ints: diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index a7259733e470..9316d7d91e2a 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -109,19 +109,24 @@ static int qla_nvme_alloc_queue(struct nvme_fc_local_port *lport, return -EINVAL; }
- if (ha->queue_pair_map[qidx]) { - *handle = ha->queue_pair_map[qidx]; - ql_log(ql_log_info, vha, 0x2121, - "Returning existing qpair of %p for idx=%x\n", - *handle, qidx); - return 0; - } + /* Use base qpair if max_qpairs is 0 */ + if (!ha->max_qpairs) { + qpair = ha->base_qpair; + } else { + if (ha->queue_pair_map[qidx]) { + *handle = ha->queue_pair_map[qidx]; + ql_log(ql_log_info, vha, 0x2121, + "Returning existing qpair of %p for idx=%x\n", + *handle, qidx); + return 0; + }
- qpair = qla2xxx_create_qpair(vha, 5, vha->vp_idx, true); - if (qpair == NULL) { - ql_log(ql_log_warn, vha, 0x2122, - "Failed to allocate qpair\n"); - return -EINVAL; + qpair = qla2xxx_create_qpair(vha, 5, vha->vp_idx, true); + if (!qpair) { + ql_log(ql_log_warn, vha, 0x2122, + "Failed to allocate qpair\n"); + return -EINVAL; + } } *handle = qpair;
@@ -728,18 +733,9 @@ int qla_nvme_register_hba(struct scsi_qla_host *vha)
WARN_ON(vha->nvme_local_port);
- if (ha->max_req_queues < 3) { - if (!ha->flags.max_req_queue_warned) - ql_log(ql_log_info, vha, 0x2120, - "%s: Disabling FC-NVME due to lack of free queue pairs (%d).\n", - __func__, ha->max_req_queues); - ha->flags.max_req_queue_warned = 1; - return ret; - } - qla_nvme_fc_transport.max_hw_queues = min((uint8_t)(qla_nvme_fc_transport.max_hw_queues), - (uint8_t)(ha->max_req_queues - 2)); + (uint8_t)(ha->max_qpairs ? ha->max_qpairs : 1));
pinfo.node_name = wwn_to_u64(vha->node_name); pinfo.port_name = wwn_to_u64(vha->port_name);
From: Likun Gao Likun.Gao@amd.com
[ Upstream commit 8d35a2596164c1c9d34d4656fd42b445cd1e247f ]
Fence driver was enabled per ring when sw init on per IP block before. Change to enable all the fence driver at the same time after amdgpu_device_ip_init finished. Rename some function related to fence to make it reasonable for read.
Signed-off-by: Likun Gao Likun.Gao@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 44 +++------------------- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 7 ++-- 3 files changed, 14 insertions(+), 47 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 7b42636fc7dc..112add12707d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3631,6 +3631,8 @@ fence_driver_init: goto release_ras_con; }
+ amdgpu_fence_driver_hw_init(adev); + dev_info(adev->dev, "SE %d, SH per SE %d, CU per SH %d, active_cu_number %d\n", adev->gfx.config.max_shader_engines, @@ -3798,7 +3800,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) else drm_atomic_helper_shutdown(adev_to_drm(adev)); } - amdgpu_fence_driver_fini_hw(adev); + amdgpu_fence_driver_hw_fini(adev);
if (adev->pm_sysfs_en) amdgpu_pm_sysfs_fini(adev); @@ -3820,7 +3822,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) void amdgpu_device_fini_sw(struct amdgpu_device *adev) { amdgpu_device_ip_fini(adev); - amdgpu_fence_driver_fini_sw(adev); + amdgpu_fence_driver_sw_fini(adev); release_firmware(adev->firmware.gpu_info_fw); adev->firmware.gpu_info_fw = NULL; adev->accel_working = false; @@ -3895,7 +3897,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon) /* evict vram memory */ amdgpu_bo_evict_vram(adev);
- amdgpu_fence_driver_suspend(adev); + amdgpu_fence_driver_hw_fini(adev);
amdgpu_device_ip_suspend_phase2(adev); /* evict remaining vram memory @@ -3940,7 +3942,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) dev_err(adev->dev, "amdgpu_device_ip_resume failed (%d).\n", r); return r; } - amdgpu_fence_driver_resume(adev); + amdgpu_fence_driver_hw_init(adev);
r = amdgpu_device_ip_late_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 72d9b92b1754..49c5c7331c53 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -417,9 +417,6 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, } amdgpu_fence_write(ring, atomic_read(&ring->fence_drv.last_seq));
- if (irq_src) - amdgpu_irq_get(adev, irq_src, irq_type); - ring->fence_drv.irq_src = irq_src; ring->fence_drv.irq_type = irq_type; ring->fence_drv.initialized = true; @@ -525,7 +522,7 @@ int amdgpu_fence_driver_init(struct amdgpu_device *adev) * * Tear down the fence driver for all possible rings (all asics). */ -void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev) +void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) { int i, r;
@@ -553,7 +550,7 @@ void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev) } }
-void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev) +void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev) { unsigned int i, j;
@@ -572,49 +569,18 @@ void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev) }
/** - * amdgpu_fence_driver_suspend - suspend the fence driver - * for all possible rings. - * - * @adev: amdgpu device pointer - * - * Suspend the fence driver for all possible rings (all asics). - */ -void amdgpu_fence_driver_suspend(struct amdgpu_device *adev) -{ - int i, r; - - for (i = 0; i < AMDGPU_MAX_RINGS; i++) { - struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->fence_drv.initialized) - continue; - - /* wait for gpu to finish processing current batch */ - r = amdgpu_fence_wait_empty(ring); - if (r) { - /* delay GPU reset to resume */ - amdgpu_fence_driver_force_completion(ring); - } - - /* disable the interrupt */ - if (ring->fence_drv.irq_src) - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); - } -} - -/** - * amdgpu_fence_driver_resume - resume the fence driver + * amdgpu_fence_driver_hw_init - enable the fence driver * for all possible rings. * * @adev: amdgpu device pointer * - * Resume the fence driver for all possible rings (all asics). + * Enable the fence driver for all possible rings (all asics). * Not all asics have all rings, so each asic will only * start the fence driver on the rings it has using * amdgpu_fence_driver_start_ring(). * Returns 0 for success. */ -void amdgpu_fence_driver_resume(struct amdgpu_device *adev) +void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev) { int i;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index e7d3d0dbdd96..27adffa7658d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -107,8 +107,6 @@ struct amdgpu_fence_driver { };
int amdgpu_fence_driver_init(struct amdgpu_device *adev); -void amdgpu_fence_driver_fini_hw(struct amdgpu_device *adev); -void amdgpu_fence_driver_fini_sw(struct amdgpu_device *adev); void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, @@ -117,8 +115,9 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, struct amdgpu_irq_src *irq_src, unsigned irq_type); -void amdgpu_fence_driver_suspend(struct amdgpu_device *adev); -void amdgpu_fence_driver_resume(struct amdgpu_device *adev); +void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev); +void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev); +void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev); int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence, unsigned flags); int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s,
From: Guchun Chen guchun.chen@amd.com
[ Upstream commit 067f44c8b4590c3f24d21a037578a478590f2175 ]
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop scheduler in s3 test, otherwise, fence related failure will arrive after resume. To fix this and for a better clean up, move drm_sched_fini from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and should never be called in hw_fini.
v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init, to keep sw_init and sw_fini paired.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668 Fixes: 8d35a2596164c1 ("drm/amdgpu: adjust fence driver enable sequence") Suggested-by: Christian König christian.koenig@amd.com Tested-by: Mike Lothian mike@fireburn.co.uk Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 12 +++++++----- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++-- 3 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 112add12707d..d3247a5cceb4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3602,9 +3602,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
fence_driver_init: /* Fence driver */ - r = amdgpu_fence_driver_init(adev); + r = amdgpu_fence_driver_sw_init(adev); if (r) { - dev_err(adev->dev, "amdgpu_fence_driver_init failed\n"); + dev_err(adev->dev, "amdgpu_fence_driver_sw_init failed\n"); amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 0, 0); goto failed; } @@ -3944,7 +3944,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) } amdgpu_fence_driver_hw_init(adev);
- r = amdgpu_device_ip_late_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 49c5c7331c53..7495911516c2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, }
/** - * amdgpu_fence_driver_init - init the fence driver + * amdgpu_fence_driver_sw_init - init the fence driver * for all possible rings. * * @adev: amdgpu device pointer @@ -509,13 +509,13 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, * amdgpu_fence_driver_start_ring(). * Returns 0 for success. */ -int amdgpu_fence_driver_init(struct amdgpu_device *adev) +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev) { return 0; }
/** - * amdgpu_fence_driver_fini - tear down the fence driver + * amdgpu_fence_driver_hw_fini - tear down the fence driver * for all possible rings. * * @adev: amdgpu device pointer @@ -531,8 +531,7 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
if (!ring || !ring->fence_drv.initialized) continue; - if (!ring->no_scheduler) - drm_sched_fini(&ring->sched); + /* You can't wait for HW to signal if it's gone */ if (!drm_dev_is_unplugged(&adev->ddev)) r = amdgpu_fence_wait_empty(ring); @@ -560,6 +559,9 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev) if (!ring || !ring->fence_drv.initialized) continue;
+ if (!ring->no_scheduler) + drm_sched_fini(&ring->sched); + for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) dma_fence_put(ring->fence_drv.fences[j]); kfree(ring->fence_drv.fences); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 27adffa7658d..9c11ced4312c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -106,7 +106,6 @@ struct amdgpu_fence_driver { struct dma_fence **fences; };
-int amdgpu_fence_driver_init(struct amdgpu_device *adev); void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);
int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, @@ -115,9 +114,10 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, struct amdgpu_irq_src *irq_src, unsigned irq_type); +void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev); void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev); +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev); void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev); -void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev); int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence, unsigned flags); int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s,
From: Guchun Chen guchun.chen@amd.com
[ Upstream commit f7d6779df642720e22bffd449e683bb8690bd3bf ]
This gurantees no more work on the ring can be submitted to hardware in suspend/resume case, otherwise a potential race will occur and the ring will get no chance to stay empty before suspend.
v2: Call drm_sched_resubmit_job before drm_sched_start to restart jobs from the pending list.
Suggested-by: Andrey Grodzovsky andrey.grodzovsky@amd.com Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 7495911516c2..49884069226a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -532,6 +532,9 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) if (!ring || !ring->fence_drv.initialized) continue;
+ if (!ring->no_scheduler) + drm_sched_stop(&ring->sched, NULL); + /* You can't wait for HW to signal if it's gone */ if (!drm_dev_is_unplugged(&adev->ddev)) r = amdgpu_fence_wait_empty(ring); @@ -591,6 +594,11 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev) if (!ring || !ring->fence_drv.initialized) continue;
+ if (!ring->no_scheduler) { + drm_sched_resubmit_jobs(&ring->sched); + drm_sched_start(&ring->sched, true); + } + /* enable the interrupt */ if (ring->fence_drv.irq_src) amdgpu_irq_get(adev, ring->fence_drv.irq_src,
From: James Morse james.morse@arm.com
[ Upstream commit cdef1196608892b9a46caa5f2b64095a7f0be60c ]
Since commit e5c6b312ce3c ("cpufreq: schedutil: Use kobject release() method to free sugov_tunables") kobject_put() has kfree()d the attr_set before gov_attr_set_put() returns.
kobject_put() isn't the last user of attr_set in gov_attr_set_put(), the subsequent mutex_destroy() triggers a use-after-free: | BUG: KASAN: use-after-free in mutex_is_locked+0x20/0x60 | Read of size 8 at addr ffff000800ca4250 by task cpuhp/2/20 | | CPU: 2 PID: 20 Comm: cpuhp/2 Not tainted 5.15.0-rc1 #12369 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development | Platform, BIOS EDK II Jul 30 2018 | Call trace: | dump_backtrace+0x0/0x380 | show_stack+0x1c/0x30 | dump_stack_lvl+0x8c/0xb8 | print_address_description.constprop.0+0x74/0x2b8 | kasan_report+0x1f4/0x210 | kasan_check_range+0xfc/0x1a4 | __kasan_check_read+0x38/0x60 | mutex_is_locked+0x20/0x60 | mutex_destroy+0x80/0x100 | gov_attr_set_put+0xfc/0x150 | sugov_exit+0x78/0x190 | cpufreq_offline.isra.0+0x2c0/0x660 | cpuhp_cpufreq_offline+0x14/0x24 | cpuhp_invoke_callback+0x430/0x6d0 | cpuhp_thread_fun+0x1b0/0x624 | smpboot_thread_fn+0x5e0/0xa6c | kthread+0x3a0/0x450 | ret_from_fork+0x10/0x20
Swap the order of the calls.
Fixes: e5c6b312ce3c ("cpufreq: schedutil: Use kobject release() method to free sugov_tunables") Cc: 4.7+ stable@vger.kernel.org # 4.7+ Signed-off-by: James Morse james.morse@arm.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq_governor_attr_set.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq_governor_attr_set.c b/drivers/cpufreq/cpufreq_governor_attr_set.c index 66b05a326910..a6f365b9cc1a 100644 --- a/drivers/cpufreq/cpufreq_governor_attr_set.c +++ b/drivers/cpufreq/cpufreq_governor_attr_set.c @@ -74,8 +74,8 @@ unsigned int gov_attr_set_put(struct gov_attr_set *attr_set, struct list_head *l if (count) return count;
- kobject_put(&attr_set->kobj); mutex_destroy(&attr_set->update_lock); + kobject_put(&attr_set->kobj); return 0; } EXPORT_SYMBOL_GPL(gov_attr_set_put);
From: Adrian Hunter adrian.hunter@intel.com
commit 1cbc9ad3eecd492be33b727b4606ae75bc880676 upstream.
Intel LKF can experience link errors. Make fixes to increase link stability, especially when switching to high speed modes.
Link: https://lore.kernel.org/r/20210831145317.26306-1-adrian.hunter@intel.com Fixes: b2c57925df1f ("scsi: ufs: ufs-pci: Add support for Intel LKF") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ufs/ufshcd-pci.c | 78 +++++++++++++++++++++++++++++++++++ drivers/scsi/ufs/ufshcd.c | 3 +- drivers/scsi/ufs/ufshcd.h | 1 + 3 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ufs/ufshcd-pci.c b/drivers/scsi/ufs/ufshcd-pci.c index e6c334bfb4c2..40acca04d03b 100644 --- a/drivers/scsi/ufs/ufshcd-pci.c +++ b/drivers/scsi/ufs/ufshcd-pci.c @@ -128,6 +128,81 @@ static int ufs_intel_link_startup_notify(struct ufs_hba *hba, return err; }
+static int ufs_intel_set_lanes(struct ufs_hba *hba, u32 lanes) +{ + struct ufs_pa_layer_attr pwr_info = hba->pwr_info; + int ret; + + pwr_info.lane_rx = lanes; + pwr_info.lane_tx = lanes; + ret = ufshcd_config_pwr_mode(hba, &pwr_info); + if (ret) + dev_err(hba->dev, "%s: Setting %u lanes, err = %d\n", + __func__, lanes, ret); + return ret; +} + +static int ufs_intel_lkf_pwr_change_notify(struct ufs_hba *hba, + enum ufs_notify_change_status status, + struct ufs_pa_layer_attr *dev_max_params, + struct ufs_pa_layer_attr *dev_req_params) +{ + int err = 0; + + switch (status) { + case PRE_CHANGE: + if (ufshcd_is_hs_mode(dev_max_params) && + (hba->pwr_info.lane_rx != 2 || hba->pwr_info.lane_tx != 2)) + ufs_intel_set_lanes(hba, 2); + memcpy(dev_req_params, dev_max_params, sizeof(*dev_req_params)); + break; + case POST_CHANGE: + if (ufshcd_is_hs_mode(dev_req_params)) { + u32 peer_granularity; + + usleep_range(1000, 1250); + err = ufshcd_dme_peer_get(hba, UIC_ARG_MIB(PA_GRANULARITY), + &peer_granularity); + } + break; + default: + break; + } + + return err; +} + +static int ufs_intel_lkf_apply_dev_quirks(struct ufs_hba *hba) +{ + u32 granularity, peer_granularity; + u32 pa_tactivate, peer_pa_tactivate; + int ret; + + ret = ufshcd_dme_get(hba, UIC_ARG_MIB(PA_GRANULARITY), &granularity); + if (ret) + goto out; + + ret = ufshcd_dme_peer_get(hba, UIC_ARG_MIB(PA_GRANULARITY), &peer_granularity); + if (ret) + goto out; + + ret = ufshcd_dme_get(hba, UIC_ARG_MIB(PA_TACTIVATE), &pa_tactivate); + if (ret) + goto out; + + ret = ufshcd_dme_peer_get(hba, UIC_ARG_MIB(PA_TACTIVATE), &peer_pa_tactivate); + if (ret) + goto out; + + if (granularity == peer_granularity) { + u32 new_peer_pa_tactivate = pa_tactivate + 2; + + ret = ufshcd_dme_peer_set(hba, UIC_ARG_MIB(PA_TACTIVATE), new_peer_pa_tactivate); + } +out: + return ret; +} + #define INTEL_ACTIVELTR 0x804 #define INTEL_IDLELTR 0x808
@@ -351,6 +426,7 @@ static int ufs_intel_lkf_init(struct ufs_hba *hba) struct ufs_host *ufs_host; int err;
+ hba->nop_out_timeout = 200; hba->quirks |= UFSHCD_QUIRK_BROKEN_AUTO_HIBERN8; hba->caps |= UFSHCD_CAP_CRYPTO; err = ufs_intel_common_init(hba); @@ -381,6 +457,8 @@ static struct ufs_hba_variant_ops ufs_intel_lkf_hba_vops = { .exit = ufs_intel_common_exit, .hce_enable_notify = ufs_intel_hce_enable_notify, .link_startup_notify = ufs_intel_link_startup_notify, + .pwr_change_notify = ufs_intel_lkf_pwr_change_notify, + .apply_dev_quirks = ufs_intel_lkf_apply_dev_quirks, .resume = ufs_intel_resume, .device_reset = ufs_intel_device_reset, }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 3a204324151a..bfc13f646d7b 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4767,7 +4767,7 @@ static int ufshcd_verify_dev_init(struct ufs_hba *hba) mutex_lock(&hba->dev_cmd.lock); for (retries = NOP_OUT_RETRIES; retries > 0; retries--) { err = ufshcd_exec_dev_cmd(hba, DEV_CMD_TYPE_NOP, - NOP_OUT_TIMEOUT); + hba->nop_out_timeout);
if (!err || err == -ETIMEDOUT) break; @@ -9403,6 +9403,7 @@ int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle) hba->dev = dev; *hba_handle = hba; hba->dev_ref_clk_freq = REF_CLK_FREQ_INVAL; + hba->nop_out_timeout = NOP_OUT_TIMEOUT;
INIT_LIST_HEAD(&hba->clk_list_head);
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 86d4765a17b8..aa95deffb873 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -814,6 +814,7 @@ struct ufs_hba { /* Device management request data */ struct ufs_dev_cmd dev_cmd; ktime_t last_dme_cmd_tstamp; + int nop_out_timeout;
/* Keeps information of the UFS device connected to this host */ struct ufs_dev_info dev_info;
From: Jaroslav Kysela perex@perex.cz
commit 09d23174402da0f10e98da2c61bb5ac8e7d79fdd upstream.
The new framing mode causes the user space regression, because the alsa-lib code does not initialize the reserved space in the params structure when the device is opened.
This change adds SNDRV_RAWMIDI_IOCTL_USER_PVERSION like we do for the PCM interface for the protocol acknowledgment.
Cc: David Henningsson coding@diwic.se Cc: stable@vger.kernel.org Fixes: 08fdced60ca0 ("ALSA: rawmidi: Add framing mode") BugLink: https://github.com/alsa-project/alsa-lib/issues/178 Signed-off-by: Jaroslav Kysela perex@perex.cz Link: https://lore.kernel.org/r/20210920171850.154186-1-perex@perex.cz Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/sound/rawmidi.h | 1 + include/uapi/sound/asound.h | 1 + sound/core/rawmidi.c | 9 +++++++++ 3 files changed, 11 insertions(+)
--- a/include/sound/rawmidi.h +++ b/include/sound/rawmidi.h @@ -98,6 +98,7 @@ struct snd_rawmidi_file { struct snd_rawmidi *rmidi; struct snd_rawmidi_substream *input; struct snd_rawmidi_substream *output; + unsigned int user_pversion; /* supported protocol version */ };
struct snd_rawmidi_str { --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -783,6 +783,7 @@ struct snd_rawmidi_status {
#define SNDRV_RAWMIDI_IOCTL_PVERSION _IOR('W', 0x00, int) #define SNDRV_RAWMIDI_IOCTL_INFO _IOR('W', 0x01, struct snd_rawmidi_info) +#define SNDRV_RAWMIDI_IOCTL_USER_PVERSION _IOW('W', 0x02, int) #define SNDRV_RAWMIDI_IOCTL_PARAMS _IOWR('W', 0x10, struct snd_rawmidi_params) #define SNDRV_RAWMIDI_IOCTL_STATUS _IOWR('W', 0x20, struct snd_rawmidi_status) #define SNDRV_RAWMIDI_IOCTL_DROP _IOW('W', 0x30, int) --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -873,12 +873,21 @@ static long snd_rawmidi_ioctl(struct fil return -EINVAL; } } + case SNDRV_RAWMIDI_IOCTL_USER_PVERSION: + if (get_user(rfile->user_pversion, (unsigned int __user *)arg)) + return -EFAULT; + return 0; + case SNDRV_RAWMIDI_IOCTL_PARAMS: { struct snd_rawmidi_params params;
if (copy_from_user(¶ms, argp, sizeof(struct snd_rawmidi_params))) return -EFAULT; + if (rfile->user_pversion < SNDRV_PROTOCOL_VERSION(2, 0, 2)) { + params.mode = 0; + memset(params.reserved, 0, sizeof(params.reserved)); + } switch (params.stream) { case SNDRV_RAWMIDI_STREAM_OUTPUT: if (rfile->output == NULL)
From: Takashi Sakamoto o-takashi@sakamocchi.jp
commit cb1bcf5ed536747013fe2b3f9bd56ce3242c295a upstream.
In MOTU protocol v2/v3, first two data chunks across 2nd and 3rd data channels includes message bytes from device. The total size of message is 48 bits per data block.
The 'data_block_message' tracepoints event produced by ALSA firewire-motu driver exposes the sequence of messages to userspace in 64 bit storage, however lower 32 bits are actually available since current implementation truncates 16 bits in upper of the message as a result of bit shift operation within 32 bit storage.
This commit fixes the bug by perform the bit shift in 64 bit storage.
Fixes: c6b0b9e65f09 ("ALSA: firewire-motu: add tracepoints for messages for unique protocol") Cc: stable@vger.kernel.org Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp Link: https://lore.kernel.org/r/20210920110734.27161-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/firewire/motu/amdtp-motu.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/sound/firewire/motu/amdtp-motu.c +++ b/sound/firewire/motu/amdtp-motu.c @@ -276,10 +276,11 @@ static void __maybe_unused copy_message(
/* This is just for v2/v3 protocol. */ for (i = 0; i < data_blocks; ++i) { - *frames = (be32_to_cpu(buffer[1]) << 16) | - (be32_to_cpu(buffer[2]) >> 16); + *frames = be32_to_cpu(buffer[1]); + *frames <<= 16; + *frames |= be32_to_cpu(buffer[2]) >> 16; + ++frames; buffer += data_block_quadlets; - frames++; } }
From: Cameron Berkenpas cam@neo-zeon.de
commit ad7cc2d41b7a8d0c5c5ecff37c3de7a4e137b3a6 upstream.
This patch initializes and enables speaker output on the Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 series of laptops using the HDA verb sequence specific to each model.
Speaker automute is suppressed for the Lenovo Legion 7i 15IMHG05 to avoid breaking speaker output on resume and when devices are unplugged from its headphone jack.
Thanks to: Andreas Holzer, Vincent Morel, sycxyc, Max Christian Pohle and all others that helped.
[ minor coding style fixes by tiwai ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208555 Signed-off-by: Cameron Berkenpas cam@neo-zeon.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210913212627.339362-1-cam@neo-zeon.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 129 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6442,6 +6442,20 @@ static void alc_fixup_thinkpad_acpi(stru hda_fixup_thinkpad_acpi(codec, fix, action); }
+/* Fixup for Lenovo Legion 15IMHg05 speaker output on headset removal. */ +static void alc287_fixup_legion_15imhg05_speakers(struct hda_codec *codec, + const struct hda_fixup *fix, + int action) +{ + struct alc_spec *spec = codec->spec; + + switch (action) { + case HDA_FIXUP_ACT_PRE_PROBE: + spec->gen.suppress_auto_mute = 1; + break; + } +} + /* for alc295_fixup_hp_top_speakers */ #include "hp_x360_helper.c"
@@ -6659,6 +6673,10 @@ enum { ALC623_FIXUP_LENOVO_THINKSTATION_P340, ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, ALC236_FIXUP_HP_LIMIT_INT_MIC_BOOST, + ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS, + ALC287_FIXUP_LEGION_15IMHG05_AUTOMUTE, + ALC287_FIXUP_YOGA7_14ITL_SPEAKERS, + ALC287_FIXUP_13S_GEN2_SPEAKERS };
static const struct hda_fixup alc269_fixups[] = { @@ -8249,6 +8267,113 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF, }, + [ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS] = { + .type = HDA_FIXUP_VERBS, + //.v.verbs = legion_15imhg05_coefs, + .v.verbs = (const struct hda_verb[]) { + // set left speaker Legion 7i. + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x41 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xc }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x1a }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + // set right speaker Legion 7i. + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x42 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xc }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2a }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + {} + }, + .chained = true, + .chain_id = ALC287_FIXUP_LEGION_15IMHG05_AUTOMUTE, + }, + [ALC287_FIXUP_LEGION_15IMHG05_AUTOMUTE] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc287_fixup_legion_15imhg05_speakers, + .chained = true, + .chain_id = ALC269_FIXUP_HEADSET_MODE, + }, + [ALC287_FIXUP_YOGA7_14ITL_SPEAKERS] = { + .type = HDA_FIXUP_VERBS, + .v.verbs = (const struct hda_verb[]) { + // set left speaker Yoga 7i. + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x41 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xc }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x1a }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + // set right speaker Yoga 7i. + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x46 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xc }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2a }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + {} + }, + .chained = true, + .chain_id = ALC269_FIXUP_HEADSET_MODE, + }, + [ALC287_FIXUP_13S_GEN2_SPEAKERS] = { + .type = HDA_FIXUP_VERBS, + .v.verbs = (const struct hda_verb[]) { + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x41 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x42 }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x2 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 }, + {} + }, + .chained = true, + .chain_id = ALC269_FIXUP_HEADSET_MODE, + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -8643,6 +8768,10 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940", ALC298_FIXUP_LENOVO_SPK_VOLUME), SND_PCI_QUIRK(0x17aa, 0x3827, "Ideapad S740", ALC285_FIXUP_IDEAPAD_S740_COEF), SND_PCI_QUIRK(0x17aa, 0x3843, "Yoga 9i", ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP), + SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS), + SND_PCI_QUIRK(0x17aa, 0x3852, "Lenovo Yoga 7 14ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), + SND_PCI_QUIRK(0x17aa, 0x3853, "Lenovo Yoga 7 15ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), + SND_PCI_QUIRK(0x17aa, 0x3819, "Lenovo 13s Gen2 ITL", ALC287_FIXUP_13S_GEN2_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
From: Jia He justin.he@arm.com
commit f060db99374e80e853ac4916b49f0a903f65e9dc upstream.
When ACPI NFIT table is failing to populate correct numa information on arm64, dax_kmem will get NUMA_NO_NODE from the NFIT driver.
Without this patch, pmem can't be probed as RAM devices on arm64 guest: $ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 128M kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1 kmem: probe of dax0.0 failed with error -22
Suggested-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Jia He justin.he@arm.com Cc: stable@vger.kernel.org Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM") Link: https://lore.kernel.org/r/20210922152919.6940-1-justin.he@arm.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/nfit/core.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3007,6 +3007,18 @@ static int acpi_nfit_register_region(str ndr_desc->target_node = NUMA_NO_NODE; }
+ /* Fallback to address based numa information if node lookup failed */ + if (ndr_desc->numa_node == NUMA_NO_NODE) { + ndr_desc->numa_node = memory_add_physaddr_to_nid(spa->address); + dev_info(acpi_desc->dev, "changing numa node from %d to %d for nfit region [%pa-%pa]", + NUMA_NO_NODE, ndr_desc->numa_node, &res.start, &res.end); + } + if (ndr_desc->target_node == NUMA_NO_NODE) { + ndr_desc->target_node = phys_to_target_node(spa->address); + dev_info(acpi_desc->dev, "changing target node from %d to %d for nfit region [%pa-%pa]", + NUMA_NO_NODE, ndr_desc->numa_node, &res.start, &res.end); + } + /* * Persistence domain bits are hierarchical, if * ACPI_NFIT_CAPABILITY_CACHE_FLUSH is set then
From: Eric Biggers ebiggers@google.com
commit 80f6e3080bfcf865062a926817b3ca6c4a137a57 upstream.
If the file size is almost S64_MAX, the calculated number of Merkle tree levels exceeds FS_VERITY_MAX_LEVELS, causing FS_IOC_ENABLE_VERITY to fail. This is unintentional, since as the comment above the definition of FS_VERITY_MAX_LEVELS states, it is enough for over U64_MAX bytes of data using SHA-256 and 4K blocks. (Specifically, 4096*128**8 >= 2**64.)
The bug is actually that when the number of blocks in the first level is calculated from i_size, there is a signed integer overflow due to i_size being signed. Fix this by treating i_size as unsigned.
This was found by the new test "generic: test fs-verity EFBIG scenarios" (https://lkml.kernel.org/r/b1d116cd4d0ea74b9cd86f349c672021e005a75c.163155849...).
This didn't affect ext4 or f2fs since those have a smaller maximum file size, but it did affect btrfs which allows files up to S64_MAX bytes.
Reported-by: Boris Burkov boris@bur.io Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl") Fixes: fd2d1acfcadf ("fs-verity: add the hook for file ->open()") Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Boris Burkov boris@bur.io Link: https://lore.kernel.org/r/20210916203424.113376-1-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/verity/enable.c | 2 +- fs/verity/open.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/fs/verity/enable.c +++ b/fs/verity/enable.c @@ -177,7 +177,7 @@ static int build_merkle_tree(struct file * (level 0) and ascending to the root node (level 'num_levels - 1'). * Then at the end (level 'num_levels'), calculate the root hash. */ - blocks = (inode->i_size + params->block_size - 1) >> + blocks = ((u64)inode->i_size + params->block_size - 1) >> params->log_blocksize; for (level = 0; level <= params->num_levels; level++) { err = build_merkle_tree_level(filp, level, blocks, params, --- a/fs/verity/open.c +++ b/fs/verity/open.c @@ -89,7 +89,7 @@ int fsverity_init_merkle_tree_params(str */
/* Compute number of levels and the number of blocks in each level */ - blocks = (inode->i_size + params->block_size - 1) >> log_blocksize; + blocks = ((u64)inode->i_size + params->block_size - 1) >> log_blocksize; pr_debug("Data is %lld bytes (%llu blocks)\n", inode->i_size, blocks); while (blocks > 1) { if (params->num_levels >= FS_VERITY_MAX_LEVELS) {
From: Paul Fertser fercerpav@gmail.com
commit 2938b2978a70d4cc10777ee71c9e512ffe4e0f4b upstream.
Function i2c_smbus_read_byte_data() can return a negative error number instead of the data read if I2C transaction failed for whatever reason.
Lack of error checking can lead to serious issues on production hardware, e.g. errors treated as temperatures produce spurious critical temperature-crossed-threshold errors in BMC logs for OCP server hardware. The patch was tested with Mellanox OCP Mezzanine card emulating TMP421 protocol for temperature sensing which sometimes leads to I2C protocol error during early boot up stage.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips") Cc: stable@vger.kernel.org Signed-off-by: Paul Fertser fercerpav@gmail.com Link: https://lore.kernel.org/r/20210924093011.26083-1-fercerpav@gmail.com [groeck: dropped unnecessary line breaks] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/tmp421.c | 38 ++++++++++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 10 deletions(-)
--- a/drivers/hwmon/tmp421.c +++ b/drivers/hwmon/tmp421.c @@ -119,38 +119,56 @@ static int temp_from_u16(u16 reg) return (temp * 1000 + 128) / 256; }
-static struct tmp421_data *tmp421_update_device(struct device *dev) +static int tmp421_update_device(struct tmp421_data *data) { - struct tmp421_data *data = dev_get_drvdata(dev); struct i2c_client *client = data->client; + int ret = 0; int i;
mutex_lock(&data->update_lock);
if (time_after(jiffies, data->last_updated + (HZ / 2)) || !data->valid) { - data->config = i2c_smbus_read_byte_data(client, - TMP421_CONFIG_REG_1); + ret = i2c_smbus_read_byte_data(client, TMP421_CONFIG_REG_1); + if (ret < 0) + goto exit; + data->config = ret;
for (i = 0; i < data->channels; i++) { - data->temp[i] = i2c_smbus_read_byte_data(client, - TMP421_TEMP_MSB[i]) << 8; - data->temp[i] |= i2c_smbus_read_byte_data(client, - TMP421_TEMP_LSB[i]); + ret = i2c_smbus_read_byte_data(client, TMP421_TEMP_MSB[i]); + if (ret < 0) + goto exit; + data->temp[i] = ret << 8; + + ret = i2c_smbus_read_byte_data(client, TMP421_TEMP_LSB[i]); + if (ret < 0) + goto exit; + data->temp[i] |= ret; } data->last_updated = jiffies; data->valid = 1; }
+exit: mutex_unlock(&data->update_lock);
- return data; + if (ret < 0) { + data->valid = 0; + return ret; + } + + return 0; }
static int tmp421_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, long *val) { - struct tmp421_data *tmp421 = tmp421_update_device(dev); + struct tmp421_data *tmp421 = dev_get_drvdata(dev); + int ret = 0; + + ret = tmp421_update_device(tmp421); + if (ret) + return ret;
switch (attr) { case hwmon_temp_input:
From: Nadezda Lutovinova lutovinova@ispras.ru
commit dd4d747ef05addab887dc8ff0d6ab9860bbcd783 upstream.
If driver read tmp value sufficient for (tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org Signed-off-by: Nadezda Lutovinova lutovinova@ispras.ru Link: https://lore.kernel.org/r/20210921155153.28098-3-lutovinova@ispras.ru [groeck: Dropped unnecessary continuation lines, fixed multi-line alignments] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/w83793.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-)
--- a/drivers/hwmon/w83793.c +++ b/drivers/hwmon/w83793.c @@ -202,7 +202,6 @@ static inline s8 TEMP_TO_REG(long val, s }
struct w83793_data { - struct i2c_client *lm75[2]; struct device *hwmon_dev; struct mutex update_lock; char valid; /* !=0 if following fields are valid */ @@ -1566,7 +1565,6 @@ w83793_detect_subclients(struct i2c_clie int address = client->addr; u8 tmp; struct i2c_adapter *adapter = client->adapter; - struct w83793_data *data = i2c_get_clientdata(client);
id = i2c_adapter_id(adapter); if (force_subclients[0] == id && force_subclients[1] == address) { @@ -1586,21 +1584,19 @@ w83793_detect_subclients(struct i2c_clie }
tmp = w83793_read_value(client, W83793_REG_I2C_SUBADDR); - if (!(tmp & 0x08)) - data->lm75[0] = devm_i2c_new_dummy_device(&client->dev, adapter, - 0x48 + (tmp & 0x7)); - if (!(tmp & 0x80)) { - if (!IS_ERR(data->lm75[0]) - && ((tmp & 0x7) == ((tmp >> 4) & 0x7))) { - dev_err(&client->dev, - "duplicate addresses 0x%x, " - "use force_subclients\n", data->lm75[0]->addr); - return -ENODEV; - } - data->lm75[1] = devm_i2c_new_dummy_device(&client->dev, adapter, - 0x48 + ((tmp >> 4) & 0x7)); + + if (!(tmp & 0x88) && (tmp & 0x7) == ((tmp >> 4) & 0x7)) { + dev_err(&client->dev, + "duplicate addresses 0x%x, use force_subclient\n", 0x48 + (tmp & 0x7)); + return -ENODEV; }
+ if (!(tmp & 0x08)) + devm_i2c_new_dummy_device(&client->dev, adapter, 0x48 + (tmp & 0x7)); + + if (!(tmp & 0x80)) + devm_i2c_new_dummy_device(&client->dev, adapter, 0x48 + ((tmp >> 4) & 0x7)); + return 0; }
From: Nadezda Lutovinova lutovinova@ispras.ru
commit 0f36b88173f028e372668ae040ab1a496834d278 upstream.
If driver read val value sufficient for (val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org Signed-off-by: Nadezda Lutovinova lutovinova@ispras.ru Link: https://lore.kernel.org/r/20210921155153.28098-2-lutovinova@ispras.ru [groeck: Dropped unnecessary continuation lines, fixed multipline alignment] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/w83792d.c | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-)
--- a/drivers/hwmon/w83792d.c +++ b/drivers/hwmon/w83792d.c @@ -264,9 +264,6 @@ struct w83792d_data { char valid; /* !=0 if following fields are valid */ unsigned long last_updated; /* In jiffies */
- /* array of 2 pointers to subclients */ - struct i2c_client *lm75[2]; - u8 in[9]; /* Register value */ u8 in_max[9]; /* Register value */ u8 in_min[9]; /* Register value */ @@ -927,7 +924,6 @@ w83792d_detect_subclients(struct i2c_cli int address = new_client->addr; u8 val; struct i2c_adapter *adapter = new_client->adapter; - struct w83792d_data *data = i2c_get_clientdata(new_client);
id = i2c_adapter_id(adapter); if (force_subclients[0] == id && force_subclients[1] == address) { @@ -946,21 +942,19 @@ w83792d_detect_subclients(struct i2c_cli }
val = w83792d_read_value(new_client, W83792D_REG_I2C_SUBADDR); - if (!(val & 0x08)) - data->lm75[0] = devm_i2c_new_dummy_device(&new_client->dev, adapter, - 0x48 + (val & 0x7)); - if (!(val & 0x80)) { - if (!IS_ERR(data->lm75[0]) && - ((val & 0x7) == ((val >> 4) & 0x7))) { - dev_err(&new_client->dev, - "duplicate addresses 0x%x, use force_subclient\n", - data->lm75[0]->addr); - return -ENODEV; - } - data->lm75[1] = devm_i2c_new_dummy_device(&new_client->dev, adapter, - 0x48 + ((val >> 4) & 0x7)); + + if (!(val & 0x88) && (val & 0x7) == ((val >> 4) & 0x7)) { + dev_err(&new_client->dev, + "duplicate addresses 0x%x, use force_subclient\n", 0x48 + (val & 0x7)); + return -ENODEV; }
+ if (!(val & 0x08)) + devm_i2c_new_dummy_device(&new_client->dev, adapter, 0x48 + (val & 0x7)); + + if (!(val & 0x80)) + devm_i2c_new_dummy_device(&new_client->dev, adapter, 0x48 + ((val >> 4) & 0x7)); + return 0; }
From: Nadezda Lutovinova lutovinova@ispras.ru
commit 943c15ac1b84d378da26bba41c83c67e16499ac4 upstream.
If driver read val value sufficient for (val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org Signed-off-by: Nadezda Lutovinova lutovinova@ispras.ru Link: https://lore.kernel.org/r/20210921155153.28098-1-lutovinova@ispras.ru [groeck: Dropped unnecessary continuation lines, fixed multi-line alignment] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/w83791d.c | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-)
--- a/drivers/hwmon/w83791d.c +++ b/drivers/hwmon/w83791d.c @@ -273,9 +273,6 @@ struct w83791d_data { char valid; /* !=0 if following fields are valid */ unsigned long last_updated; /* In jiffies */
- /* array of 2 pointers to subclients */ - struct i2c_client *lm75[2]; - /* volts */ u8 in[NUMBER_OF_VIN]; /* Register value */ u8 in_max[NUMBER_OF_VIN]; /* Register value */ @@ -1257,7 +1254,6 @@ static const struct attribute_group w837 static int w83791d_detect_subclients(struct i2c_client *client) { struct i2c_adapter *adapter = client->adapter; - struct w83791d_data *data = i2c_get_clientdata(client); int address = client->addr; int i, id; u8 val; @@ -1280,22 +1276,19 @@ static int w83791d_detect_subclients(str }
val = w83791d_read(client, W83791D_REG_I2C_SUBADDR); - if (!(val & 0x08)) - data->lm75[0] = devm_i2c_new_dummy_device(&client->dev, adapter, - 0x48 + (val & 0x7)); - if (!(val & 0x80)) { - if (!IS_ERR(data->lm75[0]) && - ((val & 0x7) == ((val >> 4) & 0x7))) { - dev_err(&client->dev, - "duplicate addresses 0x%x, " - "use force_subclient\n", - data->lm75[0]->addr); - return -ENODEV; - } - data->lm75[1] = devm_i2c_new_dummy_device(&client->dev, adapter, - 0x48 + ((val >> 4) & 0x7)); + + if (!(val & 0x88) && (val & 0x7) == ((val >> 4) & 0x7)) { + dev_err(&client->dev, + "duplicate addresses 0x%x, use force_subclient\n", 0x48 + (val & 0x7)); + return -ENODEV; }
+ if (!(val & 0x08)) + devm_i2c_new_dummy_device(&client->dev, adapter, 0x48 + (val & 0x7)); + + if (!(val & 0x80)) + devm_i2c_new_dummy_device(&client->dev, adapter, 0x48 + ((val >> 4) & 0x7)); + return 0; }
From: Andrey Gusakov andrey.gusakov@cogentembedded.com
commit 540cffbab8b8e6c52a4121666ca18d6e94586ed2 upstream.
Per gpio_chip interface, error shall be proparated to the caller.
Attempt to silent diagnostics by returning zero (as written in the comment) is plain wrong, because the zero return can be interpreted by the caller as the gpio value.
Cc: stable@vger.kernel.org Signed-off-by: Andrey Gusakov andrey.gusakov@cogentembedded.com Signed-off-by: Nikita Yushchenko nikita.yoush@cogentembedded.com Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-pca953x.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
--- a/drivers/gpio/gpio-pca953x.c +++ b/drivers/gpio/gpio-pca953x.c @@ -468,15 +468,8 @@ static int pca953x_gpio_get_value(struct mutex_lock(&chip->i2c_lock); ret = regmap_read(chip->regmap, inreg, ®_val); mutex_unlock(&chip->i2c_lock); - if (ret < 0) { - /* - * NOTE: - * diagnostic already emitted; that's all we should - * do unless gpio_*_value_cansleep() calls become different - * from their nonsleeping siblings (and report faults). - */ - return 0; - } + if (ret < 0) + return ret;
return !!(reg_val & bit); }
From: Jonathan Hsu jonathan.hsu@mediatek.com
commit e8c2da7e329ce004fee748b921e4c765dc2fa338 upstream.
Fix incorrect index for UTMRD reference in ufshcd_add_tm_upiu_trace().
Link: https://lore.kernel.org/r/20210924085848.25500-1-jonathan.hsu@mediatek.com Fixes: 4b42d557a8ad ("scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs") Cc: stable@vger.kernel.org Reviewed-by: Stanley Chu stanley.chu@mediatek.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jonathan Hsu jonathan.hsu@mediatek.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/ufs/ufshcd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -330,8 +330,7 @@ static void ufshcd_add_query_upiu_trace( static void ufshcd_add_tm_upiu_trace(struct ufs_hba *hba, unsigned int tag, enum ufs_trace_str_t str_t) { - int off = (int)tag - hba->nutrs; - struct utp_task_req_desc *descp = &hba->utmrdl_base_addr[off]; + struct utp_task_req_desc *descp = &hba->utmrdl_base_addr[tag];
if (!trace_ufshcd_upiu_enabled()) return;
From: Johannes Berg johannes.berg@intel.com
commit 94513069eb549737bcfc3d988d6ed4da948a2de8 upstream.
When PN checking is done in mac80211, for fragmentation we need to copy the PN to the RX struct so we can later use it to do a comparison, since commit bf30ca922a0c ("mac80211: check defrag PN against current frame").
Unfortunately, in that commit I used the 'hdr' variable without it being necessarily valid, so use-after-free could occur if it was necessary to reallocate (parts of) the frame.
Fix this by reloading the variable after the code that results in the reallocations, if any.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=214401.
Cc: stable@vger.kernel.org Fixes: bf30ca922a0c ("mac80211: check defrag PN against current frame") Link: https://lore.kernel.org/r/20210927115838.12b9ac6bb233.I1d066acd5408a662c3b6e... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/wpa.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/net/mac80211/wpa.c +++ b/net/mac80211/wpa.c @@ -520,6 +520,9 @@ ieee80211_crypto_ccmp_decrypt(struct iee return RX_DROP_UNUSABLE; }
+ /* reload hdr - skb might have been reallocated */ + hdr = (void *)rx->skb->data; + data_len = skb->len - hdrlen - IEEE80211_CCMP_HDR_LEN - mic_len; if (!rx->sta || data_len < 0) return RX_DROP_UNUSABLE; @@ -749,6 +752,9 @@ ieee80211_crypto_gcmp_decrypt(struct iee return RX_DROP_UNUSABLE; }
+ /* reload hdr - skb might have been reallocated */ + hdr = (void *)rx->skb->data; + data_len = skb->len - hdrlen - IEEE80211_GCMP_HDR_LEN - mic_len; if (!rx->sta || data_len < 0) return RX_DROP_UNUSABLE;
From: José Expósito jose.exposito89@gmail.com
commit b201cb0ebe87b209e252d85668e517ac1929e250 upstream.
Some devices, even non convertible ones, can send incorrect SW_TABLET_MODE reports.
Add an allow list and accept such reports only from devices in it.
Bug reported for Dell XPS 17 9710 on: https://gitlab.freedesktop.org/libinput/libinput/-/issues/662
Reported-by: Tobias Gurtzick magic@wizardtales.com Suggested-by: Hans de Goede hdegoede@redhat.com Tested-by: Tobias Gurtzick magic@wizardtales.com Signed-off-by: José Expósito jose.exposito89@gmail.com Link: https://lore.kernel.org/r/20210920160312.9787-1-jose.exposito89@gmail.com [hdegoede@redhat.com: Check dmi_switches_auto_add_allow_list only once] Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/intel-hid.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-)
--- a/drivers/platform/x86/intel-hid.c +++ b/drivers/platform/x86/intel-hid.c @@ -118,12 +118,30 @@ static const struct dmi_system_id dmi_vg { } };
+/* + * Some devices, even non convertible ones, can send incorrect SW_TABLET_MODE + * reports. Accept such reports only from devices in this list. + */ +static const struct dmi_system_id dmi_auto_add_switch[] = { + { + .matches = { + DMI_EXACT_MATCH(DMI_CHASSIS_TYPE, "31" /* Convertible */), + }, + }, + { + .matches = { + DMI_EXACT_MATCH(DMI_CHASSIS_TYPE, "32" /* Detachable */), + }, + }, + {} /* Array terminator */ +}; + struct intel_hid_priv { struct input_dev *input_dev; struct input_dev *array; struct input_dev *switches; bool wakeup_mode; - bool dual_accel; + bool auto_add_switch; };
#define HID_EVENT_FILTER_UUID "eeec56b3-4442-408f-a792-4edd4d758054" @@ -452,10 +470,8 @@ static void notify_handler(acpi_handle h * Some convertible have unreliable VGBS return which could cause incorrect * SW_TABLET_MODE report, in these cases we enable support when receiving * the first event instead of during driver setup. - * - * See dual_accel_detect.h for more info on the dual_accel check. */ - if (!priv->switches && !priv->dual_accel && (event == 0xcc || event == 0xcd)) { + if (!priv->switches && priv->auto_add_switch && (event == 0xcc || event == 0xcd)) { dev_info(&device->dev, "switch event received, enable switches supports\n"); err = intel_hid_switches_setup(device); if (err) @@ -596,7 +612,8 @@ static int intel_hid_probe(struct platfo return -ENOMEM; dev_set_drvdata(&device->dev, priv);
- priv->dual_accel = dual_accel_detect(); + /* See dual_accel_detect.h for more info on the dual_accel check. */ + priv->auto_add_switch = dmi_check_system(dmi_auto_add_switch) && !dual_accel_detect();
err = intel_hid_input_setup(device); if (err) {
From: Zelin Deng zelin.deng@linux.alibaba.com
commit ad9af930680bb396c87582edc172b3a7cf2a3fbf upstream.
There're other modules might use hv_clock_per_cpu variable like ptp_kvm, so move it into kvmclock.h and export the symbol to make it visiable to other modules.
Signed-off-by: Zelin Deng zelin.deng@linux.alibaba.com Cc: stable@vger.kernel.org Message-Id: 1632892429-101194-2-git-send-email-zelin.deng@linux.alibaba.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvmclock.h | 14 ++++++++++++++ arch/x86/kernel/kvmclock.c | 13 ++----------- 2 files changed, 16 insertions(+), 11 deletions(-)
--- a/arch/x86/include/asm/kvmclock.h +++ b/arch/x86/include/asm/kvmclock.h @@ -2,6 +2,20 @@ #ifndef _ASM_X86_KVM_CLOCK_H #define _ASM_X86_KVM_CLOCK_H
+#include <linux/percpu.h> + extern struct clocksource kvm_clock;
+DECLARE_PER_CPU(struct pvclock_vsyscall_time_info *, hv_clock_per_cpu); + +static inline struct pvclock_vcpu_time_info *this_cpu_pvti(void) +{ + return &this_cpu_read(hv_clock_per_cpu)->pvti; +} + +static inline struct pvclock_vsyscall_time_info *this_cpu_hvclock(void) +{ + return this_cpu_read(hv_clock_per_cpu); +} + #endif /* _ASM_X86_KVM_CLOCK_H */ --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -49,18 +49,9 @@ early_param("no-kvmclock-vsyscall", pars static struct pvclock_vsyscall_time_info hv_clock_boot[HVC_BOOT_ARRAY_SIZE] __bss_decrypted __aligned(PAGE_SIZE); static struct pvclock_wall_clock wall_clock __bss_decrypted; -static DEFINE_PER_CPU(struct pvclock_vsyscall_time_info *, hv_clock_per_cpu); static struct pvclock_vsyscall_time_info *hvclock_mem; - -static inline struct pvclock_vcpu_time_info *this_cpu_pvti(void) -{ - return &this_cpu_read(hv_clock_per_cpu)->pvti; -} - -static inline struct pvclock_vsyscall_time_info *this_cpu_hvclock(void) -{ - return this_cpu_read(hv_clock_per_cpu); -} +DEFINE_PER_CPU(struct pvclock_vsyscall_time_info *, hv_clock_per_cpu); +EXPORT_PER_CPU_SYMBOL_GPL(hv_clock_per_cpu);
/* * The wallclock is the time of day when we booted. Since then, some time may
From: Zelin Deng zelin.deng@linux.alibaba.com
commit 773e89ab0056aaa2baa1ffd9f044551654410104 upstream.
hv_clock is preallocated to have only HVC_BOOT_ARRAY_SIZE (64) elements; if the PTP_SYS_OFFSET_PRECISE ioctl is executed on vCPUs whose index is 64 of higher, retrieving the struct pvclock_vcpu_time_info pointer with "src = &hv_clock[cpu].pvti" will result in an out-of-bounds access and a wild pointer. Change it to "this_cpu_pvti()" which is guaranteed to be valid.
Fixes: 95a3d4454bb1 ("Switch kvmclock data to a PER_CPU variable") Signed-off-by: Zelin Deng zelin.deng@linux.alibaba.com Cc: stable@vger.kernel.org Message-Id: 1632892429-101194-3-git-send-email-zelin.deng@linux.alibaba.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ptp/ptp_kvm_x86.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/ptp/ptp_kvm_x86.c +++ b/drivers/ptp/ptp_kvm_x86.c @@ -15,8 +15,6 @@ #include <linux/ptp_clock_kernel.h> #include <linux/ptp_kvm.h>
-struct pvclock_vsyscall_time_info *hv_clock; - static phys_addr_t clock_pair_gpa; static struct kvm_clock_pairing clock_pair;
@@ -28,8 +26,7 @@ int kvm_arch_ptp_init(void) return -ENODEV;
clock_pair_gpa = slow_virt_to_phys(&clock_pair); - hv_clock = pvclock_get_pvti_cpu0_va(); - if (!hv_clock) + if (!pvclock_get_pvti_cpu0_va()) return -ENODEV;
ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING, clock_pair_gpa, @@ -64,10 +61,8 @@ int kvm_arch_ptp_get_crosststamp(u64 *cy struct pvclock_vcpu_time_info *src; unsigned int version; long ret; - int cpu;
- cpu = smp_processor_id(); - src = &hv_clock[cpu].pvti; + src = this_cpu_pvti();
do { /*
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 2f9b68f57c6278c322793a06063181deded0ad69 upstream.
KASAN reports the following issue:
BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798
CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- --- Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020 Call Trace: dump_stack+0xa5/0xe6 print_address_description.constprop.0+0x18/0x130 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] __kasan_report.cold+0x7f/0x114 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kasan_report+0x38/0x50 kasan_check_range+0xf5/0x1d0 kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm] ? kvm_arch_exit+0x110/0x110 [kvm] ? sched_clock+0x5/0x10 ioapic_write_indirect+0x59f/0x9e0 [kvm] ? static_obj+0xc0/0xc0 ? __lock_acquired+0x1d2/0x8c0 ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
The problem appears to be that 'vcpu_bitmap' is allocated as a single long on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear the lower 16 bits of it with bitmap_zero() for no particular reason (my guess would be that 'bitmap' and 'vcpu_bitmap' variables in kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed 16-bit long, the later should accommodate all possible vCPUs).
Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs") Fixes: 9a2ae9f6b6bb ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap") Reported-by: Dr. David Alan Gilbert dgilbert@redhat.com Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Reviewed-by: Maxim Levitsky mlevitsk@redhat.com Reviewed-by: Sean Christopherson seanjc@google.com Message-Id: 20210827092516.1027264-7-vkuznets@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/ioapic.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -319,8 +319,8 @@ static void ioapic_write_indirect(struct unsigned index; bool mask_before, mask_after; union kvm_ioapic_redirect_entry *e; - unsigned long vcpu_bitmap; int old_remote_irr, old_delivery_status, old_dest_id, old_dest_mode; + DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
switch (ioapic->ioregsel) { case IOAPIC_REG_VERSION: @@ -384,9 +384,9 @@ static void ioapic_write_indirect(struct irq.shorthand = APIC_DEST_NOSHORT; irq.dest_id = e->fields.dest_id; irq.msi_redir_hint = false; - bitmap_zero(&vcpu_bitmap, 16); + bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS); kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, - &vcpu_bitmap); + vcpu_bitmap); if (old_dest_mode != e->fields.dest_mode || old_dest_id != e->fields.dest_id) { /* @@ -399,10 +399,10 @@ static void ioapic_write_indirect(struct kvm_lapic_irq_dest_mode( !!e->fields.dest_mode); kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, - &vcpu_bitmap); + vcpu_bitmap); } kvm_make_scan_ioapic_request_mask(ioapic->kvm, - &vcpu_bitmap); + vcpu_bitmap); } else { kvm_make_scan_ioapic_request(ioapic->kvm); }
From: Maxim Levitsky mlevitsk@redhat.com
commit faf6b755629627f19feafa75b32e81cd7738f12d upstream.
These field correspond to features that we don't expose yet to L2
While currently there are no CVE worthy features in this field, if AMD adds more features to this field, that could allow guest escapes similar to CVE-2021-3653 and CVE-2021-3656.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Message-Id: 20210914154825.104886-6-mlevitsk@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/nested.c | 1 - 1 file changed, 1 deletion(-)
--- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -545,7 +545,6 @@ static void nested_vmcb02_prepare_contro (svm->nested.ctl.int_ctl & int_ctl_vmcb12_bits) | (svm->vmcb01.ptr->control.int_ctl & int_ctl_vmcb01_bits);
- svm->vmcb->control.virt_ext = svm->nested.ctl.virt_ext; svm->vmcb->control.int_vector = svm->nested.ctl.int_vector; svm->vmcb->control.int_state = svm->nested.ctl.int_state; svm->vmcb->control.event_inj = svm->nested.ctl.event_inj;
From: Sean Christopherson seanjc@google.com
commit 03a6e84069d1870f5b3d360e64cb330b66f76dee upstream.
Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT. Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT. For RESET, this is a nop as vcpu is zero-allocated. For INIT, the bug has likely escaped notice because no firmware/kernel puts its page tables root at PA=0, let alone relies on INIT to get the desired CR3 for such page tables.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210921000303.400537-3-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10873,6 +10873,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcp
static_call(kvm_x86_vcpu_reset)(vcpu, init_event);
+ vcpu->arch.cr3 = 0; + kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3); + /* * Reset the MMU context if paging was enabled prior to INIT (which is * implied if CR0.PG=1 as CR0 will be '0' prior to RESET). Unlike the
From: Sean Christopherson seanjc@google.com
commit e8a747d0884e554a8c1872da6c8f680a4f893c6d upstream.
Check whether a CPUID entry's index is significant before checking for a matching index to hack-a-fix an undefined behavior bug due to consuming uninitialized data. RESET/INIT emulation uses kvm_cpuid() to retrieve CPUID.0x1, which does _not_ have a significant index, and fails to initialize the dummy variable that doubles as EBX/ECX/EDX output _and_ ECX, a.k.a. index, input.
Practically speaking, it's _extremely_ unlikely any compiler will yield code that causes problems, as the compiler would need to inline the kvm_cpuid() call to detect the uninitialized data, and intentionally hose the kernel, e.g. insert ud2, instead of simply ignoring the result of the index comparison.
Although the sketchy "dummy" pattern was introduced in SVM by commit 66f7b72e1171 ("KVM: x86: Make register state after reset conform to specification"), it wasn't actually broken until commit 7ff6c0350315 ("KVM: x86: Remove stateful CPUID handling") arbitrarily swapped the order of operations such that "index" was checked before the significant flag.
Avoid consuming uninitialized data by reverting to checking the flag before the index purely so that the fix can be easily backported; the offending RESET/INIT code has been refactored, moved, and consolidated from vendor code to common x86 since the bug was introduced. A future patch will directly address the bad RESET/INIT behavior.
The undefined behavior was detected by syzbot + KernelMemorySanitizer.
BUG: KMSAN: uninit-value in cpuid_entry2_find arch/x86/kvm/cpuid.c:68 BUG: KMSAN: uninit-value in kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 BUG: KMSAN: uninit-value in kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183 cpuid_entry2_find arch/x86/kvm/cpuid.c:68 [inline] kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 [inline] kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183 kvm_vcpu_reset+0x13fb/0x1c20 arch/x86/kvm/x86.c:10885 kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923 vcpu_enter_guest+0xfd2/0x6d80 arch/x86/kvm/x86.c:9534 vcpu_run+0x7f5/0x18d0 arch/x86/kvm/x86.c:9788 kvm_arch_vcpu_ioctl_run+0x245b/0x2d10 arch/x86/kvm/x86.c:10020
Local variable ----dummy@kvm_vcpu_reset created at: kvm_vcpu_reset+0x1fb/0x1c20 arch/x86/kvm/x86.c:10812 kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923
Reported-by: syzbot+f3985126b746b3d59c9d@syzkaller.appspotmail.com Reported-by: Alexander Potapenko glider@google.com Fixes: 2a24be79b6b7 ("KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping") Fixes: 7ff6c0350315 ("KVM: x86: Remove stateful CPUID handling") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Reviewed-by: Jim Mattson jmattson@google.com Message-Id: 20210929222426.1855730-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/cpuid.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -65,8 +65,8 @@ static inline struct kvm_cpuid_entry2 *c for (i = 0; i < nent; i++) { e = &entries[i];
- if (e->function == function && (e->index == index || - !(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX))) + if (e->function == function && + (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index)) return e; }
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 8d68bad6d869fae8f4d50ab6423538dec7da72d1 upstream.
Windows Server 2022 with Hyper-V role enabled failed to boot on KVM when enlightened VMCS is advertised. Debugging revealed there are two exposed secondary controls it is not happy with: SECONDARY_EXEC_ENABLE_VMFUNC and SECONDARY_EXEC_SHADOW_VMCS. These controls are known to be unsupported, as there are no corresponding fields in eVMCSv1 (see the comment above EVMCS1_UNSUPPORTED_2NDEXEC definition).
Previously, commit 31de3d2500e4 ("x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs()") introduced the required filtering mechanism for VMX MSRs but for some reason put only known to be problematic (and not full EVMCS1_UNSUPPORTED_* lists) controls there.
Note, Windows Server 2022 seems to have gained some sanity check for VMX MSRs: it doesn't even try to launch a guest when there's something it doesn't like, nested_evmcs_check_controls() mechanism can't catch the problem.
Let's be bold this time and instead of playing whack-a-mole just filter out all unsupported controls from VMX MSRs.
Fixes: 31de3d2500e4 ("x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs()") Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Message-Id: 20210907163530.110066-1-vkuznets@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/evmcs.c | 12 +++++++++--- arch/x86/kvm/vmx/vmx.c | 9 +++++---- 2 files changed, 14 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/vmx/evmcs.c +++ b/arch/x86/kvm/vmx/evmcs.c @@ -354,14 +354,20 @@ void nested_evmcs_filter_control_msr(u32 switch (msr_index) { case MSR_IA32_VMX_EXIT_CTLS: case MSR_IA32_VMX_TRUE_EXIT_CTLS: - ctl_high &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL; + ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL; break; case MSR_IA32_VMX_ENTRY_CTLS: case MSR_IA32_VMX_TRUE_ENTRY_CTLS: - ctl_high &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; + ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL; break; case MSR_IA32_VMX_PROCBASED_CTLS2: - ctl_high &= ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES; + ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC; + break; + case MSR_IA32_VMX_PINBASED_CTLS: + ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL; + break; + case MSR_IA32_VMX_VMFUNC: + ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC; break; }
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1840,10 +1840,11 @@ static int vmx_get_msr(struct kvm_vcpu * &msr_info->data)) return 1; /* - * Enlightened VMCS v1 doesn't have certain fields, but buggy - * Hyper-V versions are still trying to use corresponding - * features when they are exposed. Filter out the essential - * minimum. + * Enlightened VMCS v1 doesn't have certain VMCS fields but + * instead of just ignoring the features, different Hyper-V + * versions are either trying to use them and fail or do some + * sanity checking and refuse to boot. Filter all unsupported + * features out. */ if (!msr_info->host_initiated && vmx->nested.enlightened_vmcs_enabled)
From: Peter Gonda pgonda@google.com
commit f43c887cb7cb5b66c4167d40a4209027f5fdb5ce upstream.
For mirroring SEV-ES the mirror VM will need more then just the ASID. The FD and the handle are required to all the mirror to call psp commands. The mirror VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs' VMSAs for SEV-ES.
Signed-off-by: Peter Gonda pgonda@google.com Cc: Marc Orr marcorr@google.com Cc: Nathan Tempelman natet@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Sean Christopherson seanjc@google.com Cc: Steve Rutherford srutherford@google.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: 20210921150345.2221634-2-pgonda@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1716,8 +1716,7 @@ int svm_vm_copy_asid_from(struct kvm *kv { struct file *source_kvm_file; struct kvm *source_kvm; - struct kvm_sev_info *mirror_sev; - unsigned int asid; + struct kvm_sev_info source_sev, *mirror_sev; int ret;
source_kvm_file = fget(source_fd); @@ -1740,7 +1739,8 @@ int svm_vm_copy_asid_from(struct kvm *kv goto e_source_unlock; }
- asid = to_kvm_svm(source_kvm)->sev_info.asid; + memcpy(&source_sev, &to_kvm_svm(source_kvm)->sev_info, + sizeof(source_sev));
/* * The mirror kvm holds an enc_context_owner ref so its asid can't @@ -1760,8 +1760,16 @@ int svm_vm_copy_asid_from(struct kvm *kv /* Set enc_context_owner and copy its encryption context over */ mirror_sev = &to_kvm_svm(kvm)->sev_info; mirror_sev->enc_context_owner = source_kvm; - mirror_sev->asid = asid; mirror_sev->active = true; + mirror_sev->asid = source_sev.asid; + mirror_sev->fd = source_sev.fd; + mirror_sev->es_active = source_sev.es_active; + mirror_sev->handle = source_sev.handle; + /* + * Do not copy ap_jump_table. Since the mirror does not share the same + * KVM contexts as the original, and they may have different + * memory-views. + */
mutex_unlock(&kvm->lock); return 0;
From: Sean Christopherson seanjc@google.com
commit 50c038018d6be20361e8a2890262746a4ac5b11f upstream.
Require the target guest page to be writable when pinning memory for RECEIVE_UPDATE_DATA. Per the SEV API, the PSP writes to guest memory:
The result is then encrypted with GCTX.VEK and written to the memory pointed to by GUEST_PADDR field.
Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Cc: stable@vger.kernel.org Cc: Peter Gonda pgonda@google.com Cc: Marc Orr marcorr@google.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Brijesh Singh brijesh.singh@amd.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210914210951.2994260-2-seanjc@google.com Reviewed-by: Brijesh Singh brijesh.singh@amd.com Reviewed-by: Peter Gonda pgonda@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1465,7 +1465,7 @@ static int sev_receive_update_data(struc
/* Pin guest memory */ guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK, - PAGE_SIZE, &n, 0); + PAGE_SIZE, &n, 1); if (IS_ERR(guest_page)) { ret = PTR_ERR(guest_page); goto e_free_trans;
From: Peter Gonda pgonda@google.com
commit bb18a677746543e7f5eeb478129c92cedb0f9658 upstream.
The update-VMSA ioctl touches data stored in struct kvm_vcpu, and therefore should not be performed concurrently with any VCPU ioctl that might cause KVM or the processor to use the same data.
Adds vcpu mutex guard to the VMSA updating code. Refactors out __sev_launch_update_vmsa() function to deal with per vCPU parts of sev_launch_update_vmsa().
Fixes: ad73109ae7ec ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Signed-off-by: Peter Gonda pgonda@google.com Cc: Marc Orr marcorr@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Sean Christopherson seanjc@google.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: 20210915171755.3773766-1-pgonda@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 53 +++++++++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 23 deletions(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -596,43 +596,50 @@ static int sev_es_sync_vmsa(struct vcpu_ return 0; }
-static int sev_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp) +static int __sev_launch_update_vmsa(struct kvm *kvm, struct kvm_vcpu *vcpu, + int *error) { - struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; struct sev_data_launch_update_vmsa vmsa; + struct vcpu_svm *svm = to_svm(vcpu); + int ret; + + /* Perform some pre-encryption checks against the VMSA */ + ret = sev_es_sync_vmsa(svm); + if (ret) + return ret; + + /* + * The LAUNCH_UPDATE_VMSA command will perform in-place encryption of + * the VMSA memory content (i.e it will write the same memory region + * with the guest's key), so invalidate it first. + */ + clflush_cache_range(svm->vmsa, PAGE_SIZE); + + vmsa.reserved = 0; + vmsa.handle = to_kvm_svm(kvm)->sev_info.handle; + vmsa.address = __sme_pa(svm->vmsa); + vmsa.len = PAGE_SIZE; + return sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_VMSA, &vmsa, error); +} + +static int sev_launch_update_vmsa(struct kvm *kvm, struct kvm_sev_cmd *argp) +{ struct kvm_vcpu *vcpu; int i, ret;
if (!sev_es_guest(kvm)) return -ENOTTY;
- vmsa.reserved = 0; - kvm_for_each_vcpu(i, vcpu, kvm) { - struct vcpu_svm *svm = to_svm(vcpu); - - /* Perform some pre-encryption checks against the VMSA */ - ret = sev_es_sync_vmsa(svm); + ret = mutex_lock_killable(&vcpu->mutex); if (ret) return ret;
- /* - * The LAUNCH_UPDATE_VMSA command will perform in-place - * encryption of the VMSA memory content (i.e it will write - * the same memory region with the guest's key), so invalidate - * it first. - */ - clflush_cache_range(svm->vmsa, PAGE_SIZE); - - vmsa.handle = sev->handle; - vmsa.address = __sme_pa(svm->vmsa); - vmsa.len = PAGE_SIZE; - ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_VMSA, &vmsa, - &argp->error); + ret = __sev_launch_update_vmsa(kvm, vcpu, &argp->error); + + mutex_unlock(&vcpu->mutex); if (ret) return ret; - - svm->vcpu.arch.guest_state_protected = true; }
return 0;
From: Peter Gonda pgonda@google.com
commit 5b92b6ca92b65bef811048c481e4446f4828500a upstream.
A mirrored SEV-ES VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs and have them measured, and their VMSAs encrypted. Without this change, it is impossible to have mirror VMs as part of SEV-ES VMs.
Also allow the guest status check and debugging commands since they do not change any guest state.
Signed-off-by: Peter Gonda pgonda@google.com Cc: Marc Orr marcorr@google.com Cc: Nathan Tempelman natet@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Sean Christopherson seanjc@google.com Cc: Steve Rutherford srutherford@google.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: 20210921150345.2221634-3-pgonda@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1509,6 +1509,20 @@ static int sev_receive_finish(struct kvm return sev_issue_cmd(kvm, SEV_CMD_RECEIVE_FINISH, &data, &argp->error); }
+static bool cmd_allowed_from_miror(u32 cmd_id) +{ + /* + * Allow mirrors VM to call KVM_SEV_LAUNCH_UPDATE_VMSA to enable SEV-ES + * active mirror VMs. Also allow the debugging and status commands. + */ + if (cmd_id == KVM_SEV_LAUNCH_UPDATE_VMSA || + cmd_id == KVM_SEV_GUEST_STATUS || cmd_id == KVM_SEV_DBG_DECRYPT || + cmd_id == KVM_SEV_DBG_ENCRYPT) + return true; + + return false; +} + int svm_mem_enc_op(struct kvm *kvm, void __user *argp) { struct kvm_sev_cmd sev_cmd; @@ -1525,8 +1539,9 @@ int svm_mem_enc_op(struct kvm *kvm, void
mutex_lock(&kvm->lock);
- /* enc_context_owner handles all memory enc operations */ - if (is_mirroring_enc_context(kvm)) { + /* Only the enc_context_owner handles some memory enc operations. */ + if (is_mirroring_enc_context(kvm) && + !cmd_allowed_from_miror(sev_cmd.id)) { r = -EINVAL; goto out; }
From: Mingwei Zhang mizhang@google.com
commit f1815e0aa770f2127c5df31eb5c2f0e37b60fa77 upstream.
DECOMMISSION the current SEV context if binding an ASID fails after RECEIVE_START. Per AMD's SEV API, RECEIVE_START generates a new guest context and thus needs to be paired with DECOMMISSION:
The RECEIVE_START command is the only command other than the LAUNCH_START command that generates a new guest context and guest handle.
The missing DECOMMISSION can result in subsequent SEV launch failures, as the firmware leaks memory and might not able to allocate more SEV guest contexts in the future.
Note, LAUNCH_START suffered the same bug, but was previously fixed by commit 934002cd660b ("KVM: SVM: Call SEV Guest Decommission if ASID binding fails").
Cc: Alper Gun alpergun@google.com Cc: Borislav Petkov bp@alien8.de Cc: Brijesh Singh brijesh.singh@amd.com Cc: David Rienjes rientjes@google.com Cc: Marc Orr marcorr@google.com Cc: John Allen john.allen@amd.com Cc: Peter Gonda pgonda@google.com Cc: Sean Christopherson seanjc@google.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Vipin Sharma vipinsh@google.com Cc: stable@vger.kernel.org Reviewed-by: Marc Orr marcorr@google.com Acked-by: Brijesh Singh brijesh.singh@amd.com Fixes: af43cbbf954b ("KVM: SVM: Add support for KVM_SEV_RECEIVE_START command") Signed-off-by: Mingwei Zhang mizhang@google.com Reviewed-by: Sean Christopherson seanjc@google.com Message-Id: 20210912181815.3899316-1-mizhang@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/sev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1405,8 +1405,10 @@ static int sev_receive_start(struct kvm
/* Bind ASID to this guest */ ret = sev_bind_asid(kvm, start.handle, error); - if (ret) + if (ret) { + sev_decommission(start.handle); goto e_free_session; + }
params.handle = start.handle; if (copy_to_user((void __user *)(uintptr_t)argp->data,
From: Chenyi Qiang chenyi.qiang@intel.com
commit 24a996ade34d00deef5dee2c33aacd8fda91ec31 upstream.
Nested bus lock VM exits are not supported yet. If L2 triggers bus lock VM exit, it will be directed to L1 VMM, which would cause unexpected behavior. Therefore, handle L2's bus lock VM exits in L0 directly.
Fixes: fe6b6bc802b4 ("KVM: VMX: Enable bus lock VM exit") Signed-off-by: Chenyi Qiang chenyi.qiang@intel.com Reviewed-by: Sean Christopherson seanjc@google.com Reviewed-by: Xiaoyao Li xiaoyao.li@intel.com Message-Id: 20210914095041.29764-1-chenyi.qiang@intel.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/nested.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5898,6 +5898,12 @@ static bool nested_vmx_l0_wants_exit(str case EXIT_REASON_VMFUNC: /* VM functions are emulated through L2->L0 vmexits. */ return true; + case EXIT_REASON_BUS_LOCK: + /* + * At present, bus lock VM exit is never exposed to L1. + * Handle L2's bus locks in L0 directly. + */ + return true; default: break; }
From: Zhenzhong Duan zhenzhong.duan@intel.com
commit 5c49d1850ddd3240d20dc40b01f593e35a184f38 upstream.
When updating the host's mask for its MSR_IA32_TSX_CTRL user return entry, clear the mask in the found uret MSR instead of vmx->guest_uret_msrs[i]. Modifying guest_uret_msrs directly is completely broken as 'i' does not point at the MSR_IA32_TSX_CTRL entry. In fact, it's guaranteed to be an out-of-bounds accesses as is always set to kvm_nr_uret_msrs in a prior loop. By sheer dumb luck, the fallout is limited to "only" failing to preserve the host's TSX_CTRL_CPUID_CLEAR. The out-of-bounds access is benign as it's guaranteed to clear a bit in a guest MSR value, which are always zero at vCPU creation on both x86-64 and i386.
Cc: stable@vger.kernel.org Fixes: 8ea8b8d6f869 ("KVM: VMX: Use common x86's uret MSR list as the one true list") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com Reviewed-by: Sean Christopherson seanjc@google.com Message-Id: 20210926015545.281083-1-zhenzhong.duan@intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6816,7 +6816,7 @@ static int vmx_create_vcpu(struct kvm_vc */ tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL); if (tsx_ctrl) - vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR; + tsx_ctrl->mask = ~(u64)TSX_CTRL_CPUID_CLEAR; }
err = alloc_loaded_vmcs(&vmx->vmcs01);
From: Wolfram Sang wsa+renesas@sang-engineering.com
commit b81bede4d138ce62f7342e27bf55ac93c8071818 upstream.
Old SDHI instances have a default value for the reset register which keeps it in reset state by default. So, when applying a hard reset we need to manually leave the soft reset state as well. Later SDHI instances have a different default value, the one we write manually now.
Fixes: b4d86f37eacb ("mmc: renesas_sdhi: do hard reset if possible") Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Tested-by: Geert Uytterhoeven geert+renesas@glider.be Reported-by: Geert Uytterhoeven geert+renesas@glider.be Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210826082107.47299-1-wsa+renesas@sang-engineerin... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/renesas_sdhi_core.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -582,6 +582,8 @@ static void renesas_sdhi_reset(struct tm /* Unknown why but without polling reset status, it will hang */ read_poll_timeout(reset_control_status, ret, ret == 0, 1, 100, false, priv->rstc); + /* At least SDHI_VER_GEN2_SDR50 needs manual release of reset */ + sd_ctrl_write16(host, CTL_RESET_SD, 0x0001); priv->needs_adjust_hs400 = false; renesas_sdhi_set_clock(host, host->clk_cache); } else if (priv->scc_ctl) {
From: Sean Young sean@mess.org
commit f0c15b360fb65ee39849afe987c16eb3d0175d0d upstream.
If the IR Toy is receiving IR while a transmit is done, it may end up hanging. We can prevent this from happening by re-entering sample mode just before issuing the transmit command.
Link: https://github.com/bengtmartensson/HarcHardware/discussions/25
Cc: stable@vger.kernel.org Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/ir_toy.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
--- a/drivers/media/rc/ir_toy.c +++ b/drivers/media/rc/ir_toy.c @@ -24,6 +24,7 @@ static const u8 COMMAND_VERSION[] = { 'v // End transmit and repeat reset command so we exit sump mode static const u8 COMMAND_RESET[] = { 0xff, 0xff, 0, 0, 0, 0, 0 }; static const u8 COMMAND_SMODE_ENTER[] = { 's' }; +static const u8 COMMAND_SMODE_EXIT[] = { 0 }; static const u8 COMMAND_TXSTART[] = { 0x26, 0x24, 0x25, 0x03 };
#define REPLY_XMITCOUNT 't' @@ -309,12 +310,30 @@ static int irtoy_tx(struct rc_dev *rc, u buf[i] = cpu_to_be16(v); }
- buf[count] = cpu_to_be16(0xffff); + buf[count] = 0xffff;
irtoy->tx_buf = buf; irtoy->tx_len = size; irtoy->emitted = 0;
+ // There is an issue where if the unit is receiving IR while the + // first TXSTART command is sent, the device might end up hanging + // with its led on. It does not respond to any command when this + // happens. To work around this, re-enter sample mode. + err = irtoy_command(irtoy, COMMAND_SMODE_EXIT, + sizeof(COMMAND_SMODE_EXIT), STATE_RESET); + if (err) { + dev_err(irtoy->dev, "exit sample mode: %d\n", err); + return err; + } + + err = irtoy_command(irtoy, COMMAND_SMODE_ENTER, + sizeof(COMMAND_SMODE_ENTER), STATE_COMMAND); + if (err) { + dev_err(irtoy->dev, "enter sample mode: %d\n", err); + return err; + } + err = irtoy_command(irtoy, COMMAND_TXSTART, sizeof(COMMAND_TXSTART), STATE_TX); kfree(buf);
From: Jason Gunthorpe jgg@nvidia.com
commit bc0bdc5afaa740d782fbf936aaeebd65e5c2921d upstream.
If the state is not idle then rdma_bind_addr() will immediately fail and no change to global state should happen.
For instance if the state is already RDMA_CM_LISTEN then this will corrupt the src_addr and would cause the test in cma_cancel_operation():
if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)
To view a mangled src_addr, eg with a IPv6 loopback address but an IPv4 family, failing the test.
This would manifest as this trace from syzkaller:
BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26 Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204
CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 __list_add_valid+0x93/0xa0 lib/list_debug.c:26 __list_add include/linux/list.h:67 [inline] list_add_tail include/linux/list.h:100 [inline] cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline] rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751 ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102 ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732 vfs_write+0x28e/0xa30 fs/read_write.c:603 ksys_write+0x1ee/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae
Which is indicating that an rdma_id_private was destroyed without doing cma_cancel_listens().
Instead of trying to re-use the src_addr memory to indirectly create an any address build one explicitly on the stack and bind to that as any other normal flow would do.
Link: https://lore.kernel.org/r/0-v1-9fbb33f5e201+2a-cma_listen_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: 732d41c545bb ("RDMA/cma: Make the locking for automatic state transition more clear") Reported-by: syzbot+6bb0528b13611047209c@syzkaller.appspotmail.com Tested-by: Hao Sun sunhao.th@gmail.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/cma.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3768,9 +3768,13 @@ int rdma_listen(struct rdma_cm_id *id, i int ret;
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_LISTEN)) { + struct sockaddr_in any_in = { + .sin_family = AF_INET, + .sin_addr.s_addr = htonl(INADDR_ANY), + }; + /* For a well behaved ULP state will be RDMA_CM_IDLE */ - id->route.addr.src_addr.ss_family = AF_INET; - ret = rdma_bind_addr(id, cma_src_addr(id_priv)); + ret = rdma_bind_addr(id, (struct sockaddr *)&any_in); if (ret) return ret; if (WARN_ON(!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND,
From: Jason Gunthorpe jgg@nvidia.com
commit 305d568b72f17f674155a2a8275f865f207b3808 upstream.
The FSM can run in a circle allowing rdma_resolve_ip() to be called twice on the same id_priv. While this cannot happen without going through the work, it violates the invariant that the same address resolution background request cannot be active twice.
CPU 1 CPU 2
rdma_resolve_addr(): RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY rdma_resolve_ip(addr_handler) #1
process_one_req(): for #1 addr_handler(): RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND mutex_unlock(&id_priv->handler_mutex); [.. handler still running ..]
rdma_resolve_addr(): RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY rdma_resolve_ip(addr_handler) !! two requests are now on the req_list
rdma_destroy_id(): destroy_id_handler_unlock(): _destroy_id(): cma_cancel_operation(): rdma_addr_cancel()
// process_one_req() self removes it spin_lock_bh(&lock); cancel_delayed_work(&req->work); if (!list_empty(&req->list)) == true
! rdma_addr_cancel() returns after process_on_req #1 is done
kfree(id_priv)
process_one_req(): for #2 addr_handler(): mutex_lock(&id_priv->handler_mutex); !! Use after free on id_priv
rdma_addr_cancel() expects there to be one req on the list and only cancels the first one. The self-removal behavior of the work only happens after the handler has returned. This yields a situations where the req_list can have two reqs for the same "handle" but rdma_addr_cancel() only cancels the first one.
The second req remains active beyond rdma_destroy_id() and will use-after-free id_priv once it inevitably triggers.
Fix this by remembering if the id_priv has called rdma_resolve_ip() and always cancel before calling it again. This ensures the req_list never gets more than one item in it and doesn't cost anything in the normal flow that never uses this strange error path.
Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager") Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/cma.c | 23 +++++++++++++++++++++++ drivers/infiniband/core/cma_priv.h | 1 + 2 files changed, 24 insertions(+)
--- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct { switch (state) { case RDMA_CM_ADDR_QUERY: + /* + * We can avoid doing the rdma_addr_cancel() based on state, + * only RDMA_CM_ADDR_QUERY has a work that could still execute. + * Notice that the addr_handler work could still be exiting + * outside this state, however due to the interaction with the + * handler_mutex the work is guaranteed not to touch id_priv + * during exit. + */ rdma_addr_cancel(&id_priv->id.route.addr.dev_addr); break; case RDMA_CM_ROUTE_QUERY: @@ -3410,6 +3418,21 @@ int rdma_resolve_addr(struct rdma_cm_id if (dst_addr->sa_family == AF_IB) { ret = cma_resolve_ib_addr(id_priv); } else { + /* + * The FSM can return back to RDMA_CM_ADDR_BOUND after + * rdma_resolve_ip() is called, eg through the error + * path in addr_handler(). If this happens the existing + * request must be canceled before issuing a new one. + * Since canceling a request is a bit slow and this + * oddball path is rare, keep track once a request has + * been issued. The track turns out to be a permanent + * state since this is the only cancel as it is + * immediately before rdma_resolve_ip(). + */ + if (id_priv->used_resolve_ip) + rdma_addr_cancel(&id->route.addr.dev_addr); + else + id_priv->used_resolve_ip = 1; ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr, &id->route.addr.dev_addr, timeout_ms, addr_handler, --- a/drivers/infiniband/core/cma_priv.h +++ b/drivers/infiniband/core/cma_priv.h @@ -91,6 +91,7 @@ struct rdma_id_private { u8 afonly; u8 timeout; u8 min_rnr_timer; + u8 used_resolve_ip; enum ib_gid_type gid_type;
/*
From: Nick Desaulniers ndesaulniers@google.com
commit 41e76c6a3c83c85e849f10754b8632ea763d9be4 upstream.
commit fad7cd3310db ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and add fallback code")
ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined!
As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h).
Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts.
This was fixed upstream by commit 76ae847497bc ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable.
Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets.
ld.lld: error: undefined symbol: __mulodi4
In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable.
In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive).
This does modify the debugfs interface.
Cc: stable@vger.kernel.org Cc: Arnd Bergmann arnd@kernel.org Cc: Rasmus Villemoes linux@rasmusvillemoes.dk Link: https://github.com/ClangBuiltLinux/linux/issues/1438 Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62B... Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Reported-by: Nathan Chancellor nathan@kernel.org Reported-by: Stephen Rothwell sfr@canb.auug.org.au Suggested-by: Kees Cook keescook@chromium.org Suggested-by: Linus Torvalds torvalds@linux-foundation.org Suggested-by: Pavel Machek pavel@ucw.cz Signed-off-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Josef Bacik josef@toxicpanda.com Link: https://lore.kernel.org/r/20210920232533.4092046-1-ndesaulniers@google.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/nbd.c | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-)
--- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -97,13 +97,18 @@ struct nbd_config {
atomic_t recv_threads; wait_queue_head_t recv_wq; - loff_t blksize; + unsigned int blksize_bits; loff_t bytesize; #if IS_ENABLED(CONFIG_DEBUG_FS) struct dentry *dbg_dir; #endif };
+static inline unsigned int nbd_blksize(struct nbd_config *config) +{ + return 1u << config->blksize_bits; +} + struct nbd_device { struct blk_mq_tag_set tag_set;
@@ -147,7 +152,7 @@ static struct dentry *nbd_dbg_dir;
#define NBD_MAGIC 0x68797548
-#define NBD_DEF_BLKSIZE 1024 +#define NBD_DEF_BLKSIZE_BITS 10
static unsigned int nbds_max = 16; static int max_part = 16; @@ -350,12 +355,12 @@ static int nbd_set_size(struct nbd_devic loff_t blksize) { if (!blksize) - blksize = NBD_DEF_BLKSIZE; + blksize = 1u << NBD_DEF_BLKSIZE_BITS; if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) return -EINVAL;
nbd->config->bytesize = bytesize; - nbd->config->blksize = blksize; + nbd->config->blksize_bits = __ffs(blksize);
if (!nbd->task_recv) return 0; @@ -1370,7 +1375,7 @@ static int nbd_start_device(struct nbd_d args->index = i; queue_work(nbd->recv_workq, &args->work); } - return nbd_set_size(nbd, config->bytesize, config->blksize); + return nbd_set_size(nbd, config->bytesize, nbd_blksize(config)); }
static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev) @@ -1439,11 +1444,11 @@ static int __nbd_ioctl(struct block_devi case NBD_SET_BLKSIZE: return nbd_set_size(nbd, config->bytesize, arg); case NBD_SET_SIZE: - return nbd_set_size(nbd, arg, config->blksize); + return nbd_set_size(nbd, arg, nbd_blksize(config)); case NBD_SET_SIZE_BLOCKS: - if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize)) + if (check_shl_overflow(arg, config->blksize_bits, &bytesize)) return -EINVAL; - return nbd_set_size(nbd, bytesize, config->blksize); + return nbd_set_size(nbd, bytesize, nbd_blksize(config)); case NBD_SET_TIMEOUT: nbd_set_cmd_timeout(nbd, arg); return 0; @@ -1509,7 +1514,7 @@ static struct nbd_config *nbd_alloc_conf atomic_set(&config->recv_threads, 0); init_waitqueue_head(&config->recv_wq); init_waitqueue_head(&config->conn_wait); - config->blksize = NBD_DEF_BLKSIZE; + config->blksize_bits = NBD_DEF_BLKSIZE_BITS; atomic_set(&config->live_connections, 0); try_module_get(THIS_MODULE); return config; @@ -1637,7 +1642,7 @@ static int nbd_dev_dbg_init(struct nbd_d debugfs_create_file("tasks", 0444, dir, nbd, &nbd_dbg_tasks_fops); debugfs_create_u64("size_bytes", 0444, dir, &config->bytesize); debugfs_create_u32("timeout", 0444, dir, &nbd->tag_set.timeout); - debugfs_create_u64("blocksize", 0444, dir, &config->blksize); + debugfs_create_u32("blocksize_bits", 0444, dir, &config->blksize_bits); debugfs_create_file("flags", 0444, dir, nbd, &nbd_dbg_flags_fops);
return 0; @@ -1841,7 +1846,7 @@ nbd_device_policy[NBD_DEVICE_ATTR_MAX + static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd) { struct nbd_config *config = nbd->config; - u64 bsize = config->blksize; + u64 bsize = nbd_blksize(config); u64 bytes = config->bytesize;
if (info->attrs[NBD_ATTR_SIZE_BYTES]) @@ -1850,7 +1855,7 @@ static int nbd_genl_size_set(struct genl if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]) bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]);
- if (bytes != config->bytesize || bsize != config->blksize) + if (bytes != config->bytesize || bsize != nbd_blksize(config)) return nbd_set_size(nbd, bytes, bsize); return 0; }
From: Josip Pavic Josip.Pavic@amd.com
commit 467a51b69d0828887fb1b6719159a6b16da688f8 upstream.
[Why] Stack variable params.backlight_ramping_override is uninitialized, so it contains junk data
[How] Initialize the variable to false
Reviewed-by: Roman Li Roman.Li@amd.com Acked-by: Anson Jacob Anson.Jacob@amd.com Signed-off-by: Josip Pavic Josip.Pavic@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1724,6 +1724,7 @@ static int dm_late_init(void *handle) linear_lut[i] = 0xFFFF * i / 15;
params.set = 0; + params.backlight_ramping_override = false; params.backlight_ramping_start = 0xCCCC; params.backlight_ramping_reduction = 0xCCCCCCCC; params.backlight_lut_array_size = 16;
From: Charlene Liu Charlene.Liu@amd.com
commit d942856865c733ff60450de9691af796ad71d7bc upstream.
[why] pci deviceid not passed to dal dc, without proper break, dcn2.x falls into dcn3.x code path
[how] pass in pci deviceid, and break once dal_version initialized.
Reviewed-by: Zhan Liu Zhan.Liu@amd.com Acked-by: Anson Jacob Anson.Jacob@amd.com Signed-off-by: Charlene Liu Charlene.Liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1117,6 +1117,7 @@ static int amdgpu_dm_init(struct amdgpu_
init_data.asic_id.pci_revision_id = adev->pdev->revision; init_data.asic_id.hw_internal_rev = adev->external_rev_id; + init_data.asic_id.chip_id = adev->pdev->device;
init_data.asic_id.vram_width = adev->gmc.vram_width; /* TODO: initialize init_data.asic_id.vram_type here!!!! */
From: Praful Swarnakar Praful.Swarnakar@amd.com
commit 083fa05bbaf65a01866b5440031c822e32ad7510 upstream.
[Why] ASSR is dependent on Signed PSP Verstage to enable Content Protection for eDP panels. Unsigned PSP verstage is used during development phase causing ASSR to FAIL. As a result, link training is performed with DP_PANEL_MODE_DEFAULT instead of DP_PANEL_MODE_EDP for eDP panels that causes display flicker on some panels.
[How] - Do not change panel mode, if ASSR is disabled - Just report and continue to perform eDP link training with right settings further.
Signed-off-by: Praful Swarnakar Praful.Swarnakar@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -1813,14 +1813,13 @@ bool perform_link_training_with_retries( if (panel_mode == DP_PANEL_MODE_EDP) { struct cp_psp *cp_psp = &stream->ctx->cp_psp;
- if (cp_psp && cp_psp->funcs.enable_assr) { - if (!cp_psp->funcs.enable_assr(cp_psp->handle, link)) { - /* since eDP implies ASSR on, change panel - * mode to disable ASSR - */ - panel_mode = DP_PANEL_MODE_DEFAULT; - } - } + if (cp_psp && cp_psp->funcs.enable_assr) + /* ASSR is bound to fail with unsigned PSP + * verstage used during devlopment phase. + * Report and continue with eDP panel mode to + * perform eDP link training with right settings + */ + cp_psp->funcs.enable_assr(cp_psp->handle, link); } #endif
From: Prike Liang Prike.Liang@amd.com
commit 26db706a6d77b9e184feb11725e97e53b7a89519 upstream.
In the s2idle stress test sdma resume fail occasionally,in the failed case GPU is in the gfxoff state.This issue may introduce by firmware miss handle doorbell S/R and now temporary fix the issue by forcing exit gfxoff for sdma resume.
Signed-off-by: Prike Liang Prike.Liang@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -883,6 +883,12 @@ static int sdma_v5_2_start(struct amdgpu msleep(1000); }
+ /* TODO: check whether can submit a doorbell request to raise + * a doorbell fence to exit gfxoff. + */ + if (adev->in_s0ix) + amdgpu_gfx_off_ctrl(adev, false); + sdma_v5_2_soft_reset(adev); /* unhalt the MEs */ sdma_v5_2_enable(adev, true); @@ -891,6 +897,8 @@ static int sdma_v5_2_start(struct amdgpu
/* start the gfx rings and rlc compute queues */ r = sdma_v5_2_gfx_resume(adev); + if (adev->in_s0ix) + amdgpu_gfx_off_ctrl(adev, true); if (r) return r; r = sdma_v5_2_rlc_resume(adev);
From: Simon Ser contact@emersion.fr
commit 98122e63a7ecc08c4172a17d97a06ef5536eb268 upstream.
On GFX9+, format modifiers are always enabled and ensure the frame-buffers can be scanned out at ADDFB2 time.
On GFX8-, format modifiers are not supported and no other check is performed. This means ADDFB2 IOCTLs will succeed even if the tiling isn't supported for scan-out, and will result in garbage displayed on screen [1].
Fix this by adding a check for tiling flags for GFX8 and older. The check is taken from radeonsi in Mesa (see how is_displayable is populated in gfx6_compute_surface).
Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)
[1]: https://github.com/swaywm/wlroots/issues/3185
Signed-off-by: Simon Ser contact@emersion.fr Acked-by: Michel Dänzer mdaenzer@redhat.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Harry Wentland hwentlan@amd.com Cc: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Cc: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 31 ++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -837,6 +837,28 @@ static int convert_tiling_flags_to_modif return 0; }
+/* Mirrors the is_displayable check in radeonsi's gfx6_compute_surface */ +static int check_tiling_flags_gfx6(struct amdgpu_framebuffer *afb) +{ + u64 micro_tile_mode; + + /* Zero swizzle mode means linear */ + if (AMDGPU_TILING_GET(afb->tiling_flags, SWIZZLE_MODE) == 0) + return 0; + + micro_tile_mode = AMDGPU_TILING_GET(afb->tiling_flags, MICRO_TILE_MODE); + switch (micro_tile_mode) { + case 0: /* DISPLAY */ + case 3: /* RENDER */ + return 0; + default: + drm_dbg_kms(afb->base.dev, + "Micro tile mode %llu not supported for scanout\n", + micro_tile_mode); + return -EINVAL; + } +} + static void get_block_dimensions(unsigned int block_log2, unsigned int cpp, unsigned int *width, unsigned int *height) { @@ -1103,6 +1125,7 @@ int amdgpu_display_framebuffer_init(stru const struct drm_mode_fb_cmd2 *mode_cmd, struct drm_gem_object *obj) { + struct amdgpu_device *adev = drm_to_adev(dev); int ret, i;
/* @@ -1122,6 +1145,14 @@ int amdgpu_display_framebuffer_init(stru if (ret) return ret;
+ if (!dev->mode_config.allow_fb_modifiers) { + drm_WARN_ONCE(dev, adev->family >= AMDGPU_FAMILY_AI, + "GFX9+ requires FB check based on format modifier\n"); + ret = check_tiling_flags_gfx6(rfb); + if (ret) + return ret; + } + if (dev->mode_config.allow_fb_modifiers && !(rfb->base.flags & DRM_MODE_FB_MODIFIERS)) { ret = convert_tiling_flags_to_modifier(rfb);
From: Hawking Zhang Hawking.Zhang@amd.com
commit 9f52c25f59b504a29dda42d83ac1e24d2af535d4 upstream.
didn't read the value of mmCP_HQD_QUANTUM from correct register offset
Signed-off-by: Hawking Zhang Hawking.Zhang@amd.com Reviewed-by: Le Ma Le.Ma@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -3598,7 +3598,7 @@ static int gfx_v9_0_mqd_init(struct amdg
/* set static priority for a queue/ring */ gfx_v9_0_mqd_set_priority(ring, mqd); - mqd->cp_hqd_quantum = RREG32(mmCP_HQD_QUANTUM); + mqd->cp_hqd_quantum = RREG32_SOC15(GC, 0, mmCP_HQD_QUANTUM);
/* map_queues packet doesn't need activate the queue, * so only kiq need set this field.
From: Shawn Guo shawn.guo@linaro.org
[ Upstream commit a06c2e5c048e5e07fac9daf3073bd0b6582913c7 ]
The id of slv_cnoc_mnoc_cfg node is mistakenly coded as id of slv_blsp_1. It causes the following warning on slv_blsp_1 node adding. Correct the id of slv_cnoc_mnoc_cfg node.
[ 1.948180] ------------[ cut here ]------------ [ 1.954122] WARNING: CPU: 2 PID: 7 at drivers/interconnect/core.c:962 icc_node_add+0xe4/0xf8 [ 1.958994] Modules linked in: [ 1.967399] CPU: 2 PID: 7 Comm: kworker/u16:0 Not tainted 5.14.0-rc6-next-20210818 #21 [ 1.970275] Hardware name: Xiaomi Redmi Note 7 (DT) [ 1.978169] Workqueue: events_unbound deferred_probe_work_func [ 1.982945] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.988849] pc : icc_node_add+0xe4/0xf8 [ 1.995699] lr : qnoc_probe+0x350/0x438 [ 1.999519] sp : ffff80001008bb10 [ 2.003337] x29: ffff80001008bb10 x28: 000000000000001a x27: ffffb83ddc61ee28 [ 2.006818] x26: ffff2fe341d44080 x25: ffff2fe340f3aa80 x24: ffffb83ddc98f0e8 [ 2.013938] x23: 0000000000000024 x22: ffff2fe3408b7400 x21: 0000000000000000 [ 2.021054] x20: ffff2fe3408b7410 x19: ffff2fe341d44080 x18: 0000000000000010 [ 2.028173] x17: ffff2fe3bdd0aac0 x16: 0000000000000281 x15: ffff2fe3400f5528 [ 2.035290] x14: 000000000000013f x13: ffff2fe3400f5528 x12: 00000000ffffffea [ 2.042410] x11: ffffb83ddc9109d0 x10: ffffb83ddc8f8990 x9 : ffffb83ddc8f89e8 [ 2.049527] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001 [ 2.056645] x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : ffffb83ddc9903b0 [ 2.063764] x2 : 1a1f6fde34d45500 x1 : ffff2fe340f3a880 x0 : ffff2fe340f3a880 [ 2.070882] Call trace: [ 2.077989] icc_node_add+0xe4/0xf8 [ 2.080247] qnoc_probe+0x350/0x438 [ 2.083718] platform_probe+0x68/0xd8 [ 2.087191] really_probe+0xb8/0x300 [ 2.091011] __driver_probe_device+0x78/0xe0 [ 2.094659] driver_probe_device+0x80/0x110 [ 2.098911] __device_attach_driver+0x90/0xe0 [ 2.102818] bus_for_each_drv+0x78/0xc8 [ 2.107331] __device_attach+0xf0/0x150 [ 2.110977] device_initial_probe+0x14/0x20 [ 2.114796] bus_probe_device+0x9c/0xa8 [ 2.118963] deferred_probe_work_func+0x88/0xc0 [ 2.122784] process_one_work+0x1a4/0x338 [ 2.127296] worker_thread+0x1f8/0x420 [ 2.131464] kthread+0x150/0x160 [ 2.135107] ret_from_fork+0x10/0x20 [ 2.138495] ---[ end trace 5eea8768cb620e87 ]---
Signed-off-by: Shawn Guo shawn.guo@linaro.org Reviewed-by: Bjorn Andersson bjorn.andersson@linaro.org Fixes: f80a1d414328 ("interconnect: qcom: Add SDM660 interconnect provider driver") Link: https://lore.kernel.org/r/20210823014003.31391-1-shawn.guo@linaro.org Signed-off-by: Georgi Djakov djakov@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/interconnect/qcom/sdm660.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/interconnect/qcom/sdm660.c b/drivers/interconnect/qcom/sdm660.c index 632dbdd21915..ac13046537e8 100644 --- a/drivers/interconnect/qcom/sdm660.c +++ b/drivers/interconnect/qcom/sdm660.c @@ -307,7 +307,7 @@ DEFINE_QNODE(slv_bimc_cfg, SDM660_SLAVE_BIMC_CFG, 4, -1, 56, true, -1, 0, -1, 0) DEFINE_QNODE(slv_prng, SDM660_SLAVE_PRNG, 4, -1, 44, true, -1, 0, -1, 0); DEFINE_QNODE(slv_spdm, SDM660_SLAVE_SPDM, 4, -1, 60, true, -1, 0, -1, 0); DEFINE_QNODE(slv_qdss_cfg, SDM660_SLAVE_QDSS_CFG, 4, -1, 63, true, -1, 0, -1, 0); -DEFINE_QNODE(slv_cnoc_mnoc_cfg, SDM660_SLAVE_BLSP_1, 4, -1, 66, true, -1, 0, -1, SDM660_MASTER_CNOC_MNOC_CFG); +DEFINE_QNODE(slv_cnoc_mnoc_cfg, SDM660_SLAVE_CNOC_MNOC_CFG, 4, -1, 66, true, -1, 0, -1, SDM660_MASTER_CNOC_MNOC_CFG); DEFINE_QNODE(slv_snoc_cfg, SDM660_SLAVE_SNOC_CFG, 4, -1, 70, true, -1, 0, -1, 0); DEFINE_QNODE(slv_qm_cfg, SDM660_SLAVE_QM_CFG, 4, -1, 212, true, -1, 0, -1, 0); DEFINE_QNODE(slv_clk_ctl, SDM660_SLAVE_CLK_CTL, 4, -1, 47, true, -1, 0, -1, 0);
From: Shawn Guo shawn.guo@linaro.org
[ Upstream commit 5833c9b8766298e73c11766f9585d4ea4fa785ff ]
The NOC_QOS_PRIORITY shift and mask do not match what vendor kernel defines [1]. Correct them per vendor kernel. As the result of NOC_QOS_PRIORITY_P0_SHIFT being 0, the definition can be dropped and regmap_update_bits() call on P0 can be simplified a bit.
[1] https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/soc/qcom/m...
Fixes: f80a1d414328 ("interconnect: qcom: Add SDM660 interconnect provider driver") Signed-off-by: Shawn Guo shawn.guo@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@somainline.org Link: https://lore.kernel.org/r/20210902054915.28689-1-shawn.guo@linaro.org Signed-off-by: Georgi Djakov djakov@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/interconnect/qcom/sdm660.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/interconnect/qcom/sdm660.c b/drivers/interconnect/qcom/sdm660.c index ac13046537e8..99eef7e2d326 100644 --- a/drivers/interconnect/qcom/sdm660.c +++ b/drivers/interconnect/qcom/sdm660.c @@ -44,9 +44,9 @@ #define NOC_PERM_MODE_BYPASS (1 << NOC_QOS_MODE_BYPASS)
#define NOC_QOS_PRIORITYn_ADDR(n) (0x8 + (n * 0x1000)) -#define NOC_QOS_PRIORITY_MASK 0xf +#define NOC_QOS_PRIORITY_P1_MASK 0xc +#define NOC_QOS_PRIORITY_P0_MASK 0x3 #define NOC_QOS_PRIORITY_P1_SHIFT 0x2 -#define NOC_QOS_PRIORITY_P0_SHIFT 0x3
#define NOC_QOS_MODEn_ADDR(n) (0xc + (n * 0x1000)) #define NOC_QOS_MODEn_MASK 0x3 @@ -624,13 +624,12 @@ static int qcom_icc_noc_set_qos_priority(struct regmap *rmap, /* Must be updated one at a time, P1 first, P0 last */ val = qos->areq_prio << NOC_QOS_PRIORITY_P1_SHIFT; rc = regmap_update_bits(rmap, NOC_QOS_PRIORITYn_ADDR(qos->qos_port), - NOC_QOS_PRIORITY_MASK, val); + NOC_QOS_PRIORITY_P1_MASK, val); if (rc) return rc;
- val = qos->prio_level << NOC_QOS_PRIORITY_P0_SHIFT; return regmap_update_bits(rmap, NOC_QOS_PRIORITYn_ADDR(qos->qos_port), - NOC_QOS_PRIORITY_MASK, val); + NOC_QOS_PRIORITY_P0_MASK, qos->prio_level); }
static int qcom_icc_set_noc_qos(struct icc_node *src, u64 max_bw)
From: Zhi A Wang zhi.wang.linux2@gmail.com
[ Upstream commit d168cd797982db9db617113644c87b8f5f3cf27e ]
As the APIs related to ww lock in i915 was changed recently, the usage of ww lock in GVT-g scheduler needs to be changed accrodingly. We noticed a deadlock when GVT-g scheduler submits the workload to i915. After some investigation, it seems the way of how to use ww lock APIs has been changed. Releasing a ww now requires a explicit i915_gem_ww_ctx_fini().
Fixes: 67f1120381df ("drm/i915/gvt: Introduce per object locking in GVT scheduler.") Cc: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Zhi A Wang zhi.a.wang@intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20210826143834.25410-1-zhi.a.wa... Acked-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gvt/scheduler.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index 734c37c5e347..527b59b86312 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -576,7 +576,7 @@ static int prepare_shadow_batch_buffer(struct intel_vgpu_workload *workload)
/* No one is going to touch shadow bb from now on. */ i915_gem_object_flush_map(bb->obj); - i915_gem_object_unlock(bb->obj); + i915_gem_ww_ctx_fini(&ww); } } return 0; @@ -630,7 +630,7 @@ static int prepare_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx) return ret; }
- i915_gem_object_unlock(wa_ctx->indirect_ctx.obj); + i915_gem_ww_ctx_fini(&ww);
/* FIXME: we are not tracking our pinned VMA leaving it * up to the core to fix up the stray pin_count upon
From: Andrea Claudi aclaudi@redhat.com
[ Upstream commit 69e73dbfda14fbfe748d3812da1244cce2928dcb ]
ip_vs_conn_tab_bits may be provided by the user through the conn_tab_bits module parameter. If this value is greater than 31, or less than 0, the shift operator used to derive tab_size causes undefined behaviour.
Fix this checking ip_vs_conn_tab_bits value to be in the range specified in ipvs Kconfig. If not, simply use default value.
Fixes: 6f7edb4881bf ("IPVS: Allow boot time change of hash size") Reported-by: Yi Chen yiche@redhat.com Signed-off-by: Andrea Claudi aclaudi@redhat.com Acked-by: Julian Anastasov ja@ssi.bg Acked-by: Simon Horman horms@verge.net.au Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipvs/ip_vs_conn.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c index c100c6b112c8..2c467c422dc6 100644 --- a/net/netfilter/ipvs/ip_vs_conn.c +++ b/net/netfilter/ipvs/ip_vs_conn.c @@ -1468,6 +1468,10 @@ int __init ip_vs_conn_init(void) int idx;
/* Compute size and mask */ + if (ip_vs_conn_tab_bits < 8 || ip_vs_conn_tab_bits > 20) { + pr_info("conn_tab_bits not in [8, 20]. Using default value\n"); + ip_vs_conn_tab_bits = CONFIG_IP_VS_TAB_BITS; + } ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits; ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
From: Hou Tao houtao1@huawei.com
[ Upstream commit 356ed64991c6847a0c4f2e8fa3b1133f7a14f1fc ]
Currently if a function ptr in struct_ops has a return value, its caller will get a random return value from it, because the return value of related BPF_PROG_TYPE_STRUCT_OPS prog is just dropped.
So adding a new flag BPF_TRAMP_F_RET_FENTRY_RET to tell bpf trampoline to save and return the return value of struct_ops prog if ret_size of the function ptr is greater than 0. Also restricting the flag to be used alone.
Fixes: 85d33df357b6 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/bpf/20210914023351.3664499-1-houtao1@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 53 ++++++++++++++++++++++++++++--------- include/linux/bpf.h | 2 ++ kernel/bpf/bpf_struct_ops.c | 7 +++-- 3 files changed, 47 insertions(+), 15 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 16d76f814e9b..47780844598a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1744,7 +1744,7 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args, }
static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, - struct bpf_prog *p, int stack_size, bool mod_ret) + struct bpf_prog *p, int stack_size, bool save_ret) { u8 *prog = *pprog; u8 *jmp_insn; @@ -1777,11 +1777,15 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog, if (emit_call(&prog, p->bpf_func, prog)) return -EINVAL;
- /* BPF_TRAMP_MODIFY_RETURN trampolines can modify the return + /* + * BPF_TRAMP_MODIFY_RETURN trampolines can modify the return * of the previous call which is then passed on the stack to * the next BPF program. + * + * BPF_TRAMP_FENTRY trampoline may need to return the return + * value of BPF_PROG_TYPE_STRUCT_OPS prog. */ - if (mod_ret) + if (save_ret) emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
/* replace 2 nops with JE insn, since jmp target is known */ @@ -1828,13 +1832,15 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond) }
static int invoke_bpf(const struct btf_func_model *m, u8 **pprog, - struct bpf_tramp_progs *tp, int stack_size) + struct bpf_tramp_progs *tp, int stack_size, + bool save_ret) { int i; u8 *prog = *pprog;
for (i = 0; i < tp->nr_progs; i++) { - if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, false)) + if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, + save_ret)) return -EINVAL; } *pprog = prog; @@ -1877,6 +1883,23 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog, return 0; }
+static bool is_valid_bpf_tramp_flags(unsigned int flags) +{ + if ((flags & BPF_TRAMP_F_RESTORE_REGS) && + (flags & BPF_TRAMP_F_SKIP_FRAME)) + return false; + + /* + * BPF_TRAMP_F_RET_FENTRY_RET is only used by bpf_struct_ops, + * and it must be used alone. + */ + if ((flags & BPF_TRAMP_F_RET_FENTRY_RET) && + (flags & ~BPF_TRAMP_F_RET_FENTRY_RET)) + return false; + + return true; +} + /* Example: * __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev); * its 'struct btf_func_model' will be nr_args=2 @@ -1949,17 +1972,19 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN]; u8 **branches = NULL; u8 *prog; + bool save_ret;
/* x86-64 supports up to 6 arguments. 7+ can be added in the future */ if (nr_args > 6) return -ENOTSUPP;
- if ((flags & BPF_TRAMP_F_RESTORE_REGS) && - (flags & BPF_TRAMP_F_SKIP_FRAME)) + if (!is_valid_bpf_tramp_flags(flags)) return -EINVAL;
- if (flags & BPF_TRAMP_F_CALL_ORIG) - stack_size += 8; /* room for return value of orig_call */ + /* room for return value of orig_call or fentry prog */ + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET); + if (save_ret) + stack_size += 8;
if (flags & BPF_TRAMP_F_SKIP_FRAME) /* skip patched call instruction and point orig_call to actual @@ -1986,7 +2011,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i }
if (fentry->nr_progs) - if (invoke_bpf(m, &prog, fentry, stack_size)) + if (invoke_bpf(m, &prog, fentry, stack_size, + flags & BPF_TRAMP_F_RET_FENTRY_RET)) return -EINVAL;
if (fmod_ret->nr_progs) { @@ -2033,7 +2059,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i }
if (fexit->nr_progs) - if (invoke_bpf(m, &prog, fexit, stack_size)) { + if (invoke_bpf(m, &prog, fexit, stack_size, false)) { ret = -EINVAL; goto cleanup; } @@ -2053,9 +2079,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i ret = -EINVAL; goto cleanup; } - /* restore original return value back into RAX */ - emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8); } + /* restore return value of orig_call or fentry prog back into RAX */ + if (save_ret) + emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
EMIT1(0x5B); /* pop rbx */ EMIT1(0xC9); /* leave */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e8e2b0393ca9..11da5671d4f0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -553,6 +553,8 @@ struct btf_func_model { * programs only. Should not be used with normal calls and indirect calls. */ #define BPF_TRAMP_F_SKIP_FRAME BIT(2) +/* Return the return value of fentry prog. Only used by bpf_struct_ops. */ +#define BPF_TRAMP_F_RET_FENTRY_RET BIT(4)
/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50 * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2 diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index 70f6fd4fa305..2ce17447fb76 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -367,6 +367,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key, const struct btf_type *mtype, *ptype; struct bpf_prog *prog; u32 moff; + u32 flags;
moff = btf_member_bit_offset(t, member) / 8; ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL); @@ -430,10 +431,12 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
tprogs[BPF_TRAMP_FENTRY].progs[0] = prog; tprogs[BPF_TRAMP_FENTRY].nr_progs = 1; + flags = st_ops->func_models[i].ret_size > 0 ? + BPF_TRAMP_F_RET_FENTRY_RET : 0; err = arch_prepare_bpf_trampoline(NULL, image, st_map->image + PAGE_SIZE, - &st_ops->func_models[i], 0, - tprogs, NULL); + &st_ops->func_models[i], + flags, tprogs, NULL); if (err < 0) goto reset_unlock;
From: Christoph Lameter cl@gentwo.de
[ Upstream commit 2cc74e1ee31d00393b6698ec80b322fd26523da4 ]
ROCE uses IGMP for Multicast instead of the native Infiniband system where joins are required in order to post messages on the Multicast group. On Ethernet one can send Multicast messages to arbitrary addresses without the need to subscribe to a group.
So ROCE correctly does not send IGMP joins during rdma_join_multicast().
F.e. in cma_iboe_join_multicast() we see:
if (addr->sa_family == AF_INET) { if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) { ib.rec.hop_limit = IPV6_DEFAULT_HOPLIMIT; if (!send_only) { err = cma_igmp_send(ndev, &ib.rec.mgid, true); } } } else {
So the IGMP join is suppressed as it is unnecessary.
However no such check is done in destroy_mc(). And therefore leaving a sendonly multicast group will send an IGMP leave.
This means that the following scenario can lead to a multicast receiver unexpectedly being unsubscribed from a MC group:
1. Sender thread does a sendonly join on MC group X. No IGMP join is sent.
2. Receiver thread does a regular join on the same MC Group x. IGMP join is sent and the receiver begins to get messages.
3. Sender thread terminates and destroys MC group X. IGMP leave is sent and the receiver no longer receives data.
This patch adds the same logic for sendonly joins to destroy_mc() that is also used in cma_iboe_join_multicast().
Fixes: ab15c95a17b3 ("IB/core: Support for CMA multicast join flags") Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2109081340540.668072@gentwo.de Signed-off-by: Christoph Lameter cl@linux.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 36ab9da70932..107462905b21 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1818,6 +1818,8 @@ static void cma_release_port(struct rdma_id_private *id_priv) static void destroy_mc(struct rdma_id_private *id_priv, struct cma_multicast *mc) { + bool send_only = mc->join_state == BIT(SENDONLY_FULLMEMBER_JOIN); + if (rdma_cap_ib_mcast(id_priv->id.device, id_priv->id.port_num)) ib_sa_free_multicast(mc->sa_mc);
@@ -1834,7 +1836,10 @@ static void destroy_mc(struct rdma_id_private *id_priv,
cma_set_mgid(id_priv, (struct sockaddr *)&mc->addr, &mgid); - cma_igmp_send(ndev, &mgid, false); + + if (!send_only) + cma_igmp_send(ndev, &mgid, false); + dev_put(ndev); }
From: Tao Liu thomas.liu@ucloud.cn
[ Upstream commit ca465e1f1f9b38fe916a36f7d80c5d25f2337c81 ]
If cma_listen_on_all() fails it leaves the per-device ID still on the listen_list but the state is not set to RDMA_CM_ADDR_BOUND.
When the cmid is eventually destroyed cma_cancel_listens() is not called due to the wrong state, however the per-device IDs are still holding the refcount preventing the ID from being destroyed, thus deadlocking:
task:rping state:D stack: 0 pid:19605 ppid: 47036 flags:0x00000084 Call Trace: __schedule+0x29a/0x780 ? free_unref_page_commit+0x9b/0x110 schedule+0x3c/0xa0 schedule_timeout+0x215/0x2b0 ? __flush_work+0x19e/0x1e0 wait_for_completion+0x8d/0xf0 _destroy_id+0x144/0x210 [rdma_cm] ucma_close_id+0x2b/0x40 [rdma_ucm] __destroy_id+0x93/0x2c0 [rdma_ucm] ? __xa_erase+0x4a/0xa0 ucma_destroy_id+0x9a/0x120 [rdma_ucm] ucma_write+0xb8/0x130 [rdma_ucm] vfs_write+0xb4/0x250 ksys_write+0xb5/0xd0 ? syscall_trace_enter.isra.19+0x123/0x190 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Ensure that cma_listen_on_all() atomically unwinds its action under the lock during error.
Fixes: c80a0c52d85c ("RDMA/cma: Add missing error handling of listen_id") Link: https://lore.kernel.org/r/20210913093344.17230-1-thomas.liu@ucloud.cn Signed-off-by: Tao Liu thomas.liu@ucloud.cn Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 107462905b21..dbbacc8e9273 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1746,15 +1746,16 @@ static void cma_cancel_route(struct rdma_id_private *id_priv) } }
-static void cma_cancel_listens(struct rdma_id_private *id_priv) +static void _cma_cancel_listens(struct rdma_id_private *id_priv) { struct rdma_id_private *dev_id_priv;
+ lockdep_assert_held(&lock); + /* * Remove from listen_any_list to prevent added devices from spawning * additional listen requests. */ - mutex_lock(&lock); list_del(&id_priv->list);
while (!list_empty(&id_priv->listen_list)) { @@ -1768,6 +1769,12 @@ static void cma_cancel_listens(struct rdma_id_private *id_priv) rdma_destroy_id(&dev_id_priv->id); mutex_lock(&lock); } +} + +static void cma_cancel_listens(struct rdma_id_private *id_priv) +{ + mutex_lock(&lock); + _cma_cancel_listens(id_priv); mutex_unlock(&lock); }
@@ -2587,7 +2594,7 @@ static int cma_listen_on_all(struct rdma_id_private *id_priv) return 0;
err_listen: - list_del(&id_priv->list); + _cma_cancel_listens(id_priv); mutex_unlock(&lock); if (to_destroy) rdma_destroy_id(&to_destroy->id);
From: Piotr Krysiuk piotras@gmail.com
[ Upstream commit 37cb28ec7d3a36a5bace7063a3dba633ab110f8b ]
The conditional branch instructions on MIPS use 18-bit signed offsets allowing for a branch range of 128 KBytes (backward and forward). However, this limit is not observed by the cBPF JIT compiler, and so the JIT compiler emits out-of-range branches when translating certain cBPF programs. A specific example of such a cBPF program is included in the "BPF_MAXINSNS: exec all MSH" test from lib/test_bpf.c that executes anomalous machine code containing incorrect branch offsets under JIT.
Furthermore, this issue can be abused to craft undesirable machine code, where the control flow is hijacked to execute arbitrary Kernel code.
The following steps can be used to reproduce the issue:
# echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_name="BPF_MAXINSNS: exec all MSH"
This should produce multiple warnings from build_bimm() similar to:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 209 at arch/mips/mm/uasm-mips.c:210 build_insn+0x558/0x590 Micro-assembler field overflow Modules linked in: test_bpf(+) CPU: 0 PID: 209 Comm: modprobe Not tainted 5.14.3 #1 Stack : 00000000 807bb824 82b33c9c 801843c0 00000000 00000004 00000000 63c9b5ee 82b33af4 80999898 80910000 80900000 82fd6030 00000001 82b33a98 82087180 00000000 00000000 80873b28 00000000 000000fc 82b3394c 00000000 2e34312e 6d6d6f43 809a180f 809a1836 6f6d203a 80900000 00000001 82b33bac 80900000 00027f80 00000000 00000000 807bb824 00000000 804ed790 001cc317 00000001 [...] Call Trace: [<80108f44>] show_stack+0x38/0x118 [<807a7aac>] dump_stack_lvl+0x5c/0x7c [<807a4b3c>] __warn+0xcc/0x140 [<807a4c3c>] warn_slowpath_fmt+0x8c/0xb8 [<8011e198>] build_insn+0x558/0x590 [<8011e358>] uasm_i_bne+0x20/0x2c [<80127b48>] build_body+0xa58/0x2a94 [<80129c98>] bpf_jit_compile+0x114/0x1e4 [<80613fc4>] bpf_prepare_filter+0x2ec/0x4e4 [<8061423c>] bpf_prog_create+0x80/0xc4 [<c0a006e4>] test_bpf_init+0x300/0xba8 [test_bpf] [<8010051c>] do_one_initcall+0x50/0x1d4 [<801c5e54>] do_init_module+0x60/0x220 [<801c8b20>] sys_finit_module+0xc4/0xfc [<801144d0>] syscall_common+0x34/0x58 [...] ---[ end trace a287d9742503c645 ]---
Then the anomalous machine code executes:
=> 0xc0a18000: addiu sp,sp,-16 0xc0a18004: sw s3,0(sp) 0xc0a18008: sw s4,4(sp) 0xc0a1800c: sw s5,8(sp) 0xc0a18010: sw ra,12(sp) 0xc0a18014: move s5,a0 0xc0a18018: move s4,zero 0xc0a1801c: move s3,zero
# __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) 0xc0a18020: lui t6,0x8012 0xc0a18024: ori t4,t6,0x9e14 0xc0a18028: li a1,0 0xc0a1802c: jalr t4 0xc0a18030: move a0,s5 0xc0a18034: bnez v0,0xc0a1ffb8 # incorrect branch offset 0xc0a18038: move v0,zero 0xc0a1803c: andi s4,s3,0xf 0xc0a18040: b 0xc0a18048 0xc0a18044: sll s4,s4,0x2 [...]
# __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) 0xc0a1ffa0: lui t6,0x8012 0xc0a1ffa4: ori t4,t6,0x9e14 0xc0a1ffa8: li a1,0 0xc0a1ffac: jalr t4 0xc0a1ffb0: move a0,s5 0xc0a1ffb4: bnez v0,0xc0a1ffb8 # incorrect branch offset 0xc0a1ffb8: move v0,zero 0xc0a1ffbc: andi s4,s3,0xf 0xc0a1ffc0: b 0xc0a1ffc8 0xc0a1ffc4: sll s4,s4,0x2
# __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) 0xc0a1ffc8: lui t6,0x8012 0xc0a1ffcc: ori t4,t6,0x9e14 0xc0a1ffd0: li a1,0 0xc0a1ffd4: jalr t4 0xc0a1ffd8: move a0,s5 0xc0a1ffdc: bnez v0,0xc0a3ffb8 # correct branch offset 0xc0a1ffe0: move v0,zero 0xc0a1ffe4: andi s4,s3,0xf 0xc0a1ffe8: b 0xc0a1fff0 0xc0a1ffec: sll s4,s4,0x2 [...]
# epilogue 0xc0a3ffb8: lw s3,0(sp) 0xc0a3ffbc: lw s4,4(sp) 0xc0a3ffc0: lw s5,8(sp) 0xc0a3ffc4: lw ra,12(sp) 0xc0a3ffc8: addiu sp,sp,16 0xc0a3ffcc: jr ra 0xc0a3ffd0: nop
To mitigate this issue, we assert the branch ranges for each emit call that could generate an out-of-range branch.
Fixes: 36366e367ee9 ("MIPS: BPF: Restore MIPS32 cBPF JIT") Fixes: c6610de353da ("MIPS: net: Add BPF JIT") Signed-off-by: Piotr Krysiuk piotras@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Tested-by: Johan Almbladh johan.almbladh@anyfinetworks.com Acked-by: Johan Almbladh johan.almbladh@anyfinetworks.com Cc: Paul Burton paulburton@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Link: https://lore.kernel.org/bpf/20210915160437.4080-1-piotras@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/net/bpf_jit.c | 57 +++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 14 deletions(-)
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c index 0af88622c619..cb6d22439f71 100644 --- a/arch/mips/net/bpf_jit.c +++ b/arch/mips/net/bpf_jit.c @@ -662,6 +662,11 @@ static void build_epilogue(struct jit_ctx *ctx) ((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative : func) : \ func##_positive)
+static bool is_bad_offset(int b_off) +{ + return b_off > 0x1ffff || b_off < -0x20000; +} + static int build_body(struct jit_ctx *ctx) { const struct bpf_prog *prog = ctx->skf; @@ -728,7 +733,10 @@ static int build_body(struct jit_ctx *ctx) /* Load return register on DS for failures */ emit_reg_move(r_ret, r_zero, ctx); /* Return with error */ - emit_b(b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_b(b_off, ctx); emit_nop(ctx); break; case BPF_LD | BPF_W | BPF_IND: @@ -775,8 +783,10 @@ static int build_body(struct jit_ctx *ctx) emit_jalr(MIPS_R_RA, r_s0, ctx); emit_reg_move(MIPS_R_A0, r_skb, ctx); /* delay slot */ /* Check the error value */ - emit_bcond(MIPS_COND_NE, r_ret, 0, - b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_bcond(MIPS_COND_NE, r_ret, 0, b_off, ctx); emit_reg_move(r_ret, r_zero, ctx); /* We are good */ /* X <- P[1:K] & 0xf */ @@ -855,8 +865,10 @@ static int build_body(struct jit_ctx *ctx) /* A /= X */ ctx->flags |= SEEN_X | SEEN_A; /* Check if r_X is zero */ - emit_bcond(MIPS_COND_EQ, r_X, r_zero, - b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_bcond(MIPS_COND_EQ, r_X, r_zero, b_off, ctx); emit_load_imm(r_ret, 0, ctx); /* delay slot */ emit_div(r_A, r_X, ctx); break; @@ -864,8 +876,10 @@ static int build_body(struct jit_ctx *ctx) /* A %= X */ ctx->flags |= SEEN_X | SEEN_A; /* Check if r_X is zero */ - emit_bcond(MIPS_COND_EQ, r_X, r_zero, - b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_bcond(MIPS_COND_EQ, r_X, r_zero, b_off, ctx); emit_load_imm(r_ret, 0, ctx); /* delay slot */ emit_mod(r_A, r_X, ctx); break; @@ -926,7 +940,10 @@ static int build_body(struct jit_ctx *ctx) break; case BPF_JMP | BPF_JA: /* pc += K */ - emit_b(b_imm(i + k + 1, ctx), ctx); + b_off = b_imm(i + k + 1, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_b(b_off, ctx); emit_nop(ctx); break; case BPF_JMP | BPF_JEQ | BPF_K: @@ -1056,12 +1073,16 @@ static int build_body(struct jit_ctx *ctx) break; case BPF_RET | BPF_A: ctx->flags |= SEEN_A; - if (i != prog->len - 1) + if (i != prog->len - 1) { /* * If this is not the last instruction * then jump to the epilogue */ - emit_b(b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_b(b_off, ctx); + } emit_reg_move(r_ret, r_A, ctx); /* delay slot */ break; case BPF_RET | BPF_K: @@ -1075,7 +1096,10 @@ static int build_body(struct jit_ctx *ctx) * If this is not the last instruction * then jump to the epilogue */ - emit_b(b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_b(b_off, ctx); emit_nop(ctx); } break; @@ -1133,8 +1157,10 @@ static int build_body(struct jit_ctx *ctx) /* Load *dev pointer */ emit_load_ptr(r_s0, r_skb, off, ctx); /* error (0) in the delay slot */ - emit_bcond(MIPS_COND_EQ, r_s0, r_zero, - b_imm(prog->len, ctx), ctx); + b_off = b_imm(prog->len, ctx); + if (is_bad_offset(b_off)) + return -E2BIG; + emit_bcond(MIPS_COND_EQ, r_s0, r_zero, b_off, ctx); emit_reg_move(r_ret, r_zero, ctx); if (code == (BPF_ANC | SKF_AD_IFINDEX)) { BUILD_BUG_ON(sizeof_field(struct net_device, ifindex) != 4); @@ -1244,7 +1270,10 @@ void bpf_jit_compile(struct bpf_prog *fp)
/* Generate the actual JIT code */ build_prologue(&ctx); - build_body(&ctx); + if (build_body(&ctx)) { + module_memfree(ctx.target); + goto out; + } build_epilogue(&ctx);
/* Update the icache */
From: Vadim Pasternak vadimp@nvidia.com
[ Upstream commit e6fab7af6ba1bc77c78713a83876f60ca7a4a064 ]
Fan speed minimum can be enforced from sysfs. For example, setting current fan speed to 20 is used to enforce fan speed to be at 100% speed, 19 - to be not below 90% speed, etcetera. This feature provides ability to limit fan speed according to some system wise considerations, like absence of some replaceable units or high system ambient temperature.
Request for changing fan minimum speed is configuration request and can be set only through 'sysfs' write procedure. In this situation value of argument 'state' is above nominal fan speed maximum.
Return non-zero code in this case to avoid thermal_cooling_device_stats_update() call, because in this case statistics update violates thermal statistics table range. The issues is observed in case kernel is configured with option CONFIG_THERMAL_STATISTICS.
Here is the trace from KASAN: [ 159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0 [ 159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444 [ 159.545625] Call Trace: [ 159.548366] dump_stack+0x92/0xc1 [ 159.552084] ? thermal_cooling_device_stats_update+0x7d/0xb0 [ 159.635869] thermal_zone_device_update+0x345/0x780 [ 159.688711] thermal_zone_device_set_mode+0x7d/0xc0 [ 159.694174] mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core] [ 159.700972] ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core] [ 159.731827] mlxsw_thermal_init+0x763/0x880 [mlxsw_core] [ 160.070233] RIP: 0033:0x7fd995909970 [ 160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff .. [ 160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970 [ 160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001 [ 160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700 [ 160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013 [ 160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013 [ 160.143671] [ 160.145338] Allocated by task 2924: [ 160.149242] kasan_save_stack+0x19/0x40 [ 160.153541] __kasan_kmalloc+0x7f/0xa0 [ 160.157743] __kmalloc+0x1a2/0x2b0 [ 160.161552] thermal_cooling_device_setup_sysfs+0xf9/0x1a0 [ 160.167687] __thermal_cooling_device_register+0x1b5/0x500 [ 160.173833] devm_thermal_of_cooling_device_register+0x60/0xa0 [ 160.180356] mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan] [ 160.248140] [ 160.249807] The buggy address belongs to the object at ffff888116163400 [ 160.249807] which belongs to the cache kmalloc-1k of size 1024 [ 160.263814] The buggy address is located 64 bytes to the right of [ 160.263814] 1024-byte region [ffff888116163400, ffff888116163800) [ 160.277536] The buggy address belongs to the page: [ 160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160 [ 160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0 [ 160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2) [ 160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0 [ 160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000 [ 160.327033] page dumped because: kasan: bad access detected [ 160.333270] [ 160.334937] Memory state around the buggy address: [ 160.356469] >ffff888116163800: fc ..
Fixes: 65afb4c8e7e4 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver") Signed-off-by: Vadim Pasternak vadimp@nvidia.com Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/mlxreg-fan.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/hwmon/mlxreg-fan.c b/drivers/hwmon/mlxreg-fan.c index 116681fde33d..89fe7b9fe26b 100644 --- a/drivers/hwmon/mlxreg-fan.c +++ b/drivers/hwmon/mlxreg-fan.c @@ -315,8 +315,8 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev, { struct mlxreg_fan *fan = cdev->devdata; unsigned long cur_state; + int i, config = 0; u32 regval; - int i; int err;
/* @@ -329,6 +329,12 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev, * overwritten. */ if (state >= MLXREG_FAN_SPEED_MIN && state <= MLXREG_FAN_SPEED_MAX) { + /* + * This is configuration change, which is only supported through sysfs. + * For configuration non-zero value is to be returned to avoid thermal + * statistics update. + */ + config = 1; state -= MLXREG_FAN_MAX_STATE; for (i = 0; i < state; i++) fan->cooling_levels[i] = state; @@ -343,7 +349,7 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev,
cur_state = MLXREG_FAN_PWM_DUTY2STATE(regval); if (state < cur_state) - return 0; + return config;
state = cur_state; } @@ -359,7 +365,7 @@ static int mlxreg_fan_set_cur_state(struct thermal_cooling_device *cdev, dev_err(fan->dev, "Failed to write PWM duty\n"); return err; } - return 0; + return config; }
static const struct thermal_cooling_device_ops mlxreg_fan_cooling_ops = {
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit 5b1e985f7626307c451f98883f5e2665ee208e1c ]
Due to duplicate reset flags, CQP commands are processed during reset.
This leads CQP failures such as below:
irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0
Remove the redundant flag and set the correct reset flag so CPQ is paused during reset
Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs") Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.com Reported-by: LiLiang liali@redhat.com Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/cm.c | 4 ++-- drivers/infiniband/hw/irdma/hw.c | 6 +++--- drivers/infiniband/hw/irdma/i40iw_if.c | 2 +- drivers/infiniband/hw/irdma/main.h | 1 - drivers/infiniband/hw/irdma/utils.c | 2 +- drivers/infiniband/hw/irdma/verbs.c | 3 +-- 6 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c index 6b62299abfbb..6dea0a49d171 100644 --- a/drivers/infiniband/hw/irdma/cm.c +++ b/drivers/infiniband/hw/irdma/cm.c @@ -3496,7 +3496,7 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp) original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT || last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE || last_ae == IRDMA_AE_BAD_CLOSE || - last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->reset)) { + last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset)) { issue_close = 1; iwqp->cm_id = NULL; qp->term_flags = 0; @@ -4250,7 +4250,7 @@ void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr, teardown_entry); attr.qp_state = IB_QPS_ERR; irdma_modify_qp(&cm_node->iwqp->ibqp, &attr, IB_QP_STATE, NULL); - if (iwdev->reset) + if (iwdev->rf->reset) irdma_cm_disconn(cm_node->iwqp); irdma_rem_ref_cm_node(cm_node); } diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 00de5ee9a260..33c06a3a4f63 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -1489,7 +1489,7 @@ void irdma_reinitialize_ieq(struct irdma_sc_vsi *vsi)
irdma_puda_dele_rsrc(vsi, IRDMA_PUDA_RSRC_TYPE_IEQ, false); if (irdma_initialize_ieq(iwdev)) { - iwdev->reset = true; + iwdev->rf->reset = true; rf->gen_ops.request_reset(rf); } } @@ -1632,13 +1632,13 @@ void irdma_rt_deinit_hw(struct irdma_device *iwdev) case IEQ_CREATED: if (!iwdev->roce_mode) irdma_puda_dele_rsrc(&iwdev->vsi, IRDMA_PUDA_RSRC_TYPE_IEQ, - iwdev->reset); + iwdev->rf->reset); fallthrough; case ILQ_CREATED: if (!iwdev->roce_mode) irdma_puda_dele_rsrc(&iwdev->vsi, IRDMA_PUDA_RSRC_TYPE_ILQ, - iwdev->reset); + iwdev->rf->reset); break; default: ibdev_warn(&iwdev->ibdev, "bad init_state = %d\n", iwdev->init_state); diff --git a/drivers/infiniband/hw/irdma/i40iw_if.c b/drivers/infiniband/hw/irdma/i40iw_if.c index bddf88194d09..d219f64b2c3d 100644 --- a/drivers/infiniband/hw/irdma/i40iw_if.c +++ b/drivers/infiniband/hw/irdma/i40iw_if.c @@ -55,7 +55,7 @@ static void i40iw_close(struct i40e_info *cdev_info, struct i40e_client *client,
iwdev = to_iwdev(ibdev); if (reset) - iwdev->reset = true; + iwdev->rf->reset = true;
iwdev->iw_status = 0; irdma_port_ibevent(iwdev); diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 743d9e143a99..b678fe712447 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -346,7 +346,6 @@ struct irdma_device { bool roce_mode:1; bool roce_dcqcn_en:1; bool dcb:1; - bool reset:1; bool iw_ooo:1; enum init_completion_state init_state;
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 5bbe44e54f9a..832e9604766b 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -2510,7 +2510,7 @@ void irdma_modify_qp_to_err(struct irdma_sc_qp *sc_qp) struct irdma_qp *qp = sc_qp->qp_uk.back_qp; struct ib_qp_attr attr;
- if (qp->iwdev->reset) + if (qp->iwdev->rf->reset) return; attr.qp_state = IB_QPS_ERR;
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 717147ed0519..6107f37321d2 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -535,8 +535,7 @@ static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) irdma_qp_rem_ref(&iwqp->ibqp); wait_for_completion(&iwqp->free_qp); irdma_free_lsmm_rsrc(iwqp); - if (!iwdev->reset) - irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp); + irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp);
if (!iwqp->user_mode) { if (iwqp->iwscq) {
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit f4475f249445b3c1fb99919b0514a075b6d6b3d4 ]
Add lower bound check for CQ entries at creation time.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20210916191222.824-3-shiraz.saleem@intel.com Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 6107f37321d2..6c3d28f744cb 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2040,7 +2040,7 @@ static int irdma_create_cq(struct ib_cq *ibcq, /* Kmode allocations */ int rsize;
- if (entries > rf->max_cqe) { + if (entries < 1 || entries > rf->max_cqe) { err_code = -EINVAL; goto cq_free_rsrc; }
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit d3bdcd59633907ee306057b6bb70f06dce47dddc ]
When the retry counter exceeds, as the remote QP didn't send any Ack or Nack an asynchronous event (AE) for too many retries is generated. Add code to handle the AE and set the correct IB WC error code IB_WC_RETRY_EXC_ERR.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20210916191222.824-4-shiraz.saleem@intel.com Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 3 +++ drivers/infiniband/hw/irdma/user.h | 1 + drivers/infiniband/hw/irdma/verbs.c | 2 ++ 3 files changed, 6 insertions(+)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 33c06a3a4f63..cb9a8e24e3b7 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -176,6 +176,9 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp, case IRDMA_AE_LLP_RECEIVED_MPA_CRC_ERROR: qp->flush_code = FLUSH_GENERAL_ERR; break; + case IRDMA_AE_LLP_TOO_MANY_RETRIES: + qp->flush_code = FLUSH_RETRY_EXC_ERR; + break; default: qp->flush_code = FLUSH_FATAL_ERR; break; diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index ff705f323233..267102d1049d 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -102,6 +102,7 @@ enum irdma_flush_opcode { FLUSH_REM_OP_ERR, FLUSH_LOC_LEN_ERR, FLUSH_FATAL_ERR, + FLUSH_RETRY_EXC_ERR, };
enum irdma_cmpl_status { diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 6c3d28f744cb..3960c872ff76 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -3358,6 +3358,8 @@ static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode return IB_WC_LOC_LEN_ERR; case FLUSH_GENERAL_ERR: return IB_WC_WR_FLUSH_ERR; + case FLUSH_RETRY_EXC_ERR: + return IB_WC_RETRY_EXC_ERR; case FLUSH_FATAL_ERR: default: return IB_WC_FATAL_ERR;
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit 9f7fa37a6bd90f2749c67f8524334c387d972eb9 ]
Report the correct WC error when MW bind error related asynchronous events are generated by HW.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20210916191222.824-5-shiraz.saleem@intel.com Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 5 +++++ drivers/infiniband/hw/irdma/user.h | 1 + drivers/infiniband/hw/irdma/verbs.c | 2 ++ 3 files changed, 8 insertions(+)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index cb9a8e24e3b7..7de525a5ccf8 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -179,6 +179,11 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp, case IRDMA_AE_LLP_TOO_MANY_RETRIES: qp->flush_code = FLUSH_RETRY_EXC_ERR; break; + case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS: + case IRDMA_AE_AMP_MWBIND_BIND_DISABLED: + case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS: + qp->flush_code = FLUSH_MW_BIND_ERR; + break; default: qp->flush_code = FLUSH_FATAL_ERR; break; diff --git a/drivers/infiniband/hw/irdma/user.h b/drivers/infiniband/hw/irdma/user.h index 267102d1049d..3dcbb1fbf2c6 100644 --- a/drivers/infiniband/hw/irdma/user.h +++ b/drivers/infiniband/hw/irdma/user.h @@ -103,6 +103,7 @@ enum irdma_flush_opcode { FLUSH_LOC_LEN_ERR, FLUSH_FATAL_ERR, FLUSH_RETRY_EXC_ERR, + FLUSH_MW_BIND_ERR, };
enum irdma_cmpl_status { diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 3960c872ff76..fa393c5ea397 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -3360,6 +3360,8 @@ static enum ib_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode return IB_WC_WR_FLUSH_ERR; case FLUSH_RETRY_EXC_ERR: return IB_WC_RETRY_EXC_ERR; + case FLUSH_MW_BIND_ERR: + return IB_WC_MW_BIND_ERR; case FLUSH_FATAL_ERR: default: return IB_WC_FATAL_ERR;
From: Florian Westphal fw@strlen.de
[ Upstream commit a499b03bf36b0c2e3b958a381d828678ab0ffc5e ]
syzbot reports following UAF: BUG: KASAN: use-after-free in memcmp+0x18f/0x1c0 lib/string.c:955 nla_strcmp+0xf2/0x130 lib/nlattr.c:836 nft_table_lookup.part.0+0x1a2/0x460 net/netfilter/nf_tables_api.c:570 nft_table_lookup net/netfilter/nf_tables_api.c:4064 [inline] nf_tables_getset+0x1b3/0x860 net/netfilter/nf_tables_api.c:4064 nfnetlink_rcv_msg+0x659/0x13f0 net/netfilter/nfnetlink.c:285 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
Problem is that all get operations are lockless, so the commit_mutex held by nft_rcv_nl_event() isn't enough to stop a parallel GET request from doing read-accesses to the table object even after synchronize_rcu().
To avoid this, unlink the table first and store the table objects in on-stack scratch space.
Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership") Reported-and-tested-by: syzbot+f31660cf279b0557160c@syzkaller.appspotmail.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 081437dd75b7..33e771cd847c 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9599,7 +9599,6 @@ static void __nft_release_table(struct net *net, struct nft_table *table) table->use--; nf_tables_chain_destroy(&ctx); } - list_del(&table->list); nf_tables_table_destroy(&ctx); }
@@ -9612,6 +9611,8 @@ static void __nft_release_tables(struct net *net) if (nft_table_has_owner(table)) continue;
+ list_del(&table->list); + __nft_release_table(net, table); } } @@ -9619,31 +9620,38 @@ static void __nft_release_tables(struct net *net) static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event, void *ptr) { + struct nft_table *table, *to_delete[8]; struct nftables_pernet *nft_net; struct netlink_notify *n = ptr; - struct nft_table *table, *nt; struct net *net = n->net; - bool release = false; + unsigned int deleted; + bool restart = false;
if (event != NETLINK_URELEASE || n->protocol != NETLINK_NETFILTER) return NOTIFY_DONE;
nft_net = nft_pernet(net); + deleted = 0; mutex_lock(&nft_net->commit_mutex); +again: list_for_each_entry(table, &nft_net->tables, list) { if (nft_table_has_owner(table) && n->portid == table->nlpid) { __nft_release_hook(net, table); - release = true; + list_del_rcu(&table->list); + to_delete[deleted++] = table; + if (deleted >= ARRAY_SIZE(to_delete)) + break; } } - if (release) { + if (deleted) { + restart = deleted >= ARRAY_SIZE(to_delete); synchronize_rcu(); - list_for_each_entry_safe(table, nt, &nft_net->tables, list) { - if (nft_table_has_owner(table) && - n->portid == table->nlpid) - __nft_release_table(net, table); - } + while (deleted) + __nft_release_table(net, to_delete[--deleted]); + + if (restart) + goto again; } mutex_unlock(&nft_net->commit_mutex);
From: Florian Westphal fw@strlen.de
[ Upstream commit b53deef054e58fe4f37c66211b8ece9f8fc1aa13 ]
iptables/nftables has two types of log modules:
1. backend, e.g. nf_log_syslog, which implement the functionality 2. frontend, e.g. xt_LOG or nft_log, which call the functionality provided by backend based on nf_tables or xtables rule set.
Problem is that the request_module() call to load the backed in nf_logger_find_get() might happen with nftables transaction mutex held in case the call path is via nf_tables/nft_compat.
This can cause deadlocks (see 'Fixes' tags for details).
The chosen solution as to let modprobe deal with this by adding 'pre: ' soft dep tag to xt_LOG (to load the syslog backend) and xt_NFLOG (to load nflog backend).
Eric reports that this breaks on systems with older modprobe that doesn't support softdeps.
Another, similar issue occurs when someone either insmods xt_(NF)LOG directly or unloads the backend module (possible if no log frontend is in use): because the frontend module is already loaded, modprobe is not invoked again so the softdep isn't evaluated.
Add a workaround: If nf_logger_find_get() returns -ENOENT and call is not via nft_compat, load the backend explicitly and try again.
Else, let nft_compat ask for deferred request_module via nf_tables infra.
Softdeps are kept in-place, so with newer modprobe the dependencies are resolved from userspace.
Fixes: cefa31a9d461 ("netfilter: nft_log: perform module load from nf_tables") Fixes: a38b5b56d6f4 ("netfilter: nf_log: add module softdeps") Reported-and-tested-by: Eric Dumazet edumazet@google.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_compat.c | 17 ++++++++++++++++- net/netfilter/xt_LOG.c | 10 +++++++++- net/netfilter/xt_NFLOG.c | 10 +++++++++- 3 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c index 272bcdb1392d..f69cc73c5813 100644 --- a/net/netfilter/nft_compat.c +++ b/net/netfilter/nft_compat.c @@ -19,6 +19,7 @@ #include <linux/netfilter_bridge/ebtables.h> #include <linux/netfilter_arp/arp_tables.h> #include <net/netfilter/nf_tables.h> +#include <net/netfilter/nf_log.h>
/* Used for matches where *info is larger than X byte */ #define NFT_MATCH_LARGE_THRESH 192 @@ -257,8 +258,22 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr, nft_compat_wait_for_destructors();
ret = xt_check_target(&par, size, proto, inv); - if (ret < 0) + if (ret < 0) { + if (ret == -ENOENT) { + const char *modname = NULL; + + if (strcmp(target->name, "LOG") == 0) + modname = "nf_log_syslog"; + else if (strcmp(target->name, "NFLOG") == 0) + modname = "nfnetlink_log"; + + if (modname && + nft_request_module(ctx->net, "%s", modname) == -EAGAIN) + return -EAGAIN; + } + return ret; + }
/* The standard target cannot be used */ if (!target->target) diff --git a/net/netfilter/xt_LOG.c b/net/netfilter/xt_LOG.c index 2ff75f7637b0..f39244f9c0ed 100644 --- a/net/netfilter/xt_LOG.c +++ b/net/netfilter/xt_LOG.c @@ -44,6 +44,7 @@ log_tg(struct sk_buff *skb, const struct xt_action_param *par) static int log_tg_check(const struct xt_tgchk_param *par) { const struct xt_log_info *loginfo = par->targinfo; + int ret;
if (par->family != NFPROTO_IPV4 && par->family != NFPROTO_IPV6) return -EINVAL; @@ -58,7 +59,14 @@ static int log_tg_check(const struct xt_tgchk_param *par) return -EINVAL; }
- return nf_logger_find_get(par->family, NF_LOG_TYPE_LOG); + ret = nf_logger_find_get(par->family, NF_LOG_TYPE_LOG); + if (ret != 0 && !par->nft_compat) { + request_module("%s", "nf_log_syslog"); + + ret = nf_logger_find_get(par->family, NF_LOG_TYPE_LOG); + } + + return ret; }
static void log_tg_destroy(const struct xt_tgdtor_param *par) diff --git a/net/netfilter/xt_NFLOG.c b/net/netfilter/xt_NFLOG.c index fb5793208059..e660c3710a10 100644 --- a/net/netfilter/xt_NFLOG.c +++ b/net/netfilter/xt_NFLOG.c @@ -42,13 +42,21 @@ nflog_tg(struct sk_buff *skb, const struct xt_action_param *par) static int nflog_tg_check(const struct xt_tgchk_param *par) { const struct xt_nflog_info *info = par->targinfo; + int ret;
if (info->flags & ~XT_NFLOG_MASK) return -EINVAL; if (info->prefix[sizeof(info->prefix) - 1] != '\0') return -EINVAL;
- return nf_logger_find_get(par->family, NF_LOG_TYPE_ULOG); + ret = nf_logger_find_get(par->family, NF_LOG_TYPE_ULOG); + if (ret != 0 && !par->nft_compat) { + request_module("%s", "nfnetlink_log"); + + ret = nf_logger_find_get(par->family, NF_LOG_TYPE_ULOG); + } + + return ret; }
static void nflog_tg_destroy(const struct xt_tgdtor_param *par)
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 98d46b021f6ee246c7a73f9d490d4cddb4511a3b ]
This reverts commit d333322361e7 ("mac80211: do not use low data rates for data frames with no ack flag").
Returning false early in rate_control_send_low breaks sending broadcast packets, since rate control will not select a rate for it.
Before re-introducing a fixed version of this patch, we should probably also make some changes to rate control to be more conservative in selecting rates for no-ack packets and also prevent using probing rates on them, since we won't get any feedback.
Fixes: d333322361e7 ("mac80211: do not use low data rates for data frames with no ack flag") Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20210906083559.9109-1-nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/rate.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/net/mac80211/rate.c b/net/mac80211/rate.c index e5935e3d7a07..8c6416129d5b 100644 --- a/net/mac80211/rate.c +++ b/net/mac80211/rate.c @@ -392,10 +392,6 @@ static bool rate_control_send_low(struct ieee80211_sta *pubsta, int mcast_rate; bool use_basicrate = false;
- if (ieee80211_is_tx_data(txrc->skb) && - info->flags & IEEE80211_TX_CTL_NO_ACK) - return false; - if (!pubsta || rc_no_data_or_no_ack_use_min(txrc)) { __rate_control_send_low(txrc->hw, sband, pubsta, info, txrc->rate_idx_mask);
From: Chih-Kang Chang gary.chang@realtek.com
[ Upstream commit fe94bac626d9c1c5bc98ab32707be8a9d7f8adba ]
In ieee80211_amsdu_aggregate() set a pointer frag_tail point to the end of skb_shinfo(head)->frag_list, and use it to bind other skb in the end of this function. But when execute ieee80211_amsdu_aggregate() ->ieee80211_amsdu_realloc_pad()->pskb_expand_head(), the address of skb_shinfo(head)->frag_list will be changed. However, the ieee80211_amsdu_aggregate() not update frag_tail after call pskb_expand_head(). That will cause the second skb can't bind to the head skb appropriately.So we update the address of frag_tail to fix it.
Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support") Signed-off-by: Chih-Kang Chang gary.chang@realtek.com Signed-off-by: Zong-Zhe Yang kevin_yang@realtek.com Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://lore.kernel.org/r/20210830073240.12736-1-pkshih@realtek.com [reword comment] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tx.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index fa09a369214d..0208f68af394 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -3380,6 +3380,14 @@ static bool ieee80211_amsdu_aggregate(struct ieee80211_sub_if_data *sdata, if (!ieee80211_amsdu_prepare_head(sdata, fast_tx, head)) goto out;
+ /* If n == 2, the "while (*frag_tail)" loop above didn't execute + * and frag_tail should be &skb_shinfo(head)->frag_list. + * However, ieee80211_amsdu_prepare_head() can reallocate it. + * Reload frag_tail to have it pointing to the correct place. + */ + if (n == 2) + frag_tail = &skb_shinfo(head)->frag_list; + /* * Pad out the previous subframe to a multiple of 4 by adding the * padding to the next one, that's being added. Note that head->len
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 13cb6d826e0ac0d144b0d48191ff1a111d32f0c6 ]
Limit max values for vht mcs and nss in ieee80211_parse_tx_radiotap routine in order to fix the following warning reported by syzbot:
WARNING: CPU: 0 PID: 10717 at include/net/mac80211.h:989 ieee80211_rate_set_vht include/net/mac80211.h:989 [inline] WARNING: CPU: 0 PID: 10717 at include/net/mac80211.h:989 ieee80211_parse_tx_radiotap+0x101e/0x12d0 net/mac80211/tx.c:2244 Modules linked in: CPU: 0 PID: 10717 Comm: syz-executor.5 Not tainted 5.14.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:ieee80211_rate_set_vht include/net/mac80211.h:989 [inline] RIP: 0010:ieee80211_parse_tx_radiotap+0x101e/0x12d0 net/mac80211/tx.c:2244 RSP: 0018:ffffc9000186f3e8 EFLAGS: 00010216 RAX: 0000000000000618 RBX: ffff88804ef76500 RCX: ffffc900143a5000 RDX: 0000000000040000 RSI: ffffffff888f478e RDI: 0000000000000003 RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000100 R10: ffffffff888f46f9 R11: 0000000000000000 R12: 00000000fffffff8 R13: ffff88804ef7653c R14: 0000000000000001 R15: 0000000000000004 FS: 00007fbf5718f700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2de23000 CR3: 000000006a671000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: ieee80211_monitor_select_queue+0xa6/0x250 net/mac80211/iface.c:740 netdev_core_pick_tx+0x169/0x2e0 net/core/dev.c:4089 __dev_queue_xmit+0x6f9/0x3710 net/core/dev.c:4165 __bpf_tx_skb net/core/filter.c:2114 [inline] __bpf_redirect_no_mac net/core/filter.c:2139 [inline] __bpf_redirect+0x5ba/0xd20 net/core/filter.c:2162 ____bpf_clone_redirect net/core/filter.c:2429 [inline] bpf_clone_redirect+0x2ae/0x420 net/core/filter.c:2401 bpf_prog_eeb6f53a69e5c6a2+0x59/0x234 bpf_dispatcher_nop_func include/linux/bpf.h:717 [inline] __bpf_prog_run include/linux/filter.h:624 [inline] bpf_prog_run include/linux/filter.h:631 [inline] bpf_test_run+0x381/0xa30 net/bpf/test_run.c:119 bpf_prog_test_run_skb+0xb84/0x1ee0 net/bpf/test_run.c:663 bpf_prog_test_run kernel/bpf/syscall.c:3307 [inline] __sys_bpf+0x2137/0x5df0 kernel/bpf/syscall.c:4605 __do_sys_bpf kernel/bpf/syscall.c:4691 [inline] __se_sys_bpf kernel/bpf/syscall.c:4689 [inline] __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:4689 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4665f9
Reported-by: syzbot+0196ac871673f0c20f68@syzkaller.appspotmail.com Fixes: 646e76bb5daf4 ("mac80211: parse VHT info in injected frames") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Link: https://lore.kernel.org/r/c26c3f02dcb38ab63b2f2534cb463d95ee81bb13.163214176... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tx.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 0208f68af394..751e601c4623 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -2209,7 +2209,11 @@ bool ieee80211_parse_tx_radiotap(struct sk_buff *skb, }
vht_mcs = iterator.this_arg[4] >> 4; + if (vht_mcs > 11) + vht_mcs = 0; vht_nss = iterator.this_arg[4] & 0xF; + if (!vht_nss || vht_nss > 8) + vht_nss = 1; break;
/*
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit b9731062ce8afd35cf723bf3a8ad55d208f915a5 ]
The pointer here points directly into the frame, so the access is potentially unaligned. Use get_unaligned_le16 to avoid that.
Fixes: 3f52b7e328c5 ("mac80211: mesh power save basics") Link: https://lore.kernel.org/r/20210920154009.3110ff75be0c.Ib6a2ff9e9cc9bc6fca50f... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mesh_ps.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/mesh_ps.c b/net/mac80211/mesh_ps.c index 204830a55240..3fbd0b9ff913 100644 --- a/net/mac80211/mesh_ps.c +++ b/net/mac80211/mesh_ps.c @@ -2,6 +2,7 @@ /* * Copyright 2012-2013, Marco Porsch marco.porsch@s2005.tu-chemnitz.de * Copyright 2012-2013, cozybit Inc. + * Copyright (C) 2021 Intel Corporation */
#include "mesh.h" @@ -588,7 +589,7 @@ void ieee80211_mps_frame_release(struct sta_info *sta,
/* only transmit to PS STA with announced, non-zero awake window */ if (test_sta_flag(sta, WLAN_STA_PS_STA) && - (!elems->awake_window || !le16_to_cpu(*elems->awake_window))) + (!elems->awake_window || !get_unaligned_le16(elems->awake_window))) return;
if (!test_sta_flag(sta, WLAN_STA_MPSP_OWNER))
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 313bbd1990b6ddfdaa7da098d0c56b098a833572 ]
Thomas explained in https://lore.kernel.org/r/87mtoeb4hb.ffs@tglx that our handling of the hrtimer here is wrong: If the timer fires late (e.g. due to vCPU scheduling, as reported by Dmitry/syzbot) then it tries to actually rearm the timer at the next deadline, which might be in the past already:
1 2 3 N N+1 | | | ... | |
^ intended to fire here (1) ^ next deadline here (2) ^ actually fired here
The next time it fires, it's later, but will still try to schedule for the next deadline (now 3), etc. until it catches up with N, but that might take a long time, causing stalls etc.
Now, all of this is simulation, so we just have to fix it, but note that the behaviour is wrong even per spec, since there's no value then in sending all those beacons unaligned - they should be aligned to the TBTT (1, 2, 3, ... in the picture), and if we're a bit (or a lot) late, then just resume at that point.
Therefore, change the code to use hrtimer_forward_now() which will ensure that the next firing of the timer would be at N+1 (in the picture), i.e. the next interval point after the current time.
Suggested-by: Thomas Gleixner tglx@linutronix.de Reported-by: Dmitry Vyukov dvyukov@google.com Reported-by: syzbot+0e964fad69a9c462bc1e@syzkaller.appspotmail.com Fixes: 01e59e467ecf ("mac80211_hwsim: hrtimer beacon") Reviewed-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20210915112936.544f383472eb.I3f9712009027aa09244b6... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index ffa894f7312a..0adae76eb8df 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -1867,8 +1867,8 @@ mac80211_hwsim_beacon(struct hrtimer *timer) bcn_int -= data->bcn_delta; data->bcn_delta = 0; } - hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer), - ns_to_ktime(bcn_int * NSEC_PER_USEC)); + hrtimer_forward_now(&data->beacon_timer, + ns_to_ktime(bcn_int * NSEC_PER_USEC)); return HRTIMER_RESTART; }
From: Saravana Kannan saravanak@google.com
[ Upstream commit 5501765a02a6c324f78581e6bb8209d054fe13ae ]
If a parent device is also a supplier to a child device, fw_devlink=on by design delays the probe() of the child device until the probe() of the parent finishes successfully.
However, some drivers of such parent devices (where parent is also a supplier) expect the child device to finish probing successfully as soon as they are added using device_add() and before the probe() of the parent device has completed successfully. One example of such a case is discussed in the link mentioned below.
Add a flag to make fw_devlink=on not enforce these supplier-consumer relationships, so these drivers can continue working.
Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=... Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") Signed-off-by: Saravana Kannan saravanak@google.com Link: https://lore.kernel.org/r/20210915170940.617415-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/core.c | 19 +++++++++++++++++++ include/linux/fwnode.h | 11 ++++++++--- 2 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c index 8c77e14987d4..3d2fc70b9951 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1721,6 +1721,25 @@ static int fw_devlink_create_devlink(struct device *con, struct device *sup_dev; int ret = 0;
+ /* + * In some cases, a device P might also be a supplier to its child node + * C. However, this would defer the probe of C until the probe of P + * completes successfully. This is perfectly fine in the device driver + * model. device_add() doesn't guarantee probe completion of the device + * by the time it returns. + * + * However, there are a few drivers that assume C will finish probing + * as soon as it's added and before P finishes probing. So, we provide + * a flag to let fw_devlink know not to delay the probe of C until the + * probe of P completes successfully. + * + * When such a flag is set, we can't create device links where P is the + * supplier of C as that would delay the probe of C. + */ + if (sup_handle->flags & FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD && + fwnode_is_ancestor_of(sup_handle, con->fwnode)) + return -EINVAL; + sup_dev = get_dev_from_fwnode(sup_handle); if (sup_dev) { /* diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h index 59828516ebaf..9f4ad719bfe3 100644 --- a/include/linux/fwnode.h +++ b/include/linux/fwnode.h @@ -22,10 +22,15 @@ struct device; * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. * NOT_DEVICE: The fwnode will never be populated as a struct device. * INITIALIZED: The hardware corresponding to fwnode has been initialized. + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its + * driver needs its child devices to be bound with + * their respective drivers as soon as they are + * added. */ -#define FWNODE_FLAG_LINKS_ADDED BIT(0) -#define FWNODE_FLAG_NOT_DEVICE BIT(1) -#define FWNODE_FLAG_INITIALIZED BIT(2) +#define FWNODE_FLAG_LINKS_ADDED BIT(0) +#define FWNODE_FLAG_NOT_DEVICE BIT(1) +#define FWNODE_FLAG_INITIALIZED BIT(2) +#define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
struct fwnode_handle { struct fwnode_handle *secondary;
From: Saravana Kannan saravanak@google.com
[ Upstream commit 04f41c68f18886aea5afc68be945e7195ea1d598 ]
There are many instances of PHYs that depend on a switch to supply a resource (Eg: interrupts). Switches also expects the PHYs to be probed by their specific drivers as soon as they are added. If that doesn't happen, then the switch would force the use of generic PHY drivers for the PHY even if the PHY might have specific driver available.
fw_devlink=on by design can cause delayed probes of PHY. To avoid, this we need to set the FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for the switch's fwnode before the PHYs are added. The most generic way to do this is to set this flag for the parent of MDIO busses which is typically the switch.
For more context: https://lore.kernel.org/lkml/YTll0i6Rz3WAAYzs@lunn.ch/#t
Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") Suggested-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Saravana Kannan saravanak@google.com Link: https://lore.kernel.org/r/20210915170940.617415-4-saravanak@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/mdio_bus.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 53f034fc2ef7..ee8313a4ac71 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -525,6 +525,10 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner) NULL == bus->read || NULL == bus->write) return -EINVAL;
+ if (bus->parent && bus->parent->of_node) + bus->parent->of_node->fwnode.flags |= + FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD; + BUG_ON(bus->state != MDIOBUS_ALLOCATED && bus->state != MDIOBUS_UNREGISTERED);
From: Xin Long lucien.xin@gmail.com
[ Upstream commit f7e745f8e94492a8ac0b0a26e25f2b19d342918f ]
We should always check if skb_header_pointer's return is NULL before using it, otherwise it may cause null-ptr-deref, as syzbot reported:
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:sctp_rcv_ootb net/sctp/input.c:705 [inline] RIP: 0010:sctp_rcv+0x1d84/0x3220 net/sctp/input.c:196 Call Trace: <IRQ> sctp6_rcv+0x38/0x60 net/sctp/ipv6.c:1109 ip6_protocol_deliver_rcu+0x2e9/0x1ca0 net/ipv6/ip6_input.c:422 ip6_input_finish+0x62/0x170 net/ipv6/ip6_input.c:463 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:472 dst_input include/net/dst.h:460 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ipv6_rcv+0x28c/0x3c0 net/ipv6/ip6_input.c:297
Fixes: 3acb50c18d8d ("sctp: delay as much as possible skb_linearize") Reported-by: syzbot+581aff2ae6b860625116@syzkaller.appspotmail.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sctp/input.c b/net/sctp/input.c index 5ef86fdb1176..1f1786021d9c 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -702,7 +702,7 @@ static int sctp_rcv_ootb(struct sk_buff *skb) ch = skb_header_pointer(skb, offset, sizeof(*ch), &_ch);
/* Break out if chunk length is less then minimal. */ - if (ntohs(ch->length) < sizeof(_ch)) + if (!ch || ntohs(ch->length) < sizeof(_ch)) break;
ch_end = offset + SCTP_PAD4(ntohs(ch->length));
From: Florian Westphal fw@strlen.de
[ Upstream commit ea1300b9df7c8e8b65695a08b8f6aaf4b25fec9c ]
mptcp_token_get_sock() may return a mptcp socket that is in a different net namespace than the socket that received the token value.
The mptcp syncookie code path had an explicit check for this, this moves the test into mptcp_token_get_sock() function.
Eventually token.c should be converted to pernet storage, but such change is not suitable for net tree.
Fixes: 2c5ebd001d4f0 ("mptcp: refactor token container") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/mptcp_diag.c | 2 +- net/mptcp/protocol.h | 2 +- net/mptcp/subflow.c | 2 +- net/mptcp/syncookies.c | 13 +------------ net/mptcp/token.c | 11 ++++++++--- net/mptcp/token_test.c | 14 ++++++++------ 6 files changed, 20 insertions(+), 24 deletions(-)
diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c index f48eb6315bbb..292374fb0779 100644 --- a/net/mptcp/mptcp_diag.c +++ b/net/mptcp/mptcp_diag.c @@ -36,7 +36,7 @@ static int mptcp_diag_dump_one(struct netlink_callback *cb, struct sock *sk;
net = sock_net(in_skb->sk); - msk = mptcp_token_get_sock(req->id.idiag_cookie[0]); + msk = mptcp_token_get_sock(net, req->id.idiag_cookie[0]); if (!msk) goto out_nosk;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 6ac564d584c1..c8a49e92e66f 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -680,7 +680,7 @@ int mptcp_token_new_connect(struct sock *sk); void mptcp_token_accept(struct mptcp_subflow_request_sock *r, struct mptcp_sock *msk); bool mptcp_token_exists(u32 token); -struct mptcp_sock *mptcp_token_get_sock(u32 token); +struct mptcp_sock *mptcp_token_get_sock(struct net *net, u32 token); struct mptcp_sock *mptcp_token_iter_next(const struct net *net, long *s_slot, long *s_num); void mptcp_token_destroy(struct mptcp_sock *msk); diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 966f777d35ce..1f3039b829a7 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -86,7 +86,7 @@ static struct mptcp_sock *subflow_token_join_request(struct request_sock *req) struct mptcp_sock *msk; int local_id;
- msk = mptcp_token_get_sock(subflow_req->token); + msk = mptcp_token_get_sock(sock_net(req_to_sk(req)), subflow_req->token); if (!msk) { SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINNOTOKEN); return NULL; diff --git a/net/mptcp/syncookies.c b/net/mptcp/syncookies.c index 37127781aee9..7f22526346a7 100644 --- a/net/mptcp/syncookies.c +++ b/net/mptcp/syncookies.c @@ -108,18 +108,12 @@ bool mptcp_token_join_cookie_init_state(struct mptcp_subflow_request_sock *subfl
e->valid = 0;
- msk = mptcp_token_get_sock(e->token); + msk = mptcp_token_get_sock(net, e->token); if (!msk) { spin_unlock_bh(&join_entry_locks[i]); return false; }
- /* If this fails, the token got re-used in the mean time by another - * mptcp socket in a different netns, i.e. entry is outdated. - */ - if (!net_eq(sock_net((struct sock *)msk), net)) - goto err_put; - subflow_req->remote_nonce = e->remote_nonce; subflow_req->local_nonce = e->local_nonce; subflow_req->backup = e->backup; @@ -128,11 +122,6 @@ bool mptcp_token_join_cookie_init_state(struct mptcp_subflow_request_sock *subfl subflow_req->msk = msk; spin_unlock_bh(&join_entry_locks[i]); return true; - -err_put: - spin_unlock_bh(&join_entry_locks[i]); - sock_put((struct sock *)msk); - return false; }
void __init mptcp_join_cookie_init(void) diff --git a/net/mptcp/token.c b/net/mptcp/token.c index a98e554b034f..e581b341c5be 100644 --- a/net/mptcp/token.c +++ b/net/mptcp/token.c @@ -231,6 +231,7 @@ bool mptcp_token_exists(u32 token)
/** * mptcp_token_get_sock - retrieve mptcp connection sock using its token + * @net: restrict to this namespace * @token: token of the mptcp connection to retrieve * * This function returns the mptcp connection structure with the given token. @@ -238,7 +239,7 @@ bool mptcp_token_exists(u32 token) * * returns NULL if no connection with the given token value exists. */ -struct mptcp_sock *mptcp_token_get_sock(u32 token) +struct mptcp_sock *mptcp_token_get_sock(struct net *net, u32 token) { struct hlist_nulls_node *pos; struct token_bucket *bucket; @@ -251,11 +252,15 @@ struct mptcp_sock *mptcp_token_get_sock(u32 token) again: sk_nulls_for_each_rcu(sk, pos, &bucket->msk_chain) { msk = mptcp_sk(sk); - if (READ_ONCE(msk->token) != token) + if (READ_ONCE(msk->token) != token || + !net_eq(sock_net(sk), net)) continue; + if (!refcount_inc_not_zero(&sk->sk_refcnt)) goto not_found; - if (READ_ONCE(msk->token) != token) { + + if (READ_ONCE(msk->token) != token || + !net_eq(sock_net(sk), net)) { sock_put(sk); goto again; } diff --git a/net/mptcp/token_test.c b/net/mptcp/token_test.c index e1bd6f0a0676..5d984bec1cd8 100644 --- a/net/mptcp/token_test.c +++ b/net/mptcp/token_test.c @@ -11,6 +11,7 @@ static struct mptcp_subflow_request_sock *build_req_sock(struct kunit *test) GFP_USER); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, req); mptcp_token_init_request((struct request_sock *)req); + sock_net_set((struct sock *)req, &init_net); return req; }
@@ -22,7 +23,7 @@ static void mptcp_token_test_req_basic(struct kunit *test) KUNIT_ASSERT_EQ(test, 0, mptcp_token_new_request((struct request_sock *)req)); KUNIT_EXPECT_NE(test, 0, (int)req->token); - KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(req->token)); + KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(&init_net, req->token));
/* cleanup */ mptcp_token_destroy_request((struct request_sock *)req); @@ -55,6 +56,7 @@ static struct mptcp_sock *build_msk(struct kunit *test) msk = kunit_kzalloc(test, sizeof(struct mptcp_sock), GFP_USER); KUNIT_EXPECT_NOT_ERR_OR_NULL(test, msk); refcount_set(&((struct sock *)msk)->sk_refcnt, 1); + sock_net_set((struct sock *)msk, &init_net); return msk; }
@@ -74,11 +76,11 @@ static void mptcp_token_test_msk_basic(struct kunit *test) mptcp_token_new_connect((struct sock *)icsk)); KUNIT_EXPECT_NE(test, 0, (int)ctx->token); KUNIT_EXPECT_EQ(test, ctx->token, msk->token); - KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(ctx->token)); + KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(&init_net, ctx->token)); KUNIT_EXPECT_EQ(test, 2, (int)refcount_read(&sk->sk_refcnt));
mptcp_token_destroy(msk); - KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(ctx->token)); + KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(&init_net, ctx->token)); }
static void mptcp_token_test_accept(struct kunit *test) @@ -90,11 +92,11 @@ static void mptcp_token_test_accept(struct kunit *test) mptcp_token_new_request((struct request_sock *)req)); msk->token = req->token; mptcp_token_accept(req, msk); - KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(msk->token)); + KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(&init_net, msk->token));
/* this is now a no-op */ mptcp_token_destroy_request((struct request_sock *)req); - KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(msk->token)); + KUNIT_EXPECT_PTR_EQ(test, msk, mptcp_token_get_sock(&init_net, msk->token));
/* cleanup */ mptcp_token_destroy(msk); @@ -116,7 +118,7 @@ static void mptcp_token_test_destroyed(struct kunit *test)
/* simulate race on removal */ refcount_set(&sk->sk_refcnt, 0); - KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(msk->token)); + KUNIT_EXPECT_PTR_EQ(test, null_msk, mptcp_token_get_sock(&init_net, msk->token));
/* cleanup */ mptcp_token_destroy(msk);
From: Davide Caratti dcaratti@redhat.com
[ Upstream commit 3f4a08909e2c740f8045efc74c4cf82eeaae3e36 ]
current Linux refuses to change the 'backup' bit of MPTCP endpoints, i.e. using MPTCP_PM_CMD_SET_FLAGS, unless it finds (at least) one subflow that matches the endpoint address. There is no reason for that, so we can just ignore the return value of mptcp_nl_addr_backup(). In this way, endpoints can reconfigure their 'backup' flag even if no MPTCP sockets are open (or more generally, in case the MP_PRIO message is not sent out).
Fixes: 0f9f696a502e ("mptcp: add set_flags command in PM netlink") Signed-off-by: Davide Caratti dcaratti@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/pm_netlink.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 89251cbe9f1a..81103b29c0af 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1558,9 +1558,7 @@ static int mptcp_nl_cmd_set_flags(struct sk_buff *skb, struct genl_info *info)
list_for_each_entry(entry, &pernet->local_addr_list, list) { if (addresses_equal(&entry->addr, &addr.addr, true)) { - ret = mptcp_nl_addr_backup(net, &entry->addr, bkup); - if (ret) - return ret; + mptcp_nl_addr_backup(net, &entry->addr, bkup);
if (bkup) entry->flags |= MPTCP_PM_ADDR_FLAG_BACKUP;
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit 14351f08ed5c8b888cdd95651152db7e096ee27f ]
gcc 8.3 and 5.4 throw this:
In function 'modify_qp_init_to_rtr', ././include/linux/compiler_types.h:322:38: error: call to '__compiletime_assert_1859' declared with attribute error: FIELD_PREP: value too large for the field _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) [..] drivers/infiniband/hw/hns/hns_roce_common.h:91:52: note: in expansion of macro 'FIELD_PREP' *((__le32 *)ptr + (field_h) / 32) |= cpu_to_le32(FIELD_PREP( \ ^~~~~~~~~~ drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write' #define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val) ^~~~~~~~~~~~~ drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4412:2: note: in expansion of macro 'hr_reg_write' hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini);
Because gcc has miscalculated the constantness of lp_pktn_ini:
mtu = ib_mtu_enum_to_int(ib_mtu); if (WARN_ON(mtu < 0)) [..] lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);
Since mtu is limited to {256,512,1024,2048,4096} lp_pktn_ini is between 4 and 8 which is compatible with the 4 bit field in the FIELD_PREP.
Work around this broken compiler by adding a 'can never be true' constraint on lp_pktn_ini's value which clears out the problem.
Fixes: f0cb411aad23 ("RDMA/hns: Use new interface to modify QP context") Link: https://lore.kernel.org/r/0-v1-c773ecb137bc+11f-hns_gcc8_jgg@nvidia.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index c320891c8763..6cb4a4e10837 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -4411,7 +4411,12 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, hr_qp->path_mtu = ib_mtu;
mtu = ib_mtu_enum_to_int(ib_mtu); - if (WARN_ON(mtu < 0)) + if (WARN_ON(mtu <= 0)) + return -EINVAL; +#define MAX_LP_MSG_LEN 65536 + /* MTU * (2 ^ LP_PKTN_INI) shouldn't be bigger than 64KB */ + lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu); + if (WARN_ON(lp_pktn_ini >= 0xF)) return -EINVAL;
if (attr_mask & IB_QP_PATH_MTU) { @@ -4419,10 +4424,6 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, hr_reg_clear(qpc_mask, QPC_MTU); }
-#define MAX_LP_MSG_LEN 65536 - /* MTU * (2 ^ LP_PKTN_INI) shouldn't be bigger than 64KB */ - lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu); - hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini); hr_reg_clear(qpc_mask, QPC_LP_PKTN_INI);
From: Paul Fertser fercerpav@gmail.com
[ Upstream commit 540effa7f283d25bcc13c0940d808002fee340b8 ]
For both local and remote sensors all the supported ICs can report an "undervoltage lockout" condition which means the conversion wasn't properly performed due to insufficient power supply voltage and so the measurement results can't be trusted.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips") Signed-off-by: Paul Fertser fercerpav@gmail.com Link: https://lore.kernel.org/r/20210924093011.26083-2-fercerpav@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp421.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/hwmon/tmp421.c b/drivers/hwmon/tmp421.c index 8fd8c3a94dfe..c9ef83627bb7 100644 --- a/drivers/hwmon/tmp421.c +++ b/drivers/hwmon/tmp421.c @@ -179,10 +179,10 @@ static int tmp421_read(struct device *dev, enum hwmon_sensor_types type, return 0; case hwmon_temp_fault: /* - * The OPEN bit signals a fault. This is bit 0 of the temperature - * register (low byte). + * Any of OPEN or /PVLD bits indicate a hardware mulfunction + * and the conversion result may be incorrect */ - *val = tmp421->temp[channel] & 0x01; + *val = !!(tmp421->temp[channel] & 0x03); return 0; default: return -EOPNOTSUPP; @@ -195,9 +195,6 @@ static umode_t tmp421_is_visible(const void *data, enum hwmon_sensor_types type, { switch (attr) { case hwmon_temp_fault: - if (channel == 0) - return 0; - return 0444; case hwmon_temp_input: return 0444; default:
From: Paul Fertser fercerpav@gmail.com
[ Upstream commit 724e8af85854c4d3401313b6dd7d79cf792d8990 ]
Old code produces -24999 for 0b1110011100000000 input in standard format due to always rounding up rather than "away from zero".
Use the common macro for division, unify and simplify the conversion code along the way.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips") Signed-off-by: Paul Fertser fercerpav@gmail.com Link: https://lore.kernel.org/r/20210924093011.26083-3-fercerpav@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp421.c | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-)
diff --git a/drivers/hwmon/tmp421.c b/drivers/hwmon/tmp421.c index c9ef83627bb7..b963a369c5ab 100644 --- a/drivers/hwmon/tmp421.c +++ b/drivers/hwmon/tmp421.c @@ -100,23 +100,17 @@ struct tmp421_data { s16 temp[4]; };
-static int temp_from_s16(s16 reg) +static int temp_from_raw(u16 reg, bool extended) { /* Mask out status bits */ int temp = reg & ~0xf;
- return (temp * 1000 + 128) / 256; -} - -static int temp_from_u16(u16 reg) -{ - /* Mask out status bits */ - int temp = reg & ~0xf; - - /* Add offset for extended temperature range. */ - temp -= 64 * 256; + if (extended) + temp = temp - 64 * 256; + else + temp = (s16)temp;
- return (temp * 1000 + 128) / 256; + return DIV_ROUND_CLOSEST(temp * 1000, 256); }
static int tmp421_update_device(struct tmp421_data *data) @@ -172,10 +166,8 @@ static int tmp421_read(struct device *dev, enum hwmon_sensor_types type,
switch (attr) { case hwmon_temp_input: - if (tmp421->config & TMP421_CONFIG_RANGE) - *val = temp_from_u16(tmp421->temp[channel]); - else - *val = temp_from_s16(tmp421->temp[channel]); + *val = temp_from_raw(tmp421->temp[channel], + tmp421->config & TMP421_CONFIG_RANGE); return 0; case hwmon_temp_fault: /*
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 325fd36ae76a6d089983b2d2eccb41237d35b221 ]
The enetc phylink .mac_config handler intends to clear the IFMODE field (bits 1:0) of the PM0_IF_MODE register, but incorrectly clears all the other fields instead.
For normal operation, the bug was inconsequential, due to the fact that we write the PM0_IF_MODE register in two stages, first in phylink .mac_config (which incorrectly cleared out a bunch of stuff), then we update the speed and duplex to the correct values in phylink .mac_link_up.
Judging by the code (not tested), it looks like maybe loopback mode was broken, since this is one of the settings in PM0_IF_MODE which is incorrectly cleared.
Fixes: c76a97218dcb ("net: enetc: force the RGMII speed and duplex instead of operating in inband mode") Reported-by: Pavel Machek (CIP) pavel@denx.de Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/enetc/enetc_pf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index c84f6c226743..cf00709caea4 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -541,8 +541,7 @@ static void enetc_mac_config(struct enetc_hw *hw, phy_interface_t phy_mode)
if (phy_interface_mode_is_rgmii(phy_mode)) { val = enetc_port_rd(hw, ENETC_PM0_IF_MODE); - val &= ~ENETC_PM0_IFM_EN_AUTO; - val &= ENETC_PM0_IFM_IFMODE_MASK; + val &= ~(ENETC_PM0_IFM_EN_AUTO | ENETC_PM0_IFM_IFMODE_MASK); val |= ENETC_PM0_IFM_IFMODE_GMII | ENETC_PM0_IFM_RG; enetc_port_wr(hw, ENETC_PM0_IF_MODE, val); }
From: Xiao Liang shaw.leon@gmail.com
[ Upstream commit 597aa16c782496bf74c5dc3b45ff472ade6cee64 ]
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop() to get the length of rtnexthop correct.
Fixes: b0f60193632e ("ipv4: Refactor nexthop attributes in fib_dump_info") Signed-off-by: Xiao Liang shaw.leon@gmail.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ip_fib.h | 2 +- include/net/nexthop.h | 2 +- net/ipv4/fib_semantics.c | 16 +++++++++------- net/ipv6/route.c | 5 +++-- 4 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index 3ab2563b1a23..7fd7f6093612 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -597,5 +597,5 @@ int ip_valid_fib_dump_req(struct net *net, const struct nlmsghdr *nlh, int fib_nexthop_info(struct sk_buff *skb, const struct fib_nh_common *nh, u8 rt_family, unsigned char *flags, bool skip_oif); int fib_add_nexthop(struct sk_buff *skb, const struct fib_nh_common *nh, - int nh_weight, u8 rt_family); + int nh_weight, u8 rt_family, u32 nh_tclassid); #endif /* _NET_FIB_H */ diff --git a/include/net/nexthop.h b/include/net/nexthop.h index 10e1777877e6..28085b995ddc 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -325,7 +325,7 @@ int nexthop_mpath_fill_node(struct sk_buff *skb, struct nexthop *nh, struct fib_nh_common *nhc = &nhi->fib_nhc; int weight = nhg->nh_entries[i].weight;
- if (fib_add_nexthop(skb, nhc, weight, rt_family) < 0) + if (fib_add_nexthop(skb, nhc, weight, rt_family, 0) < 0) return -EMSGSIZE; }
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 4c0c33e4710d..27fdd86b9cee 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -1663,7 +1663,7 @@ EXPORT_SYMBOL_GPL(fib_nexthop_info);
#if IS_ENABLED(CONFIG_IP_ROUTE_MULTIPATH) || IS_ENABLED(CONFIG_IPV6) int fib_add_nexthop(struct sk_buff *skb, const struct fib_nh_common *nhc, - int nh_weight, u8 rt_family) + int nh_weight, u8 rt_family, u32 nh_tclassid) { const struct net_device *dev = nhc->nhc_dev; struct rtnexthop *rtnh; @@ -1681,6 +1681,9 @@ int fib_add_nexthop(struct sk_buff *skb, const struct fib_nh_common *nhc,
rtnh->rtnh_flags = flags;
+ if (nh_tclassid && nla_put_u32(skb, RTA_FLOW, nh_tclassid)) + goto nla_put_failure; + /* length of rtnetlink header + attributes */ rtnh->rtnh_len = nlmsg_get_pos(skb) - (void *)rtnh;
@@ -1708,14 +1711,13 @@ static int fib_add_multipath(struct sk_buff *skb, struct fib_info *fi) }
for_nexthops(fi) { - if (fib_add_nexthop(skb, &nh->nh_common, nh->fib_nh_weight, - AF_INET) < 0) - goto nla_put_failure; + u32 nh_tclassid = 0; #ifdef CONFIG_IP_ROUTE_CLASSID - if (nh->nh_tclassid && - nla_put_u32(skb, RTA_FLOW, nh->nh_tclassid)) - goto nla_put_failure; + nh_tclassid = nh->nh_tclassid; #endif + if (fib_add_nexthop(skb, &nh->nh_common, nh->fib_nh_weight, + AF_INET, nh_tclassid) < 0) + goto nla_put_failure; } endfor_nexthops(fi);
mp_end: diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 603340302101..0aeff2ce17b9 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -5700,14 +5700,15 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb, goto nla_put_failure;
if (fib_add_nexthop(skb, &rt->fib6_nh->nh_common, - rt->fib6_nh->fib_nh_weight, AF_INET6) < 0) + rt->fib6_nh->fib_nh_weight, AF_INET6, + 0) < 0) goto nla_put_failure;
list_for_each_entry_safe(sibling, next_sibling, &rt->fib6_siblings, fib6_siblings) { if (fib_add_nexthop(skb, &sibling->fib6_nh->nh_common, sibling->fib6_nh->fib_nh_weight, - AF_INET6) < 0) + AF_INET6, 0) < 0) goto nla_put_failure; }
From: Aaro Koskinen aaro.koskinen@iki.fi
[ Upstream commit 5ab8a447bcfee1ded709e7ff5dc7608ca9f66ae2 ]
After commit 05b35e7eb9a1 ("smsc95xx: add phylib support"), link changes are no longer propagated to usbnet. As a result, rx URB allocation won't happen until there is a packet sent out first (this might never happen, e.g. running just ssh server with a static IP). Fix by triggering usbnet EVENT_LINK_CHANGE.
Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Aaro Koskinen aaro.koskinen@iki.fi Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc95xx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index 4c8ee1cff4d4..4cb71dd1998c 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -1178,7 +1178,10 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
static void smsc95xx_handle_link_change(struct net_device *net) { + struct usbnet *dev = netdev_priv(net); + phy_print_status(net->phydev); + usbnet_defer_kevent(dev, EVENT_LINK_CHANGE); }
static int smsc95xx_start_phy(struct usbnet *dev)
From: Matthew Auld matthew.auld@intel.com
[ Upstream commit c83ff0186401169eb27ce5057d820b7a863455c3 ]
Currently we blow up in trace_dma_fence_init, when calling into get_driver_name or get_timeline_name, since both the engine and context might be NULL(or contain some garbage address) in the case of newly allocated slab objects via the request ctor. Note that we also use SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately freed, but delay freeing the underlying page by an RCU grace period. With this scheme requests can be re-allocated, at the same time as they are also being read by some lockless RCU lookup mechanism.
In the ctor case, which is only called for new slab objects(i.e allocate new page and call the ctor for each object) it's safe to reset the context/engine prior to calling into dma_fence_init, since we can be certain that no one is doing an RCU lookup which might depend on peeking at the engine/context, like in active_engine(), since the object can't yet be externally visible.
In the recycled case(which might also be externally visible) the request refcount always transitions from 0->1 after we set the context/engine etc, which should ensure it's valid to dereference the engine for example, when doing an RCU list-walk, so long as we can also increment the refcount first. If the refcount is already zero, then the request is considered complete/released. If it's non-zero, then the request might be in the process of being re-allocated, or potentially still in flight, however after successfully incrementing the refcount, it's possible to carefully inspect the request state, to determine if the request is still what we were looking for. Note that all externally visible requests returned to the cache must have zero refcount.
One possible fix then is to move dma_fence_init out from the request ctor. Originally this was how it was done, but it was moved in:
commit 855e39e65cfc33a73724f1cc644ffc5754864a20 Author: Chris Wilson chris@chris-wilson.co.uk Date: Mon Feb 3 09:41:48 2020 +0000
drm/i915: Initialise basic fence before acquiring seqno
where it looks like intel_timeline_get_seqno() relied on some of the rq->fence state, but that is no longer the case since:
commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc Author: Maarten Lankhorst maarten.lankhorst@linux.intel.com Date: Tue Mar 23 16:49:50 2021 +0100
drm/i915: Do not share hwsp across contexts any more, v8.
intel_timeline_get_seqno() could also be cleaned up slightly by dropping the request argument.
Moving dma_fence_init back out of the ctor, should ensure we have enough of the request initialised in case of trace_dma_fence_init. Functionally this should be the same, and is effectively what we were already open coding before, except now we also assign the fence->lock and fence->ops, but since these are invariant for recycled requests(which might be externally visible), and will therefore already hold the same value, it shouldn't matter.
An alternative fix, since we don't yet have a fully initialised request when in the ctor, is just setting the context/engine as NULL, but this does require adding some extra handling in get_driver_name etc.
v2(Daniel): - Try to make the commit message less confusing
Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring seqno") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Michael Mason michael.w.mason@intel.com Cc: Daniel Vetter daniel@ffwll.ch Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20210921134202.3803151-1-matth... (cherry picked from commit be988eaee1cb208c4445db46bc3ceaf75f586f0b) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_request.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 37aef1308573..7db972fa7024 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -914,8 +914,6 @@ static void __i915_request_ctor(void *arg) i915_sw_fence_init(&rq->submit, submit_notify); i915_sw_fence_init(&rq->semaphore, semaphore_notify);
- dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0); - rq->capture_list = NULL;
init_llist_head(&rq->execute_cb); @@ -978,17 +976,12 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq->ring = ce->ring; rq->execution_mask = ce->engine->mask;
- kref_init(&rq->fence.refcount); - rq->fence.flags = 0; - rq->fence.error = 0; - INIT_LIST_HEAD(&rq->fence.cb_list); - ret = intel_timeline_get_seqno(tl, rq, &seqno); if (ret) goto err_free;
- rq->fence.context = tl->fence_context; - rq->fence.seqno = seqno; + dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, + tl->fence_context, seqno);
RCU_INIT_POINTER(rq->timeline, tl); rq->hwsp_seqno = tl->hwsp_seqno;
From: Tejas Upadhyay tejaskumarx.surendrakumar.upadhyay@intel.com
[ Upstream commit 4b8bcaf8a6d6ab5db51e30865def5cb694eb2966 ]
In commit 4e5c8a99e1cb ("drm/i915: Drop i915_request.lock requirement for intel_rps_boost()"), we decoupled the rps worker from the pm so that we could avoid the synchronization penalty which makes the assertion liable to run too early. Which makes warning invalid hence removed.
Fixes: 4e5c8a99e1cb ("drm/i915: Drop i915_request.lock requirement for intel_rps_boost()")
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Tejas Upadhyay tejaskumarx.surendrakumar.upadhyay@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210914090412.1393498-1-tejas... (cherry picked from commit a837a0686308d95ad9c48d32b4dfe86a17dc98c2) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_rps.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 06e9a8ed4e03..db9c212a240e 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -861,8 +861,6 @@ void intel_rps_park(struct intel_rps *rps) { int adj;
- GEM_BUG_ON(atomic_read(&rps->num_waiters)); - if (!intel_rps_clear_active(rps)) return;
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit fe23036192c95b66e60d019d2ec1814d0d561ffd ]
The datasheets suggests the 6161 uses a per port setting for jumbo frames. Testing has however shown this is not correct, it uses the old style chip wide MTU control. Change the ops in the 6161 structure to reflect this.
Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU") Reported by: 曹煜 cao88yu@gmail.com Signed-off-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 1c122a1f2f97..f99f09c50722 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3657,7 +3657,6 @@ static const struct mv88e6xxx_ops mv88e6161_ops = { .port_set_ucast_flood = mv88e6352_port_set_ucast_flood, .port_set_mcast_flood = mv88e6352_port_set_mcast_flood, .port_set_ether_type = mv88e6351_port_set_ether_type, - .port_set_jumbo_size = mv88e6165_port_set_jumbo_size, .port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting, .port_pause_limit = mv88e6097_port_pause_limit, .port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit, @@ -3682,6 +3681,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = { .avb_ops = &mv88e6165_avb_ops, .ptp_ops = &mv88e6165_ptp_ops, .phylink_validate = mv88e6185_phylink_validate, + .set_max_frame_size = mv88e6185_g1_set_max_frame_size, };
static const struct mv88e6xxx_ops mv88e6165_ops = {
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit b92ce2f54c0f0ff781e914ec189c25f7bf1b1ec2 ]
The MTU passed to the DSA driver is the payload size, typically 1500. However, the switch uses the frame size when applying restrictions. Adjust the MTU with the size of the Ethernet header and the frame checksum. The VLAN header also needs to be included when the frame size it per port, but not when it is global.
Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU") Reported by: 曹煜 cao88yu@gmail.com Signed-off-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 12 ++++++------ drivers/net/dsa/mv88e6xxx/global1.c | 2 ++ drivers/net/dsa/mv88e6xxx/port.c | 2 ++ 3 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index f99f09c50722..014950a343f4 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2775,8 +2775,8 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) if (err) return err;
- /* Port Control 2: don't force a good FCS, set the maximum frame size to - * 10240 bytes, disable 802.1q tags checking, don't discard tagged or + /* Port Control 2: don't force a good FCS, set the MTU size to + * 10222 bytes, disable 802.1q tags checking, don't discard tagged or * untagged frames on this port, do a destination address lookup on all * received packets as usual, disable ARP mirroring and don't send a * copy of all transmitted/received frames on this port to the CPU. @@ -2795,7 +2795,7 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) return err;
if (chip->info->ops->port_set_jumbo_size) { - err = chip->info->ops->port_set_jumbo_size(chip, port, 10240); + err = chip->info->ops->port_set_jumbo_size(chip, port, 10218); if (err) return err; } @@ -2885,10 +2885,10 @@ static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv;
if (chip->info->ops->port_set_jumbo_size) - return 10240; + return 10240 - VLAN_ETH_HLEN - ETH_FCS_LEN; else if (chip->info->ops->set_max_frame_size) - return 1632; - return 1522; + return 1632 - VLAN_ETH_HLEN - ETH_FCS_LEN; + return 1522 - VLAN_ETH_HLEN - ETH_FCS_LEN; }
static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c index 815b0f681d69..5848112036b0 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.c +++ b/drivers/net/dsa/mv88e6xxx/global1.c @@ -232,6 +232,8 @@ int mv88e6185_g1_set_max_frame_size(struct mv88e6xxx_chip *chip, int mtu) u16 val; int err;
+ mtu += ETH_HLEN + ETH_FCS_LEN; + err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_CTL1, &val); if (err) return err; diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index f77e2ee64a60..451028c57af8 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -1277,6 +1277,8 @@ int mv88e6165_port_set_jumbo_size(struct mv88e6xxx_chip *chip, int port, u16 reg; int err;
+ size += VLAN_ETH_HLEN + ETH_FCS_LEN; + err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL2, ®); if (err) return err;
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit b9c587fed61cf88bd45822c3159644445f6d5aa6 ]
Same members of the Marvell Ethernet switches impose MTU restrictions on ports used for connecting to the CPU or another switch for DSA. If the MTU is set too low, tagged frames will be discarded. Ensure the worst case tagger overhead is included in setting the MTU for DSA and CPU ports.
Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU") Reported by: 曹煜 cao88yu@gmail.com Signed-off-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 9 ++++++--- drivers/net/dsa/mv88e6xxx/chip.h | 1 + 2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 014950a343f4..66b4f4a9832a 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2885,10 +2885,10 @@ static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv;
if (chip->info->ops->port_set_jumbo_size) - return 10240 - VLAN_ETH_HLEN - ETH_FCS_LEN; + return 10240 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; else if (chip->info->ops->set_max_frame_size) - return 1632 - VLAN_ETH_HLEN - ETH_FCS_LEN; - return 1522 - VLAN_ETH_HLEN - ETH_FCS_LEN; + return 1632 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; + return 1522 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; }
static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) @@ -2896,6 +2896,9 @@ static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) struct mv88e6xxx_chip *chip = ds->priv; int ret = 0;
+ if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port)) + new_mtu += EDSA_HLEN; + mv88e6xxx_reg_lock(chip); if (chip->info->ops->port_set_jumbo_size) ret = chip->info->ops->port_set_jumbo_size(chip, port, new_mtu); diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 675b1f3e43b7..59f316cc8583 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -18,6 +18,7 @@ #include <linux/timecounter.h> #include <net/dsa.h>
+#define EDSA_HLEN 8 #define MV88E6XXX_N_FID 4096
/* PVT limits for 4-bit port and 5-bit switch */
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit 4329c8dc110b25d5f04ed20c6821bb60deff279f ]
commit abf9b902059f ("e100: cleanup unneeded math") tried to simplify e100_get_regs_len and remove a double 'divide and then multiply' calculation that the e100_reg_regs_len function did.
This change broke the size calculation entirely as it failed to account for the fact that the numbered registers are actually 4 bytes wide and not 1 byte. This resulted in a significant under allocation of the register buffer used by e100_get_regs.
Fix this by properly multiplying the register count by u32 first before adding the size of the dump buffer.
Fixes: abf9b902059f ("e100: cleanup unneeded math") Reported-by: Felicitas Hetzelt felicitashetzelt@gmail.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e100.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c index 1b0958bd24f6..71b9f0563b32 100644 --- a/drivers/net/ethernet/intel/e100.c +++ b/drivers/net/ethernet/intel/e100.c @@ -2441,7 +2441,11 @@ static void e100_get_drvinfo(struct net_device *netdev, static int e100_get_regs_len(struct net_device *netdev) { struct nic *nic = netdev_priv(netdev); - return 1 + E100_PHY_REGS + sizeof(nic->mem->dump_buf); + + /* We know the number of registers, and the size of the dump buffer. + * Calculate the total size in bytes. + */ + return (1 + E100_PHY_REGS) * sizeof(u32) + sizeof(nic->mem->dump_buf); }
static void e100_get_regs(struct net_device *netdev,
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit 51032e6f17ce990d06123ad7307f258c50d25aa7 ]
The e100_get_regs function is used to implement a simple register dump for the e100 device. The data is broken into a couple of MAC control registers, and then a series of PHY registers, followed by a memory dump buffer.
The total length of the register dump is defined as (1 + E100_PHY_REGS) * sizeof(u32) + sizeof(nic->mem->dump_buf).
The logic for filling in the PHY registers uses a convoluted inverted count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This is actually one more than the supposed number of PHY registers.
The memory dump buffer is then filled into the space at [2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past the total size.
The end result is that we overrun the total buffer size allocated by the kernel, which could lead to a panic or other issues due to memory corruption.
It is difficult to determine the actual total number of registers here. The only 8255x datasheet I could find indicates there are 28 total MDI registers. However, we're reading 29 here, and reading them in reverse!
In addition, the ethtool e100 register dump interface appears to read the first PHY register to determine if the device is in MDI or MDIx mode. This doesn't appear to be documented anywhere within the 8255x datasheet. I can only assume it must be in register 28 (the extra register we're reading here).
Lets not change any of the intended meaning of what we copy here. Just extend the space by 4 bytes to account for the extra register and continue copying the data out in the same order.
Change the E100_PHY_REGS value to be the correct total (29) so that the total register dump size is calculated properly. Fix the offset for where we copy the dump buffer so that it doesn't overrun the total size.
Re-write the for loop to use counting up instead of the convoluted down-counting. Correct the mdio_read offset to use the 0-based register offsets, but maintain the bizarre reverse ordering so that we have the ABI expected by applications like ethtool. This requires and additional subtraction of 1. It seems a bit odd but it makes the flow of assignment into the register buffer easier to follow.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Felicitas Hetzelt felicitashetzelt@gmail.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e100.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c index 71b9f0563b32..1fa68ebe9432 100644 --- a/drivers/net/ethernet/intel/e100.c +++ b/drivers/net/ethernet/intel/e100.c @@ -2437,7 +2437,7 @@ static void e100_get_drvinfo(struct net_device *netdev, sizeof(info->bus_info)); }
-#define E100_PHY_REGS 0x1C +#define E100_PHY_REGS 0x1D static int e100_get_regs_len(struct net_device *netdev) { struct nic *nic = netdev_priv(netdev); @@ -2459,14 +2459,18 @@ static void e100_get_regs(struct net_device *netdev, buff[0] = ioread8(&nic->csr->scb.cmd_hi) << 24 | ioread8(&nic->csr->scb.cmd_lo) << 16 | ioread16(&nic->csr->scb.status); - for (i = E100_PHY_REGS; i >= 0; i--) - buff[1 + E100_PHY_REGS - i] = - mdio_read(netdev, nic->mii.phy_id, i); + for (i = 0; i < E100_PHY_REGS; i++) + /* Note that we read the registers in reverse order. This + * ordering is the ABI apparently used by ethtool and other + * applications. + */ + buff[1 + i] = mdio_read(netdev, nic->mii.phy_id, + E100_PHY_REGS - 1 - i); memset(nic->mem->dump_buf, 0, sizeof(nic->mem->dump_buf)); e100_exec_cb(nic, NULL, e100_dump); msleep(10); - memcpy(&buff[2 + E100_PHY_REGS], nic->mem->dump_buf, - sizeof(nic->mem->dump_buf)); + memcpy(&buff[1 + E100_PHY_REGS], nic->mem->dump_buf, + sizeof(nic->mem->dump_buf)); }
static void e100_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
From: Guo Zhi qtxuning1999@sjtu.edu.cn
[ Upstream commit 7d5cfafe8b4006a75b55c2f1fdfdb363f9a5cc98 ]
Pointers should be printed with %p or %px rather than cast to 'unsigned long long' and printed with %llx. Change %llx to %p to print the secured pointer.
Fixes: 042a00f93aad ("IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev") Link: https://lore.kernel.org/r/20210922134857.619602-1-qtxuning1999@sjtu.edu.cn Signed-off-by: Guo Zhi qtxuning1999@sjtu.edu.cn Acked-by: Mike Marciniszyn mike.marciniszyn@cornelisnetworks.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/ipoib_tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c index 993f9838b6c8..e1fdeadda437 100644 --- a/drivers/infiniband/hw/hfi1/ipoib_tx.c +++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c @@ -873,14 +873,14 @@ void hfi1_ipoib_tx_timeout(struct net_device *dev, unsigned int q) struct hfi1_ipoib_txq *txq = &priv->txqs[q]; u64 completed = atomic64_read(&txq->complete_txreqs);
- dd_dev_info(priv->dd, "timeout txq %llx q %u stopped %u stops %d no_desc %d ring_full %d\n", - (unsigned long long)txq, q, + dd_dev_info(priv->dd, "timeout txq %p q %u stopped %u stops %d no_desc %d ring_full %d\n", + txq, q, __netif_subqueue_stopped(dev, txq->q_idx), atomic_read(&txq->stops), atomic_read(&txq->no_desc), atomic_read(&txq->ring_full)); - dd_dev_info(priv->dd, "sde %llx engine %u\n", - (unsigned long long)txq->sde, + dd_dev_info(priv->dd, "sde %p engine %u\n", + txq->sde, txq->sde ? txq->sde->this_idx : 0); dd_dev_info(priv->dd, "flow %x\n", txq->flow.as_int); dd_dev_info(priv->dd, "sent %llu completed %llu used %llu\n",
From: Wenpeng Liang liangwenpeng@huawei.com
[ Upstream commit cc26aee100588a3f293921342a307b6309ace193 ]
The size of CQE is different for different versions of hardware, so the driver needs to specify the size of CQE explicitly.
Fixes: 09a5f210f67e ("RDMA/hns: Add support for CQE in size of 64 Bytes") Link: https://lore.kernel.org/r/20210927125557.15031-2-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 6cb4a4e10837..0ccb0c453f6a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3306,7 +3306,7 @@ static void __hns_roce_v2_cq_clean(struct hns_roce_cq *hr_cq, u32 qpn, dest = get_cqe_v2(hr_cq, (prod_index + nfreed) & hr_cq->ib_cq.cqe); owner_bit = hr_reg_read(dest, CQE_OWNER); - memcpy(dest, cqe, sizeof(*cqe)); + memcpy(dest, cqe, hr_cq->cqe_size); hr_reg_write(dest, CQE_OWNER, owner_bit); } }
From: Wenpeng Liang liangwenpeng@huawei.com
[ Upstream commit e671f0ecfece14940a9bb81981098910ea278cf7 ]
If the CQE size of the user space is not the size supported by the hardware, the creation of CQ should be stopped.
Fixes: 09a5f210f67e ("RDMA/hns: Add support for CQE in size of 64 Bytes") Link: https://lore.kernel.org/r/20210927125557.15031-3-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hns/hns_roce_cq.c | 31 ++++++++++++++++++------- 1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c index 1e9c3c5bee68..d763f097599f 100644 --- a/drivers/infiniband/hw/hns/hns_roce_cq.c +++ b/drivers/infiniband/hw/hns/hns_roce_cq.c @@ -326,19 +326,30 @@ static void set_cq_param(struct hns_roce_cq *hr_cq, u32 cq_entries, int vector, INIT_LIST_HEAD(&hr_cq->rq_list); }
-static void set_cqe_size(struct hns_roce_cq *hr_cq, struct ib_udata *udata, - struct hns_roce_ib_create_cq *ucmd) +static int set_cqe_size(struct hns_roce_cq *hr_cq, struct ib_udata *udata, + struct hns_roce_ib_create_cq *ucmd) { struct hns_roce_dev *hr_dev = to_hr_dev(hr_cq->ib_cq.device);
- if (udata) { - if (udata->inlen >= offsetofend(typeof(*ucmd), cqe_size)) - hr_cq->cqe_size = ucmd->cqe_size; - else - hr_cq->cqe_size = HNS_ROCE_V2_CQE_SIZE; - } else { + if (!udata) { hr_cq->cqe_size = hr_dev->caps.cqe_sz; + return 0; + } + + if (udata->inlen >= offsetofend(typeof(*ucmd), cqe_size)) { + if (ucmd->cqe_size != HNS_ROCE_V2_CQE_SIZE && + ucmd->cqe_size != HNS_ROCE_V3_CQE_SIZE) { + ibdev_err(&hr_dev->ib_dev, + "invalid cqe size %u.\n", ucmd->cqe_size); + return -EINVAL; + } + + hr_cq->cqe_size = ucmd->cqe_size; + } else { + hr_cq->cqe_size = HNS_ROCE_V2_CQE_SIZE; } + + return 0; }
int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr, @@ -366,7 +377,9 @@ int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr,
set_cq_param(hr_cq, attr->cqe, attr->comp_vector, &ucmd);
- set_cqe_size(hr_cq, udata, &ucmd); + ret = set_cqe_size(hr_cq, udata, &ucmd); + if (ret) + return ret;
ret = alloc_cq_buf(hr_dev, hr_cq, udata, ucmd.buf_addr); if (ret) {
From: Lorenz Bauer lmb@cloudflare.com
[ Upstream commit 8a98ae12fbefdb583a7696de719a1d57e5e940a2 ]
When introducing CAP_BPF, bpf_jit_charge_modmem() was not changed to treat programs with CAP_BPF as privileged for the purpose of JIT memory allocation. This means that a program without CAP_BPF can block a program with CAP_BPF from loading a program.
Fix this by checking bpf_capable() in bpf_jit_charge_modmem().
Fixes: 2c78ee898d8f ("bpf: Implement CAP_BPF") Signed-off-by: Lorenz Bauer lmb@cloudflare.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210922111153.19843-1-lmb@cloudflare.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 0a28a8095d3e..c019611fbc8f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -827,7 +827,7 @@ int bpf_jit_charge_modmem(u32 pages) { if (atomic_long_add_return(pages, &bpf_jit_current) > (bpf_jit_limit >> PAGE_SHIFT)) { - if (!capable(CAP_SYS_ADMIN)) { + if (!bpf_capable()) { atomic_long_sub(pages, &bpf_jit_current); return -EPERM; }
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit bcfd367c2839f2126c048fe59700ec1b538e2b06 ]
When a BPF object is compiled without BTF info (without -g), trying to link such objects using bpftool causes a SIGSEGV due to btf__get_nr_types accessing obj->btf which is NULL. Fix this by checking for the NULL pointer, and return error.
Reproducer: $ cat a.bpf.c extern int foo(void); int bar(void) { return foo(); } $ cat b.bpf.c int foo(void) { return 0; } $ clang -O2 -target bpf -c a.bpf.c $ clang -O2 -target bpf -c b.bpf.c $ bpftool gen obj out a.bpf.o b.bpf.o Segmentation fault (core dumped)
After fix: $ bpftool gen obj out a.bpf.o b.bpf.o libbpf: failed to find BTF info for object 'a.bpf.o' Error: failed to link 'a.bpf.o': Unknown error -22 (-22)
Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables) Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/linker.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 10911a8cad0f..2df880cefdae 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -1649,11 +1649,17 @@ static bool btf_is_non_static(const struct btf_type *t) static int find_glob_sym_btf(struct src_obj *obj, Elf64_Sym *sym, const char *sym_name, int *out_btf_sec_id, int *out_btf_id) { - int i, j, n = btf__get_nr_types(obj->btf), m, btf_id = 0; + int i, j, n, m, btf_id = 0; const struct btf_type *t; const struct btf_var_secinfo *vi; const char *name;
+ if (!obj->btf) { + pr_warn("failed to find BTF info for object '%s'\n", obj->filename); + return -EINVAL; + } + + n = btf__get_nr_types(obj->btf); for (i = 1; i <= n; i++) { t = btf__type_by_id(obj->btf, i);
From: Jiri Benc jbenc@redhat.com
[ Upstream commit d888eaac4fb1df30320bb1305a8f78efe86524c6 ]
When building bpf selftest with make -j, I'm randomly getting build failures such as this one:
In file included from progs/bpf_flow.c:19: [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found #include "bpf_helper_defs.h" ^~~~~~~~~~~~~~~~~~~
The file that fails the build varies between runs but it's always in the progs/ subdir.
The reason is a missing make dependency on libbpf for the .o files in progs/. There was a dependency before commit 3ac2e20fba07e but that commit removed it to prevent unneeded rebuilds. However, that only works if libbpf has been built already; the 'wildcard' prerequisite does not trigger when there's no bpf_helper_defs.h generated yet.
Keep the libbpf as an order-only prerequisite to satisfy both goals. It is always built before the progs/ objects but it does not trigger unnecessary rebuilds by itself.
Fixes: 3ac2e20fba07e ("selftests/bpf: BPF object files should depend only on libbpf headers") Signed-off-by: Jiri Benc jbenc@redhat.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index f405b20c1e6c..93f1f124ef89 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -374,7 +374,8 @@ $(TRUNNER_BPF_OBJS): $(TRUNNER_OUTPUT)/%.o: \ $(TRUNNER_BPF_PROGS_DIR)/%.c \ $(TRUNNER_BPF_PROGS_DIR)/*.h \ $$(INCLUDE_DIR)/vmlinux.h \ - $(wildcard $(BPFDIR)/bpf_*.h) | $(TRUNNER_OUTPUT) + $(wildcard $(BPFDIR)/bpf_*.h) \ + | $(TRUNNER_OUTPUT) $$(BPFOBJ) $$(call $(TRUNNER_BPF_BUILD_RULE),$$<,$$@, \ $(TRUNNER_BPF_CFLAGS))
From: Jiri Benc jbenc@redhat.com
[ Upstream commit 79e2c306667542b8ee2d9a9d947eadc7039f0a3c ]
It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override a greater rp_filter value on the individual interfaces. We also need to set net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way, they'll also get their own rp_filter value of zero.
Fixes: 0fde56e4385b0 ("selftests: bpf: add test_lwt_ip_encap selftest") Signed-off-by: Jiri Benc jbenc@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/b1cdd9d469f09ea6e01e9c89a6071c79b7380f89.1632386... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/test_lwt_ip_encap.sh | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh index 59ea56945e6c..b497bb85b667 100755 --- a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh +++ b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh @@ -112,6 +112,14 @@ setup() ip netns add "${NS2}" ip netns add "${NS3}"
+ # rp_filter gets confused by what these tests are doing, so disable it + ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0 + ip netns exec ${NS2} sysctl -wq net.ipv4.conf.all.rp_filter=0 + ip netns exec ${NS3} sysctl -wq net.ipv4.conf.all.rp_filter=0 + ip netns exec ${NS1} sysctl -wq net.ipv4.conf.default.rp_filter=0 + ip netns exec ${NS2} sysctl -wq net.ipv4.conf.default.rp_filter=0 + ip netns exec ${NS3} sysctl -wq net.ipv4.conf.default.rp_filter=0 + ip link add veth1 type veth peer name veth2 ip link add veth3 type veth peer name veth4 ip link add veth5 type veth peer name veth6 @@ -236,11 +244,6 @@ setup() ip -netns ${NS1} -6 route add ${IPv6_GRE}/128 dev veth5 via ${IPv6_6} ${VRF} ip -netns ${NS2} -6 route add ${IPv6_GRE}/128 dev veth7 via ${IPv6_8} ${VRF}
- # rp_filter gets confused by what these tests are doing, so disable it - ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0 - ip netns exec ${NS2} sysctl -wq net.ipv4.conf.all.rp_filter=0 - ip netns exec ${NS3} sysctl -wq net.ipv4.conf.all.rp_filter=0 - TMPFILE=$(mktemp /tmp/test_lwt_ip_encap.XXXXXX)
sleep 1 # reduce flakiness
From: Johan Almbladh johan.almbladh@anyfinetworks.com
[ Upstream commit ced185824c89b60e65b5a2606954c098320cdfb8 ]
Fix the case where the dst register maps to %rax as otherwise this produces an incorrect mapping with the implementation in 981f94c3e921 ("bpf: Add bitwise atomic instructions") as %rax is clobbered given it's part of the cmpxchg as operand.
The issue is similar to b29dd96b905f ("bpf, x86: Fix BPF_FETCH atomic and/or/ xor with r0 as src") just that the case of dst register was missed.
Before, dst=r0 (%rax) src=r2 (%rsi):
[...] c5: mov %rax,%r10 c8: mov 0x0(%rax),%rax <---+ (broken) cc: mov %rax,%r11 | cf: and %rsi,%r11 | d2: lock cmpxchg %r11,0x0(%rax) <---+ d8: jne 0x00000000000000c8 | da: mov %rax,%rsi | dd: mov %r10,%rax | [...] | | After, dst=r0 (%rax) src=r2 (%rsi): | | [...] | da: mov %rax,%r10 | dd: mov 0x0(%r10),%rax <---+ (fixed) e1: mov %rax,%r11 | e4: and %rsi,%r11 | e7: lock cmpxchg %r11,0x0(%r10) <---+ ed: jne 0x00000000000000dd ef: mov %rax,%rsi f2: mov %r10,%rax [...]
The remaining combinations were fine as-is though:
After, dst=r9 (%r15) src=r0 (%rax):
[...] dc: mov %rax,%r10 df: mov 0x0(%r15),%rax e3: mov %rax,%r11 e6: and %r10,%r11 e9: lock cmpxchg %r11,0x0(%r15) ef: jne 0x00000000000000df _ f1: mov %rax,%r10 | (unneeded, but f4: mov %r10,%rax _| not a problem) [...]
After, dst=r9 (%r15) src=r4 (%rcx):
[...] de: mov %rax,%r10 e1: mov 0x0(%r15),%rax e5: mov %rax,%r11 e8: and %rcx,%r11 eb: lock cmpxchg %r11,0x0(%r15) f1: jne 0x00000000000000e1 f3: mov %rax,%rcx f6: mov %r10,%rax [...]
The case of dst == src register is rejected by the verifier and therefore not supported, but x86 JIT also handles this case just fine.
After, dst=r0 (%rax) src=r0 (%rax):
[...] eb: mov %rax,%r10 ee: mov 0x0(%r10),%rax f2: mov %rax,%r11 f5: and %r10,%r11 f8: lock cmpxchg %r11,0x0(%r10) fe: jne 0x00000000000000ee 100: mov %rax,%r10 103: mov %r10,%rax [...]
Fixes: 981f94c3e921 ("bpf: Add bitwise atomic instructions") Reported-by: Johan Almbladh johan.almbladh@anyfinetworks.com Signed-off-by: Johan Almbladh johan.almbladh@anyfinetworks.com Co-developed-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Brendan Jackman jackmanb@google.com Acked-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 47780844598a..ffcc4d29ad50 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1341,9 +1341,10 @@ st: if (is_imm8(insn->off)) if (insn->imm == (BPF_AND | BPF_FETCH) || insn->imm == (BPF_OR | BPF_FETCH) || insn->imm == (BPF_XOR | BPF_FETCH)) { - u8 *branch_target; bool is64 = BPF_SIZE(insn->code) == BPF_DW; u32 real_src_reg = src_reg; + u32 real_dst_reg = dst_reg; + u8 *branch_target;
/* * Can't be implemented with a single x86 insn. @@ -1354,11 +1355,13 @@ st: if (is_imm8(insn->off)) emit_mov_reg(&prog, true, BPF_REG_AX, BPF_REG_0); if (src_reg == BPF_REG_0) real_src_reg = BPF_REG_AX; + if (dst_reg == BPF_REG_0) + real_dst_reg = BPF_REG_AX;
branch_target = prog; /* Load old value */ emit_ldx(&prog, BPF_SIZE(insn->code), - BPF_REG_0, dst_reg, insn->off); + BPF_REG_0, real_dst_reg, insn->off); /* * Perform the (commutative) operation locally, * put the result in the AUX_REG. @@ -1369,7 +1372,8 @@ st: if (is_imm8(insn->off)) add_2reg(0xC0, AUX_REG, real_src_reg)); /* Attempt to swap in new value */ err = emit_atomic(&prog, BPF_CMPXCHG, - dst_reg, AUX_REG, insn->off, + real_dst_reg, AUX_REG, + insn->off, BPF_SIZE(insn->code)); if (WARN_ON(err)) return err; @@ -1383,11 +1387,10 @@ st: if (is_imm8(insn->off)) /* Restore R0 after clobbering RAX */ emit_mov_reg(&prog, true, BPF_REG_0, BPF_REG_AX); break; - }
err = emit_atomic(&prog, insn->imm, dst_reg, src_reg, - insn->off, BPF_SIZE(insn->code)); + insn->off, BPF_SIZE(insn->code)); if (err) return err; break;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 51bb08dd04a05035a64504faa47651d36b0f3125 ]
An object file cannot be built for both loadable module and built-in use at the same time:
arm-linux-gnueabi-ld: drivers/net/ethernet/micrel/ks8851_common.o: in function `ks8851_probe_common': ks8851_common.c:(.text+0xf80): undefined reference to `__this_module'
Change the ks8851_common code to be a standalone module instead, and use Makefile logic to ensure this is built-in if at least one of its two users is.
Fixes: 797047f875b5 ("net: ks8851: Implement Parallel bus operations") Link: https://lore.kernel.org/netdev/20210125121937.3900988-1-arnd@kernel.org/ Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Marek Vasut marex@denx.de Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/micrel/Makefile | 6 ++---- drivers/net/ethernet/micrel/ks8851_common.c | 8 ++++++++ 2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/micrel/Makefile b/drivers/net/ethernet/micrel/Makefile index 5cc00d22c708..6ecc4eb30e74 100644 --- a/drivers/net/ethernet/micrel/Makefile +++ b/drivers/net/ethernet/micrel/Makefile @@ -4,8 +4,6 @@ #
obj-$(CONFIG_KS8842) += ks8842.o -obj-$(CONFIG_KS8851) += ks8851.o -ks8851-objs = ks8851_common.o ks8851_spi.o -obj-$(CONFIG_KS8851_MLL) += ks8851_mll.o -ks8851_mll-objs = ks8851_common.o ks8851_par.o +obj-$(CONFIG_KS8851) += ks8851_common.o ks8851_spi.o +obj-$(CONFIG_KS8851_MLL) += ks8851_common.o ks8851_par.o obj-$(CONFIG_KSZ884X_PCI) += ksz884x.o diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c index 831518466de2..0f9c5457b93e 100644 --- a/drivers/net/ethernet/micrel/ks8851_common.c +++ b/drivers/net/ethernet/micrel/ks8851_common.c @@ -1057,6 +1057,7 @@ int ks8851_suspend(struct device *dev)
return 0; } +EXPORT_SYMBOL_GPL(ks8851_suspend);
int ks8851_resume(struct device *dev) { @@ -1070,6 +1071,7 @@ int ks8851_resume(struct device *dev)
return 0; } +EXPORT_SYMBOL_GPL(ks8851_resume); #endif
static int ks8851_register_mdiobus(struct ks8851_net *ks, struct device *dev) @@ -1243,6 +1245,7 @@ int ks8851_probe_common(struct net_device *netdev, struct device *dev, err_reg_io: return ret; } +EXPORT_SYMBOL_GPL(ks8851_probe_common);
int ks8851_remove_common(struct device *dev) { @@ -1261,3 +1264,8 @@ int ks8851_remove_common(struct device *dev)
return 0; } +EXPORT_SYMBOL_GPL(ks8851_remove_common); + +MODULE_DESCRIPTION("KS8851 Network driver"); +MODULE_AUTHOR("Ben Dooks ben@simtec.co.uk"); +MODULE_LICENSE("GPL");
From: Shannon Nelson snelson@pensando.io
[ Upstream commit c23bb54f28d61a48008428e8cd320c947993919b ]
Don't print stats for which we haven't reserved space as it can cause nasty memory bashing and related bad behaviors.
Fixes: aa620993b1e5 ("ionic: pull per-q stats work out of queue loops") Signed-off-by: Shannon Nelson snelson@pensando.io Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_stats.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_stats.c b/drivers/net/ethernet/pensando/ionic/ionic_stats.c index 58a854666c62..c14de5fcedea 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_stats.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_stats.c @@ -380,15 +380,6 @@ static void ionic_sw_stats_get_txq_values(struct ionic_lif *lif, u64 **buf, &ionic_dbg_intr_stats_desc[i]); (*buf)++; } - for (i = 0; i < IONIC_NUM_DBG_NAPI_STATS; i++) { - **buf = IONIC_READ_STAT64(&txqcq->napi_stats, - &ionic_dbg_napi_stats_desc[i]); - (*buf)++; - } - for (i = 0; i < IONIC_MAX_NUM_NAPI_CNTR; i++) { - **buf = txqcq->napi_stats.work_done_cntr[i]; - (*buf)++; - } for (i = 0; i < IONIC_MAX_NUM_SG_CNTR; i++) { **buf = txstats->sg_cntr[i]; (*buf)++;
From: Jens Axboe axboe@kernel.dk
[ Upstream commit ebc69e897e17373fbe1daaff1debaa77583a5284 ]
This reverts commit 2d52c58b9c9bdae0ca3df6a1eab5745ab3f7d80b.
We have had several folks complain that this causes hangs for them, which is especially problematic as the commit has also hit stable already.
As no resolution seems to be forthcoming right now, revert the patch.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214503 Fixes: 2d52c58b9c9b ("block, bfq: honor already-setup queue merges") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bfq-iosched.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 3a1038b6eeb3..9360c65169ff 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2662,15 +2662,6 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq) * are likely to increase the throughput. */ bfqq->new_bfqq = new_bfqq; - /* - * The above assignment schedules the following redirections: - * each time some I/O for bfqq arrives, the process that - * generated that I/O is disassociated from bfqq and - * associated with new_bfqq. Here we increases new_bfqq->ref - * in advance, adding the number of processes that are - * expected to be associated with new_bfqq as they happen to - * issue I/O. - */ new_bfqq->ref += process_refs; return new_bfqq; } @@ -2733,10 +2724,6 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, { struct bfq_queue *in_service_bfqq, *new_bfqq;
- /* if a merge has already been setup, then proceed with that first */ - if (bfqq->new_bfqq) - return bfqq->new_bfqq; - /* * Check delayed stable merge for rotational or non-queueing * devs. For this branch to be executed, bfqq must not be @@ -2838,6 +2825,9 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, if (bfq_too_late_for_merging(bfqq)) return NULL;
+ if (bfqq->new_bfqq) + return bfqq->new_bfqq; + if (!io_struct || unlikely(bfqq == &bfqd->oom_bfqq)) return NULL;
From: Rahul Lakkireddy rahul.lakkireddy@chelsio.com
[ Upstream commit 79a7482249a7353bc86aff8127954d5febf02472 ]
Both cxgb4 and csiostor drivers run on their own independent Physical Function. But when cxgb4 and csiostor are both being loaded in parallel via modprobe, there is a race when firmware upgrade is attempted by both the drivers.
When the cxgb4 driver initiates the firmware upgrade, it halts the firmware and the chip until upgrade is complete. When the csiostor driver is coming up in parallel, the firmware mailbox communication fails with timeouts and the csiostor driver probe fails.
Add a module soft dependency on cxgb4 driver to ensure loading csiostor triggers cxgb4 to load first when available to avoid the firmware upgrade race.
Link: https://lore.kernel.org/r/1632759248-15382-1-git-send-email-rahul.lakkireddy... Fixes: a3667aaed569 ("[SCSI] csiostor: Chelsio FCoE offload driver") Signed-off-by: Rahul Lakkireddy rahul.lakkireddy@chelsio.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/csiostor/csio_init.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c index 390b07bf92b9..ccbded3353bd 100644 --- a/drivers/scsi/csiostor/csio_init.c +++ b/drivers/scsi/csiostor/csio_init.c @@ -1254,3 +1254,4 @@ MODULE_DEVICE_TABLE(pci, csio_pci_tbl); MODULE_VERSION(CSIO_DRV_VERSION); MODULE_FIRMWARE(FW_FNAME_T5); MODULE_FIRMWARE(FW_FNAME_T6); +MODULE_SOFTDEP("pre: cxgb4");
From: Feng Zhou zhoufeng.zf@bytedance.com
[ Upstream commit 513e605d7a9ce136886cb42ebb2c40e9a6eb6333 ]
The ixgbe driver currently generates a NULL pointer dereference with some machine (online cpus < 63). This is due to the fact that the maximum value of num_xdp_queues is nr_cpu_ids. Code is in "ixgbe_set_rss_queues"".
Here's how the problem repeats itself: Some machine (online cpus < 63), And user set num_queues to 63 through ethtool. Code is in the "ixgbe_set_channels", adapter->ring_feature[RING_F_FDIR].limit = count;
It becomes 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num. adapter->num_rx_queues = rss_i; adapter->num_tx_queues = rss_i; adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from f = &adapter->ring_feature[RING_F_FDIR]; rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup", for (i = 0; i < adapter->num_rx_queues; i++) if (adapter->xdp_ring[i]->xsk_umem)
It leads to panic.
Call trace: [exception RIP: ixgbe_xdp+368] RIP: ffffffffc02a76a0 RSP: ffff9fe16202f8d0 RFLAGS: 00010297 RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000000000000001c RDI: ffffffffa94ead90 RBP: ffff92f8f24c0c18 R8: 0000000000000000 R9: 0000000000000000 R10: ffff9fe16202f830 R11: 0000000000000000 R12: ffff92f8f24c0000 R13: ffff9fe16202fc01 R14: 000000000000000a R15: ffffffffc02a7530 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc 8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808 9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
So I fix ixgbe_max_channels so that it will not allow a setting of queues to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup, take the smaller value of num_rx_queues and num_xdp_queues.
Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com Tested-by: Sandeep Penigalapati sandeep.penigalapati@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 ++++++-- 2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c index 4ceaca0f6ce3..21321d164708 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter) max_combined = ixgbe_max_rss_indices(adapter); }
- return max_combined; + return min_t(int, max_combined, num_online_cpus()); }
static void ixgbe_get_channels(struct net_device *dev, diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 14aea40da50f..77350e5fdf97 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog) struct ixgbe_adapter *adapter = netdev_priv(dev); struct bpf_prog *old_prog; bool need_reset; + int num_queues;
if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) return -EINVAL; @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog) /* Kick start the NAPI context if there is an AF_XDP socket open * on that queue id. This so that receiving will start. */ - if (need_reset && prog) - for (i = 0; i < adapter->num_rx_queues; i++) + if (need_reset && prog) { + num_queues = min_t(int, adapter->num_rx_queues, + adapter->num_xdp_queues); + for (i = 0; i < num_queues; i++) if (adapter->xdp_ring[i]->xsk_pool) (void)ixgbe_xsk_wakeup(adapter->netdev, i, XDP_WAKEUP_RX); + }
return 0; }
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 5b09e88e1bf7fe86540fab4b5f3eece8abead39e ]
hns3_nic_net_open() is not allowed to called repeatly, but there is no checking for this. When doing device reset and setup tc concurrently, there is a small oppotunity to call hns3_nic_net_open repeatedly, and cause kernel bug by calling napi_enable twice.
The calltrace information is like below: [ 3078.222780] ------------[ cut here ]------------ [ 3078.230255] kernel BUG at net/core/dev.c:6991! [ 3078.236224] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 3078.243431] Modules linked in: hns3 hclgevf hclge hnae3 vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [ 3078.258880] CPU: 0 PID: 295 Comm: kworker/u8:5 Tainted: G O 5.14.0-rc4+ #1 [ 3078.269102] Hardware name: , BIOS KpxxxFPGA 1P B600 V181 08/12/2021 [ 3078.276801] Workqueue: hclge hclge_service_task [hclge] [ 3078.288774] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 3078.296168] pc : napi_enable+0x80/0x84 tc qdisc sho[w 3d0e7v8 .e3t0h218 79] lr : hns3_nic_net_open+0x138/0x510 [hns3]
[ 3078.314771] sp : ffff8000108abb20 [ 3078.319099] x29: ffff8000108abb20 x28: 0000000000000000 x27: ffff0820a8490300 [ 3078.329121] x26: 0000000000000001 x25: ffff08209cfc6200 x24: 0000000000000000 [ 3078.339044] x23: ffff0820a8490300 x22: ffff08209cd76000 x21: ffff0820abfe3880 [ 3078.349018] x20: 0000000000000000 x19: ffff08209cd76900 x18: 0000000000000000 [ 3078.358620] x17: 0000000000000000 x16: ffffc816e1727a50 x15: 0000ffff8f4ff930 [ 3078.368895] x14: 0000000000000000 x13: 0000000000000000 x12: 0000259e9dbeb6b4 [ 3078.377987] x11: 0096a8f7e764eb40 x10: 634615ad28d3eab5 x9 : ffffc816ad8885b8 [ 3078.387091] x8 : ffff08209cfc6fb8 x7 : ffff0820ac0da058 x6 : ffff0820a8490344 [ 3078.396356] x5 : 0000000000000140 x4 : 0000000000000003 x3 : ffff08209cd76938 [ 3078.405365] x2 : 0000000000000000 x1 : 0000000000000010 x0 : ffff0820abfe38a0 [ 3078.414657] Call trace: [ 3078.418517] napi_enable+0x80/0x84 [ 3078.424626] hns3_reset_notify_up_enet+0x78/0xd0 [hns3] [ 3078.433469] hns3_reset_notify+0x64/0x80 [hns3] [ 3078.441430] hclge_notify_client+0x68/0xb0 [hclge] [ 3078.450511] hclge_reset_rebuild+0x524/0x884 [hclge] [ 3078.458879] hclge_reset_service_task+0x3c4/0x680 [hclge] [ 3078.467470] hclge_service_task+0xb0/0xb54 [hclge] [ 3078.475675] process_one_work+0x1dc/0x48c [ 3078.481888] worker_thread+0x15c/0x464 [ 3078.487104] kthread+0x160/0x170 [ 3078.492479] ret_from_fork+0x10/0x18 [ 3078.498785] Code: c8027c81 35ffffa2 d50323bf d65f03c0 (d4210000) [ 3078.506889] ---[ end trace 8ebe0340a1b0fb44 ]---
Once hns3_nic_net_open() is excute success, the flag HNS3_NIC_STATE_DOWN will be cleared. So add checking for this flag, directly return when HNS3_NIC_STATE_DOWN is no set.
Fixes: e888402789b9 ("net: hns3: call hns3_nic_net_open() while doing HNAE3_UP_CLIENT") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 9faa3712ea5b..b24ad9bc8e1b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -776,6 +776,11 @@ static int hns3_nic_net_open(struct net_device *netdev) if (hns3_nic_resetting(netdev)) return -EBUSY;
+ if (!test_bit(HNS3_NIC_STATE_DOWN, &priv->state)) { + netdev_warn(netdev, "net open repeatedly!\n"); + return 0; + } + netif_carrier_off(netdev);
ret = hns3_nic_set_real_num_queue(netdev);
From: Jian Shen shenjian15@huawei.com
[ Upstream commit a8e76fefe3de9b8e609cf192af75e7878d21fa3a ]
Currently, in function hns3_nic_set_real_num_queue(), the driver doesn't report the queue count and offset for disabled tc. If user enables multiple TCs, but only maps user priorities to partial of them, it may cause the queue range of the unmapped TC being displayed abnormally.
Fix it by removing the tc enable checking, ensure the queue count is not zero.
With this change, the tc_en is useless now, so remove it.
Fixes: a75a8efa00c5 ("net: hns3: Fix tc setup when netdev is first up") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 - drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 11 ++--------- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 5 ----- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 2 -- 4 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index e0b7c3c44e7b..32987bd134a1 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -750,7 +750,6 @@ struct hnae3_tc_info { u8 prio_tc[HNAE3_MAX_USER_PRIO]; /* TC indexed by prio */ u16 tqp_count[HNAE3_MAX_TC]; u16 tqp_offset[HNAE3_MAX_TC]; - unsigned long tc_en; /* bitmap of TC enabled */ u8 num_tc; /* Total number of enabled TCs */ bool mqprio_active; }; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index b24ad9bc8e1b..114692c4f797 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -620,13 +620,9 @@ static int hns3_nic_set_real_num_queue(struct net_device *netdev) return ret; }
- for (i = 0; i < HNAE3_MAX_TC; i++) { - if (!test_bit(i, &tc_info->tc_en)) - continue; - + for (i = 0; i < tc_info->num_tc; i++) netdev_set_tc_queue(netdev, i, tc_info->tqp_count[i], tc_info->tqp_offset[i]); - } }
ret = netif_set_real_num_tx_queues(netdev, queue_size); @@ -4830,12 +4826,9 @@ static void hns3_init_tx_ring_tc(struct hns3_nic_priv *priv) struct hnae3_tc_info *tc_info = &kinfo->tc_info; int i;
- for (i = 0; i < HNAE3_MAX_TC; i++) { + for (i = 0; i < tc_info->num_tc; i++) { int j;
- if (!test_bit(i, &tc_info->tc_en)) - continue; - for (j = 0; j < tc_info->tqp_count[i]; j++) { struct hnae3_queue *q;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c index 39f56f245d84..5fadfdbc4858 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c @@ -420,8 +420,6 @@ static int hclge_mqprio_qopt_check(struct hclge_dev *hdev, static void hclge_sync_mqprio_qopt(struct hnae3_tc_info *tc_info, struct tc_mqprio_qopt_offload *mqprio_qopt) { - int i; - memset(tc_info, 0, sizeof(*tc_info)); tc_info->num_tc = mqprio_qopt->qopt.num_tc; memcpy(tc_info->prio_tc, mqprio_qopt->qopt.prio_tc_map, @@ -430,9 +428,6 @@ static void hclge_sync_mqprio_qopt(struct hnae3_tc_info *tc_info, sizeof_field(struct hnae3_tc_info, tqp_count)); memcpy(tc_info->tqp_offset, mqprio_qopt->qopt.offset, sizeof_field(struct hnae3_tc_info, tqp_offset)); - - for (i = 0; i < HNAE3_MAX_USER_PRIO; i++) - set_bit(tc_info->prio_tc[i], &tc_info->tc_en); }
static int hclge_config_tc(struct hclge_dev *hdev, diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index 44618cc4cca1..6f5035a788c0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -687,12 +687,10 @@ static void hclge_tm_vport_tc_info_update(struct hclge_vport *vport)
for (i = 0; i < HNAE3_MAX_TC; i++) { if (hdev->hw_tc_map & BIT(i) && i < kinfo->tc_info.num_tc) { - set_bit(i, &kinfo->tc_info.tc_en); kinfo->tc_info.tqp_offset[i] = i * kinfo->rss_size; kinfo->tc_info.tqp_count[i] = kinfo->rss_size; } else { /* Set to default queue if TC is disable */ - clear_bit(i, &kinfo->tc_info.tc_en); kinfo->tc_info.tqp_offset[i] = 0; kinfo->tc_info.tqp_count[i] = 1; }
From: Jian Shen shenjian15@huawei.com
[ Upstream commit d82650be60ee92e7486f755f5387023278aa933f ]
For destroy mqprio is irreversible in stack, so it's unnecessary to rollback the tc configuration when destroy mqprio failed. Otherwise, it may cause the configuration being inconsistent between driver and netstack.
As the failure is usually caused by reset, and the driver will restore the configuration after reset, so it can keep the configuration being consistent between driver and hardware.
Fixes: 5a5c90917467 ("net: hns3: add support for tc mqprio offload") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c index 5fadfdbc4858..e4f87ffd41ac 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c @@ -493,12 +493,17 @@ static int hclge_setup_tc(struct hnae3_handle *h, return hclge_notify_init_up(hdev);
err_out: - /* roll-back */ - memcpy(&kinfo->tc_info, &old_tc_info, sizeof(old_tc_info)); - if (hclge_config_tc(hdev, &kinfo->tc_info)) - dev_err(&hdev->pdev->dev, - "failed to roll back tc configuration\n"); - + if (!tc) { + dev_warn(&hdev->pdev->dev, + "failed to destroy mqprio, will active after reset, ret = %d\n", + ret); + } else { + /* roll-back */ + memcpy(&kinfo->tc_info, &old_tc_info, sizeof(old_tc_info)); + if (hclge_config_tc(hdev, &kinfo->tc_info)) + dev_err(&hdev->pdev->dev, + "failed to roll back tc configuration\n"); + } hclge_notify_init_up(hdev);
return ret;
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 0472e95ffeac8e61259eec17ab61608c6b35599d ]
HCLGE_FLAG_MQPRIO_ENABLE is supposed to set when enable multiple TCs with tc mqprio, and HCLGE_FLAG_DCB_ENABLE is supposed to set when enable multiple TCs with ets. But the driver mixed the flags when updating the tm configuration.
Furtherly, PFC should be available when HCLGE_FLAG_MQPRIO_ENABLE too, so remove the unnecessary limitation.
Fixes: 5a5c90917467 ("net: hns3: add support for tc mqprio offload") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_dcb.c | 7 +++-- .../ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 31 +++---------------- 2 files changed, 10 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c index e4f87ffd41ac..c90bfde2aecf 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c @@ -224,6 +224,10 @@ static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets) }
hclge_tm_schd_info_update(hdev, num_tc); + if (num_tc > 1) + hdev->flag |= HCLGE_FLAG_DCB_ENABLE; + else + hdev->flag &= ~HCLGE_FLAG_DCB_ENABLE;
ret = hclge_ieee_ets_to_tm_info(hdev, ets); if (ret) @@ -285,8 +289,7 @@ static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc) u8 i, j, pfc_map, *prio_tc; int ret;
- if (!(hdev->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) || - hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE) + if (!(hdev->dcbx_cap & DCB_CAP_DCBX_VER_IEEE)) return -EINVAL;
if (pfc->pfc_en == hdev->tm_info.pfc_en) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index 6f5035a788c0..f314dbd3ce11 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -727,14 +727,6 @@ static void hclge_tm_tc_info_init(struct hclge_dev *hdev) for (i = 0; i < HNAE3_MAX_USER_PRIO; i++) hdev->tm_info.prio_tc[i] = (i >= hdev->tm_info.num_tc) ? 0 : i; - - /* DCB is enabled if we have more than 1 TC or pfc_en is - * non-zero. - */ - if (hdev->tm_info.num_tc > 1 || hdev->tm_info.pfc_en) - hdev->flag |= HCLGE_FLAG_DCB_ENABLE; - else - hdev->flag &= ~HCLGE_FLAG_DCB_ENABLE; }
static void hclge_tm_pg_info_init(struct hclge_dev *hdev) @@ -765,10 +757,10 @@ static void hclge_tm_pg_info_init(struct hclge_dev *hdev)
static void hclge_update_fc_mode_by_dcb_flag(struct hclge_dev *hdev) { - if (!(hdev->flag & HCLGE_FLAG_DCB_ENABLE)) { + if (hdev->tm_info.num_tc == 1 && !hdev->tm_info.pfc_en) { if (hdev->fc_mode_last_time == HCLGE_FC_PFC) dev_warn(&hdev->pdev->dev, - "DCB is disable, but last mode is FC_PFC\n"); + "Only 1 tc used, but last mode is FC_PFC\n");
hdev->tm_info.fc_mode = hdev->fc_mode_last_time; } else if (hdev->tm_info.fc_mode != HCLGE_FC_PFC) { @@ -794,7 +786,7 @@ static void hclge_update_fc_mode(struct hclge_dev *hdev) } }
-static void hclge_pfc_info_init(struct hclge_dev *hdev) +void hclge_tm_pfc_info_update(struct hclge_dev *hdev) { if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V3) hclge_update_fc_mode(hdev); @@ -810,7 +802,7 @@ static void hclge_tm_schd_info_init(struct hclge_dev *hdev)
hclge_tm_vport_info_update(hdev);
- hclge_pfc_info_init(hdev); + hclge_tm_pfc_info_update(hdev); }
static int hclge_tm_pg_to_pri_map(struct hclge_dev *hdev) @@ -1556,19 +1548,6 @@ void hclge_tm_schd_info_update(struct hclge_dev *hdev, u8 num_tc) hclge_tm_schd_info_init(hdev); }
-void hclge_tm_pfc_info_update(struct hclge_dev *hdev) -{ - /* DCB is enabled if we have more than 1 TC or pfc_en is - * non-zero. - */ - if (hdev->tm_info.num_tc > 1 || hdev->tm_info.pfc_en) - hdev->flag |= HCLGE_FLAG_DCB_ENABLE; - else - hdev->flag &= ~HCLGE_FLAG_DCB_ENABLE; - - hclge_pfc_info_init(hdev); -} - int hclge_tm_init_hw(struct hclge_dev *hdev, bool init) { int ret; @@ -1614,7 +1593,7 @@ int hclge_tm_vport_map_update(struct hclge_dev *hdev) if (ret) return ret;
- if (!(hdev->flag & HCLGE_FLAG_DCB_ENABLE)) + if (hdev->tm_info.num_tc == 1 && !hdev->tm_info.pfc_en) return 0;
return hclge_tm_bp_setup(hdev);
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 108b3c7810e14892c4a1819b1d268a2c785c087c ]
Currently, if function adds an existing unicast mac address, eventhough driver will not add this address into hardware, but it will return 0 in function hclge_add_uc_addr_common(). It will cause the state of this unicast mac address is ACTIVE in driver, but it should be in TO-ADD state.
To fix this problem, function hclge_add_uc_addr_common() returns -EEXIST if mac address is existing, and delete two error log to avoid printing them all the time after this modification.
Fixes: 72110b567479 ("net: hns3: return 0 and print warning when hit duplicate MAC") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_main.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 90a72c79fec9..9920e76b4f41 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -8701,15 +8701,8 @@ int hclge_add_uc_addr_common(struct hclge_vport *vport, }
/* check if we just hit the duplicate */ - if (!ret) { - dev_warn(&hdev->pdev->dev, "VF %u mac(%pM) exists\n", - vport->vport_id, addr); - return 0; - } - - dev_err(&hdev->pdev->dev, - "PF failed to add unicast entry(%pM) in the MAC table\n", - addr); + if (!ret) + return -EEXIST;
return ret; } @@ -8861,7 +8854,13 @@ static void hclge_sync_vport_mac_list(struct hclge_vport *vport, } else { set_bit(HCLGE_VPORT_STATE_MAC_TBL_CHANGE, &vport->state); - break; + + /* If one unicast mac address is existing in hardware, + * we need to try whether other unicast mac addresses + * are new addresses that can be added. + */ + if (ret != -EEXIST) + break; } } }
From: Peng Li lipeng321@huawei.com
[ Upstream commit 4c8dab1c709c5a715bce14efdb8f4e889d86aa04 ]
This patch reconstructs function hns3_self_test to reduce the code cycle complexity and make code more concise.
Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3_ethtool.c | 101 +++++++++++------- 1 file changed, 64 insertions(+), 37 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index 82061ab6930f..69b253424da8 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -312,33 +312,8 @@ static int hns3_lp_run_test(struct net_device *ndev, enum hnae3_loop mode) return ret_val; }
-/** - * hns3_self_test - self test - * @ndev: net device - * @eth_test: test cmd - * @data: test result - */ -static void hns3_self_test(struct net_device *ndev, - struct ethtool_test *eth_test, u64 *data) +static void hns3_set_selftest_param(struct hnae3_handle *h, int (*st_param)[2]) { - struct hns3_nic_priv *priv = netdev_priv(ndev); - struct hnae3_handle *h = priv->ae_handle; - int st_param[HNS3_SELF_TEST_TYPE_NUM][2]; - bool if_running = netif_running(ndev); - int test_index = 0; - u32 i; - - if (hns3_nic_resetting(ndev)) { - netdev_err(ndev, "dev resetting!"); - return; - } - - /* Only do offline selftest, or pass by default */ - if (eth_test->flags != ETH_TEST_FL_OFFLINE) - return; - - netif_dbg(h, drv, ndev, "self test start"); - st_param[HNAE3_LOOP_APP][0] = HNAE3_LOOP_APP; st_param[HNAE3_LOOP_APP][1] = h->flags & HNAE3_SUPPORT_APP_LOOPBACK; @@ -355,6 +330,18 @@ static void hns3_self_test(struct net_device *ndev, st_param[HNAE3_LOOP_PHY][0] = HNAE3_LOOP_PHY; st_param[HNAE3_LOOP_PHY][1] = h->flags & HNAE3_SUPPORT_PHY_LOOPBACK; +} + +static void hns3_selftest_prepare(struct net_device *ndev, + bool if_running, int (*st_param)[2]) +{ + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct hnae3_handle *h = priv->ae_handle; + + if (netif_msg_ifdown(h)) + netdev_info(ndev, "self test start\n"); + + hns3_set_selftest_param(h, st_param);
if (if_running) ndev->netdev_ops->ndo_stop(ndev); @@ -373,6 +360,35 @@ static void hns3_self_test(struct net_device *ndev, h->ae_algo->ops->halt_autoneg(h, true);
set_bit(HNS3_NIC_STATE_TESTING, &priv->state); +} + +static void hns3_selftest_restore(struct net_device *ndev, bool if_running) +{ + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct hnae3_handle *h = priv->ae_handle; + + clear_bit(HNS3_NIC_STATE_TESTING, &priv->state); + + if (h->ae_algo->ops->halt_autoneg) + h->ae_algo->ops->halt_autoneg(h, false); + +#if IS_ENABLED(CONFIG_VLAN_8021Q) + if (h->ae_algo->ops->enable_vlan_filter) + h->ae_algo->ops->enable_vlan_filter(h, true); +#endif + + if (if_running) + ndev->netdev_ops->ndo_open(ndev); + + if (netif_msg_ifdown(h)) + netdev_info(ndev, "self test end\n"); +} + +static void hns3_do_selftest(struct net_device *ndev, int (*st_param)[2], + struct ethtool_test *eth_test, u64 *data) +{ + int test_index = 0; + u32 i;
for (i = 0; i < HNS3_SELF_TEST_TYPE_NUM; i++) { enum hnae3_loop loop_type = (enum hnae3_loop)st_param[i][0]; @@ -391,21 +407,32 @@ static void hns3_self_test(struct net_device *ndev,
test_index++; } +}
- clear_bit(HNS3_NIC_STATE_TESTING, &priv->state); - - if (h->ae_algo->ops->halt_autoneg) - h->ae_algo->ops->halt_autoneg(h, false); +/** + * hns3_nic_self_test - self test + * @ndev: net device + * @eth_test: test cmd + * @data: test result + */ +static void hns3_self_test(struct net_device *ndev, + struct ethtool_test *eth_test, u64 *data) +{ + int st_param[HNS3_SELF_TEST_TYPE_NUM][2]; + bool if_running = netif_running(ndev);
-#if IS_ENABLED(CONFIG_VLAN_8021Q) - if (h->ae_algo->ops->enable_vlan_filter) - h->ae_algo->ops->enable_vlan_filter(h, true); -#endif + if (hns3_nic_resetting(ndev)) { + netdev_err(ndev, "dev resetting!"); + return; + }
- if (if_running) - ndev->netdev_ops->ndo_open(ndev); + /* Only do offline selftest, or pass by default */ + if (eth_test->flags != ETH_TEST_FL_OFFLINE) + return;
- netif_dbg(h, drv, ndev, "self test end\n"); + hns3_selftest_prepare(ndev, if_running, st_param); + hns3_do_selftest(ndev, st_param, eth_test, data); + hns3_selftest_restore(ndev, if_running); }
static void hns3_update_limit_promisc_mode(struct net_device *netdev,
From: Guangbin Huang huangguangbin2@huawei.com
[ Upstream commit 27bf4af69fcb9845fb2f0076db5d562ec072e70f ]
Currently, the rx vlan filter will always be disabled before selftest and be enabled after selftest as the rx vlan filter feature is fixed on in old device earlier than V3.
However, this feature is not fixed in some new devices and it can be disabled by user. In this case, it is wrong if rx vlan filter is enabled after selftest. So fix it.
Fixes: bcc26e8dc432 ("net: hns3: remove unused code in hns3_self_test()") Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index 69b253424da8..83ee0f41322c 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -348,7 +348,8 @@ static void hns3_selftest_prepare(struct net_device *ndev,
#if IS_ENABLED(CONFIG_VLAN_8021Q) /* Disable the vlan filter for selftest does not support it */ - if (h->ae_algo->ops->enable_vlan_filter) + if (h->ae_algo->ops->enable_vlan_filter && + ndev->features & NETIF_F_HW_VLAN_CTAG_FILTER) h->ae_algo->ops->enable_vlan_filter(h, false); #endif
@@ -373,7 +374,8 @@ static void hns3_selftest_restore(struct net_device *ndev, bool if_running) h->ae_algo->ops->halt_autoneg(h, false);
#if IS_ENABLED(CONFIG_VLAN_8021Q) - if (h->ae_algo->ops->enable_vlan_filter) + if (h->ae_algo->ops->enable_vlan_filter && + ndev->features & NETIF_F_HW_VLAN_CTAG_FILTER) h->ae_algo->ops->enable_vlan_filter(h, true); #endif
From: Guangbin Huang huangguangbin2@huawei.com
[ Upstream commit 0178839ccca36dee238a57e7f4c3c252f5dbbba6 ]
Currently, the firmware compatible features are enabled in PF driver initialization process, but they are not disabled in PF driver deinitialization process and firmware keeps these features in enabled status.
In this case, if load an old PF driver (for example, in VM) which not support the firmware compatible features, firmware will still send mailbox message to PF when link status changed and PF will print "un-supported mailbox message, code = 201".
To fix this problem, disable these firmware compatible features in PF driver deinitialization process.
Fixes: ed8fb4b262ae ("net: hns3: add link change event report") Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_cmd.c | 21 ++++++++++++------- 1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c index eb748aa35952..0f0bf3d503bf 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c @@ -472,7 +472,7 @@ int hclge_cmd_queue_init(struct hclge_dev *hdev) return ret; }
-static int hclge_firmware_compat_config(struct hclge_dev *hdev) +static int hclge_firmware_compat_config(struct hclge_dev *hdev, bool en) { struct hclge_firmware_compat_cmd *req; struct hclge_desc desc; @@ -480,13 +480,16 @@ static int hclge_firmware_compat_config(struct hclge_dev *hdev)
hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_IMP_COMPAT_CFG, false);
- req = (struct hclge_firmware_compat_cmd *)desc.data; + if (en) { + req = (struct hclge_firmware_compat_cmd *)desc.data;
- hnae3_set_bit(compat, HCLGE_LINK_EVENT_REPORT_EN_B, 1); - hnae3_set_bit(compat, HCLGE_NCSI_ERROR_REPORT_EN_B, 1); - if (hnae3_dev_phy_imp_supported(hdev)) - hnae3_set_bit(compat, HCLGE_PHY_IMP_EN_B, 1); - req->compat = cpu_to_le32(compat); + hnae3_set_bit(compat, HCLGE_LINK_EVENT_REPORT_EN_B, 1); + hnae3_set_bit(compat, HCLGE_NCSI_ERROR_REPORT_EN_B, 1); + if (hnae3_dev_phy_imp_supported(hdev)) + hnae3_set_bit(compat, HCLGE_PHY_IMP_EN_B, 1); + + req->compat = cpu_to_le32(compat); + }
return hclge_cmd_send(&hdev->hw, &desc, 1); } @@ -543,7 +546,7 @@ int hclge_cmd_init(struct hclge_dev *hdev) /* ask the firmware to enable some features, driver can work without * it. */ - ret = hclge_firmware_compat_config(hdev); + ret = hclge_firmware_compat_config(hdev, true); if (ret) dev_warn(&hdev->pdev->dev, "Firmware compatible features not enabled(%d).\n", @@ -573,6 +576,8 @@ static void hclge_cmd_uninit_regs(struct hclge_hw *hw)
void hclge_cmd_uninit(struct hclge_dev *hdev) { + hclge_firmware_compat_config(hdev, false); + set_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state); /* wait to ensure that the firmware completes the possible left * over commands.
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit d88fd1b546ff19c8040cfaea76bf16aed1c5a0bb ]
When EEE support was added to the 28nm EPHY it was assumed that it would be able to support the standard clause 45 over clause 22 register access method. It turns out that the PHY does not support that, which is the very reason for using the indirect shadow mode 2 bank 3 access method.
Implement {read,write}_mmd to allow the standard PHY library routines pertaining to EEE querying and configuration to work correctly on these PHYs. This forces us to implement a __phy_set_clr_bits() function that does not grab the MDIO bus lock since the PHY driver's {read,write}_mmd functions are always called with that lock held.
Fixes: 83ee102a6998 ("net: phy: bcm7xxx: add support for 28nm EPHY") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/bcm7xxx.c | 114 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 110 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c index e79297a4bae8..27b6a3f507ae 100644 --- a/drivers/net/phy/bcm7xxx.c +++ b/drivers/net/phy/bcm7xxx.c @@ -27,7 +27,12 @@ #define MII_BCM7XXX_SHD_2_ADDR_CTRL 0xe #define MII_BCM7XXX_SHD_2_CTRL_STAT 0xf #define MII_BCM7XXX_SHD_2_BIAS_TRIM 0x1a +#define MII_BCM7XXX_SHD_3_PCS_CTRL 0x0 +#define MII_BCM7XXX_SHD_3_PCS_STATUS 0x1 +#define MII_BCM7XXX_SHD_3_EEE_CAP 0x2 #define MII_BCM7XXX_SHD_3_AN_EEE_ADV 0x3 +#define MII_BCM7XXX_SHD_3_EEE_LP 0x4 +#define MII_BCM7XXX_SHD_3_EEE_WK_ERR 0x5 #define MII_BCM7XXX_SHD_3_PCS_CTRL_2 0x6 #define MII_BCM7XXX_PCS_CTRL_2_DEF 0x4400 #define MII_BCM7XXX_SHD_3_AN_STAT 0xb @@ -216,25 +221,37 @@ static int bcm7xxx_28nm_resume(struct phy_device *phydev) return genphy_config_aneg(phydev); }
-static int phy_set_clr_bits(struct phy_device *dev, int location, - int set_mask, int clr_mask) +static int __phy_set_clr_bits(struct phy_device *dev, int location, + int set_mask, int clr_mask) { int v, ret;
- v = phy_read(dev, location); + v = __phy_read(dev, location); if (v < 0) return v;
v &= ~clr_mask; v |= set_mask;
- ret = phy_write(dev, location, v); + ret = __phy_write(dev, location, v); if (ret < 0) return ret;
return v; }
+static int phy_set_clr_bits(struct phy_device *dev, int location, + int set_mask, int clr_mask) +{ + int ret; + + mutex_lock(&dev->mdio.bus->mdio_lock); + ret = __phy_set_clr_bits(dev, location, set_mask, clr_mask); + mutex_unlock(&dev->mdio.bus->mdio_lock); + + return ret; +} + static int bcm7xxx_28nm_ephy_01_afe_config_init(struct phy_device *phydev) { int ret; @@ -398,6 +415,93 @@ static int bcm7xxx_28nm_ephy_config_init(struct phy_device *phydev) return bcm7xxx_28nm_ephy_apd_enable(phydev); }
+#define MII_BCM7XXX_REG_INVALID 0xff + +static u8 bcm7xxx_28nm_ephy_regnum_to_shd(u16 regnum) +{ + switch (regnum) { + case MDIO_CTRL1: + return MII_BCM7XXX_SHD_3_PCS_CTRL; + case MDIO_STAT1: + return MII_BCM7XXX_SHD_3_PCS_STATUS; + case MDIO_PCS_EEE_ABLE: + return MII_BCM7XXX_SHD_3_EEE_CAP; + case MDIO_AN_EEE_ADV: + return MII_BCM7XXX_SHD_3_AN_EEE_ADV; + case MDIO_AN_EEE_LPABLE: + return MII_BCM7XXX_SHD_3_EEE_LP; + case MDIO_PCS_EEE_WK_ERR: + return MII_BCM7XXX_SHD_3_EEE_WK_ERR; + default: + return MII_BCM7XXX_REG_INVALID; + } +} + +static bool bcm7xxx_28nm_ephy_dev_valid(int devnum) +{ + return devnum == MDIO_MMD_AN || devnum == MDIO_MMD_PCS; +} + +static int bcm7xxx_28nm_ephy_read_mmd(struct phy_device *phydev, + int devnum, u16 regnum) +{ + u8 shd = bcm7xxx_28nm_ephy_regnum_to_shd(regnum); + int ret; + + if (!bcm7xxx_28nm_ephy_dev_valid(devnum) || + shd == MII_BCM7XXX_REG_INVALID) + return -EOPNOTSUPP; + + /* set shadow mode 2 */ + ret = __phy_set_clr_bits(phydev, MII_BCM7XXX_TEST, + MII_BCM7XXX_SHD_MODE_2, 0); + if (ret < 0) + return ret; + + /* Access the desired shadow register address */ + ret = __phy_write(phydev, MII_BCM7XXX_SHD_2_ADDR_CTRL, shd); + if (ret < 0) + goto reset_shadow_mode; + + ret = __phy_read(phydev, MII_BCM7XXX_SHD_2_CTRL_STAT); + +reset_shadow_mode: + /* reset shadow mode 2 */ + __phy_set_clr_bits(phydev, MII_BCM7XXX_TEST, 0, + MII_BCM7XXX_SHD_MODE_2); + return ret; +} + +static int bcm7xxx_28nm_ephy_write_mmd(struct phy_device *phydev, + int devnum, u16 regnum, u16 val) +{ + u8 shd = bcm7xxx_28nm_ephy_regnum_to_shd(regnum); + int ret; + + if (!bcm7xxx_28nm_ephy_dev_valid(devnum) || + shd == MII_BCM7XXX_REG_INVALID) + return -EOPNOTSUPP; + + /* set shadow mode 2 */ + ret = __phy_set_clr_bits(phydev, MII_BCM7XXX_TEST, + MII_BCM7XXX_SHD_MODE_2, 0); + if (ret < 0) + return ret; + + /* Access the desired shadow register address */ + ret = __phy_write(phydev, MII_BCM7XXX_SHD_2_ADDR_CTRL, shd); + if (ret < 0) + goto reset_shadow_mode; + + /* Write the desired value in the shadow register */ + __phy_write(phydev, MII_BCM7XXX_SHD_2_CTRL_STAT, val); + +reset_shadow_mode: + /* reset shadow mode 2 */ + return __phy_set_clr_bits(phydev, MII_BCM7XXX_TEST, 0, + MII_BCM7XXX_SHD_MODE_2); +} + static int bcm7xxx_28nm_ephy_resume(struct phy_device *phydev) { int ret; @@ -595,6 +699,8 @@ static void bcm7xxx_28nm_remove(struct phy_device *phydev) .get_stats = bcm7xxx_28nm_get_phy_stats, \ .probe = bcm7xxx_28nm_probe, \ .remove = bcm7xxx_28nm_remove, \ + .read_mmd = bcm7xxx_28nm_ephy_read_mmd, \ + .write_mmd = bcm7xxx_28nm_ephy_write_mmd, \ }
#define BCM7XXX_40NM_EPHY(_oui, _name) \
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 49054556289e8787501630b7c7a9d407da02e296 ]
Syzkaller reported a false positive deadlock involving the nl socket lock and the subflow socket lock:
MPTCP: kernel_bind error, err=-98 ============================================ WARNING: possible recursive locking detected 5.15.0-rc1-syzkaller #0 Not tainted -------------------------------------------- syz-executor998/6520 is trying to acquire lock: ffff8880795718a0 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x267/0x7b0 net/mptcp/protocol.c:2738
but task is already holding lock: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline] ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x23/0x7b0 net/mptcp/protocol.c:2720
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(k-sk_lock-AF_INET); lock(k-sk_lock-AF_INET);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by syz-executor998/6520: #0: ffffffff8d176c50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 net/netlink/genetlink.c:802 #1: ffffffff8d176d08 (genl_mutex){+.+.}-{3:3}, at: genl_lock net/netlink/genetlink.c:33 [inline] #1: ffffffff8d176d08 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x3e0/0x580 net/netlink/genetlink.c:790 #2: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline] #2: ffff8880787c8c60 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close+0x23/0x7b0 net/mptcp/protocol.c:2720
stack backtrace: CPU: 1 PID: 6520 Comm: syz-executor998 Not tainted 5.15.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2944 [inline] check_deadlock kernel/locking/lockdep.c:2987 [inline] validate_chain kernel/locking/lockdep.c:3776 [inline] __lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5015 lock_acquire kernel/locking/lockdep.c:5625 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590 lock_sock_fast+0x36/0x100 net/core/sock.c:3229 mptcp_close+0x267/0x7b0 net/mptcp/protocol.c:2738 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431 __sock_release net/socket.c:649 [inline] sock_release+0x87/0x1b0 net/socket.c:677 mptcp_pm_nl_create_listen_socket+0x238/0x2c0 net/mptcp/pm_netlink.c:900 mptcp_nl_cmd_add_addr+0x359/0x930 net/mptcp/pm_netlink.c:1170 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731 genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 genl_rcv+0x24/0x40 net/netlink/genetlink.c:803 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 sock_no_sendpage+0x101/0x150 net/core/sock.c:2980 kernel_sendpage.part.0+0x1a0/0x340 net/socket.c:3504 kernel_sendpage net/socket.c:3501 [inline] sock_sendpage+0xe5/0x140 net/socket.c:1003 pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364 splice_from_pipe_feed fs/splice.c:418 [inline] __splice_from_pipe+0x43e/0x8a0 fs/splice.c:562 splice_from_pipe fs/splice.c:597 [inline] generic_splice_sendpage+0xd4/0x140 fs/splice.c:746 do_splice_from fs/splice.c:767 [inline] direct_splice_actor+0x110/0x180 fs/splice.c:936 splice_direct_to_actor+0x34b/0x8c0 fs/splice.c:891 do_splice_direct+0x1b3/0x280 fs/splice.c:979 do_sendfile+0xae9/0x1240 fs/read_write.c:1249 __do_sys_sendfile64 fs/read_write.c:1314 [inline] __se_sys_sendfile64 fs/read_write.c:1300 [inline] __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1300 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f215cb69969 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc96bb3868 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 00007f215cbad072 RCX: 00007f215cb69969 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000005 RBP: 0000000000000000 R08: 00007ffc96bb3a08 R09: 00007ffc96bb3a08 R10: 0000000100000002 R11: 0000000000000246 R12: 00007ffc96bb387c R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000
the problem originates from uncorrect lock annotation in the mptcp code and is only visible since commit 2dcb96bacce3 ("net: core: Correct the sock::sk_lock.owned lockdep annotations"), but is present since the port-based endpoint support initial implementation.
This patch addresses the issue introducing a nested variant of lock_sock_fast() and using it in the relevant code path.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Fixes: 2dcb96bacce3 ("net: core: Correct the sock::sk_lock.owned lockdep annotations") Suggested-by: Thomas Gleixner tglx@linutronix.de Reported-and-tested-by: syzbot+1dd53f7a89b299d59eaf@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 31 ++++++++++++++++++++++++++++++- net/core/sock.c | 17 ++--------------- net/mptcp/protocol.c | 2 +- 3 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index f23cb259b0e2..980b471b569d 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1624,7 +1624,36 @@ void release_sock(struct sock *sk); SINGLE_DEPTH_NESTING) #define bh_unlock_sock(__sk) spin_unlock(&((__sk)->sk_lock.slock))
-bool lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock); +bool __lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock); + +/** + * lock_sock_fast - fast version of lock_sock + * @sk: socket + * + * This version should be used for very small section, where process wont block + * return false if fast path is taken: + * + * sk_lock.slock locked, owned = 0, BH disabled + * + * return true if slow path is taken: + * + * sk_lock.slock unlocked, owned = 1, BH enabled + */ +static inline bool lock_sock_fast(struct sock *sk) +{ + /* The sk_lock has mutex_lock() semantics here. */ + mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); + + return __lock_sock_fast(sk); +} + +/* fast socket lock variant for caller already holding a [different] socket lock */ +static inline bool lock_sock_fast_nested(struct sock *sk) +{ + mutex_acquire(&sk->sk_lock.dep_map, SINGLE_DEPTH_NESTING, 0, _RET_IP_); + + return __lock_sock_fast(sk); +}
/** * unlock_sock_fast - complement of lock_sock_fast diff --git a/net/core/sock.c b/net/core/sock.c index a3eea6e0b30a..1cf0edc79f37 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3191,20 +3191,7 @@ void release_sock(struct sock *sk) } EXPORT_SYMBOL(release_sock);
-/** - * lock_sock_fast - fast version of lock_sock - * @sk: socket - * - * This version should be used for very small section, where process wont block - * return false if fast path is taken: - * - * sk_lock.slock locked, owned = 0, BH disabled - * - * return true if slow path is taken: - * - * sk_lock.slock unlocked, owned = 1, BH enabled - */ -bool lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock) +bool __lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock) { might_sleep(); spin_lock_bh(&sk->sk_lock.slock); @@ -3226,7 +3213,7 @@ bool lock_sock_fast(struct sock *sk) __acquires(&sk->sk_lock.slock) local_bh_enable(); return true; } -EXPORT_SYMBOL(lock_sock_fast); +EXPORT_SYMBOL(__lock_sock_fast);
int sock_gettstamp(struct socket *sock, void __user *userstamp, bool timeval, bool time32) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 4d2abdd3cd3b..7d4d40360f77 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2622,7 +2622,7 @@ static void mptcp_close(struct sock *sk, long timeout) inet_csk(sk)->icsk_mtup.probe_timestamp = tcp_jiffies32; mptcp_for_each_subflow(mptcp_sk(sk), subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - bool slow = lock_sock_fast(ssk); + bool slow = lock_sock_fast_nested(ssk);
sock_orphan(ssk); unlock_sock_fast(ssk, slow);
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit d5ef190693a7d76c5c192d108e8dec48307b46ee ]
Patch that refactored fl_walk() to use idr_for_each_entry_continue_ul() also removed rcu protection of individual filters which causes following use-after-free when filter is deleted concurrently. Fix fl_walk() to obtain rcu read lock while iterating and taking the filter reference and temporary release the lock while calling arg->fn() callback that can sleep.
KASAN trace:
[ 352.773640] ================================================================== [ 352.775041] BUG: KASAN: use-after-free in fl_walk+0x159/0x240 [cls_flower] [ 352.776304] Read of size 4 at addr ffff8881c8251480 by task tc/2987
[ 352.777862] CPU: 3 PID: 2987 Comm: tc Not tainted 5.15.0-rc2+ #2 [ 352.778980] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 352.781022] Call Trace: [ 352.781573] dump_stack_lvl+0x46/0x5a [ 352.782332] print_address_description.constprop.0+0x1f/0x140 [ 352.783400] ? fl_walk+0x159/0x240 [cls_flower] [ 352.784292] ? fl_walk+0x159/0x240 [cls_flower] [ 352.785138] kasan_report.cold+0x83/0xdf [ 352.785851] ? fl_walk+0x159/0x240 [cls_flower] [ 352.786587] kasan_check_range+0x145/0x1a0 [ 352.787337] fl_walk+0x159/0x240 [cls_flower] [ 352.788163] ? fl_put+0x10/0x10 [cls_flower] [ 352.789007] ? __mutex_unlock_slowpath.constprop.0+0x220/0x220 [ 352.790102] tcf_chain_dump+0x231/0x450 [ 352.790878] ? tcf_chain_tp_delete_empty+0x170/0x170 [ 352.791833] ? __might_sleep+0x2e/0xc0 [ 352.792594] ? tfilter_notify+0x170/0x170 [ 352.793400] ? __mutex_unlock_slowpath.constprop.0+0x220/0x220 [ 352.794477] tc_dump_tfilter+0x385/0x4b0 [ 352.795262] ? tc_new_tfilter+0x1180/0x1180 [ 352.796103] ? __mod_node_page_state+0x1f/0xc0 [ 352.796974] ? __build_skb_around+0x10e/0x130 [ 352.797826] netlink_dump+0x2c0/0x560 [ 352.798563] ? netlink_getsockopt+0x430/0x430 [ 352.799433] ? __mutex_unlock_slowpath.constprop.0+0x220/0x220 [ 352.800542] __netlink_dump_start+0x356/0x440 [ 352.801397] rtnetlink_rcv_msg+0x3ff/0x550 [ 352.802190] ? tc_new_tfilter+0x1180/0x1180 [ 352.802872] ? rtnl_calcit.isra.0+0x1f0/0x1f0 [ 352.803668] ? tc_new_tfilter+0x1180/0x1180 [ 352.804344] ? _copy_from_iter_nocache+0x800/0x800 [ 352.805202] ? kasan_set_track+0x1c/0x30 [ 352.805900] netlink_rcv_skb+0xc6/0x1f0 [ 352.806587] ? rht_deferred_worker+0x6b0/0x6b0 [ 352.807455] ? rtnl_calcit.isra.0+0x1f0/0x1f0 [ 352.808324] ? netlink_ack+0x4d0/0x4d0 [ 352.809086] ? netlink_deliver_tap+0x62/0x3d0 [ 352.809951] netlink_unicast+0x353/0x480 [ 352.810744] ? netlink_attachskb+0x430/0x430 [ 352.811586] ? __alloc_skb+0xd7/0x200 [ 352.812349] netlink_sendmsg+0x396/0x680 [ 352.813132] ? netlink_unicast+0x480/0x480 [ 352.813952] ? __import_iovec+0x192/0x210 [ 352.814759] ? netlink_unicast+0x480/0x480 [ 352.815580] sock_sendmsg+0x6c/0x80 [ 352.816299] ____sys_sendmsg+0x3a5/0x3c0 [ 352.817096] ? kernel_sendmsg+0x30/0x30 [ 352.817873] ? __ia32_sys_recvmmsg+0x150/0x150 [ 352.818753] ___sys_sendmsg+0xd8/0x140 [ 352.819518] ? sendmsg_copy_msghdr+0x110/0x110 [ 352.820402] ? ___sys_recvmsg+0xf4/0x1a0 [ 352.821110] ? __copy_msghdr_from_user+0x260/0x260 [ 352.821934] ? _raw_spin_lock+0x81/0xd0 [ 352.822680] ? __handle_mm_fault+0xef3/0x1b20 [ 352.823549] ? rb_insert_color+0x2a/0x270 [ 352.824373] ? copy_page_range+0x16b0/0x16b0 [ 352.825209] ? perf_event_update_userpage+0x2d0/0x2d0 [ 352.826190] ? __fget_light+0xd9/0xf0 [ 352.826941] __sys_sendmsg+0xb3/0x130 [ 352.827613] ? __sys_sendmsg_sock+0x20/0x20 [ 352.828377] ? do_user_addr_fault+0x2c5/0x8a0 [ 352.829184] ? fpregs_assert_state_consistent+0x52/0x60 [ 352.830001] ? exit_to_user_mode_prepare+0x32/0x160 [ 352.830845] do_syscall_64+0x35/0x80 [ 352.831445] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 352.832331] RIP: 0033:0x7f7bee973c17 [ 352.833078] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 352.836202] RSP: 002b:00007ffcbb368e28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 352.837524] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7bee973c17 [ 352.838715] RDX: 0000000000000000 RSI: 00007ffcbb368e50 RDI: 0000000000000003 [ 352.839838] RBP: 00007ffcbb36d090 R08: 00000000cea96d79 R09: 00007f7beea34a40 [ 352.841021] R10: 00000000004059bb R11: 0000000000000246 R12: 000000000046563f [ 352.842208] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffcbb36d088
[ 352.843784] Allocated by task 2960: [ 352.844451] kasan_save_stack+0x1b/0x40 [ 352.845173] __kasan_kmalloc+0x7c/0x90 [ 352.845873] fl_change+0x282/0x22db [cls_flower] [ 352.846696] tc_new_tfilter+0x6cf/0x1180 [ 352.847493] rtnetlink_rcv_msg+0x471/0x550 [ 352.848323] netlink_rcv_skb+0xc6/0x1f0 [ 352.849097] netlink_unicast+0x353/0x480 [ 352.849886] netlink_sendmsg+0x396/0x680 [ 352.850678] sock_sendmsg+0x6c/0x80 [ 352.851398] ____sys_sendmsg+0x3a5/0x3c0 [ 352.852202] ___sys_sendmsg+0xd8/0x140 [ 352.852967] __sys_sendmsg+0xb3/0x130 [ 352.853718] do_syscall_64+0x35/0x80 [ 352.854457] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 352.855830] Freed by task 7: [ 352.856421] kasan_save_stack+0x1b/0x40 [ 352.857139] kasan_set_track+0x1c/0x30 [ 352.857854] kasan_set_free_info+0x20/0x30 [ 352.858609] __kasan_slab_free+0xed/0x130 [ 352.859348] kfree+0xa7/0x3c0 [ 352.859951] process_one_work+0x44d/0x780 [ 352.860685] worker_thread+0x2e2/0x7e0 [ 352.861390] kthread+0x1f4/0x220 [ 352.862022] ret_from_fork+0x1f/0x30
[ 352.862955] Last potentially related work creation: [ 352.863758] kasan_save_stack+0x1b/0x40 [ 352.864378] kasan_record_aux_stack+0xab/0xc0 [ 352.865028] insert_work+0x30/0x160 [ 352.865617] __queue_work+0x351/0x670 [ 352.866261] rcu_work_rcufn+0x30/0x40 [ 352.866917] rcu_core+0x3b2/0xdb0 [ 352.867561] __do_softirq+0xf6/0x386
[ 352.868708] Second to last potentially related work creation: [ 352.869779] kasan_save_stack+0x1b/0x40 [ 352.870560] kasan_record_aux_stack+0xab/0xc0 [ 352.871426] call_rcu+0x5f/0x5c0 [ 352.872108] queue_rcu_work+0x44/0x50 [ 352.872855] __fl_put+0x17c/0x240 [cls_flower] [ 352.873733] fl_delete+0xc7/0x100 [cls_flower] [ 352.874607] tc_del_tfilter+0x510/0xb30 [ 352.886085] rtnetlink_rcv_msg+0x471/0x550 [ 352.886875] netlink_rcv_skb+0xc6/0x1f0 [ 352.887636] netlink_unicast+0x353/0x480 [ 352.888285] netlink_sendmsg+0x396/0x680 [ 352.888942] sock_sendmsg+0x6c/0x80 [ 352.889583] ____sys_sendmsg+0x3a5/0x3c0 [ 352.890311] ___sys_sendmsg+0xd8/0x140 [ 352.891019] __sys_sendmsg+0xb3/0x130 [ 352.891716] do_syscall_64+0x35/0x80 [ 352.892395] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 352.893666] The buggy address belongs to the object at ffff8881c8251000 which belongs to the cache kmalloc-2k of size 2048 [ 352.895696] The buggy address is located 1152 bytes inside of 2048-byte region [ffff8881c8251000, ffff8881c8251800) [ 352.897640] The buggy address belongs to the page: [ 352.898492] page:00000000213bac35 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1c8250 [ 352.900110] head:00000000213bac35 order:3 compound_mapcount:0 compound_pincount:0 [ 352.901541] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff) [ 352.902908] raw: 002ffff800010200 0000000000000000 dead000000000122 ffff888100042f00 [ 352.904391] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 [ 352.905861] page dumped because: kasan: bad access detected
[ 352.907323] Memory state around the buggy address: [ 352.908218] ffff8881c8251380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 352.909471] ffff8881c8251400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 352.910735] >ffff8881c8251480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 352.912012] ^ [ 352.912642] ffff8881c8251500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 352.913919] ffff8881c8251580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 352.915185] ==================================================================
Fixes: d39d714969cd ("idr: introduce idr_for_each_entry_continue_ul()") Signed-off-by: Vlad Buslov vladbu@nvidia.com Acked-by: Cong Wang cong.wang@bytedance.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_flower.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index d7869a984881..d2a4e31d963d 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -2188,18 +2188,24 @@ static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
arg->count = arg->skip;
+ rcu_read_lock(); idr_for_each_entry_continue_ul(&head->handle_idr, f, tmp, id) { /* don't return filters that are being deleted */ if (!refcount_inc_not_zero(&f->refcnt)) continue; + rcu_read_unlock(); + if (arg->fn(tp, f, arg) < 0) { __fl_put(f); arg->stop = 1; + rcu_read_lock(); break; } __fl_put(f); arg->count++; + rcu_read_lock(); } + rcu_read_unlock(); arg->cookie = id; }
From: Wong Vee Khee vee.khee.wong@linux.intel.com
[ Upstream commit 656ed8b015f19bf3f6e6b3ddd9a4bb4aa5ca73e1 ]
When STMMAC is paired with Energy-Efficient Ethernet(EEE) capable PHY, and the PHY is advertising EEE by default, we need to enable EEE on the xPCS side too, instead of having user to manually trigger the enabling config via ethtool.
Fixed this by adding xpcs_config_eee() call in stmmac_eee_init().
Fixes: 7617af3d1a5e ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet") Cc: Michael Sit Wei Hong michael.wei.hong.sit@intel.com Signed-off-by: Wong Vee Khee vee.khee.wong@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 2218bc3a624b..86151a817b79 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -486,6 +486,10 @@ bool stmmac_eee_init(struct stmmac_priv *priv) timer_setup(&priv->eee_ctrl_timer, stmmac_eee_ctrl_timer, 0); stmmac_set_eee_timer(priv, priv->hw, STMMAC_DEFAULT_LIT_LS, eee_tw_timer); + if (priv->hw->xpcs) + xpcs_config_eee(priv->hw->xpcs, + priv->plat->mult_fact_100ns, + true); }
if (priv->plat->has_gmac4 && priv->tx_lpi_timer <= STMMAC_ET_MAX) {
From: Eric Dumazet edumazet@google.com
[ Upstream commit 35306eb23814444bd4021f8a1c3047d3cb0c8b2b ]
Jann Horn reported that SO_PEERCRED and SO_PEERGROUPS implementations are racy, as af_unix can concurrently change sk_peer_pid and sk_peer_cred.
In order to fix this issue, this patch adds a new spinlock that needs to be used whenever these fields are read or written.
Jann also pointed out that l2cap_sock_get_peer_pid_cb() is currently reading sk->sk_peer_pid which makes no sense, as this field is only possibly set by AF_UNIX sockets. We will have to clean this in a separate patch. This could be done by reverting b48596d1dc25 "Bluetooth: L2CAP: Add get_peer_pid callback" or implementing what was truly expected.
Fixes: 109f6e39fa07 ("af_unix: Allow SO_PEERCRED to work across namespaces.") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Jann Horn jannh@google.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: Luiz Augusto von Dentz luiz.von.dentz@intel.com Cc: Marcel Holtmann marcel@holtmann.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 2 ++ net/core/sock.c | 32 ++++++++++++++++++++++++++------ net/unix/af_unix.c | 34 ++++++++++++++++++++++++++++------ 3 files changed, 56 insertions(+), 12 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index 980b471b569d..db0cb8aa591f 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -487,8 +487,10 @@ struct sock { u8 sk_prefer_busy_poll; u16 sk_busy_poll_budget; #endif + spinlock_t sk_peer_lock; struct pid *sk_peer_pid; const struct cred *sk_peer_cred; + long sk_rcvtimeo; ktime_t sk_stamp; #if BITS_PER_LONG==32 diff --git a/net/core/sock.c b/net/core/sock.c index 1cf0edc79f37..bd1b34b3b778 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1366,6 +1366,16 @@ int sock_setsockopt(struct socket *sock, int level, int optname, } EXPORT_SYMBOL(sock_setsockopt);
+static const struct cred *sk_get_peer_cred(struct sock *sk) +{ + const struct cred *cred; + + spin_lock(&sk->sk_peer_lock); + cred = get_cred(sk->sk_peer_cred); + spin_unlock(&sk->sk_peer_lock); + + return cred; +}
static void cred_to_ucred(struct pid *pid, const struct cred *cred, struct ucred *ucred) @@ -1542,7 +1552,11 @@ int sock_getsockopt(struct socket *sock, int level, int optname, struct ucred peercred; if (len > sizeof(peercred)) len = sizeof(peercred); + + spin_lock(&sk->sk_peer_lock); cred_to_ucred(sk->sk_peer_pid, sk->sk_peer_cred, &peercred); + spin_unlock(&sk->sk_peer_lock); + if (copy_to_user(optval, &peercred, len)) return -EFAULT; goto lenout; @@ -1550,20 +1564,23 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
case SO_PEERGROUPS: { + const struct cred *cred; int ret, n;
- if (!sk->sk_peer_cred) + cred = sk_get_peer_cred(sk); + if (!cred) return -ENODATA;
- n = sk->sk_peer_cred->group_info->ngroups; + n = cred->group_info->ngroups; if (len < n * sizeof(gid_t)) { len = n * sizeof(gid_t); + put_cred(cred); return put_user(len, optlen) ? -EFAULT : -ERANGE; } len = n * sizeof(gid_t);
- ret = groups_to_user((gid_t __user *)optval, - sk->sk_peer_cred->group_info); + ret = groups_to_user((gid_t __user *)optval, cred->group_info); + put_cred(cred); if (ret) return ret; goto lenout; @@ -1921,9 +1938,10 @@ static void __sk_destruct(struct rcu_head *head) sk->sk_frag.page = NULL; }
- if (sk->sk_peer_cred) - put_cred(sk->sk_peer_cred); + /* We do not need to acquire sk->sk_peer_lock, we are the last user. */ + put_cred(sk->sk_peer_cred); put_pid(sk->sk_peer_pid); + if (likely(sk->sk_net_refcnt)) put_net(sock_net(sk)); sk_prot_free(sk->sk_prot_creator, sk); @@ -3124,6 +3142,8 @@ void sock_init_data(struct socket *sock, struct sock *sk)
sk->sk_peer_pid = NULL; sk->sk_peer_cred = NULL; + spin_lock_init(&sk->sk_peer_lock); + sk->sk_write_pending = 0; sk->sk_rcvlowat = 1; sk->sk_rcvtimeo = MAX_SCHEDULE_TIMEOUT; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 91ff09d833e8..f96ee27d9ff2 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -600,20 +600,42 @@ static void unix_release_sock(struct sock *sk, int embrion)
static void init_peercred(struct sock *sk) { - put_pid(sk->sk_peer_pid); - if (sk->sk_peer_cred) - put_cred(sk->sk_peer_cred); + const struct cred *old_cred; + struct pid *old_pid; + + spin_lock(&sk->sk_peer_lock); + old_pid = sk->sk_peer_pid; + old_cred = sk->sk_peer_cred; sk->sk_peer_pid = get_pid(task_tgid(current)); sk->sk_peer_cred = get_current_cred(); + spin_unlock(&sk->sk_peer_lock); + + put_pid(old_pid); + put_cred(old_cred); }
static void copy_peercred(struct sock *sk, struct sock *peersk) { - put_pid(sk->sk_peer_pid); - if (sk->sk_peer_cred) - put_cred(sk->sk_peer_cred); + const struct cred *old_cred; + struct pid *old_pid; + + if (sk < peersk) { + spin_lock(&sk->sk_peer_lock); + spin_lock_nested(&peersk->sk_peer_lock, SINGLE_DEPTH_NESTING); + } else { + spin_lock(&peersk->sk_peer_lock); + spin_lock_nested(&sk->sk_peer_lock, SINGLE_DEPTH_NESTING); + } + old_pid = sk->sk_peer_pid; + old_cred = sk->sk_peer_cred; sk->sk_peer_pid = get_pid(peersk->sk_peer_pid); sk->sk_peer_cred = get_cred(peersk->sk_peer_cred); + + spin_unlock(&sk->sk_peer_lock); + spin_unlock(&peersk->sk_peer_lock); + + put_pid(old_pid); + put_cred(old_cred); }
static int unix_listen(struct socket *sock, int backlog)
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 24ff652573754fe4c03213ebd26b17e86842feb3 ]
Occasionally objtool encounters symbol (as opposed to section) relocations in .altinstructions. Typically they are the alternatives written by elf_add_alternative() as encountered on a noinstr validation run on vmlinux after having already ran objtool on the individual .o files.
Basically this is the counterpart of commit 44f6a7c0755d ("objtool: Fix seg fault with Clang non-section symbols"), because when these new assemblers (binutils now also does this) strip the section symbols, elf_add_reloc_to_insn() is forced to emit symbol based relocations.
As such, teach get_alt_entry() about different relocation types.
Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls") Reported-by: Stephen Rothwell sfr@canb.auug.org.au Reported-by: Borislav Petkov bp@alien8.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Josh Poimboeuf jpoimboe@redhat.com Tested-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/objtool/special.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/tools/objtool/special.c b/tools/objtool/special.c index bc925cf19e2d..f58ecc50fb10 100644 --- a/tools/objtool/special.c +++ b/tools/objtool/special.c @@ -58,6 +58,24 @@ void __weak arch_handle_alternative(unsigned short feature, struct special_alt * { }
+static bool reloc2sec_off(struct reloc *reloc, struct section **sec, unsigned long *off) +{ + switch (reloc->sym->type) { + case STT_FUNC: + *sec = reloc->sym->sec; + *off = reloc->sym->offset + reloc->addend; + return true; + + case STT_SECTION: + *sec = reloc->sym->sec; + *off = reloc->addend; + return true; + + default: + return false; + } +} + static int get_alt_entry(struct elf *elf, struct special_entry *entry, struct section *sec, int idx, struct special_alt *alt) @@ -91,15 +109,12 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry, WARN_FUNC("can't find orig reloc", sec, offset + entry->orig); return -1; } - if (orig_reloc->sym->type != STT_SECTION) { - WARN_FUNC("don't know how to handle non-section reloc symbol %s", + if (!reloc2sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off)) { + WARN_FUNC("don't know how to handle reloc symbol type: %s", sec, offset + entry->orig, orig_reloc->sym->name); return -1; }
- alt->orig_sec = orig_reloc->sym->sec; - alt->orig_off = orig_reloc->addend; - if (!entry->group || alt->new_len) { new_reloc = find_reloc_by_dest(elf, sec, offset + entry->new); if (!new_reloc) { @@ -116,8 +131,11 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry, if (arch_is_retpoline(new_reloc->sym)) return 1;
- alt->new_sec = new_reloc->sym->sec; - alt->new_off = (unsigned int)new_reloc->addend; + if (!reloc2sec_off(new_reloc, &alt->new_sec, &alt->new_off)) { + WARN_FUNC("don't know how to handle reloc symbol type: %s", + sec, offset + entry->new, new_reloc->sym->name); + return -1; + }
/* _ASM_EXTABLE_EX hack */ if (alt->new_off >= 0x7ffffff0)
From: Kan Liang kan.liang@linux.intel.com
[ Upstream commit ecc2123e09f9e71ddc6c53d71e283b8ada685fe2 ]
According to the latest event list, the event encoding 0xEF is only available on the first 4 counters. Add it into the event constraints table.
Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support") Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/1632842343-25862-1-git-send-email-kan.liang@linux.... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/intel/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ac6fd2dabf6a..482224444a1e 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -263,6 +263,7 @@ static struct event_constraint intel_icl_event_constraints[] = { INTEL_EVENT_CONSTRAINT_RANGE(0xa8, 0xb0, 0xf), INTEL_EVENT_CONSTRAINT_RANGE(0xb7, 0xbd, 0xf), INTEL_EVENT_CONSTRAINT_RANGE(0xd0, 0xe6, 0xf), + INTEL_EVENT_CONSTRAINT(0xef, 0xf), INTEL_EVENT_CONSTRAINT_RANGE(0xf0, 0xf4, 0xf), EVENT_CONSTRAINT_END };
From: Michal Koutný mkoutny@suse.com
[ Upstream commit 2630cde26711dab0d0b56a8be1616475be646d13 ]
Since commit a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") we add cfs_rqs with no runnable tasks but not fully decayed into the load (leaf) list. We may ignore adding some ancestors and therefore breaking tmp_alone_branch invariant. This broke LTP test cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling").
I noticed the named test still fails even with the fix (but with low probability, 1 in ~1000 executions of the test). The reason is when bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly.
Fix this by adding ancestors if we notice the unthrottled cfs_rq was added to the load list.
Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle") Signed-off-by: Michal Koutný mkoutny@suse.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Reviewed-by: Odin Ugedal odin@uged.al Link: https://lore.kernel.org/r/20210917153037.11176-1-mkoutny@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 30a6984a58f7..423ec671a306 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4898,8 +4898,12 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) /* update hierarchical throttle state */ walk_tg_tree_from(cfs_rq->tg, tg_nop, tg_unthrottle_up, (void *)rq);
- if (!cfs_rq->load.weight) + /* Nothing to run but something to decay (on_list)? Complete the branch */ + if (!cfs_rq->load.weight) { + if (cfs_rq->on_list) + goto unthrottle_throttle; return; + }
task_delta = cfs_rq->h_nr_running; idle_task_delta = cfs_rq->idle_h_nr_running;
From: Mel Gorman mgorman@techsingularity.net
[ Upstream commit 703066188f63d66cc6b9d678e5b5ef1213c5938e ]
This patch null-terminates the temporary buffer in sched_scaling_write() so kstrtouint() does not return failure and checks the value is valid.
Before: $ cat /sys/kernel/debug/sched/tunable_scaling 1 $ echo 0 > /sys/kernel/debug/sched/tunable_scaling -bash: echo: write error: Invalid argument $ cat /sys/kernel/debug/sched/tunable_scaling 1
After: $ cat /sys/kernel/debug/sched/tunable_scaling 1 $ echo 0 > /sys/kernel/debug/sched/tunable_scaling $ cat /sys/kernel/debug/sched/tunable_scaling 0 $ echo 3 > /sys/kernel/debug/sched/tunable_scaling -bash: echo: write error: Invalid argument
Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to debugfs") Signed-off-by: Mel Gorman mgorman@techsingularity.net Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20210927114635.GH3959@techsingularity.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/debug.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 7e08e3d947c2..2c879cd02a5f 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -173,16 +173,22 @@ static ssize_t sched_scaling_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) { char buf[16]; + unsigned int scaling;
if (cnt > 15) cnt = 15;
if (copy_from_user(&buf, ubuf, cnt)) return -EFAULT; + buf[cnt] = '\0';
- if (kstrtouint(buf, 10, &sysctl_sched_tunable_scaling)) + if (kstrtouint(buf, 10, &scaling)) return -EINVAL;
+ if (scaling >= SCHED_TUNABLESCALING_END) + return -EINVAL; + + sysctl_sched_tunable_scaling = scaling; if (sched_update_scaling()) return -EINVAL;
From: Eddie James eajames@linux.ibm.com
[ Upstream commit ffa2600044979aff4bd6238edb9af815a47d7c32 ]
The P10 (temp sensor version 0x10) doesn't do the same VRM status reporting that was used on P9. It just reports the temperature, so drop the check for VRM fru type in the sysfs show function, and don't set the name to "alarm".
Fixes: db4919ec86 ("hwmon: (occ) Add new temperature sensor type") Signed-off-by: Eddie James eajames@linux.ibm.com Link: https://lore.kernel.org/r/20210929153604.14968-1-eajames@linux.ibm.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/occ/common.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/drivers/hwmon/occ/common.c b/drivers/hwmon/occ/common.c index 0d68a78be980..ae664613289c 100644 --- a/drivers/hwmon/occ/common.c +++ b/drivers/hwmon/occ/common.c @@ -340,18 +340,11 @@ static ssize_t occ_show_temp_10(struct device *dev, if (val == OCC_TEMP_SENSOR_FAULT) return -EREMOTEIO;
- /* - * VRM doesn't return temperature, only alarm bit. This - * attribute maps to tempX_alarm instead of tempX_input for - * VRM - */ - if (temp->fru_type != OCC_FRU_TYPE_VRM) { - /* sensor not ready */ - if (val == 0) - return -EAGAIN; + /* sensor not ready */ + if (val == 0) + return -EAGAIN;
- val *= 1000; - } + val *= 1000; break; case 2: val = temp->fru_type; @@ -886,7 +879,7 @@ static int occ_setup_sensor_attrs(struct occ *occ) 0, i); attr++;
- if (sensors->temp.version > 1 && + if (sensors->temp.version == 2 && temp->fru_type == OCC_FRU_TYPE_VRM) { snprintf(attr->name, sizeof(attr->name), "temp%d_alarm", s);
From: Vadim Pasternak vadimp@nvidia.com
[ Upstream commit 2292e2f685cd5c65e3f47bbcf9f469513acc3195 ]
Add missed attribute for reading POUT from page 1. It is supported by device, but has been missed in initial commit.
Fixes: 2c6fcbb21149 ("hwmon: (pmbus) Add support for MPS Multi-phase mp2975 controller") Signed-off-by: Vadim Pasternak vadimp@nvidia.com Link: https://lore.kernel.org/r/20210927070740.2149290-1-vadimp@nvidia.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/pmbus/mp2975.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/pmbus/mp2975.c b/drivers/hwmon/pmbus/mp2975.c index eb94bd5f4e2a..51986adfbf47 100644 --- a/drivers/hwmon/pmbus/mp2975.c +++ b/drivers/hwmon/pmbus/mp2975.c @@ -54,7 +54,7 @@
#define MP2975_RAIL2_FUNC (PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT | \ PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT | \ - PMBUS_PHASE_VIRTUAL) + PMBUS_HAVE_POUT | PMBUS_PHASE_VIRTUAL)
struct mp2975_data { struct pmbus_driver_info info;
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit 291073a566b2094c7192872cc0f17ce73d83cb76 ]
The recent change to make objtool aware of more symbol relocation types (commit 24ff65257375: "objtool: Teach get_alt_entry() about more relocation types") also added another check, and resulted in this objtool warning when building kvm on x86:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
The reason seems to be that kvm_fastop_exception() is marked as a global symbol, which causes the relocation to ke kept around for objtool. And at the same time, the kvm_fastop_exception definition (which is done as an inline asm statement) doesn't actually set the type of the global, which then makes objtool unhappy.
The minimal fix is to just not mark kvm_fastop_exception as being a global symbol. It's only used in that one compilation unit anyway, so it was always pointless. That's how all the other local exception table labels are done.
I'm not entirely happy about the kinds of games that the kvm code plays with doing its own exception handling, and the fact that it confused objtool is most definitely a symptom of the code being a bit too subtle and ad-hoc. But at least this trivial one-liner makes objtool no longer upset about what is going on.
Fixes: 24ff65257375 ("objtool: Teach get_alt_entry() about more relocation types") Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_E... Cc: Borislav Petkov bp@suse.de Cc: Paolo Bonzini pbonzini@redhat.com Cc: Sean Christopherson seanjc@google.com Cc: Vitaly Kuznetsov vkuznets@redhat.com Cc: Wanpeng Li wanpengli@tencent.com Cc: Jim Mattson jmattson@google.com Cc: Joerg Roedel joro@8bytes.org Cc: Peter Zijlstra peterz@infradead.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Nathan Chancellor nathan@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/emulate.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 2837110e66ed..50050d06672b 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -435,7 +435,6 @@ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop); __FOP_RET(#op)
asm(".pushsection .fixup, "ax"\n" - ".global kvm_fastop_exception \n" "kvm_fastop_exception: xor %esi, %esi; ret\n" ".popsection");
From: Keith Busch kbusch@kernel.org
commit a2941f6aa71a72be2c82c0a168523a492d093530 upstream.
Some apple controllers use the command id as an index to implementation specific data structures and will fail if the value is out of bounds. The nvme driver's recently introduced command sequence number breaks this controller.
Provide a quirk so these spec incompliant controllers can function as before. The driver will not have the ability to detect bad completions when this quirk is used, but we weren't previously checking this anyway.
The quirk bit was selected so that it can readily apply to stable.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214509 Cc: Sven Peter sven@svenpeter.dev Reported-by: Orlando Chamberlain redecorating@protonmail.com Reported-by: Aditya Garg gargaditya08@live.com Signed-off-by: Keith Busch kbusch@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Tested-by: Sven Peter sven@svenpeter.dev Link: https://lore.kernel.org/r/20210927154306.387437-1-kbusch@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/core.c | 4 +++- drivers/nvme/host/nvme.h | 6 ++++++ drivers/nvme/host/pci.c | 3 ++- 3 files changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -980,6 +980,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd); blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req) { struct nvme_command *cmd = nvme_req(req)->cmd; + struct nvme_ctrl *ctrl = nvme_req(req)->ctrl; blk_status_t ret = BLK_STS_OK;
if (!(req->rq_flags & RQF_DONTPREP)) { @@ -1028,7 +1029,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ return BLK_STS_IOERR; }
- nvme_req(req)->genctr++; + if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN)) + nvme_req(req)->genctr++; cmd->common.command_id = nvme_cid(req); trace_nvme_setup_cmd(req, cmd); return ret; --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -149,6 +149,12 @@ enum nvme_quirks { * 48 bits. */ NVME_QUIRK_DMA_ADDRESS_BITS_48 = (1 << 16), + + /* + * The controller requires the command_id value be be limited, so skip + * encoding the generation sequence number. + */ + NVME_QUIRK_SKIP_CID_GEN = (1 << 17), };
/* --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3282,7 +3282,8 @@ static const struct pci_device_id nvme_i { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005), .driver_data = NVME_QUIRK_SINGLE_VECTOR | NVME_QUIRK_128_BYTES_SQES | - NVME_QUIRK_SHARED_TAGS }, + NVME_QUIRK_SHARED_TAGS | + NVME_QUIRK_SKIP_CID_GEN },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, { 0, }
From: Chen Jingwen chenjingwen6@huawei.com
commit 9b2f72cc0aa4bb444541bb87581c35b7508b37d3 upstream.
In commit b212921b13bd ("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings") we still leave MAP_FIXED_NOREPLACE in place for load_elf_interp.
Unfortunately, this will cause kernel to fail to start with:
1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already Failed to execute /init (error -17)
The reason is that the elf interpreter (ld.so) has overlapping segments.
readelf -l ld-2.31.so Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000002c94c 0x000000000002c94c R E 0x10000 LOAD 0x000000000002dae0 0x000000000003dae0 0x000000000003dae0 0x00000000000021e8 0x0000000000002320 RW 0x10000 LOAD 0x000000000002fe00 0x000000000003fe00 0x000000000003fe00 0x00000000000011ac 0x0000000000001328 RW 0x10000
The reason for this problem is the same as described in commit ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments").
Not only executable binaries, elf interpreters (e.g. ld.so) can have overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go back to MAP_FIXED in load_elf_interp.
Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") Cc: stable@vger.kernel.org # v4.19 Cc: Andrew Morton akpm@linux-foundation.org Cc: Michal Hocko mhocko@suse.com Signed-off-by: Chen Jingwen chenjingwen6@huawei.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/binfmt_elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -630,7 +630,7 @@ static unsigned long load_elf_interp(str
vaddr = eppnt->p_vaddr; if (interp_elf_ex->e_type == ET_EXEC || load_addr_set) - elf_type |= MAP_FIXED_NOREPLACE; + elf_type |= MAP_FIXED; else if (no_base && interp_elf_ex->e_type == ET_DYN) load_addr = -vaddr;
From: Saravana Kannan saravanak@google.com
commit 2de9d8e0d2fe3a1eb632def2245529067cb35db5 upstream.
When we have a dependency of the form:
Device-A -> Device-C Device-B
Device-C -> Device-B
Where, * Indentation denotes "child of" parent in previous line. * X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
We have cyclic dependency: device-A -> device-C -> device-B -> device-A
fw_devlink current treats device-C -> device-B dependency as an invalid dependency and doesn't enforce it but leaves the rest of the dependencies as is.
While the current behavior is necessary, it is not sufficient if the false dependency in this example is actually device-A -> device-C. When this is the case, device-C will correctly probe defer waiting for device-B to be added, but device-A will be incorrectly probe deferred by fw_devlink waiting on device-C to probe successfully. Due to this, none of the devices in the cycle will end up probing.
To fix this, we need to go relax all the dependencies in the cycle like we already do in the other instances where fw_devlink detects cycles. A real world example of this was reported[1] and analyzed[2].
[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.co... [2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5Q...
Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature") Cc: stable stable@vger.kernel.org Reported-by: Marek Szyprowski m.szyprowski@samsung.com Tested-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Saravana Kannan saravanak@google.com Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1790,14 +1790,21 @@ static int fw_devlink_create_devlink(str * be broken by applying logic. Check for these types of cycles and * break them so that devices in the cycle probe properly. * - * If the supplier's parent is dependent on the consumer, then - * the consumer-supplier dependency is a false dependency. So, - * treat it as an invalid link. + * If the supplier's parent is dependent on the consumer, then the + * consumer and supplier have a cyclic dependency. Since fw_devlink + * can't tell which of the inferred dependencies are incorrect, don't + * enforce probe ordering between any of the devices in this cyclic + * dependency. Do this by relaxing all the fw_devlink device links in + * this cycle and by treating the fwnode link between the consumer and + * the supplier as an invalid dependency. */ sup_dev = fwnode_get_next_parent_dev(sup_handle); if (sup_dev && device_is_dependent(con, sup_dev)) { - dev_dbg(con, "Not linking to %pfwP - False link\n", - sup_handle); + dev_info(con, "Fixing up cyclic dependency with %pfwP (%s)\n", + sup_handle, dev_name(sup_dev)); + device_links_write_lock(); + fw_devlink_relax_cycle(con, sup_dev); + device_links_write_unlock(); ret = -EINVAL; } else { /*
From: Nirmoy Das nirmoy.das@amd.com
commit af505cad9567f7a500d34bf183696d570d7f6810 upstream.
debugfs_create_file() returns encoded error so use IS_ERR for checking return value.
Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Nirmoy Das nirmoy.das@amd.com Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210902102917.2233-1-nirmoy.das@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/debugfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -528,7 +528,7 @@ void debugfs_create_file_size(const char { struct dentry *de = debugfs_create_file(name, mode, parent, data, fops);
- if (de) + if (!IS_ERR(de)) d_inode(de)->i_size = file_size; } EXPORT_SYMBOL_GPL(debugfs_create_file_size);
From: Johan Hovold johan@kernel.org
commit a89936cce87d60766a75732a9e7e25c51164f47c upstream.
The tty driver name is used also after registering the driver and must specifically not be allocated on the stack to avoid leaking information to user space (or triggering an oops).
Drivers should not try to encode topology information in the tty device name but this one snuck in through staging without anyone noticing and another driver has since copied this malpractice.
Fixing the ABI is a separate issue, but this at least plugs the security hole.
Fixes: ba4dc61fe8c5 ("Staging: ipack: add support for IP-OCTAL mezzanine board") Cc: stable@vger.kernel.org # 3.5 Acked-by: Samuel Iglesias Gonsalvez siglesias@igalia.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210917114622.5412-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ipack/devices/ipoctal.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/drivers/ipack/devices/ipoctal.c +++ b/drivers/ipack/devices/ipoctal.c @@ -264,7 +264,6 @@ static int ipoctal_inst_slot(struct ipoc int res; int i; struct tty_driver *tty; - char name[20]; struct ipoctal_channel *channel; struct ipack_region *region; void __iomem *addr; @@ -355,8 +354,11 @@ static int ipoctal_inst_slot(struct ipoc /* Fill struct tty_driver with ipoctal data */ tty->owner = THIS_MODULE; tty->driver_name = KBUILD_MODNAME; - sprintf(name, KBUILD_MODNAME ".%d.%d.", bus_nr, slot); - tty->name = name; + tty->name = kasprintf(GFP_KERNEL, KBUILD_MODNAME ".%d.%d.", bus_nr, slot); + if (!tty->name) { + res = -ENOMEM; + goto err_put_driver; + } tty->major = 0;
tty->minor_start = 0; @@ -372,8 +374,7 @@ static int ipoctal_inst_slot(struct ipoc res = tty_register_driver(tty); if (res) { dev_err(&ipoctal->dev->dev, "Can't register tty driver.\n"); - put_tty_driver(tty); - return res; + goto err_free_name; }
/* Save struct tty_driver for use it when uninstalling the device */ @@ -410,6 +411,13 @@ static int ipoctal_inst_slot(struct ipoc ipoctal_irq_handler, ipoctal);
return 0; + +err_free_name: + kfree(tty->name); +err_put_driver: + put_tty_driver(tty); + + return res; }
static inline int ipoctal_copy_write_buffer(struct ipoctal_channel *channel, @@ -697,6 +705,7 @@ static void __ipoctal_remove(struct ipoc }
tty_unregister_driver(ipoctal->tty_drv); + kfree(ipoctal->tty_drv->name); put_tty_driver(ipoctal->tty_drv); kfree(ipoctal); }
From: Johan Hovold johan@kernel.org
commit 65c001df517a7bf9be8621b53d43c89f426ce8d6 upstream.
Make sure to set the tty class-device driver data before registering the tty to avoid having a racing open() dereference a NULL pointer.
Fixes: 9c1d784afc6f ("Staging: ipack/devices/ipoctal: Get rid of ipoctal_list.") Cc: stable@vger.kernel.org # 3.7 Acked-by: Samuel Iglesias Gonsalvez siglesias@igalia.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210917114622.5412-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ipack/devices/ipoctal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/ipack/devices/ipoctal.c +++ b/drivers/ipack/devices/ipoctal.c @@ -393,13 +393,13 @@ static int ipoctal_inst_slot(struct ipoc spin_lock_init(&channel->lock); channel->pointer_read = 0; channel->pointer_write = 0; - tty_dev = tty_port_register_device(&channel->tty_port, tty, i, NULL); + tty_dev = tty_port_register_device_attr(&channel->tty_port, tty, + i, NULL, channel, NULL); if (IS_ERR(tty_dev)) { dev_err(&ipoctal->dev->dev, "Failed to register tty device.\n"); tty_port_destroy(&channel->tty_port); continue; } - dev_set_drvdata(tty_dev, channel); }
/*
From: Johan Hovold johan@kernel.org
commit cd20d59291d1790dc74248476e928f57fc455189 upstream.
Registration of the ipoctal tty devices is unlikely to fail, but if it ever does, make sure not to deregister a never registered tty device (and dereference a NULL pointer) when the driver is later unbound.
Fixes: 2afb41d9d30d ("Staging: ipack/devices/ipoctal: Check tty_register_device return value.") Cc: stable@vger.kernel.org # 3.7 Acked-by: Samuel Iglesias Gonsalvez siglesias@igalia.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210917114622.5412-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ipack/devices/ipoctal.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/ipack/devices/ipoctal.c +++ b/drivers/ipack/devices/ipoctal.c @@ -33,6 +33,7 @@ struct ipoctal_channel { unsigned int pointer_read; unsigned int pointer_write; struct tty_port tty_port; + bool tty_registered; union scc2698_channel __iomem *regs; union scc2698_block __iomem *block_regs; unsigned int board_id; @@ -397,9 +398,11 @@ static int ipoctal_inst_slot(struct ipoc i, NULL, channel, NULL); if (IS_ERR(tty_dev)) { dev_err(&ipoctal->dev->dev, "Failed to register tty device.\n"); + tty_port_free_xmit_buf(&channel->tty_port); tty_port_destroy(&channel->tty_port); continue; } + channel->tty_registered = true; }
/* @@ -699,6 +702,10 @@ static void __ipoctal_remove(struct ipoc
for (i = 0; i < NR_CHANNELS; i++) { struct ipoctal_channel *channel = &ipoctal->channel[i]; + + if (!channel->tty_registered) + continue; + tty_unregister_device(ipoctal->tty_drv, i); tty_port_free_xmit_buf(&channel->tty_port); tty_port_destroy(&channel->tty_port);
From: Johan Hovold johan@kernel.org
commit 445c8132727728dc297492a7d9fc074af3e94ba3 upstream.
Add the missing error handling when allocating the transmit buffer to avoid dereferencing a NULL pointer in write() should the allocation ever fail.
Fixes: ba4dc61fe8c5 ("Staging: ipack: add support for IP-OCTAL mezzanine board") Cc: stable@vger.kernel.org # 3.5 Acked-by: Samuel Iglesias Gonsalvez siglesias@igalia.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210917114622.5412-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ipack/devices/ipoctal.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/ipack/devices/ipoctal.c +++ b/drivers/ipack/devices/ipoctal.c @@ -386,7 +386,9 @@ static int ipoctal_inst_slot(struct ipoc
channel = &ipoctal->channel[i]; tty_port_init(&channel->tty_port); - tty_port_alloc_xmit_buf(&channel->tty_port); + res = tty_port_alloc_xmit_buf(&channel->tty_port); + if (res) + continue; channel->tty_port.ops = &ipoctal_tty_port_ops;
ipoctal_reset_stats(&channel->stats);
From: Johan Hovold johan@kernel.org
commit bb8a4fcb2136508224c596a7e665bdba1d7c3c27 upstream.
A reference to the carrier module was taken on every open but was only released once when the final reference to the tty struct was dropped.
Fix this by taking the module reference and initialising the tty driver data when installing the tty.
Fixes: 82a82340bab6 ("ipoctal: get carrier driver to avoid rmmod") Cc: stable@vger.kernel.org # 3.18 Cc: Federico Vaga federico.vaga@cern.ch Acked-by: Samuel Iglesias Gonsalvez siglesias@igalia.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210917114622.5412-6-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ipack/devices/ipoctal.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
--- a/drivers/ipack/devices/ipoctal.c +++ b/drivers/ipack/devices/ipoctal.c @@ -82,22 +82,34 @@ static int ipoctal_port_activate(struct return 0; }
-static int ipoctal_open(struct tty_struct *tty, struct file *file) +static int ipoctal_install(struct tty_driver *driver, struct tty_struct *tty) { struct ipoctal_channel *channel = dev_get_drvdata(tty->dev); struct ipoctal *ipoctal = chan_to_ipoctal(channel, tty->index); - int err; - - tty->driver_data = channel; + int res;
if (!ipack_get_carrier(ipoctal->dev)) return -EBUSY;
- err = tty_port_open(&channel->tty_port, tty, file); - if (err) - ipack_put_carrier(ipoctal->dev); + res = tty_standard_install(driver, tty); + if (res) + goto err_put_carrier; + + tty->driver_data = channel; + + return 0; + +err_put_carrier: + ipack_put_carrier(ipoctal->dev); + + return res; +} + +static int ipoctal_open(struct tty_struct *tty, struct file *file) +{ + struct ipoctal_channel *channel = tty->driver_data;
- return err; + return tty_port_open(&channel->tty_port, tty, file); }
static void ipoctal_reset_stats(struct ipoctal_stats *stats) @@ -662,6 +674,7 @@ static void ipoctal_cleanup(struct tty_s
static const struct tty_operations ipoctal_fops = { .ioctl = NULL, + .install = ipoctal_install, .open = ipoctal_open, .close = ipoctal_close, .write = ipoctal_write_tty,
From: Ritesh Harjani riteshh@linux.ibm.com
commit 75ca6ad408f459f00b09a64f04c774559848c097 upstream.
We should use unsigned long long rather than loff_t to avoid overflow in ext4_max_bitmap_size() for comparison before returning. w/o this patch sbi->s_bitmap_maxbytes was becoming a negative value due to overflow of upper_limit (with has_huge_files as true)
Below is a quick test to trigger it on a 64KB pagesize system.
sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2 sudo mount /dev/loop2 /mnt sudo echo "hello" > /mnt/hello -> This will error out with "echo: write error: File too large"
Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.162286759... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3185,17 +3185,17 @@ static loff_t ext4_max_size(int blkbits, */ static loff_t ext4_max_bitmap_size(int bits, int has_huge_files) { - loff_t res = EXT4_NDIR_BLOCKS; + unsigned long long upper_limit, res = EXT4_NDIR_BLOCKS; int meta_blocks; - loff_t upper_limit; - /* This is calculated to be the largest file size for a dense, block + + /* + * This is calculated to be the largest file size for a dense, block * mapped file such that the file's total number of 512-byte sectors, * including data and all indirect blocks, does not exceed (2^48 - 1). * * __u32 i_blocks_lo and _u16 i_blocks_high represent the total * number of 512-byte sectors of the file. */ - if (!has_huge_files) { /* * !has_huge_files or implies that the inode i_block field @@ -3238,7 +3238,7 @@ static loff_t ext4_max_bitmap_size(int b if (res > MAX_LFS_FILESIZE) res = MAX_LFS_FILESIZE;
- return res; + return (loff_t)res; }
static ext4_fsblk_t descriptor_loc(struct super_block *sb,
From: Hou Tao houtao1@huawei.com
commit a2c2f0826e2b75560b31daf1cd9a755ab93cf4c6 upstream.
Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the newly-added blocks, but the limit on the max value of ee_len field is ignored, and it can lead to BUG_ON as shown below when running command "fallocate -l 128M file" on a fast_commit-enabled fs:
kernel BUG at fs/ext4/ext4_extents.h:199! invalid opcode: 0000 [#1] SMP PTI CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200 Call Trace: ? ext4_fc_write_inode+0xf2/0x150 ext4_fc_commit+0x93b/0xa00 ? ext4_fallocate+0x1ad/0x10d0 ext4_sync_file+0x157/0x340 ? ext4_sync_file+0x157/0x340 vfs_fsync_range+0x49/0x80 do_fsync+0x3d/0x70 __x64_sys_fsync+0x14/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae
Simply fixing it by limiting the number of blocks in one EXT4_FC_TAG_ADD_RANGE TLV.
Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Cc: stable@kernel.org Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/fast_commit.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/fs/ext4/fast_commit.c +++ b/fs/ext4/fast_commit.c @@ -893,6 +893,12 @@ static int ext4_fc_write_inode_data(stru sizeof(lrange), (u8 *)&lrange, crc)) return -ENOSPC; } else { + unsigned int max = (map.m_flags & EXT4_MAP_UNWRITTEN) ? + EXT_UNWRITTEN_MAX_LEN : EXT_INIT_MAX_LEN; + + /* Limit the number of blocks in one extent */ + map.m_len = min(max, map.m_len); + fc_ext.fc_ino = cpu_to_le32(inode->i_ino); ex = (struct ext4_extent *)&fc_ext.fc_ex; ex->ee_block = cpu_to_le32(map.m_lblk);
From: Jeffle Xu jefflexu@linux.alibaba.com
commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc upstream.
When ext4_insert_delayed block receives and recovers from an error from ext4_es_insert_delayed_block(), e.g., ENOMEM, it does not release the space it has reserved for that block insertion as it should. One effect of this bug is that s_dirtyclusters_counter is not decremented and remains incorrectly elevated until the file system has been unmounted. This can result in premature ENOSPC returns and apparent loss of free space.
Another effect of this bug is that /sys/fs/ext4/<dev>/delayed_allocation_blocks can remain non-zero even after syncfs has been executed on the filesystem.
Besides, add check for s_dirtyclusters_counter when inode is going to be evicted and freed. s_dirtyclusters_counter can still keep non-zero until inode is written back in .evict_inode(), and thus the check is delayed to .destroy_inode().
Fixes: 51865fda28e5 ("ext4: let ext4 maintain extent status tree") Cc: stable@kernel.org Suggested-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Jeffle Xu jefflexu@linux.alibaba.com Reviewed-by: Eric Whitney enwlinux@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20210823061358.84473-1-jefflexu@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inode.c | 5 +++++ fs/ext4/super.c | 6 ++++++ 2 files changed, 11 insertions(+)
--- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1640,6 +1640,7 @@ static int ext4_insert_delayed_block(str struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); int ret; bool allocated = false; + bool reserved = false;
/* * If the cluster containing lblk is shared with a delayed, @@ -1656,6 +1657,7 @@ static int ext4_insert_delayed_block(str ret = ext4_da_reserve_space(inode); if (ret != 0) /* ENOSPC */ goto errout; + reserved = true; } else { /* bigalloc */ if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) { if (!ext4_es_scan_clu(inode, @@ -1668,6 +1670,7 @@ static int ext4_insert_delayed_block(str ret = ext4_da_reserve_space(inode); if (ret != 0) /* ENOSPC */ goto errout; + reserved = true; } else { allocated = true; } @@ -1678,6 +1681,8 @@ static int ext4_insert_delayed_block(str }
ret = ext4_es_insert_delayed_block(inode, lblk, allocated); + if (ret && reserved) + ext4_da_release_space(inode, 1);
errout: return ret; --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1351,6 +1351,12 @@ static void ext4_destroy_inode(struct in true); dump_stack(); } + + if (EXT4_I(inode)->i_reserved_data_blocks) + ext4_msg(inode->i_sb, KERN_ERR, + "Inode %lu (%p): i_reserved_data_blocks (%u) not cleared!", + inode->i_ino, EXT4_I(inode), + EXT4_I(inode)->i_reserved_data_blocks); }
static void init_once(void *foo)
From: Theodore Ts'o tytso@mit.edu
commit 1fd95c05d8f742abfe906620780aee4dbe1a2db0 upstream.
If the call to ext4_map_blocks() fails due to an corrupted file system, ext4_ext_replay_set_iblocks() can get stuck in an infinite loop. This could be reproduced by running generic/526 with a file system that has inline_data and fast_commit enabled. The system will repeatedly log to the console:
EXT4-fs warning (device dm-3): ext4_block_to_path:105: block 1074800922 > max in inode 131076
and the stack that it gets stuck in is:
ext4_block_to_path+0xe3/0x130 ext4_ind_map_blocks+0x93/0x690 ext4_map_blocks+0x100/0x660 skip_hole+0x47/0x70 ext4_ext_replay_set_iblocks+0x223/0x440 ext4_fc_replay_inode+0x29e/0x3b0 ext4_fc_replay+0x278/0x550 do_one_pass+0x646/0xc10 jbd2_journal_recover+0x14a/0x270 jbd2_journal_load+0xc4/0x150 ext4_load_journal+0x1f3/0x490 ext4_fill_super+0x22d4/0x2c00
With this patch, generic/526 still fails, but system is no longer locking up in a tight loop. It's likely the root casue is that fast_commit replay is corrupting file systems with inline_data, and we probably need to add better error handling in the fast commit replay code path beyond what is done here, which essentially just breaks the infinite loop without reporting the to the higher levels of the code.
Fixes: 8016E29F4362 ("ext4: fast commit recovery path") Cc: stable@kernel.org Cc: Harshad Shirwadkar harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/extents.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5908,7 +5908,7 @@ void ext4_ext_replay_shrink_inode(struct }
/* Check if *cur is a hole and if it is, skip it */ -static void skip_hole(struct inode *inode, ext4_lblk_t *cur) +static int skip_hole(struct inode *inode, ext4_lblk_t *cur) { int ret; struct ext4_map_blocks map; @@ -5917,9 +5917,12 @@ static void skip_hole(struct inode *inod map.m_len = ((inode->i_size) >> inode->i_sb->s_blocksize_bits) - *cur;
ret = ext4_map_blocks(NULL, inode, &map, 0); + if (ret < 0) + return ret; if (ret != 0) - return; + return 0; *cur = *cur + map.m_len; + return 0; }
/* Count number of blocks used by this inode and update i_blocks */ @@ -5968,7 +5971,9 @@ int ext4_ext_replay_set_iblocks(struct i * iblocks by total number of differences found. */ cur = 0; - skip_hole(inode, &cur); + ret = skip_hole(inode, &cur); + if (ret < 0) + goto out; path = ext4_find_extent(inode, cur, NULL, 0); if (IS_ERR(path)) goto out; @@ -5987,8 +5992,12 @@ int ext4_ext_replay_set_iblocks(struct i } cur = max(cur + 1, le32_to_cpu(ex->ee_block) + ext4_ext_get_actual_len(ex)); - skip_hole(inode, &cur); - + ret = skip_hole(inode, &cur); + if (ret < 0) { + ext4_ext_drop_refs(path); + kfree(path); + break; + } path2 = ext4_find_extent(inode, cur, NULL, 0); if (IS_ERR(path2)) { ext4_ext_drop_refs(path);
From: yangerkun yangerkun@huawei.com
commit 42cb447410d024e9d54139ae9c21ea132a8c384c upstream.
When ext4_htree_fill_tree() fails, ext4_dx_readdir() can run into an infinite loop since if info->last_pos != ctx->pos this will reset the directory scan and reread the failing entry. For example:
1. a dx_dir which has 3 block, block 0 as dx_root block, block 1/2 as leaf block which own the ext4_dir_entry_2 2. block 1 read ok and call_filldir which will fill the dirent and update the ctx->pos 3. block 2 read fail, but we has already fill some dirent, so we will return back to userspace will a positive return val(see ksys_getdents64) 4. the second ext4_dx_readdir will reset the world since info->last_pos != ctx->pos, and will also init the curr_hash which pos to block 1 5. So we will read block1 too, and once block2 still read fail, we can only fill one dirent because the hash of the entry in block1(besides the last one) won't greater than curr_hash 6. this time, we forget update last_pos too since the read for block2 will fail, and since we has got the one entry, ksys_getdents64 can return success 7. Latter we will trapped in a loop with step 4~6
Cc: stable@kernel.org Signed-off-by: yangerkun yangerkun@huawei.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20210914111415.3921954-1-yangerkun@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/dir.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c @@ -551,7 +551,7 @@ static int ext4_dx_readdir(struct file * struct dir_private_info *info = file->private_data; struct inode *inode = file_inode(file); struct fname *fname; - int ret; + int ret = 0;
if (!info) { info = ext4_htree_create_dir_info(file, ctx->pos); @@ -599,7 +599,7 @@ static int ext4_dx_readdir(struct file * info->curr_minor_hash, &info->next_hash); if (ret < 0) - return ret; + goto finished; if (ret == 0) { ctx->pos = ext4_get_htree_eof(file); break; @@ -630,7 +630,7 @@ static int ext4_dx_readdir(struct file * } finished: info->last_pos = ctx->pos; - return 0; + return ret < 0 ? ret : 0; }
static int ext4_release_dir(struct inode *inode, struct file *filp)
From: yangerkun yangerkun@huawei.com
commit bb9464e08309f6befe80866f5be51778ca355ee9 upstream.
The error path in ext4_fill_super forget to flush s_error_work before journal destroy, and it may trigger the follow bug since flush_stashed_error_work can run concurrently with journal destroy without any protection for sbi->s_journal.
[32031.740193] EXT4-fs (loop66): get root inode failed [32031.740484] EXT4-fs (loop66): mount failed [32031.759805] ------------[ cut here ]------------ [32031.759807] kernel BUG at fs/jbd2/transaction.c:373! [32031.760075] invalid opcode: 0000 [#1] SMP PTI [32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded 4.18.0 [32031.765112] Call Trace: [32031.765375] ? __switch_to_asm+0x35/0x70 [32031.765635] ? __switch_to_asm+0x41/0x70 [32031.765893] ? __switch_to_asm+0x35/0x70 [32031.766148] ? __switch_to_asm+0x41/0x70 [32031.766405] ? _cond_resched+0x15/0x40 [32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2] [32031.766934] jbd2_journal_start+0x19/0x20 [jbd2] [32031.767218] flush_stashed_error_work+0x30/0x90 [ext4] [32031.767487] process_one_work+0x195/0x390 [32031.767747] worker_thread+0x30/0x390 [32031.768007] ? process_one_work+0x390/0x390 [32031.768265] kthread+0x10d/0x130 [32031.768521] ? kthread_flush_work_fn+0x10/0x10 [32031.768778] ret_from_fork+0x35/0x40
static int start_this_handle(...) BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this
Besides, after we enable fast commit, ext4_fc_replay can add work to s_error_work but return success, so the latter journal destroy in ext4_load_journal can trigger this problem too.
Fix this problem with two steps: 1. Call ext4_commit_super directly in ext4_handle_error for the case that called from ext4_fc_replay 2. Since it's hard to pair the init and flush for s_error_work, we'd better add a extras flush_work before journal destroy in ext4_fill_super
Besides, this patch will call ext4_commit_super in ext4_handle_error for any nojournal case too. But it seems safe since the reason we call schedule_work was that we should save error info to sb through journal if available. Conversely, for the nojournal case, it seems useless delay commit superblock to s_error_work.
Fixes: c92dc856848f ("ext4: defer saving error info from atomic context") Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Cc: stable@kernel.org Signed-off-by: yangerkun yangerkun@huawei.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -661,7 +661,7 @@ static void ext4_handle_error(struct sup * constraints, it may not be safe to do it right here so we * defer superblock flushing to a workqueue. */ - if (continue_fs) + if (continue_fs && journal) schedule_work(&EXT4_SB(sb)->s_error_work); else ext4_commit_super(sb); @@ -5189,12 +5189,15 @@ failed_mount_wq: sbi->s_ea_block_cache = NULL;
if (sbi->s_journal) { + /* flush s_error_work before journal destroy. */ + flush_work(&sbi->s_error_work); jbd2_journal_destroy(sbi->s_journal); sbi->s_journal = NULL; } failed_mount3a: ext4_es_unregister_shrinker(sbi); failed_mount3: + /* flush s_error_work before sbi destroy */ flush_work(&sbi->s_error_work); del_timer_sync(&sbi->s_err_report); ext4_stop_mmpd(sbi);
From: Andrej Shadura andrew.shadura@collabora.co.uk
commit 22d65765f211cc83186fd8b87521159f354c0da9 upstream.
Since the actual_length calculation is performed unsigned, packets shorter than 7 bytes (e.g. packets without data or otherwise truncated) or non-received packets ("zero" bytes) can cause buffer overflow.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214437 Fixes: 42337b9d4d958("HID: add driver for U2F Zero built-in LED and RNG") Signed-off-by: Andrej Shadura andrew.shadura@collabora.co.uk Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-u2fzero.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/hid/hid-u2fzero.c +++ b/drivers/hid/hid-u2fzero.c @@ -198,7 +198,9 @@ static int u2fzero_rng_read(struct hwrng }
ret = u2fzero_recv(dev, &req, &resp); - if (ret < 0) + + /* ignore errors or packets without data */ + if (ret < offsetof(struct u2f_hid_msg, init.data)) return 0;
/* only take the minimum amount of data it is safe to take */
From: Eric Dumazet edumazet@google.com
commit a9f5970767d11eadc805d5283f202612c7ba1f59 upstream.
up->corkflag field can be read or written without any lock. Annotate accesses to avoid possible syzbot/KCSAN reports.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/udp.c | 10 +++++----- net/ipv6/udp.c | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-)
--- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1053,7 +1053,7 @@ int udp_sendmsg(struct sock *sk, struct __be16 dport; u8 tos; int err, is_udplite = IS_UDPLITE(sk); - int corkreq = up->corkflag || msg->msg_flags&MSG_MORE; + int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE; int (*getfrag)(void *, char *, int, int, int, struct sk_buff *); struct sk_buff *skb; struct ip_options_data opt_copy; @@ -1361,7 +1361,7 @@ int udp_sendpage(struct sock *sk, struct }
up->len += size; - if (!(up->corkflag || (flags&MSG_MORE))) + if (!(READ_ONCE(up->corkflag) || (flags&MSG_MORE))) ret = udp_push_pending_frames(sk); if (!ret) ret = size; @@ -2662,9 +2662,9 @@ int udp_lib_setsockopt(struct sock *sk, switch (optname) { case UDP_CORK: if (val != 0) { - up->corkflag = 1; + WRITE_ONCE(up->corkflag, 1); } else { - up->corkflag = 0; + WRITE_ONCE(up->corkflag, 0); lock_sock(sk); push_pending_frames(sk); release_sock(sk); @@ -2787,7 +2787,7 @@ int udp_lib_getsockopt(struct sock *sk,
switch (optname) { case UDP_CORK: - val = up->corkflag; + val = READ_ONCE(up->corkflag); break;
case UDP_ENCAP: --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1303,7 +1303,7 @@ int udpv6_sendmsg(struct sock *sk, struc int addr_len = msg->msg_namelen; bool connected = false; int ulen = len; - int corkreq = up->corkflag || msg->msg_flags&MSG_MORE; + int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE; int err; int is_udplite = IS_UDPLITE(sk); int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
From: Randy Dunlap rdunlap@infradead.org
commit 9523b33cc31cf8ce703f8facee9fd16cba36d5ad upstream.
This is a nuisance when CONFIG_WERROR is set, so drop the variable declaration since the code that used it was removed.
../arch/nios2/kernel/setup.c: In function 'setup_arch': ../arch/nios2/kernel/setup.c:152:13: warning: unused variable 'dram_start' [-Wunused-variable] 152 | int dram_start;
Fixes: 7f7bc20bc41a ("nios2: Don't use _end for calculating min_low_pfn") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Reviewed-by: Mike Rapoport rppt@linux.ibm.com Cc: Andreas Oetken andreas.oetken@siemens.com Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/nios2/kernel/setup.c | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/nios2/kernel/setup.c +++ b/arch/nios2/kernel/setup.c @@ -149,8 +149,6 @@ static void __init find_limits(unsigned
void __init setup_arch(char **cmdline_p) { - int dram_start; - console_verbose();
memory_start = memblock_start_of_DRAM();
From: Dongliang Mu mudongliangabcd@gmail.com
commit dcb713d53e2eadf42b878c12a471e74dc6ed3145 upstream.
There are two invocation sites of hso_free_net_device. After refactoring hso_create_net_device, this parameter is useless. Remove the bailout in the hso_free_net_device and change the invocation sites of this function.
Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: David S. Miller davem@davemloft.net [Backport this cleanup patch to 5.10 and 5.14 in order to keep the codebase consistent with the 4.14/4.19/5.4 patchseries] Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/hso.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/usb/hso.c +++ b/drivers/net/usb/hso.c @@ -2353,7 +2353,7 @@ static int remove_net_device(struct hso_ }
/* Frees our network device */ -static void hso_free_net_device(struct hso_device *hso_dev, bool bailout) +static void hso_free_net_device(struct hso_device *hso_dev) { int i; struct hso_net *hso_net = dev2net(hso_dev); @@ -2376,7 +2376,7 @@ static void hso_free_net_device(struct h kfree(hso_net->mux_bulk_tx_buf); hso_net->mux_bulk_tx_buf = NULL;
- if (hso_net->net && !bailout) + if (hso_net->net) free_netdev(hso_net->net);
kfree(hso_dev); @@ -3136,7 +3136,7 @@ static void hso_free_interface(struct us rfkill_unregister(rfk); rfkill_destroy(rfk); } - hso_free_net_device(network_table[i], false); + hso_free_net_device(network_table[i]); } } }
From: F.A.Sulaiman asha.16@itfac.mrt.ac.lk
commit 1e4ce418b1cb1a810256b5fb3fd33d22d1325993 upstream.
Syzbot reported slab-out-of-bounds Write bug in hid-betopff driver. The problem is the driver assumes the device must have an input report but some malicious devices violate this assumption.
So this patch checks hid_device's input is non empty before it's been used.
Reported-by: syzbot+07efed3bc5a1407bd742@syzkaller.appspotmail.com Signed-off-by: F.A. SULAIMAN asha.16@itfac.mrt.ac.lk Reviewed-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-betopff.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/drivers/hid/hid-betopff.c +++ b/drivers/hid/hid-betopff.c @@ -56,15 +56,22 @@ static int betopff_init(struct hid_devic { struct betopff_device *betopff; struct hid_report *report; - struct hid_input *hidinput = - list_first_entry(&hid->inputs, struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int field_count = 0; int error; int i, j;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + + hidinput = list_first_entry(&hid->inputs, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV;
From: Jozsef Kadlecsik kadlec@netfilter.org
commit 7bbc3d385bd813077acaf0e6fdb2a86a901f5382 upstream.
The commit
commit 7661809d493b426e979f39ab512e3adf41fbcc69 Author: Linus Torvalds torvalds@linux-foundation.org Date: Wed Jul 14 09:45:49 2021 -0700
mm: don't allow oversized kvmalloc() calls
limits the max allocatable memory via kvmalloc() to MAX_INT. Apply the same limit in ipset.
Reported-by: syzbot+3493b1873fb3ea827986@syzkaller.appspotmail.com Reported-by: syzbot+2b8443c35458a617c904@syzkaller.appspotmail.com Reported-by: syzbot+ee5cb15f4a0e85e0d54e@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/ipset/ip_set_hash_gen.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -130,11 +130,11 @@ htable_size(u8 hbits) { size_t hsize;
- /* We must fit both into u32 in jhash and size_t */ + /* We must fit both into u32 in jhash and INT_MAX in kvmalloc_node() */ if (hbits > 31) return 0; hsize = jhash_size(hbits); - if ((((size_t)-1) - sizeof(struct htable)) / sizeof(struct hbucket *) + if ((INT_MAX - sizeof(struct htable)) / sizeof(struct hbucket *) < hsize) return 0;
From: Linus Torvalds torvalds@linux-foundation.org
commit 7661809d493b426e979f39ab512e3adf41fbcc69 upstream.
'kvmalloc()' is a convenience function for people who want to do a kmalloc() but fall back on vmalloc() if there aren't enough physically contiguous pages, or if the allocation is larger than what kmalloc() supports.
However, let's make sure it doesn't get _too_ easy to do crazy things with it. In particular, don't allow big allocations that could be due to integer overflow or underflow. So make sure the allocation size fits in an 'int', to protect against trivial integer conversion issues.
Acked-by: Willy Tarreau w@1wt.eu Cc: Kees Cook keescook@chromium.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/util.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/util.c +++ b/mm/util.c @@ -593,6 +593,10 @@ void *kvmalloc_node(size_t size, gfp_t f if (ret || size <= PAGE_SIZE) return ret;
+ /* Don't even allow crazy sizes */ + if (WARN_ON_ONCE(size > INT_MAX)) + return NULL; + return __vmalloc_node(size, 1, flags, node, __builtin_return_address(0)); }
From: Anirudh Rayabharam mail@anirudhrb.com
commit f7744fa16b96da57187dc8e5634152d3b63d72de upstream.
Free the unsent raw_report buffers when the device is removed.
Fixes a memory leak reported by syzbot at: https://syzkaller.appspot.com/bug?id=7b4fa7cb1a7c2d3342a2a8a6c53371c8c418ab4...
Reported-by: syzbot+47b26cd837ececfc666d@syzkaller.appspotmail.com Tested-by: syzbot+47b26cd837ececfc666d@syzkaller.appspotmail.com Signed-off-by: Anirudh Rayabharam mail@anirudhrb.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/usbhid/hid-core.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -505,7 +505,7 @@ static void hid_ctrl(struct urb *urb)
if (unplug) { usbhid->ctrltail = usbhid->ctrlhead; - } else { + } else if (usbhid->ctrlhead != usbhid->ctrltail) { usbhid->ctrltail = (usbhid->ctrltail + 1) & (HID_CONTROL_FIFO_SIZE - 1);
if (usbhid->ctrlhead != usbhid->ctrltail && @@ -1223,9 +1223,20 @@ static void usbhid_stop(struct hid_devic mutex_lock(&usbhid->mutex);
clear_bit(HID_STARTED, &usbhid->iofl); + spin_lock_irq(&usbhid->lock); /* Sync with error and led handlers */ set_bit(HID_DISCONNECTED, &usbhid->iofl); + while (usbhid->ctrltail != usbhid->ctrlhead) { + if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_OUT) { + kfree(usbhid->ctrl[usbhid->ctrltail].raw_report); + usbhid->ctrl[usbhid->ctrltail].raw_report = NULL; + } + + usbhid->ctrltail = (usbhid->ctrltail + 1) & + (HID_CONTROL_FIFO_SIZE - 1); + } spin_unlock_irq(&usbhid->lock); + usb_kill_urb(usbhid->urbin); usb_kill_urb(usbhid->urbout); usb_kill_urb(usbhid->urbctrl);
From: Shreyansh Chouhan chouhan.shreyansh630@gmail.com
commit 72ff2bf04db2a48840df93a461b7115900f46c05 upstream.
xts_crypt() code doesn't call kernel_fpu_end() after calling kernel_fpu_begin() if walk.nbytes is 0. The correct behavior should be not calling kernel_fpu_begin() if walk.nbytes is 0.
Reported-by: syzbot+20191dc583eff8602d2d@syzkaller.appspotmail.com Signed-off-by: Shreyansh Chouhan chouhan.shreyansh630@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/crypto/aesni-intel_glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -849,7 +849,7 @@ static int xts_crypt(struct skcipher_req return -EINVAL;
err = skcipher_walk_virt(&walk, req, false); - if (err) + if (!walk.nbytes) return err;
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
From: Yanfei Xu yanfei.xu@windriver.com
commit ab609f25d19858513919369ff3d9a63c02cd9e2e upstream.
Once device_register() failed, we should call put_device() to decrement reference count for cleanup. Or it will cause memory leak.
BUG: memory leak unreferenced object 0xffff888114032e00 (size 256): comm "kworker/1:3", pid 2960, jiffies 4294943572 (age 15.920s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 2e 03 14 81 88 ff ff ................ 08 2e 03 14 81 88 ff ff 90 76 65 82 ff ff ff ff .........ve..... backtrace: [<ffffffff8265cfab>] kmalloc include/linux/slab.h:591 [inline] [<ffffffff8265cfab>] kzalloc include/linux/slab.h:721 [inline] [<ffffffff8265cfab>] device_private_init drivers/base/core.c:3203 [inline] [<ffffffff8265cfab>] device_add+0x89b/0xdf0 drivers/base/core.c:3253 [<ffffffff828dd643>] __mdiobus_register+0xc3/0x450 drivers/net/phy/mdio_bus.c:537 [<ffffffff828cb835>] __devm_mdiobus_register+0x75/0xf0 drivers/net/phy/mdio_devres.c:87 [<ffffffff82b92a00>] ax88772_init_mdio drivers/net/usb/asix_devices.c:676 [inline] [<ffffffff82b92a00>] ax88772_bind+0x330/0x480 drivers/net/usb/asix_devices.c:786 [<ffffffff82baa33f>] usbnet_probe+0x3ff/0xdf0 drivers/net/usb/usbnet.c:1745 [<ffffffff82c36e17>] usb_probe_interface+0x177/0x370 drivers/usb/core/driver.c:396 [<ffffffff82661d17>] call_driver_probe drivers/base/dd.c:517 [inline] [<ffffffff82661d17>] really_probe.part.0+0xe7/0x380 drivers/base/dd.c:596 [<ffffffff826620bc>] really_probe drivers/base/dd.c:558 [inline] [<ffffffff826620bc>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:751 [<ffffffff826621ba>] driver_probe_device+0x2a/0x120 drivers/base/dd.c:781 [<ffffffff82662a26>] __device_attach_driver+0xf6/0x140 drivers/base/dd.c:898 [<ffffffff8265eca7>] bus_for_each_drv+0xb7/0x100 drivers/base/bus.c:427 [<ffffffff826625a2>] __device_attach+0x122/0x260 drivers/base/dd.c:969 [<ffffffff82660916>] bus_probe_device+0xc6/0xe0 drivers/base/bus.c:487 [<ffffffff8265cd0b>] device_add+0x5fb/0xdf0 drivers/base/core.c:3359 [<ffffffff82c343b9>] usb_set_configuration+0x9d9/0xb90 drivers/usb/core/message.c:2170 [<ffffffff82c4473c>] usb_generic_driver_probe+0x8c/0xc0 drivers/usb/core/generic.c:238
BUG: memory leak unreferenced object 0xffff888116f06900 (size 32): comm "kworker/0:2", pid 2670, jiffies 4294944448 (age 7.160s) hex dump (first 32 bytes): 75 73 62 2d 30 30 31 3a 30 30 33 00 00 00 00 00 usb-001:003..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81484516>] kstrdup+0x36/0x70 mm/util.c:60 [<ffffffff814845a3>] kstrdup_const+0x53/0x80 mm/util.c:83 [<ffffffff82296ba2>] kvasprintf_const+0xc2/0x110 lib/kasprintf.c:48 [<ffffffff82358d4b>] kobject_set_name_vargs+0x3b/0xe0 lib/kobject.c:289 [<ffffffff826575f3>] dev_set_name+0x63/0x90 drivers/base/core.c:3147 [<ffffffff828dd63b>] __mdiobus_register+0xbb/0x450 drivers/net/phy/mdio_bus.c:535 [<ffffffff828cb835>] __devm_mdiobus_register+0x75/0xf0 drivers/net/phy/mdio_devres.c:87 [<ffffffff82b92a00>] ax88772_init_mdio drivers/net/usb/asix_devices.c:676 [inline] [<ffffffff82b92a00>] ax88772_bind+0x330/0x480 drivers/net/usb/asix_devices.c:786 [<ffffffff82baa33f>] usbnet_probe+0x3ff/0xdf0 drivers/net/usb/usbnet.c:1745 [<ffffffff82c36e17>] usb_probe_interface+0x177/0x370 drivers/usb/core/driver.c:396 [<ffffffff82661d17>] call_driver_probe drivers/base/dd.c:517 [inline] [<ffffffff82661d17>] really_probe.part.0+0xe7/0x380 drivers/base/dd.c:596 [<ffffffff826620bc>] really_probe drivers/base/dd.c:558 [inline] [<ffffffff826620bc>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:751 [<ffffffff826621ba>] driver_probe_device+0x2a/0x120 drivers/base/dd.c:781 [<ffffffff82662a26>] __device_attach_driver+0xf6/0x140 drivers/base/dd.c:898 [<ffffffff8265eca7>] bus_for_each_drv+0xb7/0x100 drivers/base/bus.c:427 [<ffffffff826625a2>] __device_attach+0x122/0x260 drivers/base/dd.c:969
Reported-by: syzbot+398e7dc692ddbbb4cfec@syzkaller.appspotmail.com Signed-off-by: Yanfei Xu yanfei.xu@windriver.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/mdio_bus.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -541,6 +541,7 @@ int __mdiobus_register(struct mii_bus *b err = device_register(&bus->dev); if (err) { pr_err("mii_bus %s failed to register\n", bus->id); + put_device(&bus->dev); return -EINVAL; }
From: Haimin Zhang tcs_kernel@tencent.com
commit eb7511bf9182292ef1df1082d23039e856d1ddfb upstream.
Check the return of init_srcu_struct(), which can fail due to OOM, when initializing the page track mechanism. Lack of checking leads to a NULL pointer deref found by a modified syzkaller.
Reported-by: TCS Robot tcs_robot@tencent.com Signed-off-by: Haimin Zhang tcs_kernel@tencent.com Message-Id: 1630636626-12262-1-git-send-email-tcs_kernel@tencent.com [Move the call towards the beginning of kvm_arch_init_vm. - Paolo] Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm_page_track.h | 2 +- arch/x86/kvm/mmu/page_track.c | 4 ++-- arch/x86/kvm/x86.c | 7 ++++++- 3 files changed, 9 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -46,7 +46,7 @@ struct kvm_page_track_notifier_node { struct kvm_page_track_notifier_node *node); };
-void kvm_page_track_init(struct kvm *kvm); +int kvm_page_track_init(struct kvm *kvm); void kvm_page_track_cleanup(struct kvm *kvm);
void kvm_page_track_free_memslot(struct kvm_memory_slot *slot); --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -163,13 +163,13 @@ void kvm_page_track_cleanup(struct kvm * cleanup_srcu_struct(&head->track_srcu); }
-void kvm_page_track_init(struct kvm *kvm) +int kvm_page_track_init(struct kvm *kvm) { struct kvm_page_track_notifier_head *head;
head = &kvm->arch.track_notifier_head; - init_srcu_struct(&head->track_srcu); INIT_HLIST_HEAD(&head->track_notifier_list); + return init_srcu_struct(&head->track_srcu); }
/* --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11093,9 +11093,15 @@ void kvm_arch_free_vm(struct kvm *kvm)
int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { + int ret; + if (type) return -EINVAL;
+ ret = kvm_page_track_init(kvm); + if (ret) + return ret; + INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); @@ -11128,7 +11134,6 @@ int kvm_arch_init_vm(struct kvm *kvm, un
kvm_apicv_init(kvm); kvm_hv_init_vm(kvm); - kvm_page_track_init(kvm); kvm_mmu_init_vm(kvm);
return static_call(kvm_x86_vm_init)(kvm);
From: Eric Dumazet edumazet@google.com
commit e9edc188fc76499b0b9bd60364084037f6d03773 upstream.
Syzbot was able to trigger the following warning [1]
No repro found by syzbot yet but I was able to trigger similar issue by having 2 scripts running in parallel, changing conntrack hash sizes, and:
for j in `seq 1 1000` ; do unshare -n /bin/true >/dev/null ; done
It would take more than 5 minutes for net_namespace structures to be cleaned up.
This is because nf_ct_iterate_cleanup() has to restart everytime a resize happened.
By adding a mutex, we can serialize hash resizes and cleanups and also make get_next_corpse() faster by skipping over empty buckets.
Even without resizes in the picture, this patch considerably speeds up network namespace dismantles.
[1] INFO: task syz-executor.0:8312 can't die for more than 144 seconds. task:syz-executor.0 state:R running task stack:25672 pid: 8312 ppid: 6573 flags:0x00004006 Call Trace: context_switch kernel/sched/core.c:4955 [inline] __schedule+0x940/0x26f0 kernel/sched/core.c:6236 preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:6408 preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:35 __local_bh_enable_ip+0x109/0x120 kernel/softirq.c:390 local_bh_enable include/linux/bottom_half.h:32 [inline] get_next_corpse net/netfilter/nf_conntrack_core.c:2252 [inline] nf_ct_iterate_cleanup+0x15a/0x450 net/netfilter/nf_conntrack_core.c:2275 nf_conntrack_cleanup_net_list+0x14c/0x4f0 net/netfilter/nf_conntrack_core.c:2469 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:171 setup_net+0x639/0xa30 net/core/net_namespace.c:349 copy_net_ns+0x319/0x760 net/core/net_namespace.c:470 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226 ksys_unshare+0x445/0x920 kernel/fork.c:3128 __do_sys_unshare kernel/fork.c:3202 [inline] __se_sys_unshare kernel/fork.c:3200 [inline] __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3200 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f63da68e739 RSP: 002b:00007f63d7c05188 EFLAGS: 00000246 ORIG_RAX: 0000000000000110 RAX: ffffffffffffffda RBX: 00007f63da792f80 RCX: 00007f63da68e739 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000000 RBP: 00007f63da6e8cc4 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f63da792f80 R13: 00007fff50b75d3f R14: 00007f63d7c05300 R15: 0000000000022000
Showing all locks held in the system: 1 lock held by khungtaskd/27: #0: ffffffff8b980020 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446 2 locks held by kworker/u4:2/153: #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1198 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:634 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline] #0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x896/0x1690 kernel/workqueue.c:2268 #1: ffffc9000140fdb0 ((kfence_timer).work){+.+.}-{0:0}, at: process_one_work+0x8ca/0x1690 kernel/workqueue.c:2272 1 lock held by systemd-udevd/2970: 1 lock held by in:imklog/6258: #0: ffff88807f970ff0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990 3 locks held by kworker/1:6/8158: 1 lock held by syz-executor.0/8312: 2 locks held by kworker/u4:13/9320: 1 lock held by syz-executor.5/10178: 1 lock held by syz-executor.4/10217:
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_conntrack_core.c | 70 ++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 33 deletions(-)
--- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -75,6 +75,9 @@ static __read_mostly struct kmem_cache * static DEFINE_SPINLOCK(nf_conntrack_locks_all_lock); static __read_mostly bool nf_conntrack_locks_all;
+/* serialize hash resizes and nf_ct_iterate_cleanup */ +static DEFINE_MUTEX(nf_conntrack_mutex); + #define GC_SCAN_INTERVAL (120u * HZ) #define GC_SCAN_MAX_DURATION msecs_to_jiffies(10)
@@ -2192,28 +2195,31 @@ get_next_corpse(int (*iter)(struct nf_co spinlock_t *lockp;
for (; *bucket < nf_conntrack_htable_size; (*bucket)++) { + struct hlist_nulls_head *hslot = &nf_conntrack_hash[*bucket]; + + if (hlist_nulls_empty(hslot)) + continue; + lockp = &nf_conntrack_locks[*bucket % CONNTRACK_LOCKS]; local_bh_disable(); nf_conntrack_lock(lockp); - if (*bucket < nf_conntrack_htable_size) { - hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[*bucket], hnnode) { - if (NF_CT_DIRECTION(h) != IP_CT_DIR_REPLY) - continue; - /* All nf_conn objects are added to hash table twice, one - * for original direction tuple, once for the reply tuple. - * - * Exception: In the IPS_NAT_CLASH case, only the reply - * tuple is added (the original tuple already existed for - * a different object). - * - * We only need to call the iterator once for each - * conntrack, so we just use the 'reply' direction - * tuple while iterating. - */ - ct = nf_ct_tuplehash_to_ctrack(h); - if (iter(ct, data)) - goto found; - } + hlist_nulls_for_each_entry(h, n, hslot, hnnode) { + if (NF_CT_DIRECTION(h) != IP_CT_DIR_REPLY) + continue; + /* All nf_conn objects are added to hash table twice, one + * for original direction tuple, once for the reply tuple. + * + * Exception: In the IPS_NAT_CLASH case, only the reply + * tuple is added (the original tuple already existed for + * a different object). + * + * We only need to call the iterator once for each + * conntrack, so we just use the 'reply' direction + * tuple while iterating. + */ + ct = nf_ct_tuplehash_to_ctrack(h); + if (iter(ct, data)) + goto found; } spin_unlock(lockp); local_bh_enable(); @@ -2231,26 +2237,20 @@ found: static void nf_ct_iterate_cleanup(int (*iter)(struct nf_conn *i, void *data), void *data, u32 portid, int report) { - unsigned int bucket = 0, sequence; + unsigned int bucket = 0; struct nf_conn *ct;
might_sleep();
- for (;;) { - sequence = read_seqcount_begin(&nf_conntrack_generation); - - while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) { - /* Time to push up daises... */ + mutex_lock(&nf_conntrack_mutex); + while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) { + /* Time to push up daises... */
- nf_ct_delete(ct, portid, report); - nf_ct_put(ct); - cond_resched(); - } - - if (!read_seqcount_retry(&nf_conntrack_generation, sequence)) - break; - bucket = 0; + nf_ct_delete(ct, portid, report); + nf_ct_put(ct); + cond_resched(); } + mutex_unlock(&nf_conntrack_mutex); }
struct iter_data { @@ -2486,8 +2486,10 @@ int nf_conntrack_hash_resize(unsigned in if (!hash) return -ENOMEM;
+ mutex_lock(&nf_conntrack_mutex); old_size = nf_conntrack_htable_size; if (old_size == hashsize) { + mutex_unlock(&nf_conntrack_mutex); kvfree(hash); return 0; } @@ -2523,6 +2525,8 @@ int nf_conntrack_hash_resize(unsigned in nf_conntrack_all_unlock(); local_bh_enable();
+ mutex_unlock(&nf_conntrack_mutex); + synchronize_net(); kvfree(old_hash); return 0;
From: Pablo Neira Ayuso pablo@netfilter.org
commit 45928afe94a094bcda9af858b96673d59bc4a0e9 upstream.
The commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") limits the max allocatable memory via kvmalloc() to MAX_INT.
Reported-by: syzbot+cd43695a64bcd21b8596@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4336,7 +4336,7 @@ static int nf_tables_newset(struct sk_bu if (ops->privsize != NULL) size = ops->privsize(nla, &desc); alloc_size = sizeof(*set) + size + udlen; - if (alloc_size < size) + if (alloc_size < size || alloc_size > INT_MAX) return -ENOMEM; set = kvzalloc(alloc_size, GFP_KERNEL); if (!set)
From: Daniele Palmas dnlplm@gmail.com
commit 4526fe74c3c5095cc55931a3a6fb4932f9e06002 upstream.
Fix double free_netdev when mhi_prepare_for_transfer fails.
Fixes: 3ffec6a14f24 ("net: Add mhi-net driver") Signed-off-by: Daniele Palmas dnlplm@gmail.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Loic Poulain loic.poulain@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/mhi/net.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/net/mhi/net.c +++ b/drivers/net/mhi/net.c @@ -337,7 +337,7 @@ static int mhi_net_newlink(void *ctxt, s /* Start MHI channels */ err = mhi_prepare_for_transfer(mhi_dev); if (err) - goto out_err; + return err;
/* Number of transfer descriptors determines size of the queue */ mhi_netdev->rx_queue_sz = mhi_get_free_desc_count(mhi_dev, DMA_FROM_DEVICE); @@ -347,7 +347,7 @@ static int mhi_net_newlink(void *ctxt, s else err = register_netdev(ndev); if (err) - goto out_err; + return err;
if (mhi_netdev->proto) { err = mhi_netdev->proto->init(mhi_netdev); @@ -359,8 +359,6 @@ static int mhi_net_newlink(void *ctxt, s
out_err_proto: unregister_netdevice(ndev); -out_err: - free_netdev(ndev); return err; }
Hi!
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On Mon 2021-10-04 20:06:51, Pavel Machek wrote:
Hi!
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5...
Sorry, I replied to wrong email. We are not actually testing 5.14.
Best regards, Pavel
On Mon, 4 Oct 2021 14:50:50 +0200, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
5.14.10-rc1 Successfully Compiled and booted on my Raspberry PI 4b (8g) (bcm2711)
Tested-by: Fox Chen foxhlchen@gmail.com
On 10/4/21 6:50 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 10/4/21 5:50 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
Similar to other comments., this commit:
net: mdiobus: Fix memory leak in __mdiobus_register
has since been reverted upstream. Thanks!
On Mon, Oct 04, 2021 at 02:50:50PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 480 pass: 480 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Mon, Oct 04, 2021 at 07:20:13PM -0700, Guenter Roeck wrote:
On Mon, Oct 04, 2021 at 02:50:50PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 480 pass: 480 fail: 0
I spoke too early. A large number of boot tests report the following warning tracebacks in this rc.
Guenter
==================================== WARNING: ip/134 still has locks held! 5.14.10-rc1-00173-gcda15f9c69e0 #1 Not tainted ------------------------------------ 1 lock held by ip/134: #0: c17acf0c (sk_lock-AF_INET){+.+.}-{0:0}, at: 0xc17c561c
stack backtrace: CPU: 0 PID: 134 Comm: ip Not tainted 5.14.10-rc1-00173-gcda15f9c69e0 #1 Call trace: [<(ptrval)>] dump_stack+0x34/0x48 [<(ptrval)>] debug_check_no_locks_held+0xc8/0xd0 [<(ptrval)>] do_exit+0x6a8/0xc2c [<(ptrval)>] do_group_exit+0x68/0x114 [<(ptrval)>] __wake_up_parent+0x0/0x34 [<(ptrval)>] _syscall_return+0x0/0x4
========================= WARNING: held lock freed! 5.14.10-rc1-00173-gcda15f9c69e0 #1 Not tainted ------------------------- ip/182 is freeing memory 9000000002400040-90000000024006bf, with a lock still held there! 9000000002400160 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x28/0x118 2 locks held by ip/182: #0: 90000000036a0280 (&sb->s_type->i_mutex_key#4){+.+.}-{3:3}, at: __sock_release+0x34/0xd8 #1: 9000000002400160 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x28/0x118
stack backtrace: CPU: 0 PID: 182 Comm: ip Not tainted 5.14.10-rc1-00173-gcda15f9c69e0 #1 Stack : 0000000000000000 0000000000000000 0000000000000008 627881c1d8d1d123 90000000021453c0 0000000000000000 90000000026cbb58 ffffffff80f3b950 ffffffff81088f80 0000000000000001 90000000026cb9a8 0000000000000000 0000000000000000 0000000000000010 ffffffff8071ee00 0000000000000000 0000000000000000 ffffffff81060000 0000000000000001 ffffffff80f3b950 9000000002ab2940 000000001000a4e0 000000007fe10f68 ffffffffffffffff 0000000000000000 0000000000000000 ffffffff807a51a0 0000000000052798 ffffffff81220000 90000000026c8000 90000000026cbb50 0000000000000000 ffffffff80c7b6f0 0000000000000000 90000000026cbc88 ffffffff8121e440 00000000000000b6 0000000000000000 ffffffff8010b304 627881c1d8d1d123 ... Call Trace: [<ffffffff8010b304>] show_stack+0x3c/0x120 [<ffffffff80c7b6f0>] dump_stack_lvl+0xa0/0xec [<ffffffff801a8228>] debug_check_no_locks_freed+0x168/0x1a0 [<ffffffff802d679c>] kmem_cache_free.part.0+0xf4/0x2b8 [<ffffffff809eb5d0>] __sk_destruct+0x150/0x230 [<ffffffff80b22094>] inet_release+0x4c/0x88 [<ffffffff809e2744>] __sock_release+0x44/0xd8 [<ffffffff809e27ec>] sock_close+0x14/0x28 [<ffffffff802e4d84>] __fput+0xac/0x2d8 [<ffffffff80163500>] task_work_run+0x90/0xf0 [<ffffffff8010a6e8>] do_notify_resume+0x348/0x360 [<ffffffff801039e8>] work_notifysig+0x10/0x18
On Mon, 4 Oct 2021 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regression found on arm. following kernel BUG reported on stable-rc linux-5.14.y while booting BeagleBoard X15 device.
metadata: git branch: linux-5.14.y git repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc git commit: cda15f9c69e08480d4308d0e5c62bd44324a9ff0 git describe: v5.14.9-173-gcda15f9c69e0 make_kernelversion: 5.14.10-rc1 kernel-config: https://builds.tuxbuild.com/1z2ozt5ntwqausJUkOMJ8zBgRqi/config
Kernel crash: -------------- [ 6.457366] kernel BUG at kernel/cpu.c:1065! [ 6.461669] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 6.466827] Modules linked in: [ 6.469909] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.10-rc1 #1 [ 6.476318] Hardware name: Generic DRA74X (Flattened Device Tree) [ 6.482452] PC is at cpuhp_report_idle_dead+0x74/0x78 [ 6.487518] LR is at do_idle+0x108/0x310 [ 6.491485] pc : [<c035a700>] lr : [<c03a3cc0>] psr: 20000093 [ 6.497772] sp : c2101ee0 ip : c2101f00 fp : c2101efc [ 6.503051] r10: 00000000 r9 : c2100000 r8 : 00000000 [ 6.508300] r7 : c2108fe4 r6 : eeb04414 r5 : 2ca71000 r4 : c2093414 [ 6.514862] r3 : c2101ee0 r2 : 00000000 r1 : 000000e0 r0 : 00000000 [ 6.521423] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none [ 6.528686] Control: 10c5387d Table: 8020406a DAC: 00000051 [ 6.534454] Register r0 information: NULL pointer [ 6.539215] Register r1 information: non-paged memory [ 6.544281] Register r2 information: NULL pointer [ 6.549011] Register r3 information: non-slab/vmalloc memory [ 6.554718] Register r4 information: non-slab/vmalloc memory [ 6.560424] Register r5 information: non-paged memory [ 6.565490] Register r6 information: non-slab/vmalloc memory [ 6.571197] Register r7 information: non-slab/vmalloc memory [ 6.576904] Register r8 information: NULL pointer [ 6.581634] Register r9 information: non-slab/vmalloc memory [ 6.587310] Register r10 information: NULL pointer [ 6.592132] Register r11 information: non-slab/vmalloc memory [ 6.597930] Register r12 information: non-slab/vmalloc memory [ 6.603698] Process swapper/0 (pid: 0, stack limit = 0x(ptrval)) [ 6.609741] Stack: (0xc2101ee0 to 0xc2102000) [ 6.614135] 1ee0: 00000000 c2100000 c2108f90 c2108fe4 c2101f44 c2101f00 c03a3cc0 c035a698 [ 6.622375] 1f00: c210e0c0 00000002 c2101f00 c209f0f0 c2100000 5a76ee47 c2101f3c 000000e0 [ 6.630584] 1f20: c2101f58 00000002 c2108f40 00000000 c2100000 10c5387d c2101f54 c2101f48 [ 6.638824] 1f40: c03a4334 c03a3bc4 c2101f84 c2101f58 c1556b84 c03a4318 00000000 00000000 [ 6.647033] 1f60: c1556a88 c2101f70 c0322004 c239e068 c1fd7a80 ffffffff c2101f94 c2101f88 [ 6.655273] 1f80: c1f00af0 c15569c0 c2101ff4 c2101f98 c1f012a0 c1f00ae4 ffffffff ffffffff [ 6.663482] 1fa0: 00000000 c1f00640 00000000 00000000 00000000 00000000 00000000 c1fd7a80 [ 6.671722] 1fc0: 5a73e04d 00000000 00000000 c1f00330 00000051 10c0387d 00000000 8ffda000 [ 6.679931] 1fe0: 412fc0f2 10c5387d 00000000 c2101ff8 00000000 c1f00b78 00000000 00000000 [ 6.688140] Backtrace: [ 6.690612] [<c035a68c>] (cpuhp_report_idle_dead) from [<c03a3cc0>] (do_idle+0x108/0x310) [ 6.698852] r7:c2108fe4 r6:c2108f90 r5:c2100000 r4:00000000 [ 6.704559] [<c03a3bb8>] (do_idle) from [<c03a4334>] (cpu_startup_entry+0x28/0x2c) [ 6.712188] r10:10c5387d r9:c2100000 r8:00000000 r7:c2108f40 r6:00000002 r5:c2101f58 [ 6.720031] r4:000000e0 [ 6.722595] [<c03a430c>] (cpu_startup_entry) from [<c1556b84>] (rest_init+0x1d0/0x2fc) [ 6.730560] [<c15569b4>] (rest_init) from [<c1f00af0>] (arch_call_rest_init+0x18/0x1c) [ 6.738555] r6:ffffffff r5:c1fd7a80 r4:c239e068 [ 6.743194] [<c1f00ad8>] (arch_call_rest_init) from [<c1f012a0>] (start_kernel+0x734/0x780) [ 6.751586] [<c1f00b6c>] (start_kernel) from [<00000000>] (0x0) [ 6.757568] r10:10c5387d r9:412fc0f2 r8:8ffda000 r7:00000000 r6:10c0387d r5:00000051 [ 6.765441] r4:c1f00330 [ 6.767974] Code: e34c1035 e3a03000 eb039b62 e89da8f0 (e7f001f2) [ 6.774108] ---[ end trace 8e1c46209559ca84 ]---
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
ref: https://lkft.validation.linaro.org/scheduler/job/3659621#L2841
On Tue, Oct 05, 2021 at 09:30:27AM +0530, Naresh Kamboju wrote:
On Mon, 4 Oct 2021 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regression found on arm. following kernel BUG reported on stable-rc linux-5.14.y while booting BeagleBoard X15 device.
metadata: git branch: linux-5.14.y git repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc git commit: cda15f9c69e08480d4308d0e5c62bd44324a9ff0 git describe: v5.14.9-173-gcda15f9c69e0 make_kernelversion: 5.14.10-rc1 kernel-config: https://builds.tuxbuild.com/1z2ozt5ntwqausJUkOMJ8zBgRqi/config
Kernel crash:
[ 6.457366] kernel BUG at kernel/cpu.c:1065! [ 6.461669] Internal error: Oops - BUG: 0 [#1] SMP ARM [ 6.466827] Modules linked in: [ 6.469909] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.10-rc1 #1 [ 6.476318] Hardware name: Generic DRA74X (Flattened Device Tree) [ 6.482452] PC is at cpuhp_report_idle_dead+0x74/0x78 [ 6.487518] LR is at do_idle+0x108/0x310 [ 6.491485] pc : [<c035a700>] lr : [<c03a3cc0>] psr: 20000093 [ 6.497772] sp : c2101ee0 ip : c2101f00 fp : c2101efc [ 6.503051] r10: 00000000 r9 : c2100000 r8 : 00000000 [ 6.508300] r7 : c2108fe4 r6 : eeb04414 r5 : 2ca71000 r4 : c2093414 [ 6.514862] r3 : c2101ee0 r2 : 00000000 r1 : 000000e0 r0 : 00000000 [ 6.521423] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none [ 6.528686] Control: 10c5387d Table: 8020406a DAC: 00000051 [ 6.534454] Register r0 information: NULL pointer [ 6.539215] Register r1 information: non-paged memory [ 6.544281] Register r2 information: NULL pointer [ 6.549011] Register r3 information: non-slab/vmalloc memory [ 6.554718] Register r4 information: non-slab/vmalloc memory [ 6.560424] Register r5 information: non-paged memory [ 6.565490] Register r6 information: non-slab/vmalloc memory [ 6.571197] Register r7 information: non-slab/vmalloc memory [ 6.576904] Register r8 information: NULL pointer [ 6.581634] Register r9 information: non-slab/vmalloc memory [ 6.587310] Register r10 information: NULL pointer [ 6.592132] Register r11 information: non-slab/vmalloc memory [ 6.597930] Register r12 information: non-slab/vmalloc memory [ 6.603698] Process swapper/0 (pid: 0, stack limit = 0x(ptrval)) [ 6.609741] Stack: (0xc2101ee0 to 0xc2102000) [ 6.614135] 1ee0: 00000000 c2100000 c2108f90 c2108fe4 c2101f44 c2101f00 c03a3cc0 c035a698 [ 6.622375] 1f00: c210e0c0 00000002 c2101f00 c209f0f0 c2100000 5a76ee47 c2101f3c 000000e0 [ 6.630584] 1f20: c2101f58 00000002 c2108f40 00000000 c2100000 10c5387d c2101f54 c2101f48 [ 6.638824] 1f40: c03a4334 c03a3bc4 c2101f84 c2101f58 c1556b84 c03a4318 00000000 00000000 [ 6.647033] 1f60: c1556a88 c2101f70 c0322004 c239e068 c1fd7a80 ffffffff c2101f94 c2101f88 [ 6.655273] 1f80: c1f00af0 c15569c0 c2101ff4 c2101f98 c1f012a0 c1f00ae4 ffffffff ffffffff [ 6.663482] 1fa0: 00000000 c1f00640 00000000 00000000 00000000 00000000 00000000 c1fd7a80 [ 6.671722] 1fc0: 5a73e04d 00000000 00000000 c1f00330 00000051 10c0387d 00000000 8ffda000 [ 6.679931] 1fe0: 412fc0f2 10c5387d 00000000 c2101ff8 00000000 c1f00b78 00000000 00000000 [ 6.688140] Backtrace: [ 6.690612] [<c035a68c>] (cpuhp_report_idle_dead) from [<c03a3cc0>] (do_idle+0x108/0x310) [ 6.698852] r7:c2108fe4 r6:c2108f90 r5:c2100000 r4:00000000 [ 6.704559] [<c03a3bb8>] (do_idle) from [<c03a4334>] (cpu_startup_entry+0x28/0x2c) [ 6.712188] r10:10c5387d r9:c2100000 r8:00000000 r7:c2108f40 r6:00000002 r5:c2101f58 [ 6.720031] r4:000000e0 [ 6.722595] [<c03a430c>] (cpu_startup_entry) from [<c1556b84>] (rest_init+0x1d0/0x2fc) [ 6.730560] [<c15569b4>] (rest_init) from [<c1f00af0>] (arch_call_rest_init+0x18/0x1c) [ 6.738555] r6:ffffffff r5:c1fd7a80 r4:c239e068 [ 6.743194] [<c1f00ad8>] (arch_call_rest_init) from [<c1f012a0>] (start_kernel+0x734/0x780) [ 6.751586] [<c1f00b6c>] (start_kernel) from [<00000000>] (0x0) [ 6.757568] r10:10c5387d r9:412fc0f2 r8:8ffda000 r7:00000000 r6:10c0387d r5:00000051 [ 6.765441] r4:c1f00330 [ 6.767974] Code: e34c1035 e3a03000 eb039b62 e89da8f0 (e7f001f2) [ 6.774108] ---[ end trace 8e1c46209559ca84 ]---
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
ref: https://lkft.validation.linaro.org/scheduler/job/3659621#L2841
-- Linaro LKFT https://lkft.linaro.org
Any chance at bisection?
On Mon, 4 Oct 2021 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regression found on i386 and x86. following kernel warning reported on stable-rc linux-5.14.y while booting x86.
metadata: git branch: linux-5.14.y git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: cda15f9c69e08480d4308d0e5c62bd44324a9ff0 git describe: v5.14.9-173-gcda15f9c69e0 make_kernelversion: 5.14.10-rc1 kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft...
Kernel crash: -------------- [ 22.356755] ========================= [ 22.360414] WARNING: held lock freed! [ 22.364071] 5.14.10-rc1 #1 Not tainted [ 22.367824] ------------------------- [ 22.371489] systemd-network/341 is freeing memory ffff9d21cbea0000-ffff9d21cbea06bf, with a lock still held there! [ 22.381828] ffff9d21cbea0120 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x21/0x100 [ 22.384624] igb 0000:02:00.0 eno2: renamed from eth1 [ 22.390260] 2 locks held by systemd-network/341: [ 22.390261] #0: ffff9d21c406eb10 (&sb->s_type->i_mutex_key#6){+.+.}-{3:3}, at: __sock_release+0x32/0xc0 [ 22.390267] #1: ffff9d21cbea0120 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x21/0x100 [ 22.390272] [ 22.390272] stack backtrace: [ 22.390273] CPU: 2 PID: 341 Comm: systemd-network Not tainted 5.14.10-rc1 #1 [ 22.390275] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 22.390276] Call Trace: [ 22.390278] dump_stack_lvl+0x49/0x5e [ 22.390281] dump_stack+0x10/0x12 [ 22.390283] debug_check_no_locks_freed+0x111/0x120 [ 22.390286] slab_free_freelist_hook+0x119/0x1d0 [ 22.390289] kmem_cache_free+0x102/0x540 [ 22.390291] ? __sk_destruct+0x145/0x210 [ 22.390294] __sk_destruct+0x145/0x210 [ 22.390296] sk_destruct+0x48/0x50 [ 22.409489] ata_id (348) used greatest stack depth: 11952 bytes left [ 22.418194] __sk_free+0x2f/0xc0 [ 22.418198] sk_free+0x26/0x40 [ 22.418200] sk_common_release+0xa9/0x100 [ 22.487567] udp_lib_close+0x9/0x10 [ 22.491057] inet_release+0x44/0x80 [ 22.494542] __sock_release+0x42/0xc0 [ 22.498209] sock_close+0x18/0x20 [ 22.501525] __fput+0xb5/0x260 [ 22.504578] ____fput+0xe/0x10 [ 22.507636] task_work_run+0x6f/0xc0 [ 22.511216] exit_to_user_mode_prepare+0x1f6/0x200 [ 22.515999] syscall_exit_to_user_mode+0x1d/0x50 [ 22.520610] do_syscall_64+0x67/0x80 [ 22.524190] ? irqentry_exit+0x75/0x80 [ 22.527940] ? exc_page_fault+0x6c/0x200 [ 22.531859] ? asm_exc_page_fault+0x8/0x30 [ 22.535950] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 22.541002] RIP: 0033:0x7f52fe67e641 [ 22.544580] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 aa cd 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f c3 66 0f 1f 44 00 00 53 89 fb 48 83 ec 10 [ 22.563318] RSP: 002b:00007ffe31761878 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 22.570884] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00007f52fe67e641 [ 22.578007] RDX: 00000000000073b0 RSI: 0000000000000000 RDI: 0000000000000010 [ 22.585130] RBP: 00007f52feed8338 R08: 000055893cf103c3 R09: 0000000000000078 [ 22.592253] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 [ 22.599379] R13: 00007ffe317618c0 R14: 0000000000000000 R15: 00007ffe31762a60
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
ref: https://lkft.validation.linaro.org/scheduler/job/3658919#L1477
On Tue, Oct 05, 2021 at 09:34:22AM +0530, Naresh Kamboju wrote:
On Mon, 4 Oct 2021 at 18:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.10 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 06 Oct 2021 12:50:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.10-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regression found on i386 and x86. following kernel warning reported on stable-rc linux-5.14.y while booting x86.
metadata: git branch: linux-5.14.y git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: cda15f9c69e08480d4308d0e5c62bd44324a9ff0 git describe: v5.14.9-173-gcda15f9c69e0 make_kernelversion: 5.14.10-rc1 kernel-config: http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/intel-corei7-64/lkft...
Kernel crash:
[ 22.356755] ========================= [ 22.360414] WARNING: held lock freed! [ 22.364071] 5.14.10-rc1 #1 Not tainted [ 22.367824] ------------------------- [ 22.371489] systemd-network/341 is freeing memory ffff9d21cbea0000-ffff9d21cbea06bf, with a lock still held there! [ 22.381828] ffff9d21cbea0120 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x21/0x100 [ 22.384624] igb 0000:02:00.0 eno2: renamed from eth1 [ 22.390260] 2 locks held by systemd-network/341: [ 22.390261] #0: ffff9d21c406eb10 (&sb->s_type->i_mutex_key#6){+.+.}-{3:3}, at: __sock_release+0x32/0xc0 [ 22.390267] #1: ffff9d21cbea0120 (sk_lock-AF_INET){+.+.}-{0:0}, at: sk_common_release+0x21/0x100 [ 22.390272] [ 22.390272] stack backtrace: [ 22.390273] CPU: 2 PID: 341 Comm: systemd-network Not tainted 5.14.10-rc1 #1 [ 22.390275] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 22.390276] Call Trace: [ 22.390278] dump_stack_lvl+0x49/0x5e [ 22.390281] dump_stack+0x10/0x12 [ 22.390283] debug_check_no_locks_freed+0x111/0x120 [ 22.390286] slab_free_freelist_hook+0x119/0x1d0 [ 22.390289] kmem_cache_free+0x102/0x540 [ 22.390291] ? __sk_destruct+0x145/0x210 [ 22.390294] __sk_destruct+0x145/0x210 [ 22.390296] sk_destruct+0x48/0x50 [ 22.409489] ata_id (348) used greatest stack depth: 11952 bytes left [ 22.418194] __sk_free+0x2f/0xc0 [ 22.418198] sk_free+0x26/0x40 [ 22.418200] sk_common_release+0xa9/0x100 [ 22.487567] udp_lib_close+0x9/0x10 [ 22.491057] inet_release+0x44/0x80 [ 22.494542] __sock_release+0x42/0xc0 [ 22.498209] sock_close+0x18/0x20 [ 22.501525] __fput+0xb5/0x260 [ 22.504578] ____fput+0xe/0x10 [ 22.507636] task_work_run+0x6f/0xc0 [ 22.511216] exit_to_user_mode_prepare+0x1f6/0x200 [ 22.515999] syscall_exit_to_user_mode+0x1d/0x50 [ 22.520610] do_syscall_64+0x67/0x80 [ 22.524190] ? irqentry_exit+0x75/0x80 [ 22.527940] ? exc_page_fault+0x6c/0x200 [ 22.531859] ? asm_exc_page_fault+0x8/0x30 [ 22.535950] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 22.541002] RIP: 0033:0x7f52fe67e641 [ 22.544580] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 aa cd 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f c3 66 0f 1f 44 00 00 53 89 fb 48 83 ec 10 [ 22.563318] RSP: 002b:00007ffe31761878 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 22.570884] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00007f52fe67e641 [ 22.578007] RDX: 00000000000073b0 RSI: 0000000000000000 RDI: 0000000000000010 [ 22.585130] RBP: 00007f52feed8338 R08: 000055893cf103c3 R09: 0000000000000078 [ 22.592253] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 [ 22.599379] R13: 00007ffe317618c0 R14: 0000000000000000 R15: 00007ffe31762a60
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
ref: https://lkft.validation.linaro.org/scheduler/job/3658919#L1477
-- Linaro LKFT https://lkft.linaro.org
Any chance at bisection?
linux-stable-mirror@lists.linaro.org