This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.5.7-rc1
Leon Romanovsky leon@kernel.org RDMA/mlx5: Remove not-used cache disable flag
Namjae Jeon linkinjeon@kernel.org ksmbd: fix race condition from parallel smb2 lock requests
luosili rootlab@huawei.com ksmbd: fix uaf in smb20_oplock_break_ack
Namjae Jeon linkinjeon@kernel.org ksmbd: fix race condition between session lookup and expire
Tom Lendacky thomas.lendacky@amd.com x86/sev: Use the GHCB protocol when available for SNP CPUID requests
Tom Lendacky thomas.lendacky@amd.com x86/sev: Change npages to unsigned long in snp_accept_memory()
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Fixed two speaker platform
Colin Ian King colin.i.king@gmail.com ALSA: hda/realtek: Fix spelling mistake "powe" -> "power"
Shay Drory shayd@nvidia.com RDMA/mlx5: Fix NULL string error
Hamdan Igbaria hamdani@nvidia.com RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creation
Michael Guralnik michaelgur@nvidia.com RDMA/mlx5: Fix assigning access flags to cache mkeys
Shay Drory shayd@nvidia.com RDMA/mlx5: Fix mkey cache possible deadlock on cleanup
Bernard Metzler bmt@zurich.ibm.com RDMA/siw: Fix connection failure handling
Bart Van Assche bvanassche@acm.org RDMA/srp: Do not call scsi_done() from srp_abort()
Konstantin Meskhidze konstantin.meskhidze@huawei.com RDMA/uverbs: Fix typo of sizeof argument
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Fix the handling of control path response data
Leon Romanovsky leon@kernel.org RDMA/cma: Fix truncation compilation warning in make_cma_ports
Mark Zhang markzhang@nvidia.com RDMA/cma: Initialize ib_sa_multicast structure to 0 when join
Duje Mihanović duje.mihanovic@skole.hr gpio: pxa: disable pinctrl calls for MMP_GPIO
Bartosz Golaszewski bartosz.golaszewski@linaro.org gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()
Christophe JAILLET christophe.jaillet@wanadoo.fr IB/mlx4: Fix the size of a buffer in add_port_entries()
Dan Carpenter dan.carpenter@linaro.org of: dynamic: Fix potential memory leak in of_changeset_action()
Leon Romanovsky leon@kernel.org RDMA/core: Require admin capabilities to set system parameters
Fedor Pchelkin pchelkin@ispras.ru dm zoned: free dmz->ddev array in dmz_put_zoned_devices
Helge Deller deller@gmx.de parisc: Fix crash with nr_cpus=1 option
Jordan Rife jrife@google.com smb: use kernel_connect() and kernel_bind()
John David Anglin dave@parisc-linux.org parisc: Restore __ldcw_align for PA-RISC 2.0 processors
Randy Dunlap rdunlap@infradead.org net: lan743x: also select PHYLIB
Srinivas Pandruvada srinivas.pandruvada@linux.intel.com HID: intel-ish-hid: ipc: Disable and reenable ACPI GPE bit
Jiri Kosina jkosina@suse.cz HID: sony: remove duplicate NULL check before calling usb_free_urb()
Christophe JAILLET christophe.jaillet@wanadoo.fr HID: nvidia-shield: Fix a missing led_classdev_unregister() in the probe error handling path
Haiyang Zhang haiyangz@microsoft.com net: mana: Fix oversized sge0 for GSO packets
Haiyang Zhang haiyangz@microsoft.com net: mana: Fix the tso_bytes calculation
Eric Dumazet edumazet@google.com netlink: annotate data-races around sk->sk_err
Xin Long lucien.xin@gmail.com sctp: update hb timer immediately after users change hb_interval
Xin Long lucien.xin@gmail.com sctp: update transport state when processing a dupcook packet
Neal Cardwell ncardwell@google.com tcp: fix delayed ACKs for MSS boundary condition
Neal Cardwell ncardwell@google.com tcp: fix quick-ack counting to count actual ACKs of new data
Chengfeng Ye dg573847474@gmail.com tipc: fix a potential deadlock on &tx->lock
Ben Wolsieffer ben.wolsieffer@hefring.com net: stmmac: dwmac-stm32: fix resume on STM32 MCU
Benjamin Poirier bpoirier@nvidia.com ipv4: Set offload_failed flag in fibmatch results
Florian Westphal fw@strlen.de netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
Phil Sutter phil@nwl.cc netfilter: nf_tables: Deduplicate nft_register_obj audit logs
Phil Sutter phil@nwl.cc selftests: netfilter: Extend nft_audit.sh
Phil Sutter phil@nwl.cc selftests: netfilter: Test nf_tables audit logging
Xin Long lucien.xin@gmail.com netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
Florian Westphal fw@strlen.de netfilter: nft_payload: rebuild vlan header on h_proto access
David Wilder dwilder@us.ibm.com ibmveth: Remove condition to recompute TCP header checksum.
Dan Carpenter dan.carpenter@linaro.org net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com rswitch: Fix PHY station management clock setting
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com net: renesas: rswitch: Add spin lock protection for irq {un}mask
Jeremy Cline jeremy@jcline.org net: nfc: llcp: Add lock when modifying device list
Parthiban Veerasooran Parthiban.Veerasooran@microchip.com ethtool: plca: fix plca enable data type while parsing the value
Shigeru Yoshida syoshida@redhat.com net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg
Ilya Maximets i.maximets@ovn.org ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling
Al Viro viro@zeniv.linux.org.uk ovl: fetch inode once in ovl_dentry_revalidate_common()
Al Viro viro@zeniv.linux.org.uk ovl: move freeing ovl_entry past rcu delay
Fabio Estevam festevam@denx.de net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent
Dinghao Liu dinghao.liu@zju.edu.cn ptp: ocp: Fix error handling in ptp_ocp_device_init
David Howells dhowells@redhat.com ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()
Eric Dumazet edumazet@google.com neighbour: fix data-races around n->output
Eric Dumazet edumazet@google.com net: fix possible store tearing in neigh_periodic_work()
Clark Wang xiaoning.wang@nxp.com net: stmmac: platform: fix the incorrect parameter
Mauricio Faria de Oliveira mfo@canonical.com modpost: add missing else to the "of" check
Jakub Sitnicki jakub@cloudflare.com bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets
John Fastabend john.fastabend@gmail.com bpf, sockmap: Do not inc copied_seq when PEEK flag set
John Fastabend john.fastabend@gmail.com bpf: tcp_read_skb needs to pop skb regardless of seq
Michal Schmidt mschmidt@redhat.com ice: always add legacy 32byte RXDID in supported_rxdids
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Fix a nfs4_state_manager() race
Arnd Bergmann arnd@arndb.de ima: rework CONFIG_IMA dependency block
Junxiao Bi junxiao.bi@oracle.com scsi: target: core: Fix deadlock due to recursive locking
Ilan Peer ilan.peer@intel.com wifi: iwlwifi: mvm: Fix incorrect usage of scan API
Oleksandr Tymoshenko ovt@google.com ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig
Michał Mirosław mirq-linux@rere.qmqm.pl regulator/core: regulator_register: set device->class earlier
Benjamin Berg benjamin.berg@intel.com wifi: mac80211: Create resources for disabled links
Yong Wu yong.wu@mediatek.com iommu/mediatek: Fix share pgtable for iova over 4GB
Breno Leitao leitao@debian.org perf/x86/amd: Do not WARN() on every IRQ
Johannes Berg johannes.berg@intel.com wifi: mac80211: fix potential key use-after-free
Richard Fitzgerald rf@opensource.cirrus.com regmap: rbtree: Fix wrong register marked as in-cache when creating new node
Daniel Bristot de Oliveira bristot@kernel.org rtla/timerlat: Do not stop user-space if a cpu is offline
Sandipan Das sandipan.das@amd.com perf/x86/amd/core: Fix overflow reset on hotplug
Felix Fietkau nbd@nbd.name wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling
Alexandra Diupina adiupina@astralinux.ru drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: ISO: Fix handling of listen for unicast
Ying Hsu yinghsu@chromium.org Bluetooth: Fix hci_link_tx_to RCU lock usage
Yao Xiao xiaoyao@rock-chips.com Bluetooth: Delete unused hci_req_prepare_suspend() declaration
Chen-Yu Tsai wenst@chromium.org regulator: mt6358: split ops for buck and linear range LDO regulators
Andrii Nakryiko andrii@kernel.org bpf: unconditionally reset backtrack_state masks on global func exit
Leon Hwang hffilwlqm@gmail.com bpf: Fix tr dereferencing
Marek Behún kabel@kernel.org leds: Drop BUG_ON check for LED_COLOR_ID_MULTI
Song Liu song@kernel.org s390/bpf: Let arch_prepare_bpf_trampoline return program size
Jingbo Xu jefflexu@linux.alibaba.com erofs: allow empty device tags in flatdev mode
Randy Dunlap rdunlap@infradead.org HID: nvidia-shield: add LEDS_CLASS dependency
Pin-yen Lin treapking@chromium.org wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
Felix Fietkau nbd@nbd.name wifi: mac80211: fix mesh id corruption on 32 bit systems
Johannes Berg johannes.berg@intel.com wifi: cfg80211: add missing kernel-doc for cqm_rssi_work
Daniel Bristot de Oliveira bristot@kernel.org rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample
Daniel Bristot de Oliveira bristot@kernel.org rtla/timerlat_aa: Fix negative IRQ delay
Daniel Bristot de Oliveira bristot@kernel.org rtla/timerlat_aa: Zero thread sum after every sample analysis
Johannes Berg johannes.berg@intel.com wifi: cfg80211: fix cqm_config access race
Christophe JAILLET christophe.jaillet@wanadoo.fr wifi: iwlwifi: mvm: Fix a memory corruption issue
Arnd Bergmann arnd@arndb.de wifi: iwlwifi: dbg_ini: fix structure packing
Gregory Greenman gregory.greenman@intel.com iwlwifi: mvm: handle PS changes in vif_cfg_changed
Wen Gong quic_wgong@quicinc.com wifi: cfg80211/mac80211: hold link BSSes when assoc fails for MLO connection
Gao Xiang xiang@kernel.org erofs: fix memory leak of LZMA global compressed deduplication
Zhihao Cheng chengzhihao1@huawei.com ubi: Refuse attaching if mtd's erasesize is 0
Lorenzo Bianconi lorenzo@kernel.org wifi: mt76: fix lock dependency problem for wed_lock
Christophe JAILLET christophe.jaillet@wanadoo.fr HID: sony: Fix a potential memory leak in sony_probe()
Rob Herring robh@kernel.org arm64: errata: Add Cortex-A520 speculative unprivileged load workaround
Rob Herring robh@kernel.org arm64: Add Cortex-A520 CPU part definition
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix detection of _PR3 on the PCIe root port
Nirmoy Das nirmoy.das@intel.com drm/i915: Don't set PIPE_CONTROL_FLUSH_L3 for aux inval
Jordan Rife jrife@google.com net: prevent rewrite of msg_name in sock_sendmsg()
Qu Wenruo wqu@suse.com btrfs: reject unknown mount options early
Filipe Manana fdmanana@suse.com btrfs: always print transaction aborted messages with an error level
Jens Axboe axboe@kernel.dk io_uring: ensure io_lockdep_assert_cq_locked() handles disabled rings
Jens Axboe axboe@kernel.dk io_uring/kbuf: don't allow registered buffer rings on highmem pages
Jordan Rife jrife@google.com net: replace calls to sock->ops->connect() with kernel_connect()
Jithu Joseph jithu.joseph@intel.com platform/x86/intel/ifs: release cpus_read_lock()
Sricharan Ramabadhran quic_srichara@quicinc.com PCI: qcom: Fix IPQ8074 enumeration
Mika Westerberg mika.westerberg@linux.intel.com PCI/PM: Mark devices disconnected if upstream PCIe link is down on resume
David Jeffery djeffery@redhat.com md/raid5: release batch_last before waiting for another stripe_head
Jens Axboe axboe@kernel.dk io_uring: don't allow IORING_SETUP_NO_MMAP rings on highmem pages
Gustavo A. R. Silva gustavoars@kernel.org wifi: mwifiex: Fix tlv_buf_left calculation
Sascha Hauer s.hauer@pengutronix.de wifi: rtw88: rtw8723d: Fix MAC address offset in EEPROM
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: hci_sync: Fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER
Juerg Haefliger juerg.haefliger@canonical.com wifi: brcmfmac: Replace 1-element arrays with flexible arrays
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: hci_codec: Fix leaking content of local_codecs
Gustavo A. R. Silva gustavoars@kernel.org qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info
Geliang Tang geliang.tang@suse.com mptcp: userspace pm allow creating id 0 subflow
Paolo Abeni pabeni@redhat.com mptcp: fix delegated action races
Christian Marangi ansuelsmth@gmail.com net: ethernet: mediatek: disable irq before schedule napi
Stefano Garzarella sgarzare@redhat.com vringh: don't use vringh_kiov_advance() in vringh_iov_xfer()
Haiyang Zhang haiyangz@microsoft.com net: mana: Fix TX CQE error handling
Zhang Rui rui.zhang@intel.com iommu/vt-d: Avoid memory allocation in iommu_suspend()
Dinghao Liu dinghao.liu@zju.edu.cn scsi: zfcp: Fix a double put in zfcp_port_enqueue()
Hector Martin marcan@marcan.st iommu/apple-dart: Handle DMA_FQ domains in attach_dev()
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: reduce resets during store setup
Robin Murphy robin.murphy@arm.com iommu/arm-smmu-v3: Avoid constructing invalid range commands
Patrick Rohr prohr@google.com net: release reference to inet6_dev pointer
Patrick Rohr prohr@google.com net: change accept_ra_min_rtr_lft to affect all RA lifetimes
Patrick Rohr prohr@google.com net: add sysctl accept_ra_min_rtr_lft
Kristina Martsenko kristina.martsenko@arm.com arm64: cpufeature: Fix CLRBHB and BC detection
Joey Gouly joey.gouly@arm.com arm64: add HWCAP for FEAT_HBC (hinted conditional branches)
Josef Bacik josef@toxicpanda.com btrfs: don't clear uptodate on write errors
Christoph Hellwig hch@lst.de btrfs: remove end_extent_writepage
Christoph Hellwig hch@lst.de btrfs: remove btrfs_writepage_endio_finish_ordered
Damien Le Moal dlemoal@kernel.org ata: libata-scsi: Fix delayed scsi_rescan_device() execution
Damien Le Moal dlemoal@kernel.org scsi: Do not attempt to rescan suspended devices
Bart Van Assche bvanassche@acm.org scsi: core: Improve type safety of scsi_rescan_device()
Paolo Abeni pabeni@redhat.com mptcp: fix dangling connection hang-up
Paolo Abeni pabeni@redhat.com mptcp: rename timer related helper to less confusing names
Kuniyuki Iwashima kuniyu@amazon.com mptcp: Remove unnecessary test for __mptcp_init_sock()
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: add mas_is_active() to detect in-tree walks
Sameer Pujar spujar@nvidia.com ASoC: tegra: Fix redundant PLLA and PLLA_OUT0 updates
Sameer Pujar spujar@nvidia.com ASoC: soc-utils: Export snd_soc_dai_is_dummy() symbol
Kailang Yang kailang@realtek.com ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support
Kailang Yang kailang@realtek.com ALSA: hda/realtek - ALC287 I2S speaker platform support
Fabian Vogt fabian@ritter-vogt.de ALSA: hda/realtek: Add quirk for mute LEDs on HP ENVY x360 15-eu0xxx
SungHwan Jung onenowy@gmail.com ALSA: hda/realtek: Add quirk for HP Victus 16-d1xxx to enable mute LED
Shenghao Ding shenghao-ding@ti.com ALSA: hda/tas2781: Add tas2781 HDA driver
-------------
Diffstat:
Documentation/arch/arm64/silicon-errata.rst | 2 + Documentation/networking/ip-sysctl.rst | 8 + Makefile | 4 +- arch/arm64/Kconfig | 13 ++ arch/arm64/include/asm/cpufeature.h | 2 +- arch/arm64/include/asm/cputype.h | 2 + arch/arm64/include/asm/hwcap.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 1 + arch/arm64/kernel/cpu_errata.c | 8 + arch/arm64/kernel/cpufeature.c | 4 +- arch/arm64/kernel/cpuinfo.c | 1 + arch/arm64/kernel/entry.S | 4 + arch/arm64/tools/cpucaps | 1 + arch/arm64/tools/sysreg | 6 +- arch/parisc/include/asm/ldcw.h | 37 ++-- arch/parisc/include/asm/spinlock_types.h | 5 - arch/parisc/kernel/smp.c | 4 +- arch/s390/net/bpf_jit_comp.c | 2 +- arch/x86/events/amd/core.c | 24 +- arch/x86/kernel/sev-shared.c | 69 ++++-- arch/x86/kernel/sev.c | 3 +- drivers/ata/libata-core.c | 16 ++ drivers/ata/libata-scsi.c | 33 ++- drivers/base/regmap/regcache-rbtree.c | 3 +- drivers/gpio/gpio-aspeed.c | 2 +- drivers/gpio/gpio-pxa.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 41 ++-- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 11 +- drivers/hid/Kconfig | 1 + drivers/hid/hid-nvidia-shield.c | 3 +- drivers/hid/hid-sony.c | 2 + drivers/hid/intel-ish-hid/ipc/pci-ish.c | 8 + drivers/infiniband/core/cma.c | 2 +- drivers/infiniband/core/cma_configfs.c | 2 +- drivers/infiniband/core/nldev.c | 1 + drivers/infiniband/core/uverbs_main.c | 2 +- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 11 +- drivers/infiniband/hw/mlx4/sysfs.c | 2 +- drivers/infiniband/hw/mlx5/fs.c | 2 +- drivers/infiniband/hw/mlx5/main.c | 2 +- drivers/infiniband/hw/mlx5/mr.c | 14 +- drivers/infiniband/sw/siw/siw_cm.c | 16 +- drivers/infiniband/ulp/srp/ib_srp.c | 16 +- drivers/iommu/apple-dart.c | 3 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +- drivers/iommu/intel/iommu.c | 16 -- drivers/iommu/intel/iommu.h | 2 +- drivers/iommu/mtk_iommu.c | 9 +- drivers/leds/led-core.c | 4 - drivers/md/dm-zoned-target.c | 15 +- drivers/md/raid5.c | 7 + drivers/mtd/ubi/build.c | 7 + drivers/net/dsa/mv88e6xxx/chip.c | 6 +- drivers/net/dsa/mv88e6xxx/global1.c | 31 --- drivers/net/dsa/mv88e6xxx/global1.h | 1 - drivers/net/dsa/mv88e6xxx/global2.c | 2 +- drivers/net/dsa/mv88e6xxx/global2.h | 1 + drivers/net/ethernet/ibm/ibmveth.c | 25 +-- drivers/net/ethernet/intel/ice/ice_virtchnl.c | 12 +- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 +- drivers/net/ethernet/microchip/Kconfig | 1 + drivers/net/ethernet/microsoft/mana/mana_en.c | 211 ++++++++++++------ drivers/net/ethernet/qlogic/qed/qed_ll2.h | 2 +- drivers/net/ethernet/renesas/rswitch.c | 25 ++- drivers/net/ethernet/renesas/rswitch.h | 4 + drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c | 7 +- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 1 + drivers/net/usb/smsc75xx.c | 4 +- drivers/net/wan/fsl_ucc_hdlc.c | 12 +- .../broadcom/brcm80211/brcmfmac/fwil_types.h | 9 +- drivers/net/wireless/intel/iwlwifi/fw/error-dump.h | 6 +- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 2 +- .../net/wireless/intel/iwlwifi/mvm/mld-mac80211.c | 121 +++++----- drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 2 +- .../net/wireless/marvell/mwifiex/11n_rxreorder.c | 4 +- drivers/net/wireless/marvell/mwifiex/sta_rx.c | 16 +- drivers/net/wireless/mediatek/mt76/dma.c | 8 +- .../net/wireless/mediatek/mt76/mt76x02_eeprom.c | 7 - drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c | 13 +- drivers/net/wireless/realtek/rtw88/rtw8723d.h | 1 + drivers/of/dynamic.c | 6 +- drivers/pci/controller/dwc/pcie-qcom.c | 4 +- drivers/pci/pci-driver.c | 14 +- drivers/platform/x86/intel/ifs/runtest.c | 7 +- drivers/ptp/ptp_ocp.c | 1 - drivers/regulator/core.c | 4 +- drivers/regulator/mt6358-regulator.c | 18 +- drivers/s390/scsi/zfcp_aux.c | 9 +- drivers/scsi/aacraid/commsup.c | 2 +- drivers/scsi/mvumi.c | 2 +- drivers/scsi/scsi_lib.c | 2 +- drivers/scsi/scsi_priv.h | 1 - drivers/scsi/scsi_scan.c | 20 +- drivers/scsi/scsi_sysfs.c | 4 +- drivers/scsi/smartpqi/smartpqi_init.c | 2 +- drivers/scsi/storvsc_drv.c | 2 +- drivers/scsi/virtio_scsi.c | 2 +- drivers/target/target_core_device.c | 11 +- drivers/vhost/vringh.c | 12 +- fs/btrfs/btrfs_inode.h | 3 - fs/btrfs/extent_io.c | 58 ++--- fs/btrfs/extent_io.h | 2 - fs/btrfs/inode.c | 47 ++-- fs/btrfs/ordered-data.c | 4 + fs/btrfs/super.c | 4 + fs/btrfs/transaction.h | 4 +- fs/erofs/decompressor_lzma.c | 5 +- fs/erofs/super.c | 2 +- fs/nfs/nfs4state.c | 7 + fs/overlayfs/super.c | 9 +- fs/smb/client/connect.c | 10 +- fs/smb/server/connection.c | 2 + fs/smb/server/connection.h | 1 + fs/smb/server/mgmt/user_session.c | 10 +- fs/smb/server/smb2pdu.c | 16 +- include/linux/bpf.h | 2 +- include/linux/ipv6.h | 1 + include/linux/maple_tree.h | 11 + include/linux/netfilter/nf_conntrack_sctp.h | 1 + include/net/cfg80211.h | 6 +- include/net/mana/mana.h | 5 +- include/net/neighbour.h | 2 +- include/net/tcp.h | 6 +- include/scsi/scsi_host.h | 2 +- include/uapi/linux/ipv6.h | 1 + io_uring/io_uring.c | 16 +- io_uring/io_uring.h | 41 ++-- io_uring/kbuf.c | 27 ++- kernel/bpf/verifier.c | 8 +- lib/maple_tree.c | 246 +++++++++++++++------ lib/test_maple_tree.c | 87 ++++++-- net/bluetooth/hci_core.c | 6 + net/bluetooth/hci_event.c | 1 + net/bluetooth/hci_request.h | 2 - net/bluetooth/hci_sync.c | 14 +- net/bluetooth/iso.c | 9 +- net/bridge/br_netfilter_hooks.c | 2 +- net/core/neighbour.c | 14 +- net/core/sock_map.c | 4 + net/ethtool/plca.c | 45 ++-- net/ipv4/route.c | 2 + net/ipv4/tcp.c | 10 +- net/ipv4/tcp_bpf.c | 4 +- net/ipv4/tcp_input.c | 13 ++ net/ipv4/tcp_output.c | 7 +- net/ipv6/addrconf.c | 13 ++ net/ipv6/ndisc.c | 13 +- net/ipv6/tcp_ipv6.c | 10 +- net/l2tp/l2tp_ip6.c | 2 +- net/mac80211/cfg.c | 3 + net/mac80211/ieee80211_i.h | 2 +- net/mac80211/key.c | 2 +- net/mac80211/mesh.c | 8 +- net/mac80211/mlme.c | 18 +- net/mptcp/pm_userspace.c | 6 - net/mptcp/protocol.c | 161 +++++++------- net/mptcp/protocol.h | 59 +++-- net/mptcp/subflow.c | 13 +- net/netfilter/ipvs/ip_vs_sync.c | 4 +- net/netfilter/nf_conntrack_proto_sctp.c | 43 +++- net/netfilter/nf_tables_api.c | 44 ++-- net/netfilter/nft_payload.c | 13 +- net/netfilter/nft_set_rbtree.c | 46 ++-- net/netlink/af_netlink.c | 8 +- net/nfc/llcp_core.c | 2 + net/rds/tcp_connect.c | 2 +- net/sctp/associola.c | 3 +- net/sctp/socket.c | 1 + net/socket.c | 29 ++- net/tipc/crypto.c | 4 +- net/wireless/core.c | 14 +- net/wireless/core.h | 7 +- net/wireless/mlme.c | 3 +- net/wireless/nl80211.c | 93 +++++--- scripts/mod/file2alias.c | 2 +- security/integrity/ima/Kconfig | 22 +- sound/pci/hda/patch_realtek.c | 154 ++++++++++++- sound/soc/soc-utils.c | 1 + sound/soc/tegra/tegra_audio_graph_card.c | 30 +-- tools/testing/selftests/netfilter/.gitignore | 1 + tools/testing/selftests/netfilter/Makefile | 4 +- tools/testing/selftests/netfilter/audit_logread.c | 165 ++++++++++++++ tools/testing/selftests/netfilter/config | 1 + tools/testing/selftests/netfilter/nft_audit.sh | 193 ++++++++++++++++ tools/tracing/rtla/src/timerlat_aa.c | 32 ++- tools/tracing/rtla/src/timerlat_u.c | 6 +- 188 files changed, 2189 insertions(+), 962 deletions(-)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 88 +++++++++++++++++++++++++++++++++-- 1 file changed, 85 insertions(+), 3 deletions(-)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 4a13747b2b0f3..ccf99f881cc4f 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6721,7 +6721,7 @@ static void comp_generic_playback_hook(struct hda_pcm_stream *hinfo, struct hda_ } }
-struct cs35l41_dev_name { +struct scodec_dev_name { const char *bus; const char *hid; int index; @@ -6730,7 +6730,7 @@ struct cs35l41_dev_name { /* match the device name in a slightly relaxed manner */ static int comp_match_cs35l41_dev_name(struct device *dev, void *data) { - struct cs35l41_dev_name *p = data; + struct scodec_dev_name *p = data; const char *d = dev_name(dev); int n = strlen(p->bus); char tmp[32]; @@ -6746,12 +6746,32 @@ static int comp_match_cs35l41_dev_name(struct device *dev, void *data) return !strcmp(d + n, tmp); }
+static int comp_match_tas2781_dev_name(struct device *dev, + void *data) +{ + struct scodec_dev_name *p = data; + const char *d = dev_name(dev); + int n = strlen(p->bus); + char tmp[32]; + + /* check the bus name */ + if (strncmp(d, p->bus, n)) + return 0; + /* skip the bus number */ + if (isdigit(d[n])) + n++; + /* the rest must be exact matching */ + snprintf(tmp, sizeof(tmp), "-%s:00", p->hid); + + return !strcmp(d + n, tmp); +} + static void cs35l41_generic_fixup(struct hda_codec *cdc, int action, const char *bus, const char *hid, int count) { struct device *dev = hda_codec_dev(cdc); struct alc_spec *spec = cdc->spec; - struct cs35l41_dev_name *rec; + struct scodec_dev_name *rec; int ret, i;
switch (action) { @@ -6779,6 +6799,41 @@ static void cs35l41_generic_fixup(struct hda_codec *cdc, int action, const char } }
+static void tas2781_generic_fixup(struct hda_codec *cdc, int action, + const char *bus, const char *hid) +{ + struct device *dev = hda_codec_dev(cdc); + struct alc_spec *spec = cdc->spec; + struct scodec_dev_name *rec; + int ret; + + switch (action) { + case HDA_FIXUP_ACT_PRE_PROBE: + rec = devm_kmalloc(dev, sizeof(*rec), GFP_KERNEL); + if (!rec) + return; + rec->bus = bus; + rec->hid = hid; + rec->index = 0; + spec->comps[0].codec = cdc; + component_match_add(dev, &spec->match, + comp_match_tas2781_dev_name, rec); + ret = component_master_add_with_match(dev, &comp_master_ops, + spec->match); + if (ret) + codec_err(cdc, + "Fail to register component aggregator %d\n", + ret); + else + spec->gen.pcm_playback_hook = + comp_generic_playback_hook; + break; + case HDA_FIXUP_ACT_FREE: + component_master_del(dev, &comp_master_ops); + break; + } +} + static void cs35l41_fixup_i2c_two(struct hda_codec *cdc, const struct hda_fixup *fix, int action) { cs35l41_generic_fixup(cdc, action, "i2c", "CSC3551", 2); @@ -6806,6 +6861,12 @@ static void alc287_fixup_legion_16ithg6_speakers(struct hda_codec *cdc, const st cs35l41_generic_fixup(cdc, action, "i2c", "CLSA0101", 2); }
+static void tas2781_fixup_i2c(struct hda_codec *cdc, + const struct hda_fixup *fix, int action) +{ + tas2781_generic_fixup(cdc, action, "i2c", "TIAS2781"); +} + /* for alc295_fixup_hp_top_speakers */ #include "hp_x360_helper.c"
@@ -7231,6 +7292,7 @@ enum { ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS, ALC236_FIXUP_DELL_DUAL_CODECS, ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI, + ALC287_FIXUP_TAS2781_I2C, };
/* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9309,6 +9371,12 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC269_FIXUP_THINKPAD_ACPI, }, + [ALC287_FIXUP_TAS2781_I2C] = { + .type = HDA_FIXUP_FUNC, + .v.func = tas2781_fixup_i2c, + .chained = true, + .chain_id = ALC269_FIXUP_THINKPAD_ACPI, + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -9890,6 +9958,20 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x3853, "Lenovo Yoga 7 15ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3855, "Legion 7 16ITHG6", ALC287_FIXUP_LEGION_16ITHG6), SND_PCI_QUIRK(0x17aa, 0x3869, "Lenovo Yoga7 14IAL7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), + SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual powe mode2 YC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3884, "Y780 YG DUAL", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3886, "Y780 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38a7, "Y780P AMD YG dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38a8, "Y780P AMD VECO dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38ba, "Yoga S780-14.5 Air AMD quad YC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38bb, "Yoga S780-14.5 Air AMD quad AAC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38be, "Yoga S980-14.5 proX YC Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38bf, "Yoga S980-14.5 proX LX Dual", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38c3, "Y980 DUAL", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38cb, "Y790 YG DUAL", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x38cd, "Y790 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
On Mon, 09 Oct 2023 14:59:25 +0200, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org
This makes little sense without the backport of commit 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver"). Confusingly, the patch subject is very same as this commit...
And the above commit has more follow-up fixes, too. 1c80cc055b3f 17a1eab7b70d ed81cb9e0517
thanks,
Takashi
sound/pci/hda/patch_realtek.c | 88 +++++++++++++++++++++++++++++++++-- 1 file changed, 85 insertions(+), 3 deletions(-)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 4a13747b2b0f3..ccf99f881cc4f 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6721,7 +6721,7 @@ static void comp_generic_playback_hook(struct hda_pcm_stream *hinfo, struct hda_ } } -struct cs35l41_dev_name { +struct scodec_dev_name { const char *bus; const char *hid; int index; @@ -6730,7 +6730,7 @@ struct cs35l41_dev_name { /* match the device name in a slightly relaxed manner */ static int comp_match_cs35l41_dev_name(struct device *dev, void *data) {
- struct cs35l41_dev_name *p = data;
- struct scodec_dev_name *p = data; const char *d = dev_name(dev); int n = strlen(p->bus); char tmp[32];
@@ -6746,12 +6746,32 @@ static int comp_match_cs35l41_dev_name(struct device *dev, void *data) return !strcmp(d + n, tmp); } +static int comp_match_tas2781_dev_name(struct device *dev,
- void *data)
+{
- struct scodec_dev_name *p = data;
- const char *d = dev_name(dev);
- int n = strlen(p->bus);
- char tmp[32];
- /* check the bus name */
- if (strncmp(d, p->bus, n))
return 0;
- /* skip the bus number */
- if (isdigit(d[n]))
n++;
- /* the rest must be exact matching */
- snprintf(tmp, sizeof(tmp), "-%s:00", p->hid);
- return !strcmp(d + n, tmp);
+}
static void cs35l41_generic_fixup(struct hda_codec *cdc, int action, const char *bus, const char *hid, int count) { struct device *dev = hda_codec_dev(cdc); struct alc_spec *spec = cdc->spec;
- struct cs35l41_dev_name *rec;
- struct scodec_dev_name *rec; int ret, i;
switch (action) { @@ -6779,6 +6799,41 @@ static void cs35l41_generic_fixup(struct hda_codec *cdc, int action, const char } } +static void tas2781_generic_fixup(struct hda_codec *cdc, int action,
- const char *bus, const char *hid)
+{
- struct device *dev = hda_codec_dev(cdc);
- struct alc_spec *spec = cdc->spec;
- struct scodec_dev_name *rec;
- int ret;
- switch (action) {
- case HDA_FIXUP_ACT_PRE_PROBE:
rec = devm_kmalloc(dev, sizeof(*rec), GFP_KERNEL);
if (!rec)
return;
rec->bus = bus;
rec->hid = hid;
rec->index = 0;
spec->comps[0].codec = cdc;
component_match_add(dev, &spec->match,
comp_match_tas2781_dev_name, rec);
ret = component_master_add_with_match(dev, &comp_master_ops,
spec->match);
if (ret)
codec_err(cdc,
"Fail to register component aggregator %d\n",
ret);
else
spec->gen.pcm_playback_hook =
comp_generic_playback_hook;
break;
- case HDA_FIXUP_ACT_FREE:
component_master_del(dev, &comp_master_ops);
break;
- }
+}
static void cs35l41_fixup_i2c_two(struct hda_codec *cdc, const struct hda_fixup *fix, int action) { cs35l41_generic_fixup(cdc, action, "i2c", "CSC3551", 2); @@ -6806,6 +6861,12 @@ static void alc287_fixup_legion_16ithg6_speakers(struct hda_codec *cdc, const st cs35l41_generic_fixup(cdc, action, "i2c", "CLSA0101", 2); } +static void tas2781_fixup_i2c(struct hda_codec *cdc,
- const struct hda_fixup *fix, int action)
+{
tas2781_generic_fixup(cdc, action, "i2c", "TIAS2781");
+}
/* for alc295_fixup_hp_top_speakers */ #include "hp_x360_helper.c" @@ -7231,6 +7292,7 @@ enum { ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS, ALC236_FIXUP_DELL_DUAL_CODECS, ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI,
- ALC287_FIXUP_TAS2781_I2C,
}; /* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9309,6 +9371,12 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC269_FIXUP_THINKPAD_ACPI, },
- [ALC287_FIXUP_TAS2781_I2C] = {
.type = HDA_FIXUP_FUNC,
.v.func = tas2781_fixup_i2c,
.chained = true,
.chain_id = ALC269_FIXUP_THINKPAD_ACPI,
- },
}; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -9890,6 +9958,20 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x3853, "Lenovo Yoga 7 15ITL5", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x3855, "Legion 7 16ITHG6", ALC287_FIXUP_LEGION_16ITHG6), SND_PCI_QUIRK(0x17aa, 0x3869, "Lenovo Yoga7 14IAL7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN),
- SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual powe mode2 YC", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x3884, "Y780 YG DUAL", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x3886, "Y780 VECO DUAL", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38a7, "Y780P AMD YG dual", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38a8, "Y780P AMD VECO dual", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38ba, "Yoga S780-14.5 Air AMD quad YC", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38bb, "Yoga S780-14.5 Air AMD quad AAC", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38be, "Yoga S980-14.5 proX YC Dual", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38bf, "Yoga S980-14.5 proX LX Dual", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38c3, "Y980 DUAL", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38cb, "Y790 YG DUAL", ALC287_FIXUP_TAS2781_I2C),
- SND_PCI_QUIRK(0x17aa, 0x38cd, "Y790 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
-- 2.40.1
On Mon, Oct 09, 2023 at 03:17:59PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 14:59:25 +0200, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org
This makes little sense without the backport of commit 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver"). Confusingly, the patch subject is very same as this commit...
This is a tricky one: 3babae915f4 really doesn't add a new driver but rather just refactors some of the quirk handling which is needed for later patches to apply. It's the following 5be27f1e3ec9 which actually adds the driver (which we don't need here).
We don't actually want to bring in a new driver, so 5be27f1e3ec9 is unnecessary.
And the above commit has more follow-up fixes, too. 1c80cc055b3f 17a1eab7b70d ed81cb9e0517
If we don't take 5be27f1e3ec9 then we don't need any of the follow-up fixes.
On Mon, 09 Oct 2023 16:25:13 +0200, Sasha Levin wrote:
On Mon, Oct 09, 2023 at 03:17:59PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 14:59:25 +0200, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org
This makes little sense without the backport of commit 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver"). Confusingly, the patch subject is very same as this commit...
This is a tricky one: 3babae915f4 really doesn't add a new driver but rather just refactors some of the quirk handling which is needed for later patches to apply. It's the following 5be27f1e3ec9 which actually adds the driver (which we don't need here).
We don't actually want to bring in a new driver, so 5be27f1e3ec9 is unnecessary.
If we don't want the backport of 5be27f1e3ec9, this commit (3babae915f4c) and other relevant ones must be dropped, too.
Takashi
On Mon, Oct 09, 2023 at 04:29:38PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 16:25:13 +0200, Sasha Levin wrote:
On Mon, Oct 09, 2023 at 03:17:59PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 14:59:25 +0200, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org
This makes little sense without the backport of commit 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver"). Confusingly, the patch subject is very same as this commit...
This is a tricky one: 3babae915f4 really doesn't add a new driver but rather just refactors some of the quirk handling which is needed for later patches to apply. It's the following 5be27f1e3ec9 which actually adds the driver (which we don't need here).
We don't actually want to bring in a new driver, so 5be27f1e3ec9 is unnecessary.
If we don't want the backport of 5be27f1e3ec9, this commit (3babae915f4c) and other relevant ones must be dropped, too.
Sure, I've dropped:
3babae915f4c ("ALSA: hda/tas2781: Add tas2781 HDA driver") 93dc18e11b1a ("ALSA: hda/realtek: Add quirk for HP Victus 16-d1xxx to enable mute LED") c99c26b16c15 ("ALSA: hda/realtek: Add quirk for mute LEDs on HP ENVY x360 15-eu0xxx") e43252db7e20 ("ALSA: hda/realtek - ALC287 I2S speaker platform support") 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support")
On Mon, Oct 09, 2023 at 04:29:38PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 16:25:13 +0200, Sasha Levin wrote:
On Mon, Oct 09, 2023 at 03:17:59PM +0200, Takashi Iwai wrote:
On Mon, 09 Oct 2023 14:59:25 +0200, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Shenghao Ding shenghao-ding@ti.com
[ Upstream commit 3babae915f4c15d76a5134e55806a1c1588e2865 ]
Integrate tas2781 configs for Lenovo Laptops. All of the tas2781s in the laptop will be aggregated as one audio device. The code support realtek as the primary codec. Rename "struct cs35l41_dev_name" to "struct scodec_dev_name" for all other side codecs instead of the certain one.
Signed-off-by: Shenghao Ding shenghao-ding@ti.com Link: https://lore.kernel.org/r/20230818085836.1442-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org
This makes little sense without the backport of commit 5be27f1e3ec9 ("ALSA: hda/tas2781: Add tas2781 HDA driver"). Confusingly, the patch subject is very same as this commit...
This is a tricky one: 3babae915f4 really doesn't add a new driver but rather just refactors some of the quirk handling which is needed for later patches to apply. It's the following 5be27f1e3ec9 which actually adds the driver (which we don't need here).
We don't actually want to bring in a new driver, so 5be27f1e3ec9 is unnecessary.
If we don't want the backport of 5be27f1e3ec9, this commit (3babae915f4c) and other relevant ones must be dropped, too.
All are now dropped, thanks.
greg k-h
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SungHwan Jung onenowy@gmail.com
[ Upstream commit 93dc18e11b1ab2d485b69f91c973e6b83e47ebd0 ]
This quirk enables mute LED on HP Victus 16-d1xxx (8A25) laptops, which use ALC245 codec.
Signed-off-by: SungHwan Jung onenowy@gmail.com Link: https://lore.kernel.org/r/20230823114051.3921-1-onenowy@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index ccf99f881cc4f..00051c14e263a 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4639,6 +4639,22 @@ static void alc236_fixup_hp_mute_led_coefbit2(struct hda_codec *codec, } }
+static void alc245_fixup_hp_mute_led_coefbit(struct hda_codec *codec, + const struct hda_fixup *fix, + int action) +{ + struct alc_spec *spec = codec->spec; + + if (action == HDA_FIXUP_ACT_PRE_PROBE) { + spec->mute_led_polarity = 0; + spec->mute_led_coef.idx = 0x0b; + spec->mute_led_coef.mask = 3 << 2; + spec->mute_led_coef.on = 2 << 2; + spec->mute_led_coef.off = 1 << 2; + snd_hda_gen_add_mute_led_cdev(codec, coef_mute_led_set); + } +} + /* turn on/off mic-mute LED per capture hook by coef bit */ static int coef_micmute_led_set(struct led_classdev *led_cdev, enum led_brightness brightness) @@ -7293,6 +7309,7 @@ enum { ALC236_FIXUP_DELL_DUAL_CODECS, ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI, ALC287_FIXUP_TAS2781_I2C, + ALC245_FIXUP_HP_MUTE_LED_COEFBIT, };
/* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9377,6 +9394,10 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC269_FIXUP_THINKPAD_ACPI, }, + [ALC245_FIXUP_HP_MUTE_LED_COEFBIT] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc245_fixup_hp_mute_led_coefbit, + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -9650,6 +9671,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x89c6, "Zbook Fury 17 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x89ca, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), SND_PCI_QUIRK(0x103c, 0x89d3, "HP EliteBook 645 G9 (MB 89D2)", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), + SND_PCI_QUIRK(0x103c, 0x8a25, "HP Victus 16-d1xxx (MB 8A25)", ALC245_FIXUP_HP_MUTE_LED_COEFBIT), SND_PCI_QUIRK(0x103c, 0x8a78, "HP Dev One", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x103c, 0x8aa0, "HP ProBook 440 G9 (MB 8A9E)", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8aa3, "HP ProBook 450 G9 (MB 8AA1)", ALC236_FIXUP_HP_GPIO_LED),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabian Vogt fabian@ritter-vogt.de
[ Upstream commit c99c26b16c1544534ebd6a5f27a034f3e44d2597 ]
The LED for the mic mute button is controlled by GPIO2. The mute button LED is slightly more complex, it's controlled by two bits in coeff 0x0b.
Signed-off-by: Fabian Vogt fabian@ritter-vogt.de Link: https://lore.kernel.org/r/2693091.mvXUDI8C0e@fabians-envy Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 00051c14e263a..57bd11c6057d5 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7310,6 +7310,7 @@ enum { ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI, ALC287_FIXUP_TAS2781_I2C, ALC245_FIXUP_HP_MUTE_LED_COEFBIT, + ALC245_FIXUP_HP_X360_MUTE_LEDS, };
/* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9398,6 +9399,12 @@ static const struct hda_fixup alc269_fixups[] = { .type = HDA_FIXUP_FUNC, .v.func = alc245_fixup_hp_mute_led_coefbit, }, + [ALC245_FIXUP_HP_X360_MUTE_LEDS] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc245_fixup_hp_mute_led_coefbit, + .chained = true, + .chain_id = ALC245_FIXUP_HP_GPIO_LED + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -9640,6 +9647,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8873, "HP ZBook Studio 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x887a, "HP Laptop 15s-eq2xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), + SND_PCI_QUIRK(0x103c, 0x888a, "HP ENVY x360 Convertible 15-eu0xxx", ALC245_FIXUP_HP_X360_MUTE_LEDS), SND_PCI_QUIRK(0x103c, 0x888d, "HP ZBook Power 15.6 inch G8 Mobile Workstation PC", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8895, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED), SND_PCI_QUIRK(0x103c, 0x8896, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_MUTE_LED),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
[ Upstream commit e43252db7e207a2e194e6a4883a43a31a776a968 ]
0x17 was only speaker pin, DAC assigned will be 0x03. Headphone assigned to 0x02. Playback via headphone will get EQ filter processing. So,it needs to swap DAC.
Tested-by: Mark Pearson mpearson@lenovo.com Signed-off-by: Kailang Yang kailang@realtek.com Link: https://lore.kernel.org/r/4e4cfa1b3b4c46838aecafc6e8b6f876@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Stable-dep-of: 41b07476da38 ("ALSA: hda/realtek - ALC287 Realtek I2S speaker platform support") Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 57bd11c6057d5..b040889b22880 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7049,6 +7049,27 @@ static void alc295_fixup_dell_inspiron_top_speakers(struct hda_codec *codec, } }
+/* Forcibly assign NID 0x03 to HP while NID 0x02 to SPK */ +static void alc287_fixup_bind_dacs(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + static const hda_nid_t conn[] = { 0x02, 0x03 }; /* exclude 0x06 */ + static const hda_nid_t preferred_pairs[] = { + 0x17, 0x02, 0x21, 0x03, 0 + }; + + if (action != HDA_FIXUP_ACT_PRE_PROBE) + return; + + snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn); + spec->gen.preferred_dacs = preferred_pairs; + spec->gen.auto_mute_via_amp = 1; + snd_hda_codec_write_cache(codec, 0x14, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, + 0x0); /* Make sure 0x14 was disable */ +} + + enum { ALC269_FIXUP_GPIO2, ALC269_FIXUP_SONY_VAIO, @@ -7311,6 +7332,7 @@ enum { ALC287_FIXUP_TAS2781_I2C, ALC245_FIXUP_HP_MUTE_LED_COEFBIT, ALC245_FIXUP_HP_X360_MUTE_LEDS, + ALC287_FIXUP_THINKPAD_I2S_SPK, };
/* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9405,6 +9427,10 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC245_FIXUP_HP_GPIO_LED }, + [ALC287_FIXUP_THINKPAD_I2S_SPK] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc287_fixup_bind_dacs, + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -10537,6 +10563,10 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x17, 0x90170111}, {0x19, 0x03a11030}, {0x21, 0x03211020}), + SND_HDA_PIN_QUIRK(0x10ec0287, 0x17aa, "Lenovo", ALC287_FIXUP_THINKPAD_I2S_SPK, + {0x17, 0x90170110}, + {0x19, 0x03a11030}, + {0x21, 0x03211020}), SND_HDA_PIN_QUIRK(0x10ec0286, 0x1025, "Acer", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE, {0x12, 0x90a60130}, {0x17, 0x90170110},
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
[ Upstream commit 41b07476da38ac2878a14e5b8fe0312c41ea36e3 ]
New platform SSID:0x231f.
0x17 was only speaker pin, DAC assigned will be 0x03. Headphone assigned to 0x02. Playback via headphone will get EQ filter processing. So, it needs to swap DAC.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8d63c6e360124e3ea2523753050e6f05@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b040889b22880..45fa102060cef 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10567,6 +10567,10 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x17, 0x90170110}, {0x19, 0x03a11030}, {0x21, 0x03211020}), + SND_HDA_PIN_QUIRK(0x10ec0287, 0x17aa, "Lenovo", ALC287_FIXUP_THINKPAD_I2S_SPK, + {0x17, 0x90170110}, /* 0x231f with RTK I2S AMP */ + {0x19, 0x04a11040}, + {0x21, 0x04211020}), SND_HDA_PIN_QUIRK(0x10ec0286, 0x1025, "Acer", ALC286_FIXUP_ACER_AIO_MIC_NO_PRESENCE, {0x12, 0x90a60130}, {0x17, 0x90170110},
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sameer Pujar spujar@nvidia.com
[ Upstream commit f101583fa9f8c3f372d4feb61d67da0ccbf4d9a5 ]
Export symbol snd_soc_dai_is_dummy() for usage outside core driver modules. This is required by Tegra ASoC machine driver.
Signed-off-by: Sameer Pujar spujar@nvidia.com Link: https://lore.kernel.org/r/1694098945-32760-2-git-send-email-spujar@nvidia.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-utils.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/soc-utils.c b/sound/soc/soc-utils.c index 11607c5f5d5a8..9c746e4edef71 100644 --- a/sound/soc/soc-utils.c +++ b/sound/soc/soc-utils.c @@ -217,6 +217,7 @@ int snd_soc_dai_is_dummy(struct snd_soc_dai *dai) return 1; return 0; } +EXPORT_SYMBOL_GPL(snd_soc_dai_is_dummy);
int snd_soc_component_is_dummy(struct snd_soc_component *component) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sameer Pujar spujar@nvidia.com
[ Upstream commit e765886249c533e1bb5cbc3cd741bad677417312 ]
Tegra audio graph card has many DAI links which connects internal AHUB modules and external audio codecs. Since these are DPCM links, hw_params() call in the machine driver happens for each connected BE link and PLLA is updated every time. This is not really needed for all links as only I/O link DAIs derive respective clocks from PLLA_OUT0 and thus from PLLA. Hence add checks to limit the clock updates to DAIs over I/O links.
This found to be fixing a DMIC clock discrepancy which is suspected to happen because of back to back quick PLLA and PLLA_OUT0 rate updates. This was observed on Jetson TX2 platform where DMIC clock ended up with unexpected value.
Fixes: 202e2f774543 ("ASoC: tegra: Add audio graph based card driver") Cc: stable@vger.kernel.org Signed-off-by: Sameer Pujar spujar@nvidia.com Link: https://lore.kernel.org/r/1694098945-32760-3-git-send-email-spujar@nvidia.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/tegra/tegra_audio_graph_card.c | 30 ++++++++++++++---------- 1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/sound/soc/tegra/tegra_audio_graph_card.c b/sound/soc/tegra/tegra_audio_graph_card.c index 1f2c5018bf5ac..4737e776d3837 100644 --- a/sound/soc/tegra/tegra_audio_graph_card.c +++ b/sound/soc/tegra/tegra_audio_graph_card.c @@ -10,6 +10,7 @@ #include <linux/platform_device.h> #include <sound/graph_card.h> #include <sound/pcm_params.h> +#include <sound/soc-dai.h>
#define MAX_PLLA_OUT0_DIV 128
@@ -44,6 +45,21 @@ struct tegra_audio_cdata { unsigned int plla_out0_rates[NUM_RATE_TYPE]; };
+static bool need_clk_update(struct snd_soc_dai *dai) +{ + if (snd_soc_dai_is_dummy(dai) || + !dai->driver->ops || + !dai->driver->name) + return false; + + if (strstr(dai->driver->name, "I2S") || + strstr(dai->driver->name, "DMIC") || + strstr(dai->driver->name, "DSPK")) + return true; + + return false; +} + /* Setup PLL clock as per the given sample rate */ static int tegra_audio_graph_update_pll(struct snd_pcm_substream *substream, struct snd_pcm_hw_params *params) @@ -140,19 +156,7 @@ static int tegra_audio_graph_hw_params(struct snd_pcm_substream *substream, struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0); int err;
- /* - * This gets called for each DAI link (FE or BE) when DPCM is used. - * We may not want to update PLLA rate for each call. So PLLA update - * must be restricted to external I/O links (I2S, DMIC or DSPK) since - * they actually depend on it. I/O modules update their clocks in - * hw_param() of their respective component driver and PLLA rate - * update here helps them to derive appropriate rates. - * - * TODO: When more HW accelerators get added (like sample rate - * converter, volume gain controller etc., which don't really - * depend on PLLA) we need a better way to filter here. - */ - if (cpu_dai->driver->ops && rtd->dai_link->no_pcm) { + if (need_clk_update(cpu_dai)) { err = tegra_audio_graph_update_pll(substream, params); if (err) return err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liam R. Howlett Liam.Howlett@oracle.com
[ Upstream commit 5c590804b6b0ff933ed4e5cee5d76de3a5048d9f ]
Patch series "maple_tree: Fix mas_prev() state regression".
Pedro Falcato retported an mprotect regression [1] which was bisected back to the iterator changes for maple tree. Root cause analysis showed the mas_prev() running off the end of the VMA space (previous from 0) followed by mas_find(), would skip the first value.
This patchset introduces maple state underflow/overflow so the sequence of calls on the maple state will return what the user expects.
Users who encounter this bug may see mprotect(), userfaultfd_register(), and mlock() fail on VMAs mapped with address 0.
This patch (of 2):
Instead of constantly checking each possibility of the maple state, create a fast path that will skip over checking unlikely states.
Link: https://lkml.kernel.org/r/20230921181236.509072-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230921181236.509072-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Cc: Pedro Falcato pedro.falcato@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/maple_tree.h | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index 295548cca8b36..e1e5f38384b20 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -503,6 +503,15 @@ static inline bool mas_is_paused(const struct ma_state *mas) return mas->node == MAS_PAUSE; }
+/* Check if the mas is pointing to a node or not */ +static inline bool mas_is_active(struct ma_state *mas) +{ + if ((unsigned long)mas->node >= MAPLE_RESERVED_RANGE) + return true; + + return false; +} + /** * mas_reset() - Reset a Maple Tree operation state. * @mas: Maple Tree operation state.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit e263691773cd67d7c824eeee8b802f50c6e0c118 ]
__mptcp_init_sock() always returns 0 because mptcp_init_sock() used to return the value directly.
But after commit 18b683bff89d ("mptcp: queue data for mptcp level retransmission"), __mptcp_init_sock() need not return value anymore.
Let's remove the unnecessary test for __mptcp_init_sock() and make it return void.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 27e5ccc2d5a5 ("mptcp: fix dangling connection hang-up") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 6947b4b2519c9..0aae76f1465b8 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2710,7 +2710,7 @@ static void mptcp_worker(struct work_struct *work) sock_put(sk); }
-static int __mptcp_init_sock(struct sock *sk) +static void __mptcp_init_sock(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk);
@@ -2737,8 +2737,6 @@ static int __mptcp_init_sock(struct sock *sk) /* re-use the csk retrans timer for MPTCP-level retrans */ timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0); timer_setup(&sk->sk_timer, mptcp_timeout_timer, 0); - - return 0; }
static void mptcp_ca_reset(struct sock *sk) @@ -2756,11 +2754,8 @@ static void mptcp_ca_reset(struct sock *sk) static int mptcp_init_sock(struct sock *sk) { struct net *net = sock_net(sk); - int ret;
- ret = __mptcp_init_sock(sk); - if (ret) - return ret; + __mptcp_init_sock(sk);
if (!mptcp_is_enabled(net)) return -ENOPROTOOPT;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit f6909dc1c1f4452879278128012da6c76bc186a5 ]
The msk socket uses to different timeout to track close related events and retransmissions. The existing helpers do not indicate clearly which timer they actually touch, making the related code quite confusing.
Change the existing helpers name to avoid such confusion. No functional change intended.
This patch is linked to the next one ("mptcp: fix dangling connection hang-up"). The two patches are supposed to be backported together.
Cc: stable@vger.kernel.org # v5.11+ Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 27e5ccc2d5a5 ("mptcp: fix dangling connection hang-up") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 42 +++++++++++++++++++++--------------------- net/mptcp/protocol.h | 2 +- net/mptcp/subflow.c | 2 +- 3 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 0aae76f1465b8..3c85b4c107b2a 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -407,7 +407,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, return false; }
-static void mptcp_stop_timer(struct sock *sk) +static void mptcp_stop_rtx_timer(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk);
@@ -913,12 +913,12 @@ static void __mptcp_flush_join_list(struct sock *sk, struct list_head *join_list } }
-static bool mptcp_timer_pending(struct sock *sk) +static bool mptcp_rtx_timer_pending(struct sock *sk) { return timer_pending(&inet_csk(sk)->icsk_retransmit_timer); }
-static void mptcp_reset_timer(struct sock *sk) +static void mptcp_reset_rtx_timer(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); unsigned long tout; @@ -1052,10 +1052,10 @@ static void __mptcp_clean_una(struct sock *sk) out: if (snd_una == READ_ONCE(msk->snd_nxt) && snd_una == READ_ONCE(msk->write_seq)) { - if (mptcp_timer_pending(sk) && !mptcp_data_fin_enabled(msk)) - mptcp_stop_timer(sk); + if (mptcp_rtx_timer_pending(sk) && !mptcp_data_fin_enabled(msk)) + mptcp_stop_rtx_timer(sk); } else { - mptcp_reset_timer(sk); + mptcp_reset_rtx_timer(sk); } }
@@ -1606,8 +1606,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
out: /* ensure the rtx timer is running */ - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); if (do_check_data_fin) mptcp_check_send_data_fin(sk); } @@ -1663,8 +1663,8 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool if (copied) { tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, info.size_goal); - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk);
if (msk->snd_data_fin_enable && msk->snd_nxt + 1 == msk->write_seq) @@ -2235,7 +2235,7 @@ static void mptcp_retransmit_timer(struct timer_list *t) sock_put(sk); }
-static void mptcp_timeout_timer(struct timer_list *t) +static void mptcp_tout_timer(struct timer_list *t) { struct sock *sk = from_timer(sk, t, sk_timer);
@@ -2607,14 +2607,14 @@ static void __mptcp_retrans(struct sock *sk) reset_timer: mptcp_check_and_set_pending(sk);
- if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); }
/* schedule the timeout timer for the relevant event: either close timeout * or mp_fail timeout. The close timeout takes precedence on the mp_fail one */ -void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout) +void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout) { struct sock *sk = (struct sock *)msk; unsigned long timeout, close_timeout; @@ -2647,7 +2647,7 @@ static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) WRITE_ONCE(mptcp_subflow_ctx(ssk)->fail_tout, 0); unlock_sock_fast(ssk, slow);
- mptcp_reset_timeout(msk, 0); + mptcp_reset_tout_timer(msk, 0); }
static void mptcp_do_fastclose(struct sock *sk) @@ -2736,7 +2736,7 @@ static void __mptcp_init_sock(struct sock *sk)
/* re-use the csk retrans timer for MPTCP-level retrans */ timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0); - timer_setup(&sk->sk_timer, mptcp_timeout_timer, 0); + timer_setup(&sk->sk_timer, mptcp_tout_timer, 0); }
static void mptcp_ca_reset(struct sock *sk) @@ -2821,8 +2821,8 @@ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how) } else { pr_debug("Sending DATA_FIN on subflow %p", ssk); tcp_send_ack(ssk); - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); } break; } @@ -2905,7 +2905,7 @@ static void __mptcp_destroy_sock(struct sock *sk)
might_sleep();
- mptcp_stop_timer(sk); + mptcp_stop_rtx_timer(sk); sk_stop_timer(sk, &sk->sk_timer); msk->pm.status = 0;
@@ -3024,7 +3024,7 @@ bool __mptcp_close(struct sock *sk, long timeout) __mptcp_destroy_sock(sk); do_cancel_work = true; } else { - mptcp_reset_timeout(msk, 0); + mptcp_reset_tout_timer(msk, 0); }
return do_cancel_work; @@ -3087,7 +3087,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) mptcp_check_listen_stop(sk); inet_sk_state_store(sk, TCP_CLOSE);
- mptcp_stop_timer(sk); + mptcp_stop_rtx_timer(sk); sk_stop_timer(sk, &sk->sk_timer);
if (msk->token) diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index ba2a873a4d2e6..4e31b5cf48299 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -699,7 +699,7 @@ void mptcp_get_options(const struct sk_buff *skb,
void mptcp_finish_connect(struct sock *sk); void __mptcp_set_connected(struct sock *sk); -void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout); +void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout); static inline bool mptcp_is_fully_established(struct sock *sk) { return inet_sk_state_load(sk) == TCP_ESTABLISHED && diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index c7bd99b8e7b7a..0506d33177f3d 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1226,7 +1226,7 @@ static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk) WRITE_ONCE(subflow->fail_tout, fail_tout); tcp_send_ack(ssk);
- mptcp_reset_timeout(msk, subflow->fail_tout); + mptcp_reset_tout_timer(msk, subflow->fail_tout); }
static bool subflow_check_data_avail(struct sock *ssk)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 27e5ccc2d5a50ed61bb73153edb1066104b108b3 ]
According to RFC 8684 section 3.3:
A connection is not closed unless [...] or an implementation-specific connection-level send timeout.
Currently the MPTCP protocol does not implement such timeout, and connection timing-out at the TCP-level never move to close state.
Introduces a catch-up condition at subflow close time to move the MPTCP socket to close, too.
That additionally allows removing similar existing inside the worker.
Finally, allow some additional timeout for plain ESTABLISHED mptcp sockets, as the protocol allows creating new subflows even at that point and making the connection functional again.
This issue is actually present since the beginning, but it is basically impossible to solve without a long chain of functional pre-requisites topped by commit bbd49d114d57 ("mptcp: consolidate transition to TCP_CLOSE in mptcp_do_fastclose()"). When backporting this current patch, please also backport this other commit as well.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/430 Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 86 ++++++++++++++++++++++---------------------- net/mptcp/protocol.h | 22 ++++++++++++ net/mptcp/subflow.c | 1 + 3 files changed, 65 insertions(+), 44 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3c85b4c107b2a..3a08ad4ec5d2c 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -894,6 +894,7 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk) mptcp_subflow_ctx(ssk)->subflow_id = msk->subflow_id++; mptcp_sockopt_sync_locked(msk, ssk); mptcp_subflow_joined(msk, ssk); + mptcp_stop_tout_timer(sk); return true; }
@@ -2357,18 +2358,14 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, bool dispose_it, need_push = false;
/* If the first subflow moved to a close state before accept, e.g. due - * to an incoming reset, mptcp either: - * - if either the subflow or the msk are dead, destroy the context - * (the subflow socket is deleted by inet_child_forget) and the msk - * - otherwise do nothing at the moment and take action at accept and/or - * listener shutdown - user-space must be able to accept() the closed - * socket. + * to an incoming reset or listener shutdown, the subflow socket is + * already deleted by inet_child_forget() and the mptcp socket can't + * survive too. */ - if (msk->in_accept_queue && msk->first == ssk) { - if (!sock_flag(sk, SOCK_DEAD) && !sock_flag(ssk, SOCK_DEAD)) - return; - + if (msk->in_accept_queue && msk->first == ssk && + (sock_flag(sk, SOCK_DEAD) || sock_flag(ssk, SOCK_DEAD))) { /* ensure later check in mptcp_worker() will dispose the msk */ + mptcp_set_close_tout(sk, tcp_jiffies32 - (TCP_TIMEWAIT_LEN + 1)); sock_set_flag(sk, SOCK_DEAD); lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); mptcp_subflow_drop_ctx(ssk); @@ -2435,6 +2432,22 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
if (need_push) __mptcp_push_pending(sk, 0); + + /* Catch every 'all subflows closed' scenario, including peers silently + * closing them, e.g. due to timeout. + * For established sockets, allow an additional timeout before closing, + * as the protocol can still create more subflows. + */ + if (list_is_singular(&msk->conn_list) && msk->first && + inet_sk_state_load(msk->first) == TCP_CLOSE) { + if (sk->sk_state != TCP_ESTABLISHED || + msk->in_accept_queue || sock_flag(sk, SOCK_DEAD)) { + inet_sk_state_store(sk, TCP_CLOSE); + mptcp_close_wake_up(sk); + } else { + mptcp_start_tout_timer(sk); + } + } }
void mptcp_close_ssk(struct sock *sk, struct sock *ssk, @@ -2478,23 +2491,14 @@ static void __mptcp_close_subflow(struct sock *sk)
}
-static bool mptcp_should_close(const struct sock *sk) +static bool mptcp_close_tout_expired(const struct sock *sk) { - s32 delta = tcp_jiffies32 - inet_csk(sk)->icsk_mtup.probe_timestamp; - struct mptcp_subflow_context *subflow; - - if (delta >= TCP_TIMEWAIT_LEN || mptcp_sk(sk)->in_accept_queue) - return true; + if (!inet_csk(sk)->icsk_mtup.probe_timestamp || + sk->sk_state == TCP_CLOSE) + return false;
- /* if all subflows are in closed status don't bother with additional - * timeout - */ - mptcp_for_each_subflow(mptcp_sk(sk), subflow) { - if (inet_sk_state_load(mptcp_subflow_tcp_sock(subflow)) != - TCP_CLOSE) - return false; - } - return true; + return time_after32(tcp_jiffies32, + inet_csk(sk)->icsk_mtup.probe_timestamp + TCP_TIMEWAIT_LEN); }
static void mptcp_check_fastclose(struct mptcp_sock *msk) @@ -2619,15 +2623,16 @@ void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout) struct sock *sk = (struct sock *)msk; unsigned long timeout, close_timeout;
- if (!fail_tout && !sock_flag(sk, SOCK_DEAD)) + if (!fail_tout && !inet_csk(sk)->icsk_mtup.probe_timestamp) return;
- close_timeout = inet_csk(sk)->icsk_mtup.probe_timestamp - tcp_jiffies32 + jiffies + TCP_TIMEWAIT_LEN; + close_timeout = inet_csk(sk)->icsk_mtup.probe_timestamp - tcp_jiffies32 + jiffies + + TCP_TIMEWAIT_LEN;
/* the close timeout takes precedence on the fail one, and here at least one of * them is active */ - timeout = sock_flag(sk, SOCK_DEAD) ? close_timeout : fail_tout; + timeout = inet_csk(sk)->icsk_mtup.probe_timestamp ? close_timeout : fail_tout;
sk_reset_timer(sk, &sk->sk_timer, timeout); } @@ -2646,8 +2651,6 @@ static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) mptcp_subflow_reset(ssk); WRITE_ONCE(mptcp_subflow_ctx(ssk)->fail_tout, 0); unlock_sock_fast(ssk, slow); - - mptcp_reset_tout_timer(msk, 0); }
static void mptcp_do_fastclose(struct sock *sk) @@ -2684,18 +2687,14 @@ static void mptcp_worker(struct work_struct *work) if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) __mptcp_close_subflow(sk);
- /* There is no point in keeping around an orphaned sk timedout or - * closed, but we need the msk around to reply to incoming DATA_FIN, - * even if it is orphaned and in FIN_WAIT2 state - */ - if (sock_flag(sk, SOCK_DEAD)) { - if (mptcp_should_close(sk)) - mptcp_do_fastclose(sk); + if (mptcp_close_tout_expired(sk)) { + mptcp_do_fastclose(sk); + mptcp_close_wake_up(sk); + }
- if (sk->sk_state == TCP_CLOSE) { - __mptcp_destroy_sock(sk); - goto unlock; - } + if (sock_flag(sk, SOCK_DEAD) && sk->sk_state == TCP_CLOSE) { + __mptcp_destroy_sock(sk); + goto unlock; }
if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags)) @@ -2987,7 +2986,6 @@ bool __mptcp_close(struct sock *sk, long timeout)
cleanup: /* orphan all the subflows */ - inet_csk(sk)->icsk_mtup.probe_timestamp = tcp_jiffies32; mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); bool slow = lock_sock_fast_nested(ssk); @@ -3024,7 +3022,7 @@ bool __mptcp_close(struct sock *sk, long timeout) __mptcp_destroy_sock(sk); do_cancel_work = true; } else { - mptcp_reset_tout_timer(msk, 0); + mptcp_start_tout_timer(sk); }
return do_cancel_work; @@ -3088,7 +3086,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) inet_sk_state_store(sk, TCP_CLOSE);
mptcp_stop_rtx_timer(sk); - sk_stop_timer(sk, &sk->sk_timer); + mptcp_stop_tout_timer(sk);
if (msk->token) mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 4e31b5cf48299..0a2d9d3df522e 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -700,6 +700,28 @@ void mptcp_get_options(const struct sk_buff *skb, void mptcp_finish_connect(struct sock *sk); void __mptcp_set_connected(struct sock *sk); void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout); + +static inline void mptcp_stop_tout_timer(struct sock *sk) +{ + if (!inet_csk(sk)->icsk_mtup.probe_timestamp) + return; + + sk_stop_timer(sk, &sk->sk_timer); + inet_csk(sk)->icsk_mtup.probe_timestamp = 0; +} + +static inline void mptcp_set_close_tout(struct sock *sk, unsigned long tout) +{ + /* avoid 0 timestamp, as that means no close timeout */ + inet_csk(sk)->icsk_mtup.probe_timestamp = tout ? : 1; +} + +static inline void mptcp_start_tout_timer(struct sock *sk) +{ + mptcp_set_close_tout(sk, tcp_jiffies32); + mptcp_reset_tout_timer(mptcp_sk(sk), 0); +} + static inline bool mptcp_is_fully_established(struct sock *sk) { return inet_sk_state_load(sk) == TCP_ESTABLISHED && diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 0506d33177f3d..40ac67b854074 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1552,6 +1552,7 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc, mptcp_sock_graft(ssk, sk->sk_socket); iput(SOCK_INODE(sf)); WRITE_ONCE(msk->allow_infinite_fallback, false); + mptcp_stop_tout_timer(sk); return 0;
failed_unlink:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit 79519528a180c64a90863db2ce70887de6c49d16 ]
Most callers of scsi_rescan_device() have the scsi_device pointer readily available. Pass a struct scsi_device pointer to scsi_rescan_device() instead of a struct device pointer. This change prevents that a pointer to another struct device would be passed accidentally to scsi_rescan_device().
Remove the scsi_rescan_device() declaration from the scsi_priv.h header file since it duplicates the declaration in <scsi/scsi_host.h>.
Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: John Garry john.g.garry@oracle.com Cc: Mike Christie michael.christie@oracle.com Cc: Ming Lei ming.lei@redhat.com Signed-off-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20230822153043.4046244-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 8b4d9469d0b0 ("ata: libata-scsi: Fix delayed scsi_rescan_device() execution") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-scsi.c | 2 +- drivers/scsi/aacraid/commsup.c | 2 +- drivers/scsi/mvumi.c | 2 +- drivers/scsi/scsi_lib.c | 2 +- drivers/scsi/scsi_priv.h | 1 - drivers/scsi/scsi_scan.c | 4 ++-- drivers/scsi/scsi_sysfs.c | 4 ++-- drivers/scsi/smartpqi/smartpqi_init.c | 2 +- drivers/scsi/storvsc_drv.c | 2 +- drivers/scsi/virtio_scsi.c | 2 +- include/scsi/scsi_host.h | 2 +- 11 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 702812285d8f0..22d7c26297889 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -4930,7 +4930,7 @@ void ata_scsi_dev_rescan(struct work_struct *work) }
spin_unlock_irqrestore(ap->lock, flags); - scsi_rescan_device(&(sdev->sdev_gendev)); + scsi_rescan_device(sdev); scsi_device_put(sdev); spin_lock_irqsave(ap->lock, flags); } diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c index 3f062e4013ab6..013a9a334972e 100644 --- a/drivers/scsi/aacraid/commsup.c +++ b/drivers/scsi/aacraid/commsup.c @@ -1451,7 +1451,7 @@ static void aac_handle_aif(struct aac_dev * dev, struct fib * fibptr) #endif break; } - scsi_rescan_device(&device->sdev_gendev); + scsi_rescan_device(device); break;
default: diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c index 73aa7059b5569..6cfbac518085d 100644 --- a/drivers/scsi/mvumi.c +++ b/drivers/scsi/mvumi.c @@ -1500,7 +1500,7 @@ static void mvumi_rescan_devices(struct mvumi_hba *mhba, int id)
sdev = scsi_device_lookup(mhba->shost, 0, id, 0); if (sdev) { - scsi_rescan_device(&sdev->sdev_gendev); + scsi_rescan_device(sdev); scsi_device_put(sdev); } } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ad9afae49544a..ca5eb058d5c7e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2458,7 +2458,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt) envp[idx++] = "SDEV_MEDIA_CHANGE=1"; break; case SDEV_EVT_INQUIRY_CHANGE_REPORTED: - scsi_rescan_device(&sdev->sdev_gendev); + scsi_rescan_device(sdev); envp[idx++] = "SDEV_UA=INQUIRY_DATA_HAS_CHANGED"; break; case SDEV_EVT_CAPACITY_CHANGE_REPORTED: diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index f42388ecb0248..65c993c979095 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -138,7 +138,6 @@ extern int scsi_complete_async_scans(void); extern int scsi_scan_host_selected(struct Scsi_Host *, unsigned int, unsigned int, u64, enum scsi_scan_mode); extern void scsi_forget_host(struct Scsi_Host *); -extern void scsi_rescan_device(struct device *);
/* scsi_sysctl.c */ #ifdef CONFIG_SYSCTL diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 97669657a9976..eaa972bee6c00 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1619,9 +1619,9 @@ int scsi_add_device(struct Scsi_Host *host, uint channel, } EXPORT_SYMBOL(scsi_add_device);
-void scsi_rescan_device(struct device *dev) +void scsi_rescan_device(struct scsi_device *sdev) { - struct scsi_device *sdev = to_scsi_device(dev); + struct device *dev = &sdev->sdev_gendev;
device_lock(dev);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 60317676e45f1..24f6eefb68030 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -747,7 +747,7 @@ static ssize_t store_rescan_field (struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { - scsi_rescan_device(dev); + scsi_rescan_device(to_scsi_device(dev)); return count; } static DEVICE_ATTR(rescan, S_IWUSR, NULL, store_rescan_field); @@ -840,7 +840,7 @@ store_state_field(struct device *dev, struct device_attribute *attr, * waiting for pending I/O to finish. */ blk_mq_run_hw_queues(sdev->request_queue, true); - scsi_rescan_device(dev); + scsi_rescan_device(sdev); }
return ret == 0 ? count : -EINVAL; diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 6aaaa7ebca377..ed694d9399648 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -2257,7 +2257,7 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info, device->advertised_queue_depth = device->queue_depth; scsi_change_queue_depth(device->sdev, device->advertised_queue_depth); if (device->rescan) { - scsi_rescan_device(&device->sdev->sdev_gendev); + scsi_rescan_device(device->sdev); device->rescan = false; } } diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 047ffaf7d42a9..a80a9e27ff9ee 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -472,7 +472,7 @@ static void storvsc_device_scan(struct work_struct *work) sdev = scsi_device_lookup(wrk->host, 0, wrk->tgt_id, wrk->lun); if (!sdev) goto done; - scsi_rescan_device(&sdev->sdev_gendev); + scsi_rescan_device(sdev); scsi_device_put(sdev);
done: diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index bd5633667d015..9d1bdcdc13312 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -325,7 +325,7 @@ static void virtscsi_handle_param_change(struct virtio_scsi *vscsi, /* Handle "Parameters changed", "Mode parameters changed", and "Capacity data has changed". */ if (asc == 0x2a && (ascq == 0x00 || ascq == 0x01 || ascq == 0x09)) - scsi_rescan_device(&sdev->sdev_gendev); + scsi_rescan_device(sdev);
scsi_device_put(sdev); } diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index a2b8d30c4c803..49f768d0ff370 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -764,7 +764,7 @@ scsi_template_proc_dir(const struct scsi_host_template *sht); #define scsi_template_proc_dir(sht) NULL #endif extern void scsi_scan_host(struct Scsi_Host *); -extern void scsi_rescan_device(struct device *); +extern void scsi_rescan_device(struct scsi_device *); extern void scsi_remove_host(struct Scsi_Host *); extern struct Scsi_Host *scsi_host_get(struct Scsi_Host *); extern int scsi_host_busy(struct Scsi_Host *shost);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
[ Upstream commit ff48b37802e5c134e2dfc4d091f10b2eb5065a72 ]
scsi_rescan_device() takes a scsi device lock before executing a device handler and device driver rescan methods. Waiting for the completion of any command issued to the device by these methods will thus be done with the device lock held. As a result, there is a risk of deadlocking within the power management code if scsi_rescan_device() is called to handle a device resume with the associated scsi device not yet resumed.
Avoid such situation by checking that the target scsi device is in the running state, that is, fully capable of executing commands, before proceeding with the rescan and bailout returning -EWOULDBLOCK otherwise. With this error return, the caller can retry rescaning the device after a delay.
The state check is done with the device lock held and is thus safe against incoming suspend power management operations.
Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Niklas Cassel niklas.cassel@wdc.com Tested-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Reviewed-by: Bart Van Assche bvanassche@acm.org Stable-dep-of: 8b4d9469d0b0 ("ata: libata-scsi: Fix delayed scsi_rescan_device() execution") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_scan.c | 18 +++++++++++++++++- include/scsi/scsi_host.h | 2 +- 2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index eaa972bee6c00..902655d759476 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1619,12 +1619,24 @@ int scsi_add_device(struct Scsi_Host *host, uint channel, } EXPORT_SYMBOL(scsi_add_device);
-void scsi_rescan_device(struct scsi_device *sdev) +int scsi_rescan_device(struct scsi_device *sdev) { struct device *dev = &sdev->sdev_gendev; + int ret = 0;
device_lock(dev);
+ /* + * Bail out if the device is not running. Otherwise, the rescan may + * block waiting for commands to be executed, with us holding the + * device lock. This can result in a potential deadlock in the power + * management core code when system resume is on-going. + */ + if (sdev->sdev_state != SDEV_RUNNING) { + ret = -EWOULDBLOCK; + goto unlock; + } + scsi_attach_vpd(sdev); scsi_cdl_check(sdev);
@@ -1638,7 +1650,11 @@ void scsi_rescan_device(struct scsi_device *sdev) drv->rescan(dev); module_put(dev->driver->owner); } + +unlock: device_unlock(dev); + + return ret; } EXPORT_SYMBOL(scsi_rescan_device);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 49f768d0ff370..4c2dc8150c6d7 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -764,7 +764,7 @@ scsi_template_proc_dir(const struct scsi_host_template *sht); #define scsi_template_proc_dir(sht) NULL #endif extern void scsi_scan_host(struct Scsi_Host *); -extern void scsi_rescan_device(struct scsi_device *); +extern int scsi_rescan_device(struct scsi_device *sdev); extern void scsi_remove_host(struct Scsi_Host *); extern struct Scsi_Host *scsi_host_get(struct Scsi_Host *); extern int scsi_host_busy(struct Scsi_Host *shost);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
[ Upstream commit 8b4d9469d0b0e553208ee6f62f2807111fde18b9 ]
Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") modified ata_scsi_dev_rescan() to check the scsi device "is_suspended" power field to ensure that the scsi device associated with an ATA device is fully resumed when scsi_rescan_device() is executed. However, this fix is problematic as: 1) It relies on a PM internal field that should not be used without PM device locking protection. 2) The check for is_suspended and the call to scsi_rescan_device() are not atomic and a suspend PM event may be triggered between them, casuing scsi_rescan_device() to be called on a suspended device and in that function blocking while holding the scsi device lock. This would deadlock a following resume operation. These problems can trigger PM deadlocks on resume, especially with resume operations triggered quickly after or during suspend operations. E.g., a simple bash script like:
for (( i=0; i<10; i++ )); do echo "+2 > /sys/class/rtc/rtc0/wakealarm echo mem > /sys/power/state done
that triggers a resume 2 seconds after starting suspending a system can quickly lead to a PM deadlock preventing the system from correctly resuming.
Fix this by replacing the check on is_suspended with a check on the return value given by scsi_rescan_device() as that function will fail if called against a suspended device. Also make sure rescan tasks already scheduled are first cancelled before suspending an ata port.
Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Niklas Cassel niklas.cassel@wdc.com Tested-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-core.c | 16 ++++++++++++++++ drivers/ata/libata-scsi.c | 33 +++++++++++++++------------------ 2 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 76bf185a73c65..6ae9cff6b50c5 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5245,11 +5245,27 @@ static const unsigned int ata_port_suspend_ehi = ATA_EHI_QUIET
static void ata_port_suspend(struct ata_port *ap, pm_message_t mesg) { + /* + * We are about to suspend the port, so we do not care about + * scsi_rescan_device() calls scheduled by previous resume operations. + * The next resume will schedule the rescan again. So cancel any rescan + * that is not done yet. + */ + cancel_delayed_work_sync(&ap->scsi_rescan_task); + ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, false); }
static void ata_port_suspend_async(struct ata_port *ap, pm_message_t mesg) { + /* + * We are about to suspend the port, so we do not care about + * scsi_rescan_device() calls scheduled by previous resume operations. + * The next resume will schedule the rescan again. So cancel any rescan + * that is not done yet. + */ + cancel_delayed_work_sync(&ap->scsi_rescan_task); + ata_port_request_pm(ap, mesg, 0, ata_port_suspend_ehi, true); }
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 22d7c26297889..ed3146c460910 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -4900,7 +4900,7 @@ void ata_scsi_dev_rescan(struct work_struct *work) struct ata_link *link; struct ata_device *dev; unsigned long flags; - bool delay_rescan = false; + int ret = 0;
mutex_lock(&ap->scsi_scan_mutex); spin_lock_irqsave(ap->lock, flags); @@ -4909,37 +4909,34 @@ void ata_scsi_dev_rescan(struct work_struct *work) ata_for_each_dev(dev, link, ENABLED) { struct scsi_device *sdev = dev->sdev;
+ /* + * If the port was suspended before this was scheduled, + * bail out. + */ + if (ap->pflags & ATA_PFLAG_SUSPENDED) + goto unlock; + if (!sdev) continue; if (scsi_device_get(sdev)) continue;
- /* - * If the rescan work was scheduled because of a resume - * event, the port is already fully resumed, but the - * SCSI device may not yet be fully resumed. In such - * case, executing scsi_rescan_device() may cause a - * deadlock with the PM code on device_lock(). Prevent - * this by giving up and retrying rescan after a short - * delay. - */ - delay_rescan = sdev->sdev_gendev.power.is_suspended; - if (delay_rescan) { - scsi_device_put(sdev); - break; - } - spin_unlock_irqrestore(ap->lock, flags); - scsi_rescan_device(sdev); + ret = scsi_rescan_device(sdev); scsi_device_put(sdev); spin_lock_irqsave(ap->lock, flags); + + if (ret) + goto unlock; } }
+unlock: spin_unlock_irqrestore(ap->lock, flags); mutex_unlock(&ap->scsi_scan_mutex);
- if (delay_rescan) + /* Reschedule with a delay if scsi_rescan_device() returned an error */ + if (ret) schedule_delayed_work(&ap->scsi_rescan_task, msecs_to_jiffies(5)); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit 6648cedd86135db197410e56b5372b2945f2b311 ]
btrfs_writepage_endio_finish_ordered is a small wrapper around btrfs_mark_ordered_io_finished that just changs the argument passing slightly, and adds a tracepoint.
Move the tracpoint to btrfs_mark_ordered_io_finished, which means it now also covers the error handling in btrfs_cleanup_ordered_extent and switch all callers to just call btrfs_mark_ordered_io_finished directly.
Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: b595d2599632 ("btrfs: don't clear uptodate on write errors") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/btrfs_inode.h | 3 --- fs/btrfs/extent_io.c | 17 ++++++++--------- fs/btrfs/inode.c | 9 --------- fs/btrfs/ordered-data.c | 4 ++++ 4 files changed, 12 insertions(+), 21 deletions(-)
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index d47a927b3504d..90e60ad9db620 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -501,9 +501,6 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page u64 start, u64 end, int *page_started, unsigned long *nr_written, struct writeback_control *wbc); int btrfs_writepage_cow_fixup(struct page *page); -void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, - struct page *page, u64 start, - u64 end, bool uptodate); int btrfs_encoded_io_compression_from_extent(struct btrfs_fs_info *fs_info, int compress_type); int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode, diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 7cc0ed7532793..b051b6e52022c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -504,17 +504,15 @@ void end_extent_writepage(struct page *page, int err, u64 start, u64 end) struct btrfs_inode *inode; const bool uptodate = (err == 0); int ret = 0; + u32 len = end + 1 - start;
+ ASSERT(end + 1 - start <= U32_MAX); ASSERT(page && page->mapping); inode = BTRFS_I(page->mapping->host); - btrfs_writepage_endio_finish_ordered(inode, page, start, end, uptodate); + btrfs_mark_ordered_io_finished(inode, page, start, len, uptodate);
if (!uptodate) { const struct btrfs_fs_info *fs_info = inode->root->fs_info; - u32 len; - - ASSERT(end + 1 - start <= U32_MAX); - len = end + 1 - start;
btrfs_page_clear_uptodate(fs_info, page, start, len); ret = err < 0 ? err : -EIO; @@ -1382,6 +1380,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
bio_ctrl->end_io_func = end_bio_extent_writepage; while (cur <= end) { + u32 len = end - cur + 1; u64 disk_bytenr; u64 em_end; u64 dirty_range_start = cur; @@ -1389,8 +1388,8 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, u32 iosize;
if (cur >= i_size) { - btrfs_writepage_endio_finish_ordered(inode, page, cur, - end, true); + btrfs_mark_ordered_io_finished(inode, page, cur, len, + true); /* * This range is beyond i_size, thus we don't need to * bother writing back. @@ -1399,7 +1398,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, * writeback the sectors with subpage dirty bits, * causing writeback without ordered extent. */ - btrfs_page_clear_dirty(fs_info, page, cur, end + 1 - cur); + btrfs_page_clear_dirty(fs_info, page, cur, len); break; }
@@ -1410,7 +1409,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, continue; }
- em = btrfs_get_extent(inode, NULL, 0, cur, end - cur + 1); + em = btrfs_get_extent(inode, NULL, 0, cur, len); if (IS_ERR(em)) { ret = PTR_ERR_OR_ZERO(em); goto out_error; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d5c112f6091b1..e0bb4018ddb28 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3391,15 +3391,6 @@ int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered) return btrfs_finish_one_ordered(ordered); }
-void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, - struct page *page, u64 start, - u64 end, bool uptodate) -{ - trace_btrfs_writepage_end_io_hook(inode, start, end, uptodate); - - btrfs_mark_ordered_io_finished(inode, page, start, end + 1 - start, uptodate); -} - /* * Verify the checksum for a single sector without any extra action that depend * on the type of I/O. diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 5b1aac3fc8e4a..eea5215280dfe 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -410,6 +410,10 @@ void btrfs_mark_ordered_io_finished(struct btrfs_inode *inode, unsigned long flags; u64 cur = file_offset;
+ trace_btrfs_writepage_end_io_hook(inode, file_offset, + file_offset + num_bytes - 1, + uptodate); + spin_lock_irqsave(&tree->lock, flags); while (cur < file_offset + num_bytes) { u64 entry_end;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit 9783e4deed7291996459858a1a16f41a8988dd60 ]
end_extent_writepage is a small helper that combines a call to btrfs_mark_ordered_io_finished with conditional error-only calls to btrfs_page_clear_uptodate and mapping_set_error with a somewhat unfortunate calling convention that passes and inclusive end instead of the len expected by the underlying functions.
Remove end_extent_writepage and open code it in the 4 callers. Out of those two already are error-only and thus don't need the extra conditional, and one already has the mapping_set_error, so a duplicate call can be avoided.
Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: b595d2599632 ("btrfs: don't clear uptodate on write errors") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 44 +++++++++++++++----------------------------- fs/btrfs/extent_io.h | 2 -- fs/btrfs/inode.c | 42 ++++++++++++++++++++++-------------------- 3 files changed, 37 insertions(+), 51 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index b051b6e52022c..009c322c5418d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -497,29 +497,6 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) btrfs_subpage_end_reader(fs_info, page, start, len); }
-/* lots and lots of room for performance fixes in the end_bio funcs */ - -void end_extent_writepage(struct page *page, int err, u64 start, u64 end) -{ - struct btrfs_inode *inode; - const bool uptodate = (err == 0); - int ret = 0; - u32 len = end + 1 - start; - - ASSERT(end + 1 - start <= U32_MAX); - ASSERT(page && page->mapping); - inode = BTRFS_I(page->mapping->host); - btrfs_mark_ordered_io_finished(inode, page, start, len, uptodate); - - if (!uptodate) { - const struct btrfs_fs_info *fs_info = inode->root->fs_info; - - btrfs_page_clear_uptodate(fs_info, page, start, len); - ret = err < 0 ? err : -EIO; - mapping_set_error(page->mapping, ret); - } -} - /* * after a writepage IO is done, we need to: * clear the uptodate bits on error @@ -1485,7 +1462,6 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl struct folio *folio = page_folio(page); struct inode *inode = page->mapping->host; const u64 page_start = page_offset(page); - const u64 page_end = page_start + PAGE_SIZE - 1; int ret; int nr = 0; size_t pg_offset; @@ -1529,8 +1505,13 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl set_page_writeback(page); end_page_writeback(page); } - if (ret) - end_extent_writepage(page, ret, page_start, page_end); + if (ret) { + btrfs_mark_ordered_io_finished(BTRFS_I(inode), page, page_start, + PAGE_SIZE, !ret); + btrfs_page_clear_uptodate(btrfs_sb(inode->i_sb), page, + page_start, PAGE_SIZE); + mapping_set_error(page->mapping, ret); + } unlock_page(page); ASSERT(ret <= 0); return ret; @@ -2248,6 +2229,7 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
while (cur <= end) { u64 cur_end = min(round_down(cur, PAGE_SIZE) + PAGE_SIZE - 1, end); + u32 cur_len = cur_end + 1 - cur; struct page *page; int nr = 0;
@@ -2271,9 +2253,13 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, set_page_writeback(page); end_page_writeback(page); } - if (ret) - end_extent_writepage(page, ret, cur, cur_end); - btrfs_page_unlock_writer(fs_info, page, cur, cur_end + 1 - cur); + if (ret) { + btrfs_mark_ordered_io_finished(BTRFS_I(inode), page, + cur, cur_len, !ret); + btrfs_page_clear_uptodate(fs_info, page, cur, cur_len); + mapping_set_error(page->mapping, ret); + } + btrfs_page_unlock_writer(fs_info, page, cur, cur_len); if (ret < 0) { found_error = true; first_error = ret; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index f61b7896320a1..e7b293717dc14 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -284,8 +284,6 @@ void btrfs_clear_buffer_dirty(struct btrfs_trans_handle *trans,
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array);
-void end_extent_writepage(struct page *page, int err, u64 start, u64 end); - #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS bool find_lock_delalloc_range(struct inode *inode, struct page *locked_page, u64 *start, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e0bb4018ddb28..b126394ca3dde 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -423,11 +423,10 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode,
while (index <= end_index) { /* - * For locked page, we will call end_extent_writepage() on it - * in run_delalloc_range() for the error handling. That - * end_extent_writepage() function will call - * btrfs_mark_ordered_io_finished() to clear page Ordered and - * run the ordered extent accounting. + * For locked page, we will call btrfs_mark_ordered_io_finished + * through btrfs_mark_ordered_io_finished() on it + * in run_delalloc_range() for the error handling, which will + * clear page Ordered and run the ordered extent accounting. * * Here we can't just clear the Ordered bit, or * btrfs_mark_ordered_io_finished() would skip the accounting @@ -1157,11 +1156,16 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1); if (locked_page) { const u64 page_start = page_offset(locked_page); - const u64 page_end = page_start + PAGE_SIZE - 1;
set_page_writeback(locked_page); end_page_writeback(locked_page); - end_extent_writepage(locked_page, ret, page_start, page_end); + btrfs_mark_ordered_io_finished(inode, locked_page, + page_start, PAGE_SIZE, + !ret); + btrfs_page_clear_uptodate(inode->root->fs_info, + locked_page, page_start, + PAGE_SIZE); + mapping_set_error(locked_page->mapping, ret); unlock_page(locked_page); } return ret; @@ -2840,23 +2844,19 @@ struct btrfs_writepage_fixup {
static void btrfs_writepage_fixup_worker(struct btrfs_work *work) { - struct btrfs_writepage_fixup *fixup; + struct btrfs_writepage_fixup *fixup = + container_of(work, struct btrfs_writepage_fixup, work); struct btrfs_ordered_extent *ordered; struct extent_state *cached_state = NULL; struct extent_changeset *data_reserved = NULL; - struct page *page; - struct btrfs_inode *inode; - u64 page_start; - u64 page_end; + struct page *page = fixup->page; + struct btrfs_inode *inode = fixup->inode; + struct btrfs_fs_info *fs_info = inode->root->fs_info; + u64 page_start = page_offset(page); + u64 page_end = page_offset(page) + PAGE_SIZE - 1; int ret = 0; bool free_delalloc_space = true;
- fixup = container_of(work, struct btrfs_writepage_fixup, work); - page = fixup->page; - inode = fixup->inode; - page_start = page_offset(page); - page_end = page_offset(page) + PAGE_SIZE - 1; - /* * This is similar to page_mkwrite, we need to reserve the space before * we take the page lock. @@ -2949,10 +2949,12 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work) * to reflect the errors and clean the page. */ mapping_set_error(page->mapping, ret); - end_extent_writepage(page, ret, page_start, page_end); + btrfs_mark_ordered_io_finished(inode, page, page_start, + PAGE_SIZE, !ret); + btrfs_page_clear_uptodate(fs_info, page, page_start, PAGE_SIZE); clear_page_dirty_for_io(page); } - btrfs_page_clear_checked(inode->root->fs_info, page, page_start, PAGE_SIZE); + btrfs_page_clear_checked(fs_info, page, page_start, PAGE_SIZE); unlock_page(page); put_page(page); kfree(fixup);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit b595d25996329427b2c09d4b90395a165fb3ef8e ]
We have been consistently seeing hangs with generic/648 in our subpage GitHub CI setup. This is a classic deadlock, we are calling btrfs_read_folio() on a folio, which requires holding the folio lock on the folio, and then finding a ordered extent that overlaps that range and calling btrfs_start_ordered_extent(), which then tries to write out the dirty page, which requires taking the folio lock and then we deadlock.
The hang happens because we're writing to range [1271750656, 1271767040), page index [77621, 77622], and page 77621 is !Uptodate. It is also Dirty, so we call btrfs_read_folio() for 77621 and which does btrfs_lock_and_flush_ordered_range() for that range, and we find an ordered extent which is [1271644160, 1271746560), page index [77615, 77621]. The page indexes overlap, but the actual bytes don't overlap. We're holding the page lock for 77621, then call btrfs_lock_and_flush_ordered_range() which tries to flush the dirty page, and tries to lock 77621 again and then we deadlock.
The byte ranges do not overlap, but with subpage support if we clear uptodate on any portion of the page we mark the entire thing as not uptodate.
We have been clearing page uptodate on write errors, but no other file system does this, and is in fact incorrect. This doesn't hurt us in the !subpage case because we can't end up with overlapped ranges that don't also overlap on the page.
Fix this by not clearing uptodate when we have a write error. The only thing we should be doing in this case is setting the mapping error and carrying on. This makes it so we would no longer call btrfs_read_folio() on the page as it's uptodate and eliminates the deadlock.
With this patch we're now able to make it through a full fstests run on our subpage blocksize VMs.
Note for stable backports: this probably goes beyond 6.1 but the code has been cleaned up and clearing the uptodate bit must be verified on each version independently.
CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 9 +-------- fs/btrfs/inode.c | 4 ---- 2 files changed, 1 insertion(+), 12 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 009c322c5418d..d8461c9aa2445 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -533,10 +533,8 @@ static void end_bio_extent_writepage(struct btrfs_bio *bbio) bvec->bv_offset, bvec->bv_len);
btrfs_finish_ordered_extent(bbio->ordered, page, start, len, !error); - if (error) { - btrfs_page_clear_uptodate(fs_info, page, start, len); + if (error) mapping_set_error(page->mapping, error); - } btrfs_page_clear_writeback(fs_info, page, start, len); }
@@ -1508,8 +1506,6 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl if (ret) { btrfs_mark_ordered_io_finished(BTRFS_I(inode), page, page_start, PAGE_SIZE, !ret); - btrfs_page_clear_uptodate(btrfs_sb(inode->i_sb), page, - page_start, PAGE_SIZE); mapping_set_error(page->mapping, ret); } unlock_page(page); @@ -1676,8 +1672,6 @@ static void extent_buffer_write_end_io(struct btrfs_bio *bbio) struct page *page = bvec->bv_page; u32 len = bvec->bv_len;
- if (!uptodate) - btrfs_page_clear_uptodate(fs_info, page, start, len); btrfs_page_clear_writeback(fs_info, page, start, len); bio_offset += len; } @@ -2256,7 +2250,6 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, if (ret) { btrfs_mark_ordered_io_finished(BTRFS_I(inode), page, cur, cur_len, !ret); - btrfs_page_clear_uptodate(fs_info, page, cur, cur_len); mapping_set_error(page->mapping, ret); } btrfs_page_unlock_writer(fs_info, page, cur, cur_len); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b126394ca3dde..0f4498dfa30c9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1162,9 +1162,6 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, btrfs_mark_ordered_io_finished(inode, locked_page, page_start, PAGE_SIZE, !ret); - btrfs_page_clear_uptodate(inode->root->fs_info, - locked_page, page_start, - PAGE_SIZE); mapping_set_error(locked_page->mapping, ret); unlock_page(locked_page); } @@ -2951,7 +2948,6 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work) mapping_set_error(page->mapping, ret); btrfs_mark_ordered_io_finished(inode, page, page_start, PAGE_SIZE, !ret); - btrfs_page_clear_uptodate(fs_info, page, page_start, PAGE_SIZE); clear_page_dirty_for_io(page); } btrfs_page_clear_checked(fs_info, page, page_start, PAGE_SIZE);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joey Gouly joey.gouly@arm.com
[ Upstream commit 7f86d128e437990fd08d9e66ae7c1571666cff8a ]
Add a HWCAP for FEAT_HBC, so that userspace can make a decision on using this feature.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20230804143746.3900803-2-joey.gouly@arm.com Signed-off-by: Will Deacon will@kernel.org Stable-dep-of: 479965a2b7ec ("arm64: cpufeature: Fix CLRBHB and BC detection") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/hwcap.h | 1 + arch/arm64/include/uapi/asm/hwcap.h | 1 + arch/arm64/kernel/cpufeature.c | 3 ++- arch/arm64/kernel/cpuinfo.c | 1 + 4 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h index 692b1ec663b27..521267478d187 100644 --- a/arch/arm64/include/asm/hwcap.h +++ b/arch/arm64/include/asm/hwcap.h @@ -138,6 +138,7 @@ #define KERNEL_HWCAP_SME_B16B16 __khwcap2_feature(SME_B16B16) #define KERNEL_HWCAP_SME_F16F16 __khwcap2_feature(SME_F16F16) #define KERNEL_HWCAP_MOPS __khwcap2_feature(MOPS) +#define KERNEL_HWCAP_HBC __khwcap2_feature(HBC)
/* * This yields a mask that user programs can use to figure out what diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h index a2cac4305b1e0..53026f45a5092 100644 --- a/arch/arm64/include/uapi/asm/hwcap.h +++ b/arch/arm64/include/uapi/asm/hwcap.h @@ -103,5 +103,6 @@ #define HWCAP2_SME_B16B16 (1UL << 41) #define HWCAP2_SME_F16F16 (1UL << 42) #define HWCAP2_MOPS (1UL << 43) +#define HWCAP2_HBC (1UL << 44)
#endif /* _UAPI__ASM_HWCAP_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index f9d456fe132d8..ac764c1dac363 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -222,7 +222,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0), - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_MOPS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 0), @@ -2844,6 +2844,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = { HWCAP_CAP(ID_AA64ISAR2_EL1, RPRES, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRES), HWCAP_CAP(ID_AA64ISAR2_EL1, WFxT, IMP, CAP_HWCAP, KERNEL_HWCAP_WFXT), HWCAP_CAP(ID_AA64ISAR2_EL1, MOPS, IMP, CAP_HWCAP, KERNEL_HWCAP_MOPS), + HWCAP_CAP(ID_AA64ISAR2_EL1, BC, IMP, CAP_HWCAP, KERNEL_HWCAP_HBC), #ifdef CONFIG_ARM64_SME HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME), HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 58622dc859177..98fda85005353 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -126,6 +126,7 @@ static const char *const hwcap_str[] = { [KERNEL_HWCAP_SME_B16B16] = "smeb16b16", [KERNEL_HWCAP_SME_F16F16] = "smef16f16", [KERNEL_HWCAP_MOPS] = "mops", + [KERNEL_HWCAP_HBC] = "hbc", };
#ifdef CONFIG_COMPAT
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kristina Martsenko kristina.martsenko@arm.com
[ Upstream commit 479965a2b7ec481737df0cadf553331063b9c343 ]
ClearBHB support is indicated by the CLRBHB field in ID_AA64ISAR2_EL1. Following some refactoring the kernel incorrectly checks the BC field instead. Fix the detection to use the right field.
(Note: The original ClearBHB support had it as FTR_HIGHER_SAFE, but this patch uses FTR_LOWER_SAFE, which seems more correct.)
Also fix the detection of BC (hinted conditional branches) to use FTR_LOWER_SAFE, so that it is not reported on mismatched systems.
Fixes: 356137e68a9f ("arm64/sysreg: Make BHB clear feature defines match the architecture") Fixes: 8fcc8285c0e3 ("arm64/sysreg: Convert ID_AA64ISAR2_EL1 to automatic generation") Cc: stable@vger.kernel.org Signed-off-by: Kristina Martsenko kristina.martsenko@arm.com Reviewed-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20230912133429.2606875-1-kristina.martsenko@arm.co... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/cpufeature.h | 2 +- arch/arm64/kernel/cpufeature.c | 3 ++- arch/arm64/tools/sysreg | 6 +++++- 3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 96e50227f940e..5bba393760557 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -663,7 +663,7 @@ static inline bool supports_clearbhb(int scope) isar2 = read_sanitised_ftr_reg(SYS_ID_AA64ISAR2_EL1);
return cpuid_feature_extract_unsigned_field(isar2, - ID_AA64ISAR2_EL1_BC_SHIFT); + ID_AA64ISAR2_EL1_CLRBHB_SHIFT); }
const struct cpumask *system_32bit_el0_cpumask(void); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index ac764c1dac363..2c0b8444fea67 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -222,7 +222,8 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0), - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CLRBHB_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_MOPS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 0), diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index 65866bf819c33..ffc81afa6caca 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1347,7 +1347,11 @@ UnsignedEnum 51:48 RPRFM 0b0000 NI 0b0001 IMP EndEnum -Res0 47:28 +Res0 47:32 +UnsignedEnum 31:28 CLRBHB + 0b0000 NI + 0b0001 IMP +EndEnum UnsignedEnum 27:24 PAC_frac 0b0000 NI 0b0001 IMP
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Patrick Rohr prohr@google.com
commit 1671bcfd76fdc0b9e65153cf759153083755fe4c upstream.
This change adds a new sysctl accept_ra_min_rtr_lft to specify the minimum acceptable router lifetime in an RA. If the received RA router lifetime is less than the configured value (and not 0), the RA is ignored. This is useful for mobile devices, whose battery life can be impacted by networks that configure RAs with a short lifetime. On such networks, the device should never gain IPv6 provisioning and should attempt to drop RAs via hardware offload, if available.
Signed-off-by: Patrick Rohr prohr@google.com Cc: Maciej Żenczykowski maze@google.com Cc: Lorenzo Colitti lorenzo@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/networking/ip-sysctl.rst | 8 ++++++++ include/linux/ipv6.h | 1 + include/uapi/linux/ipv6.h | 1 + net/ipv6/addrconf.c | 10 ++++++++++ net/ipv6/ndisc.c | 18 ++++++++++++++++-- 5 files changed, 36 insertions(+), 2 deletions(-)
--- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2287,6 +2287,14 @@ accept_ra_min_hop_limit - INTEGER
Default: 1
+accept_ra_min_rtr_lft - INTEGER + Minimum acceptable router lifetime in Router Advertisement. + + RAs with a router lifetime less than this value shall be + ignored. RAs with a router lifetime of 0 are unaffected. + + Default: 0 + accept_ra_pinfo - BOOLEAN Learn Prefix Information in Router Advertisement.
--- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -33,6 +33,7 @@ struct ipv6_devconf { __s32 accept_ra_defrtr; __u32 ra_defrtr_metric; __s32 accept_ra_min_hop_limit; + __s32 accept_ra_min_rtr_lft; __s32 accept_ra_pinfo; __s32 ignore_routes_with_linkdown; #ifdef CONFIG_IPV6_ROUTER_PREF --- a/include/uapi/linux/ipv6.h +++ b/include/uapi/linux/ipv6.h @@ -198,6 +198,7 @@ enum { DEVCONF_IOAM6_ID_WIDE, DEVCONF_NDISC_EVICT_NOCARRIER, DEVCONF_ACCEPT_UNTRACKED_NA, + DEVCONF_ACCEPT_RA_MIN_RTR_LFT, DEVCONF_MAX };
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -202,6 +202,7 @@ static struct ipv6_devconf ipv6_devconf .ra_defrtr_metric = IP6_RT_PRIO_USER, .accept_ra_from_local = 0, .accept_ra_min_hop_limit= 1, + .accept_ra_min_rtr_lft = 0, .accept_ra_pinfo = 1, #ifdef CONFIG_IPV6_ROUTER_PREF .accept_ra_rtr_pref = 1, @@ -262,6 +263,7 @@ static struct ipv6_devconf ipv6_devconf_ .ra_defrtr_metric = IP6_RT_PRIO_USER, .accept_ra_from_local = 0, .accept_ra_min_hop_limit= 1, + .accept_ra_min_rtr_lft = 0, .accept_ra_pinfo = 1, #ifdef CONFIG_IPV6_ROUTER_PREF .accept_ra_rtr_pref = 1, @@ -5602,6 +5604,7 @@ static inline void ipv6_store_devconf(st array[DEVCONF_IOAM6_ID_WIDE] = cnf->ioam6_id_wide; array[DEVCONF_NDISC_EVICT_NOCARRIER] = cnf->ndisc_evict_nocarrier; array[DEVCONF_ACCEPT_UNTRACKED_NA] = cnf->accept_untracked_na; + array[DEVCONF_ACCEPT_RA_MIN_RTR_LFT] = cnf->accept_ra_min_rtr_lft; }
static inline size_t inet6_ifla6_size(void) @@ -6794,6 +6797,13 @@ static const struct ctl_table addrconf_s .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, + }, + { + .procname = "accept_ra_min_rtr_lft", + .data = &ipv6_devconf.accept_ra_min_rtr_lft, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, }, { .procname = "accept_ra_pinfo", --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1281,6 +1281,8 @@ static enum skb_drop_reason ndisc_router if (!ndisc_parse_options(skb->dev, opt, optlen, &ndopts)) return SKB_DROP_REASON_IPV6_NDISC_BAD_OPTIONS;
+ lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); + if (!ipv6_accept_ra(in6_dev)) { ND_PRINTK(2, info, "RA: %s, did not accept ra for dev: %s\n", @@ -1288,6 +1290,13 @@ static enum skb_drop_reason ndisc_router goto skip_linkparms; }
+ if (lifetime != 0 && lifetime < in6_dev->cnf.accept_ra_min_rtr_lft) { + ND_PRINTK(2, info, + "RA: router lifetime (%ds) is too short: %s\n", + lifetime, skb->dev->name); + goto skip_linkparms; + } + #ifdef CONFIG_IPV6_NDISC_NODETYPE /* skip link-specific parameters from interior routers */ if (skb->ndisc_nodetype == NDISC_NODETYPE_NODEFAULT) { @@ -1340,8 +1349,6 @@ static enum skb_drop_reason ndisc_router goto skip_defrtr; }
- lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); - #ifdef CONFIG_IPV6_ROUTER_PREF pref = ra_msg->icmph.icmp6_router_pref; /* 10b is handled as if it were 00b (medium) */ @@ -1493,6 +1500,13 @@ skip_linkparms: goto out; }
+ if (lifetime != 0 && lifetime < in6_dev->cnf.accept_ra_min_rtr_lft) { + ND_PRINTK(2, info, + "RA: router lifetime (%ds) is too short: %s\n", + lifetime, skb->dev->name); + goto out; + } + #ifdef CONFIG_IPV6_ROUTE_INFO if (!in6_dev->cnf.accept_ra_from_local && ipv6_chk_addr(dev_net(in6_dev->dev), &ipv6_hdr(skb)->saddr,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Patrick Rohr prohr@google.com
commit 5027d54a9c30bc7ec808360378e2b4753f053f25 upstream.
accept_ra_min_rtr_lft only considered the lifetime of the default route and discarded entire RAs accordingly.
This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and applies the value to individual RA sections; in particular, router lifetime, PIO preferred lifetime, and RIO lifetime. If any of those lifetimes are lower than the configured value, the specific RA section is ignored.
In order for the sysctl to be useful to Android, it should really apply to all lifetimes in the RA, since that is what determines the minimum frequency at which RAs must be processed by the kernel. Android uses hardware offloads to drop RAs for a fraction of the minimum of all lifetimes present in the RA (some networks have very frequent RAs (5s) with high lifetimes (2h)). Despite this, we have encountered networks that set the router lifetime to 30s which results in very frequent CPU wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the WiFi firmware) entirely on such networks, it seems better to ignore the misconfigured routers while still processing RAs from other IPv6 routers on the same network (i.e. to support IoT applications).
The previous implementation dropped the entire RA based on router lifetime. This turned out to be hard to expand to the other lifetimes present in the RA in a consistent manner; dropping the entire RA based on RIO/PIO lifetimes would essentially require parsing the whole thing twice.
Fixes: 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft") Cc: Lorenzo Colitti lorenzo@google.com Signed-off-by: Patrick Rohr prohr@google.com Reviewed-by: Maciej Żenczykowski maze@google.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/networking/ip-sysctl.rst | 8 ++++---- include/linux/ipv6.h | 2 +- include/uapi/linux/ipv6.h | 2 +- net/ipv6/addrconf.c | 13 ++++++++----- net/ipv6/ndisc.c | 27 +++++++++++---------------- 5 files changed, 25 insertions(+), 27 deletions(-)
--- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2287,11 +2287,11 @@ accept_ra_min_hop_limit - INTEGER
Default: 1
-accept_ra_min_rtr_lft - INTEGER - Minimum acceptable router lifetime in Router Advertisement. +accept_ra_min_lft - INTEGER + Minimum acceptable lifetime value in Router Advertisement.
- RAs with a router lifetime less than this value shall be - ignored. RAs with a router lifetime of 0 are unaffected. + RA sections with a lifetime less than this value shall be + ignored. Zero lifetimes stay unaffected.
Default: 0
--- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -33,7 +33,7 @@ struct ipv6_devconf { __s32 accept_ra_defrtr; __u32 ra_defrtr_metric; __s32 accept_ra_min_hop_limit; - __s32 accept_ra_min_rtr_lft; + __s32 accept_ra_min_lft; __s32 accept_ra_pinfo; __s32 ignore_routes_with_linkdown; #ifdef CONFIG_IPV6_ROUTER_PREF --- a/include/uapi/linux/ipv6.h +++ b/include/uapi/linux/ipv6.h @@ -198,7 +198,7 @@ enum { DEVCONF_IOAM6_ID_WIDE, DEVCONF_NDISC_EVICT_NOCARRIER, DEVCONF_ACCEPT_UNTRACKED_NA, - DEVCONF_ACCEPT_RA_MIN_RTR_LFT, + DEVCONF_ACCEPT_RA_MIN_LFT, DEVCONF_MAX };
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -202,7 +202,7 @@ static struct ipv6_devconf ipv6_devconf .ra_defrtr_metric = IP6_RT_PRIO_USER, .accept_ra_from_local = 0, .accept_ra_min_hop_limit= 1, - .accept_ra_min_rtr_lft = 0, + .accept_ra_min_lft = 0, .accept_ra_pinfo = 1, #ifdef CONFIG_IPV6_ROUTER_PREF .accept_ra_rtr_pref = 1, @@ -263,7 +263,7 @@ static struct ipv6_devconf ipv6_devconf_ .ra_defrtr_metric = IP6_RT_PRIO_USER, .accept_ra_from_local = 0, .accept_ra_min_hop_limit= 1, - .accept_ra_min_rtr_lft = 0, + .accept_ra_min_lft = 0, .accept_ra_pinfo = 1, #ifdef CONFIG_IPV6_ROUTER_PREF .accept_ra_rtr_pref = 1, @@ -2733,6 +2733,9 @@ void addrconf_prefix_rcv(struct net_devi return; }
+ if (valid_lft != 0 && valid_lft < in6_dev->cnf.accept_ra_min_lft) + return; + /* * Two things going on here: * 1) Add routes for on-link prefixes @@ -5604,7 +5607,7 @@ static inline void ipv6_store_devconf(st array[DEVCONF_IOAM6_ID_WIDE] = cnf->ioam6_id_wide; array[DEVCONF_NDISC_EVICT_NOCARRIER] = cnf->ndisc_evict_nocarrier; array[DEVCONF_ACCEPT_UNTRACKED_NA] = cnf->accept_untracked_na; - array[DEVCONF_ACCEPT_RA_MIN_RTR_LFT] = cnf->accept_ra_min_rtr_lft; + array[DEVCONF_ACCEPT_RA_MIN_LFT] = cnf->accept_ra_min_lft; }
static inline size_t inet6_ifla6_size(void) @@ -6799,8 +6802,8 @@ static const struct ctl_table addrconf_s .proc_handler = proc_dointvec, }, { - .procname = "accept_ra_min_rtr_lft", - .data = &ipv6_devconf.accept_ra_min_rtr_lft, + .procname = "accept_ra_min_lft", + .data = &ipv6_devconf.accept_ra_min_lft, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1281,8 +1281,6 @@ static enum skb_drop_reason ndisc_router if (!ndisc_parse_options(skb->dev, opt, optlen, &ndopts)) return SKB_DROP_REASON_IPV6_NDISC_BAD_OPTIONS;
- lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); - if (!ipv6_accept_ra(in6_dev)) { ND_PRINTK(2, info, "RA: %s, did not accept ra for dev: %s\n", @@ -1290,13 +1288,6 @@ static enum skb_drop_reason ndisc_router goto skip_linkparms; }
- if (lifetime != 0 && lifetime < in6_dev->cnf.accept_ra_min_rtr_lft) { - ND_PRINTK(2, info, - "RA: router lifetime (%ds) is too short: %s\n", - lifetime, skb->dev->name); - goto skip_linkparms; - } - #ifdef CONFIG_IPV6_NDISC_NODETYPE /* skip link-specific parameters from interior routers */ if (skb->ndisc_nodetype == NDISC_NODETYPE_NODEFAULT) { @@ -1337,6 +1328,14 @@ static enum skb_drop_reason ndisc_router goto skip_defrtr; }
+ lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime); + if (lifetime != 0 && lifetime < in6_dev->cnf.accept_ra_min_lft) { + ND_PRINTK(2, info, + "RA: router lifetime (%ds) is too short: %s\n", + lifetime, skb->dev->name); + goto skip_defrtr; + } + /* Do not accept RA with source-addr found on local machine unless * accept_ra_from_local is set to true. */ @@ -1500,13 +1499,6 @@ skip_linkparms: goto out; }
- if (lifetime != 0 && lifetime < in6_dev->cnf.accept_ra_min_rtr_lft) { - ND_PRINTK(2, info, - "RA: router lifetime (%ds) is too short: %s\n", - lifetime, skb->dev->name); - goto out; - } - #ifdef CONFIG_IPV6_ROUTE_INFO if (!in6_dev->cnf.accept_ra_from_local && ipv6_chk_addr(dev_net(in6_dev->dev), &ipv6_hdr(skb)->saddr, @@ -1531,6 +1523,9 @@ skip_linkparms: if (ri->prefix_len == 0 && !in6_dev->cnf.accept_ra_defrtr) continue; + if (ri->lifetime != 0 && + ntohl(ri->lifetime) < in6_dev->cnf.accept_ra_min_lft) + continue; if (ri->prefix_len < in6_dev->cnf.accept_ra_rt_info_min_plen) continue; if (ri->prefix_len > in6_dev->cnf.accept_ra_rt_info_max_plen)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Patrick Rohr prohr@google.com
commit 5cb249686e67dbef3ffe53887fa725eefc5a7144 upstream.
addrconf_prefix_rcv returned early without releasing the inet6_dev pointer when the PIO lifetime is less than accept_ra_min_lft.
Fixes: 5027d54a9c30 ("net: change accept_ra_min_rtr_lft to affect all RA lifetimes") Cc: Maciej Żenczykowski maze@google.com Cc: Lorenzo Colitti lorenzo@google.com Cc: David Ahern dsahern@kernel.org Cc: Simon Horman horms@kernel.org Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Maciej Żenczykowski maze@google.com Signed-off-by: Patrick Rohr prohr@google.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/addrconf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2734,7 +2734,7 @@ void addrconf_prefix_rcv(struct net_devi }
if (valid_lft != 0 && valid_lft < in6_dev->cnf.accept_ra_min_lft) - return; + goto put;
/* * Two things going on here:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robin Murphy robin.murphy@arm.com
[ Upstream commit eb6c97647be227822c7ce23655482b05e348fba5 ]
Although io-pgtable's non-leaf invalidations are always for full tables, I missed that SVA also uses non-leaf invalidations, while being at the mercy of whatever range the MMU notifier throws at it. This means it definitely wants the previous TTL fix as well, since it also doesn't know exactly which leaf level(s) may need invalidating, but it can also give us less-aligned ranges wherein certain corners may lead to building an invalid command where TTL, Num and Scale are all 0. It should be fine to handle this by over-invalidating an extra page, since falling back to a non-range command opens up a whole can of errata-flavoured worms.
Fixes: 6833b8f2e199 ("iommu/arm-smmu-v3: Set TTL invalidation hint better") Reported-by: Rui Zhu zhurui3@huawei.com Signed-off-by: Robin Murphy robin.murphy@arm.com Link: https://lore.kernel.org/r/b99cfe71af2bd93a8a2930f20967fb2a4f7748dd.169443273... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9b0dc35056019..6ccbae9b93a14 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1895,18 +1895,23 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd, /* Get the leaf page size */ tg = __ffs(smmu_domain->domain.pgsize_bitmap);
+ num_pages = size >> tg; + /* Convert page size of 12,14,16 (log2) to 1,2,3 */ cmd->tlbi.tg = (tg - 10) / 2;
/* - * Determine what level the granule is at. For non-leaf, io-pgtable - * assumes .tlb_flush_walk can invalidate multiple levels at once, - * so ignore the nominal last-level granule and leave TTL=0. + * Determine what level the granule is at. For non-leaf, both + * io-pgtable and SVA pass a nominal last-level granule because + * they don't know what level(s) actually apply, so ignore that + * and leave TTL=0. However for various errata reasons we still + * want to use a range command, so avoid the SVA corner case + * where both scale and num could be 0 as well. */ if (cmd->tlbi.leaf) cmd->tlbi.ttl = 4 - ((ilog2(granule) - 3) / (tg - 3)); - - num_pages = size >> tg; + else if ((num_pages & CMDQ_TLBI_RANGE_NUM_MAX) == 1) + num_pages++; }
cmds.num = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liam R. Howlett Liam.Howlett@oracle.com
commit fec29364348fec535c55708b1f4025b321aba572 upstream.
mas_prealloc() may walk partially down the tree before finding that a split or spanning store is needed. When the write occurs, relax the logic on resetting the walk so that partial walks will not restart, but walks that have gone too far (a store that affects beyond the current node) should be restarted.
Link: https://lkml.kernel.org/r/20230724183157.3939892-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Cc: Peng Zhang zhangpeng.00@bytedance.com Cc: Suren Baghdasaryan surenb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/maple_tree.c | 37 ++++++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 11 deletions(-)
--- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -5439,19 +5439,34 @@ static inline void mte_destroy_walk(stru
static void mas_wr_store_setup(struct ma_wr_state *wr_mas) { + if (mas_is_start(wr_mas->mas)) + return; + if (unlikely(mas_is_paused(wr_mas->mas))) - mas_reset(wr_mas->mas); + goto reset; + + if (unlikely(mas_is_none(wr_mas->mas))) + goto reset; + + /* + * A less strict version of mas_is_span_wr() where we allow spanning + * writes within this node. This is to stop partial walks in + * mas_prealloc() from being reset. + */ + if (wr_mas->mas->last > wr_mas->mas->max) + goto reset; + + if (wr_mas->entry) + return; + + if (mte_is_leaf(wr_mas->mas->node) && + wr_mas->mas->last == wr_mas->mas->max) + goto reset; + + return;
- if (!mas_is_start(wr_mas->mas)) { - if (mas_is_none(wr_mas->mas)) { - mas_reset(wr_mas->mas); - } else { - wr_mas->r_max = wr_mas->mas->max; - wr_mas->type = mte_node_type(wr_mas->mas->node); - if (mas_is_span_wr(wr_mas)) - mas_reset(wr_mas->mas); - } - } +reset: + mas_reset(wr_mas->mas); }
/* Interface */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liam R. Howlett Liam.Howlett@oracle.com
commit a8091f039c1ebf5cb0d5261e3613f18eb2a5d8b7 upstream.
When updating the maple tree iterator to avoid rewalks, an issue was introduced when shifting beyond the limits. This can be seen by trying to go to the previous address of 0, which would set the maple node to MAS_NONE and keep the range as the last entry.
Subsequent calls to mas_find() would then search upwards from mas->last and skip the value at mas->index/mas->last. This showed up as a bug in mprotect which skips the actual VMA at the current range after attempting to go to the previous VMA from 0.
Since MAS_NONE may already be set when searching for a value that isn't contained within a node, changing the handling of MAS_NONE in mas_find() would make the code more complicated and error prone. Furthermore, there was no way to tell which limit was hit, and thus which action to take (next or the entry at the current range).
This solution is to add two states to track what happened with the previous iterator action. This allows for the expected behaviour of the next command to return the correct item (either the item at the range requested, or the next/previous).
Tests are also added and updated accordingly.
Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle... Fixes: 39193685d585 ("maple_tree: try harder to keep active node with mas_prev()") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Reported-by: Pedro Falcato pedro.falcato@gmail.com Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b Closes: https://bugs.archlinux.org/task/79656 Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/maple_tree.h | 2 lib/maple_tree.c | 221 +++++++++++++++++++++++++++++++++------------ lib/test_maple_tree.c | 87 ++++++++++++++--- 3 files changed, 237 insertions(+), 73 deletions(-)
--- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -420,6 +420,8 @@ struct ma_wr_state { #define MAS_ROOT ((struct maple_enode *)5UL) #define MAS_NONE ((struct maple_enode *)9UL) #define MAS_PAUSE ((struct maple_enode *)17UL) +#define MAS_OVERFLOW ((struct maple_enode *)33UL) +#define MAS_UNDERFLOW ((struct maple_enode *)65UL) #define MA_ERROR(err) \ ((struct maple_enode *)(((unsigned long)err << 2) | 2UL))
--- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -255,6 +255,22 @@ bool mas_is_err(struct ma_state *mas) return xa_is_err(mas->node); }
+static __always_inline bool mas_is_overflow(struct ma_state *mas) +{ + if (unlikely(mas->node == MAS_OVERFLOW)) + return true; + + return false; +} + +static __always_inline bool mas_is_underflow(struct ma_state *mas) +{ + if (unlikely(mas->node == MAS_UNDERFLOW)) + return true; + + return false; +} + static inline bool mas_searchable(struct ma_state *mas) { if (mas_is_none(mas)) @@ -4560,10 +4576,13 @@ no_entry: * * @mas: The maple state * @max: The minimum starting range + * @empty: Can be empty + * @set_underflow: Set the @mas->node to underflow state on limit. * * Return: The entry in the previous slot which is possibly NULL */ -static void *mas_prev_slot(struct ma_state *mas, unsigned long min, bool empty) +static void *mas_prev_slot(struct ma_state *mas, unsigned long min, bool empty, + bool set_underflow) { void *entry; void __rcu **slots; @@ -4580,7 +4599,6 @@ retry: if (unlikely(mas_rewalk_if_dead(mas, node, save_point))) goto retry;
-again: if (mas->min <= min) { pivot = mas_safe_min(mas, pivots, mas->offset);
@@ -4588,9 +4606,10 @@ again: goto retry;
if (pivot <= min) - return NULL; + goto underflow; }
+again: if (likely(mas->offset)) { mas->offset--; mas->last = mas->index - 1; @@ -4602,7 +4621,7 @@ again: }
if (mas_is_none(mas)) - return NULL; + goto underflow;
mas->last = mas->max; node = mas_mn(mas); @@ -4619,10 +4638,19 @@ again: if (likely(entry)) return entry;
- if (!empty) + if (!empty) { + if (mas->index <= min) + goto underflow; + goto again; + }
return entry; + +underflow: + if (set_underflow) + mas->node = MAS_UNDERFLOW; + return NULL; }
/* @@ -4712,10 +4740,13 @@ no_entry: * @mas: The maple state * @max: The maximum starting range * @empty: Can be empty + * @set_overflow: Should @mas->node be set to overflow when the limit is + * reached. * * Return: The entry in the next slot which is possibly NULL */ -static void *mas_next_slot(struct ma_state *mas, unsigned long max, bool empty) +static void *mas_next_slot(struct ma_state *mas, unsigned long max, bool empty, + bool set_overflow) { void __rcu **slots; unsigned long *pivots; @@ -4734,22 +4765,22 @@ retry: if (unlikely(mas_rewalk_if_dead(mas, node, save_point))) goto retry;
-again: if (mas->max >= max) { if (likely(mas->offset < data_end)) pivot = pivots[mas->offset]; else - return NULL; /* must be mas->max */ + goto overflow;
if (unlikely(mas_rewalk_if_dead(mas, node, save_point))) goto retry;
if (pivot >= max) - return NULL; + goto overflow; }
if (likely(mas->offset < data_end)) { mas->index = pivots[mas->offset] + 1; +again: mas->offset++; if (likely(mas->offset < data_end)) mas->last = pivots[mas->offset]; @@ -4761,8 +4792,11 @@ again: goto retry; }
- if (mas_is_none(mas)) + if (WARN_ON_ONCE(mas_is_none(mas))) { + mas->node = MAS_OVERFLOW; return NULL; + goto overflow; + }
mas->offset = 0; mas->index = mas->min; @@ -4781,12 +4815,20 @@ again: return entry;
if (!empty) { - if (!mas->offset) - data_end = 2; + if (mas->last >= max) + goto overflow; + + mas->index = mas->last + 1; + /* Node cannot end on NULL, so it's safe to short-cut here */ goto again; }
return entry; + +overflow: + if (set_overflow) + mas->node = MAS_OVERFLOW; + return NULL; }
/* @@ -4796,17 +4838,20 @@ again: * * Set the @mas->node to the next entry and the range_start to * the beginning value for the entry. Does not check beyond @limit. - * Sets @mas->index and @mas->last to the limit if it is hit. + * Sets @mas->index and @mas->last to the range, Does not update @mas->index and + * @mas->last on overflow. * Restarts on dead nodes. * * Return: the next entry or %NULL. */ static inline void *mas_next_entry(struct ma_state *mas, unsigned long limit) { - if (mas->last >= limit) + if (mas->last >= limit) { + mas->node = MAS_OVERFLOW; return NULL; + }
- return mas_next_slot(mas, limit, false); + return mas_next_slot(mas, limit, false, true); }
/* @@ -4982,7 +5027,7 @@ void *mas_walk(struct ma_state *mas) { void *entry;
- if (mas_is_none(mas) || mas_is_paused(mas) || mas_is_ptr(mas)) + if (!mas_is_active(mas) || !mas_is_start(mas)) mas->node = MAS_START; retry: entry = mas_state_walk(mas); @@ -5439,14 +5484,22 @@ static inline void mte_destroy_walk(stru
static void mas_wr_store_setup(struct ma_wr_state *wr_mas) { - if (mas_is_start(wr_mas->mas)) - return; + if (!mas_is_active(wr_mas->mas)) { + if (mas_is_start(wr_mas->mas)) + return;
- if (unlikely(mas_is_paused(wr_mas->mas))) - goto reset; + if (unlikely(mas_is_paused(wr_mas->mas))) + goto reset;
- if (unlikely(mas_is_none(wr_mas->mas))) - goto reset; + if (unlikely(mas_is_none(wr_mas->mas))) + goto reset; + + if (unlikely(mas_is_overflow(wr_mas->mas))) + goto reset; + + if (unlikely(mas_is_underflow(wr_mas->mas))) + goto reset; + }
/* * A less strict version of mas_is_span_wr() where we allow spanning @@ -5697,8 +5750,25 @@ static inline bool mas_next_setup(struct { bool was_none = mas_is_none(mas);
- if (mas_is_none(mas) || mas_is_paused(mas)) + if (unlikely(mas->last >= max)) { + mas->node = MAS_OVERFLOW; + return true; + } + + if (mas_is_active(mas)) + return false; + + if (mas_is_none(mas) || mas_is_paused(mas)) { + mas->node = MAS_START; + } else if (mas_is_overflow(mas)) { + /* Overflowed before, but the max changed */ mas->node = MAS_START; + } else if (mas_is_underflow(mas)) { + mas->node = MAS_START; + *entry = mas_walk(mas); + if (*entry) + return true; + }
if (mas_is_start(mas)) *entry = mas_walk(mas); /* Retries on dead nodes handled by mas_walk */ @@ -5717,6 +5787,7 @@ static inline bool mas_next_setup(struct
if (mas_is_none(mas)) return true; + return false; }
@@ -5739,7 +5810,7 @@ void *mas_next(struct ma_state *mas, uns return entry;
/* Retries on dead nodes handled by mas_next_slot */ - return mas_next_slot(mas, max, false); + return mas_next_slot(mas, max, false, true); } EXPORT_SYMBOL_GPL(mas_next);
@@ -5762,7 +5833,7 @@ void *mas_next_range(struct ma_state *ma return entry;
/* Retries on dead nodes handled by mas_next_slot */ - return mas_next_slot(mas, max, true); + return mas_next_slot(mas, max, true, true); } EXPORT_SYMBOL_GPL(mas_next_range);
@@ -5789,18 +5860,31 @@ EXPORT_SYMBOL_GPL(mt_next); static inline bool mas_prev_setup(struct ma_state *mas, unsigned long min, void **entry) { - if (mas->index <= min) - goto none; + if (unlikely(mas->index <= min)) { + mas->node = MAS_UNDERFLOW; + return true; + }
- if (mas_is_none(mas) || mas_is_paused(mas)) + if (mas_is_active(mas)) + return false; + + if (mas_is_overflow(mas)) { mas->node = MAS_START; + *entry = mas_walk(mas); + if (*entry) + return true; + }
- if (mas_is_start(mas)) { - mas_walk(mas); - if (!mas->index) - goto none; + if (mas_is_none(mas) || mas_is_paused(mas)) { + mas->node = MAS_START; + } else if (mas_is_underflow(mas)) { + /* underflowed before but the min changed */ + mas->node = MAS_START; }
+ if (mas_is_start(mas)) + mas_walk(mas); + if (unlikely(mas_is_ptr(mas))) { if (!mas->index) goto none; @@ -5845,7 +5929,7 @@ void *mas_prev(struct ma_state *mas, uns if (mas_prev_setup(mas, min, &entry)) return entry;
- return mas_prev_slot(mas, min, false); + return mas_prev_slot(mas, min, false, true); } EXPORT_SYMBOL_GPL(mas_prev);
@@ -5868,7 +5952,7 @@ void *mas_prev_range(struct ma_state *ma if (mas_prev_setup(mas, min, &entry)) return entry;
- return mas_prev_slot(mas, min, true); + return mas_prev_slot(mas, min, true, true); } EXPORT_SYMBOL_GPL(mas_prev_range);
@@ -5922,24 +6006,35 @@ EXPORT_SYMBOL_GPL(mas_pause); static inline bool mas_find_setup(struct ma_state *mas, unsigned long max, void **entry) { - *entry = NULL; + if (mas_is_active(mas)) { + if (mas->last < max) + return false; + + return true; + }
- if (unlikely(mas_is_none(mas))) { + if (mas_is_paused(mas)) { if (unlikely(mas->last >= max)) return true;
- mas->index = mas->last; + mas->index = ++mas->last; mas->node = MAS_START; - } else if (unlikely(mas_is_paused(mas))) { + } else if (mas_is_none(mas)) { if (unlikely(mas->last >= max)) return true;
+ mas->index = mas->last; mas->node = MAS_START; - mas->index = ++mas->last; - } else if (unlikely(mas_is_ptr(mas))) - goto ptr_out_of_range; + } else if (mas_is_overflow(mas) || mas_is_underflow(mas)) { + if (mas->index > max) { + mas->node = MAS_OVERFLOW; + return true; + }
- if (unlikely(mas_is_start(mas))) { + mas->node = MAS_START; + } + + if (mas_is_start(mas)) { /* First run or continue */ if (mas->index > max) return true; @@ -5989,7 +6084,7 @@ void *mas_find(struct ma_state *mas, uns return entry;
/* Retries on dead nodes handled by mas_next_slot */ - return mas_next_slot(mas, max, false); + return mas_next_slot(mas, max, false, false); } EXPORT_SYMBOL_GPL(mas_find);
@@ -6007,13 +6102,13 @@ EXPORT_SYMBOL_GPL(mas_find); */ void *mas_find_range(struct ma_state *mas, unsigned long max) { - void *entry; + void *entry = NULL;
if (mas_find_setup(mas, max, &entry)) return entry;
/* Retries on dead nodes handled by mas_next_slot */ - return mas_next_slot(mas, max, true); + return mas_next_slot(mas, max, true, false); } EXPORT_SYMBOL_GPL(mas_find_range);
@@ -6028,26 +6123,36 @@ EXPORT_SYMBOL_GPL(mas_find_range); static inline bool mas_find_rev_setup(struct ma_state *mas, unsigned long min, void **entry) { - *entry = NULL; - - if (unlikely(mas_is_none(mas))) { - if (mas->index <= min) - goto none; + if (mas_is_active(mas)) { + if (mas->index > min) + return false;
- mas->last = mas->index; - mas->node = MAS_START; + return true; }
- if (unlikely(mas_is_paused(mas))) { + if (mas_is_paused(mas)) { if (unlikely(mas->index <= min)) { mas->node = MAS_NONE; return true; } mas->node = MAS_START; mas->last = --mas->index; + } else if (mas_is_none(mas)) { + if (mas->index <= min) + goto none; + + mas->last = mas->index; + mas->node = MAS_START; + } else if (mas_is_underflow(mas) || mas_is_overflow(mas)) { + if (mas->last <= min) { + mas->node = MAS_UNDERFLOW; + return true; + } + + mas->node = MAS_START; }
- if (unlikely(mas_is_start(mas))) { + if (mas_is_start(mas)) { /* First run or continue */ if (mas->index < min) return true; @@ -6098,13 +6203,13 @@ none: */ void *mas_find_rev(struct ma_state *mas, unsigned long min) { - void *entry; + void *entry = NULL;
if (mas_find_rev_setup(mas, min, &entry)) return entry;
/* Retries on dead nodes handled by mas_prev_slot */ - return mas_prev_slot(mas, min, false); + return mas_prev_slot(mas, min, false, false);
} EXPORT_SYMBOL_GPL(mas_find_rev); @@ -6124,13 +6229,13 @@ EXPORT_SYMBOL_GPL(mas_find_rev); */ void *mas_find_range_rev(struct ma_state *mas, unsigned long min) { - void *entry; + void *entry = NULL;
if (mas_find_rev_setup(mas, min, &entry)) return entry;
/* Retries on dead nodes handled by mas_prev_slot */ - return mas_prev_slot(mas, min, true); + return mas_prev_slot(mas, min, true, false); } EXPORT_SYMBOL_GPL(mas_find_range_rev);
--- a/lib/test_maple_tree.c +++ b/lib/test_maple_tree.c @@ -2039,7 +2039,7 @@ static noinline void __init next_prev_te MT_BUG_ON(mt, val != NULL); MT_BUG_ON(mt, mas.index != 0); MT_BUG_ON(mt, mas.last != 5); - MT_BUG_ON(mt, mas.node != MAS_NONE); + MT_BUG_ON(mt, mas.node != MAS_UNDERFLOW);
mas.index = 0; mas.last = 5; @@ -2790,6 +2790,7 @@ static noinline void __init check_empty_ * exists MAS_NONE active range * exists active active range * DNE active active set to last range + * ERANGE active MAS_OVERFLOW last range * * Function ENTRY Start Result index & last * mas_prev() @@ -2818,6 +2819,7 @@ static noinline void __init check_empty_ * any MAS_ROOT MAS_NONE 0 * exists active active range * DNE active active last range + * ERANGE active MAS_UNDERFLOW last range * * Function ENTRY Start Result index & last * mas_find() @@ -2828,7 +2830,7 @@ static noinline void __init check_empty_ * DNE MAS_START MAS_NONE 0 * DNE MAS_PAUSE MAS_NONE 0 * DNE MAS_ROOT MAS_NONE 0 - * DNE MAS_NONE MAS_NONE 0 + * DNE MAS_NONE MAS_NONE 1 * if index == 0 * exists MAS_START MAS_ROOT 0 * exists MAS_PAUSE MAS_ROOT 0 @@ -2840,7 +2842,7 @@ static noinline void __init check_empty_ * DNE MAS_START active set to max * exists MAS_PAUSE active range * DNE MAS_PAUSE active set to max - * exists MAS_NONE active range + * exists MAS_NONE active range (start at last) * exists active active range * DNE active active last range (max < last) * @@ -2865,7 +2867,7 @@ static noinline void __init check_empty_ * DNE MAS_START active set to min * exists MAS_PAUSE active range * DNE MAS_PAUSE active set to min - * exists MAS_NONE active range + * exists MAS_NONE active range (start at index) * exists active active range * DNE active active last range (min > index) * @@ -2912,10 +2914,10 @@ static noinline void __init check_state_ mtree_store_range(mt, 0, 0, ptr, GFP_KERNEL);
mas_lock(&mas); - /* prev: Start -> none */ + /* prev: Start -> underflow*/ entry = mas_prev(&mas, 0); MT_BUG_ON(mt, entry != NULL); - MT_BUG_ON(mt, mas.node != MAS_NONE); + MT_BUG_ON(mt, mas.node != MAS_UNDERFLOW);
/* prev: Start -> root */ mas_set(&mas, 10); @@ -2942,7 +2944,7 @@ static noinline void __init check_state_ MT_BUG_ON(mt, entry != NULL); MT_BUG_ON(mt, mas.node != MAS_NONE);
- /* next: start -> none */ + /* next: start -> none*/ mas_set(&mas, 10); entry = mas_next(&mas, ULONG_MAX); MT_BUG_ON(mt, mas.index != 1); @@ -3141,25 +3143,46 @@ static noinline void __init check_state_ MT_BUG_ON(mt, mas.last != 0x2500); MT_BUG_ON(mt, !mas_active(mas));
- /* next:active -> active out of range*/ + /* next:active -> active beyond data */ entry = mas_next(&mas, 0x2999); MT_BUG_ON(mt, entry != NULL); MT_BUG_ON(mt, mas.index != 0x2501); MT_BUG_ON(mt, mas.last != 0x2fff); MT_BUG_ON(mt, !mas_active(mas));
- /* Continue after out of range*/ + /* Continue after last range ends after max */ entry = mas_next(&mas, ULONG_MAX); MT_BUG_ON(mt, entry != ptr3); MT_BUG_ON(mt, mas.index != 0x3000); MT_BUG_ON(mt, mas.last != 0x3500); MT_BUG_ON(mt, !mas_active(mas));
- /* next:active -> active out of range*/ + /* next:active -> active continued */ + entry = mas_next(&mas, ULONG_MAX); + MT_BUG_ON(mt, entry != NULL); + MT_BUG_ON(mt, mas.index != 0x3501); + MT_BUG_ON(mt, mas.last != ULONG_MAX); + MT_BUG_ON(mt, !mas_active(mas)); + + /* next:active -> overflow */ entry = mas_next(&mas, ULONG_MAX); MT_BUG_ON(mt, entry != NULL); MT_BUG_ON(mt, mas.index != 0x3501); MT_BUG_ON(mt, mas.last != ULONG_MAX); + MT_BUG_ON(mt, mas.node != MAS_OVERFLOW); + + /* next:overflow -> overflow */ + entry = mas_next(&mas, ULONG_MAX); + MT_BUG_ON(mt, entry != NULL); + MT_BUG_ON(mt, mas.index != 0x3501); + MT_BUG_ON(mt, mas.last != ULONG_MAX); + MT_BUG_ON(mt, mas.node != MAS_OVERFLOW); + + /* prev:overflow -> active */ + entry = mas_prev(&mas, 0); + MT_BUG_ON(mt, entry != ptr3); + MT_BUG_ON(mt, mas.index != 0x3000); + MT_BUG_ON(mt, mas.last != 0x3500); MT_BUG_ON(mt, !mas_active(mas));
/* next: none -> active, skip value at location */ @@ -3180,11 +3203,46 @@ static noinline void __init check_state_ MT_BUG_ON(mt, mas.last != 0x1500); MT_BUG_ON(mt, !mas_active(mas));
- /* prev:active -> active out of range*/ + /* prev:active -> active spanning end range */ + entry = mas_prev(&mas, 0x0100); + MT_BUG_ON(mt, entry != NULL); + MT_BUG_ON(mt, mas.index != 0); + MT_BUG_ON(mt, mas.last != 0x0FFF); + MT_BUG_ON(mt, !mas_active(mas)); + + /* prev:active -> underflow */ + entry = mas_prev(&mas, 0); + MT_BUG_ON(mt, entry != NULL); + MT_BUG_ON(mt, mas.index != 0); + MT_BUG_ON(mt, mas.last != 0x0FFF); + MT_BUG_ON(mt, mas.node != MAS_UNDERFLOW); + + /* prev:underflow -> underflow */ entry = mas_prev(&mas, 0); MT_BUG_ON(mt, entry != NULL); MT_BUG_ON(mt, mas.index != 0); MT_BUG_ON(mt, mas.last != 0x0FFF); + MT_BUG_ON(mt, mas.node != MAS_UNDERFLOW); + + /* next:underflow -> active */ + entry = mas_next(&mas, ULONG_MAX); + MT_BUG_ON(mt, entry != ptr); + MT_BUG_ON(mt, mas.index != 0x1000); + MT_BUG_ON(mt, mas.last != 0x1500); + MT_BUG_ON(mt, !mas_active(mas)); + + /* prev:first value -> underflow */ + entry = mas_prev(&mas, 0x1000); + MT_BUG_ON(mt, entry != NULL); + MT_BUG_ON(mt, mas.index != 0x1000); + MT_BUG_ON(mt, mas.last != 0x1500); + MT_BUG_ON(mt, mas.node != MAS_UNDERFLOW); + + /* find:underflow -> first value */ + entry = mas_find(&mas, ULONG_MAX); + MT_BUG_ON(mt, entry != ptr); + MT_BUG_ON(mt, mas.index != 0x1000); + MT_BUG_ON(mt, mas.last != 0x1500); MT_BUG_ON(mt, !mas_active(mas));
/* prev: pause ->active */ @@ -3198,14 +3256,14 @@ static noinline void __init check_state_ MT_BUG_ON(mt, mas.last != 0x2500); MT_BUG_ON(mt, !mas_active(mas));
- /* prev:active -> active out of range*/ + /* prev:active -> active spanning min */ entry = mas_prev(&mas, 0x1600); MT_BUG_ON(mt, entry != NULL); MT_BUG_ON(mt, mas.index != 0x1501); MT_BUG_ON(mt, mas.last != 0x1FFF); MT_BUG_ON(mt, !mas_active(mas));
- /* prev: active ->active, continue*/ + /* prev: active ->active, continue */ entry = mas_prev(&mas, 0); MT_BUG_ON(mt, entry != ptr); MT_BUG_ON(mt, mas.index != 0x1000); @@ -3252,7 +3310,7 @@ static noinline void __init check_state_ MT_BUG_ON(mt, mas.last != 0x2FFF); MT_BUG_ON(mt, !mas_active(mas));
- /* find: none ->active */ + /* find: overflow ->active */ entry = mas_find(&mas, 0x5000); MT_BUG_ON(mt, entry != ptr3); MT_BUG_ON(mt, mas.index != 0x3000); @@ -3637,7 +3695,6 @@ static int __init maple_tree_seed(void) check_empty_area_fill(&tree); mtree_destroy(&tree);
- mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); check_state_handling(&tree); mtree_destroy(&tree);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hector Martin marcan@marcan.st
commit c7bd8a1f45bada7725d11266df7fd5cb549b3098 upstream.
Commit a4fdd9762272 ("iommu: Use flush queue capability") hid the IOMMU_DOMAIN_DMA_FQ domain type from domain allocation. A check was introduced in iommu_dma_init_domain() to fall back if not supported, but this check runs too late: by that point, devices have been attached to the IOMMU, and apple-dart's attach_dev() callback does not expect IOMMU_DOMAIN_DMA_FQ domains.
Change the logic so the IOMMU_DOMAIN_DMA codepath is the default, instead of explicitly enumerating all types.
Fixes an apple-dart regression in v6.5.
Cc: regressions@lists.linux.dev Cc: stable@vger.kernel.org Suggested-by: Robin Murphy robin.murphy@arm.com Fixes: a4fdd9762272 ("iommu: Use flush queue capability") Signed-off-by: Hector Martin marcan@marcan.st Reviewed-by: Neal Gompa neal@gompa.dev Reviewed-by: Jason Gunthorpe jgg@nvidia.com Link: https://lore.kernel.org/r/20230922-iommu-type-regression-v2-1-689b2ba9b673@m... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/apple-dart.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index 2082081402d3..0b8927508427 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -671,8 +671,7 @@ static int apple_dart_attach_dev(struct iommu_domain *domain, return ret;
switch (domain->type) { - case IOMMU_DOMAIN_DMA: - case IOMMU_DOMAIN_UNMANAGED: + default: ret = apple_dart_domain_add_streams(dart_domain, cfg); if (ret) return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dinghao Liu dinghao.liu@zju.edu.cn
commit b481f644d9174670b385c3a699617052cd2a79d3 upstream.
When device_register() fails, zfcp_port_release() will be called after put_device(). As a result, zfcp_ccw_adapter_put() will be called twice: one in zfcp_port_release() and one in the error path after device_register(). So the reference on the adapter object is doubly put, which may lead to a premature free. Fix this by adjusting the error tag after device_register().
Fixes: f3450c7b9172 ("[SCSI] zfcp: Replace local reference counting with common kref") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Link: https://lore.kernel.org/r/20230923103723.10320-1-dinghao.liu@zju.edu.cn Acked-by: Benjamin Block bblock@linux.ibm.com Cc: stable@vger.kernel.org # v2.6.33+ Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/scsi/zfcp_aux.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/drivers/s390/scsi/zfcp_aux.c +++ b/drivers/s390/scsi/zfcp_aux.c @@ -518,12 +518,12 @@ struct zfcp_port *zfcp_port_enqueue(stru if (port) { put_device(&port->dev); retval = -EEXIST; - goto err_out; + goto err_put; }
port = kzalloc(sizeof(struct zfcp_port), GFP_KERNEL); if (!port) - goto err_out; + goto err_put;
rwlock_init(&port->unit_list_lock); INIT_LIST_HEAD(&port->unit_list); @@ -546,7 +546,7 @@ struct zfcp_port *zfcp_port_enqueue(stru
if (dev_set_name(&port->dev, "0x%016llx", (unsigned long long)wwpn)) { kfree(port); - goto err_out; + goto err_put; } retval = -EINVAL;
@@ -563,7 +563,8 @@ struct zfcp_port *zfcp_port_enqueue(stru
return port;
-err_out: +err_put: zfcp_ccw_adapter_put(adapter); +err_out: return ERR_PTR(retval); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Rui rui.zhang@intel.com
commit 59df44bfb0ca4c3ee1f1c3c5d0ee8e314844799e upstream.
The iommu_suspend() syscore suspend callback is invoked with IRQ disabled. Allocating memory with the GFP_KERNEL flag may re-enable IRQs during the suspend callback, which can cause intermittent suspend/hibernation problems with the following kernel traces:
Calling iommu_suspend+0x0/0x1d0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 15 at kernel/time/timekeeping.c:868 ktime_get+0x9b/0xb0 ... CPU: 0 PID: 15 Comm: rcu_preempt Tainted: G U E 6.3-intel #r1 RIP: 0010:ktime_get+0x9b/0xb0 ... Call Trace: <IRQ> tick_sched_timer+0x22/0x90 ? __pfx_tick_sched_timer+0x10/0x10 __hrtimer_run_queues+0x111/0x2b0 hrtimer_interrupt+0xfa/0x230 __sysvec_apic_timer_interrupt+0x63/0x140 sysvec_apic_timer_interrupt+0x7b/0xa0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1f/0x30 ... ------------[ cut here ]------------ Interrupts enabled after iommu_suspend+0x0/0x1d0 WARNING: CPU: 0 PID: 27420 at drivers/base/syscore.c:68 syscore_suspend+0x147/0x270 CPU: 0 PID: 27420 Comm: rtcwake Tainted: G U W E 6.3-intel #r1 RIP: 0010:syscore_suspend+0x147/0x270 ... Call Trace: <TASK> hibernation_snapshot+0x25b/0x670 hibernate+0xcd/0x390 state_store+0xcf/0xe0 kobj_attr_store+0x13/0x30 sysfs_kf_write+0x3f/0x50 kernfs_fop_write_iter+0x128/0x200 vfs_write+0x1fd/0x3c0 ksys_write+0x6f/0xf0 __x64_sys_write+0x1d/0x30 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc
Given that only 4 words memory is needed, avoid the memory allocation in iommu_suspend().
CC: stable@kernel.org Fixes: 33e07157105e ("iommu/vt-d: Avoid GFP_ATOMIC where it is not needed") Signed-off-by: Zhang Rui rui.zhang@intel.com Tested-by: Ooi, Chin Hao chin.hao.ooi@intel.com Link: https://lore.kernel.org/r/20230921093956.234692-1-rui.zhang@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20230925120417.55977-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel/iommu.c | 16 ---------------- drivers/iommu/intel/iommu.h | 2 +- 2 files changed, 1 insertion(+), 17 deletions(-)
--- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3004,13 +3004,6 @@ static int iommu_suspend(void) struct intel_iommu *iommu = NULL; unsigned long flag;
- for_each_active_iommu(iommu, drhd) { - iommu->iommu_state = kcalloc(MAX_SR_DMAR_REGS, sizeof(u32), - GFP_KERNEL); - if (!iommu->iommu_state) - goto nomem; - } - iommu_flush_all();
for_each_active_iommu(iommu, drhd) { @@ -3030,12 +3023,6 @@ static int iommu_suspend(void) raw_spin_unlock_irqrestore(&iommu->register_lock, flag); } return 0; - -nomem: - for_each_active_iommu(iommu, drhd) - kfree(iommu->iommu_state); - - return -ENOMEM; }
static void iommu_resume(void) @@ -3067,9 +3054,6 @@ static void iommu_resume(void)
raw_spin_unlock_irqrestore(&iommu->register_lock, flag); } - - for_each_active_iommu(iommu, drhd) - kfree(iommu->iommu_state); }
static struct syscore_ops iommu_syscore_ops = { --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -680,7 +680,7 @@ struct intel_iommu { struct iopf_queue *iopf_queue; unsigned char iopfq_name[16]; struct q_inval *qi; /* Queued invalidation info */ - u32 *iommu_state; /* Store iommu states between suspend and resume.*/ + u32 iommu_state[MAX_SR_DMAR_REGS]; /* Store iommu states between suspend and resume.*/
#ifdef CONFIG_IRQ_REMAP struct ir_table *ir_table; /* Interrupt remapping info */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haiyang Zhang haiyangz@microsoft.com
commit b2b000069a4c307b09548dc2243f31f3ca0eac9c upstream.
For an unknown TX CQE error type (probably from a newer hardware), still free the SKB, update the queue tail, etc., otherwise the accounting will be wrong.
Also, TX errors can be triggered by injecting corrupted packets, so replace the WARN_ONCE to ratelimited error logging.
Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang haiyangz@microsoft.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/microsoft/mana/mana_en.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1315,19 +1315,23 @@ static void mana_poll_tx_cq(struct mana_ case CQE_TX_VPORT_IDX_OUT_OF_RANGE: case CQE_TX_VPORT_DISABLED: case CQE_TX_VLAN_TAGGING_VIOLATION: - WARN_ONCE(1, "TX: CQE error %d: ignored.\n", - cqe_oob->cqe_hdr.cqe_type); + if (net_ratelimit()) + netdev_err(ndev, "TX: CQE error %d\n", + cqe_oob->cqe_hdr.cqe_type); + apc->eth_stats.tx_cqe_err++; break;
default: - /* If the CQE type is unexpected, log an error, assert, - * and go through the error path. + /* If the CQE type is unknown, log an error, + * and still free the SKB, update tail, etc. */ - WARN_ONCE(1, "TX: Unexpected CQE type %d: HW BUG?\n", - cqe_oob->cqe_hdr.cqe_type); + if (net_ratelimit()) + netdev_err(ndev, "TX: unknown CQE type %d\n", + cqe_oob->cqe_hdr.cqe_type); + apc->eth_stats.tx_cqe_unknown_type++; - return; + break; }
if (WARN_ON_ONCE(txq->gdma_txq_id != completions[i].wq_num))
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Garzarella sgarzare@redhat.com
commit 7aed44babc7f97e82b38e9a68515e699692cc100 upstream.
In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of the `iov` has 0 lenght. In this case, we should skip the iov and go to the next one. But calling vringh_kiov_advance() with 0 lenght does not cause the advancement, since it returns immediately if asked to advance by 0 bytes.
Let's restore the code that was there before commit b8c06ad4d67d ("vringh: implement vringh_kiov_advance()"), avoiding using vringh_kiov_advance().
Fixes: b8c06ad4d67d ("vringh: implement vringh_kiov_advance()") Cc: stable@vger.kernel.org Reported-by: Jason Wang jasowang@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vhost/vringh.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/drivers/vhost/vringh.c +++ b/drivers/vhost/vringh.c @@ -123,8 +123,18 @@ static inline ssize_t vringh_iov_xfer(st done += partlen; len -= partlen; ptr += partlen; + iov->consumed += partlen; + iov->iov[iov->i].iov_len -= partlen; + iov->iov[iov->i].iov_base += partlen;
- vringh_kiov_advance(iov, partlen); + if (!iov->iov[iov->i].iov_len) { + /* Fix up old iov element then increment. */ + iov->iov[iov->i].iov_len = iov->consumed; + iov->iov[iov->i].iov_base -= iov->consumed; + + iov->consumed = 0; + iov->i++; + } } return done; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Marangi ansuelsmth@gmail.com
commit fcdfc462881d8acf9db77f483b2c821e286ca97b upstream.
While searching for possible refactor of napi_schedule_prep and __napi_schedule it was notice that the mtk eth driver disable the interrupt for rx and tx AFTER napi is scheduled.
While this is a very hard to repro case it might happen to have situation where the interrupt is disabled and never enabled again as the napi completes and the interrupt is enabled before.
This is caused by the fact that a napi driven by interrupt expect a logic with: 1. interrupt received. napi prepared -> interrupt disabled -> napi scheduled 2. napi triggered. ring cleared -> interrupt enabled -> wait for new interrupt
To prevent this case, disable the interrupt BEFORE the napi is scheduled.
Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi ansuelsmth@gmail.com Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -3036,8 +3036,8 @@ static irqreturn_t mtk_handle_irq_rx(int
eth->rx_events++; if (likely(napi_schedule_prep(ð->rx_napi))) { - __napi_schedule(ð->rx_napi); mtk_rx_irq_disable(eth, eth->soc->txrx.rx_irq_done_mask); + __napi_schedule(ð->rx_napi); }
return IRQ_HANDLED; @@ -3049,8 +3049,8 @@ static irqreturn_t mtk_handle_irq_tx(int
eth->tx_events++; if (likely(napi_schedule_prep(ð->tx_napi))) { - __napi_schedule(ð->tx_napi); mtk_tx_irq_disable(eth, MTK_TX_DONE_INT); + __napi_schedule(ð->tx_napi); }
return IRQ_HANDLED;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit a5efdbcece83af94180e8d7c0a6e22947318499d upstream.
The delegated action infrastructure is prone to the following race: different CPUs can try to schedule different delegated actions on the same subflow at the same time.
Each of them will check different bits via mptcp_subflow_delegate(), and will try to schedule the action on the related per-cpu napi instance.
Depending on the timing, both can observe an empty delegated list node, causing the same entry to be added simultaneously on two different lists.
The root cause is that the delegated actions infra does not provide a single synchronization point. Address the issue reserving an additional bit to mark the subflow as scheduled for delegation. Acquiring such bit guarantee the caller to own the delegated list node, and being able to safely schedule the subflow.
Clear such bit only when the subflow scheduling is completed, ensuring proper barrier in place.
Additionally swap the meaning of the delegated_action bitmask, to allow the usage of the existing helper to set multiple bit at once.
Fixes: bcd97734318d ("mptcp: use delegate action to schedule 3rd ack retrans") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau martineau@kernel.org Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-1-28de4ac663ae@kerne... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 28 ++++++++++++++-------------- net/mptcp/protocol.h | 35 ++++++++++++----------------------- net/mptcp/subflow.c | 10 ++++++++-- 3 files changed, 34 insertions(+), 39 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3409,24 +3409,21 @@ static void schedule_3rdack_retransmissi sk_reset_timer(ssk, &icsk->icsk_delack_timer, timeout); }
-void mptcp_subflow_process_delegated(struct sock *ssk) +void mptcp_subflow_process_delegated(struct sock *ssk, long status) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct sock *sk = subflow->conn;
- if (test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) { + if (status & BIT(MPTCP_DELEGATE_SEND)) { mptcp_data_lock(sk); if (!sock_owned_by_user(sk)) __mptcp_subflow_push_pending(sk, ssk, true); else __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); - mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_SEND); } - if (test_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status)) { + if (status & BIT(MPTCP_DELEGATE_ACK)) schedule_3rdack_retransmission(ssk); - mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_ACK); - } }
static int mptcp_hash(struct sock *sk) @@ -3932,14 +3929,17 @@ static int mptcp_napi_poll(struct napi_s struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
bh_lock_sock_nested(ssk); - if (!sock_owned_by_user(ssk) && - mptcp_subflow_has_delegated_action(subflow)) - mptcp_subflow_process_delegated(ssk); - /* ... elsewhere tcp_release_cb_override already processed - * the action or will do at next release_sock(). - * In both case must dequeue the subflow here - on the same - * CPU that scheduled it. - */ + if (!sock_owned_by_user(ssk)) { + mptcp_subflow_process_delegated(ssk, xchg(&subflow->delegated_status, 0)); + } else { + /* tcp_release_cb_override already processed + * the action or will do at next release_sock(). + * In both case must dequeue the subflow here - on the same + * CPU that scheduled it. + */ + smp_wmb(); + clear_bit(MPTCP_DELEGATE_SCHEDULED, &subflow->delegated_status); + } bh_unlock_sock(ssk); sock_put(ssk);
--- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -440,9 +440,11 @@ struct mptcp_delegated_action {
DECLARE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions);
-#define MPTCP_DELEGATE_SEND 0 -#define MPTCP_DELEGATE_ACK 1 +#define MPTCP_DELEGATE_SCHEDULED 0 +#define MPTCP_DELEGATE_SEND 1 +#define MPTCP_DELEGATE_ACK 2
+#define MPTCP_DELEGATE_ACTIONS_MASK (~BIT(MPTCP_DELEGATE_SCHEDULED)) /* MPTCP subflow context */ struct mptcp_subflow_context { struct list_head node;/* conn_list of subflows */ @@ -559,23 +561,24 @@ mptcp_subflow_get_mapped_dsn(const struc return subflow->map_seq + mptcp_subflow_get_map_offset(subflow); }
-void mptcp_subflow_process_delegated(struct sock *ssk); +void mptcp_subflow_process_delegated(struct sock *ssk, long actions);
static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *subflow, int action) { + long old, set_bits = BIT(MPTCP_DELEGATE_SCHEDULED) | BIT(action); struct mptcp_delegated_action *delegated; bool schedule;
/* the caller held the subflow bh socket lock */ lockdep_assert_in_softirq();
- /* The implied barrier pairs with mptcp_subflow_delegated_done(), and - * ensures the below list check sees list updates done prior to status - * bit changes + /* The implied barrier pairs with tcp_release_cb_override() + * mptcp_napi_poll(), and ensures the below list check sees list + * updates done prior to delegated status bits changes */ - if (!test_and_set_bit(action, &subflow->delegated_status)) { - /* still on delegated list from previous scheduling */ - if (!list_empty(&subflow->delegated_node)) + old = set_mask_bits(&subflow->delegated_status, 0, set_bits); + if (!(old & BIT(MPTCP_DELEGATE_SCHEDULED))) { + if (WARN_ON_ONCE(!list_empty(&subflow->delegated_node))) return;
delegated = this_cpu_ptr(&mptcp_delegated_actions); @@ -600,20 +603,6 @@ mptcp_subflow_delegated_next(struct mptc return ret; }
-static inline bool mptcp_subflow_has_delegated_action(const struct mptcp_subflow_context *subflow) -{ - return !!READ_ONCE(subflow->delegated_status); -} - -static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *subflow, int action) -{ - /* pairs with mptcp_subflow_delegate, ensures delegate_node is updated before - * touching the status bit - */ - smp_wmb(); - clear_bit(action, &subflow->delegated_status); -} - int mptcp_is_enabled(const struct net *net); unsigned int mptcp_get_add_addr_timeout(const struct net *net); int mptcp_is_checksum_enabled(const struct net *net); --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1956,9 +1956,15 @@ static void subflow_ulp_clone(const stru static void tcp_release_cb_override(struct sock *ssk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + long status;
- if (mptcp_subflow_has_delegated_action(subflow)) - mptcp_subflow_process_delegated(ssk); + /* process and clear all the pending actions, but leave the subflow into + * the napi queue. To respect locking, only the same CPU that originated + * the action can touch the list. mptcp_napi_poll will take care of it. + */ + status = set_mask_bits(&subflow->delegated_status, MPTCP_DELEGATE_ACTIONS_MASK, 0); + if (status) + mptcp_subflow_process_delegated(ssk, status);
tcp_release_cb(ssk); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geliang Tang geliang.tang@suse.com
commit e5ed101a602873d65d2d64edaba93e8c73ec1b0f upstream.
This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow creating additional subflows with the local addr ID 0.
There is no reason not to allow additional subflows from this local address: we should be able to create new subflows from the initial endpoint. This limitation was breaking fullmesh support from userspace.
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391 Cc: stable@vger.kernel.org Suggested-by: Matthieu Baerts matthieu.baerts@tessares.net Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Geliang Tang geliang.tang@suse.com Signed-off-by: Mat Martineau martineau@kernel.org Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-2-28de4ac663ae@kerne... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/pm_userspace.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/net/mptcp/pm_userspace.c +++ b/net/mptcp/pm_userspace.c @@ -307,12 +307,6 @@ int mptcp_nl_cmd_sf_create(struct sk_buf goto create_err; }
- if (addr_l.id == 0) { - NL_SET_ERR_MSG_ATTR(info->extack, laddr, "missing local addr id"); - err = -EINVAL; - goto create_err; - } - err = mptcp_pm_parse_addr(raddr, info, &addr_r); if (err < 0) { NL_SET_ERR_MSG_ATTR(info->extack, raddr, "error parsing remote addr");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavoars@kernel.org
commit eea03d18af9c44235865a4bc9bec4d780ef6cf21 upstream.
The flexible structure (a structure that contains a flexible-array member at the end) `qed_ll2_tx_packet` is nested within the second layer of `struct qed_ll2_info`:
struct qed_ll2_tx_packet { ... /* Flexible Array of bds_set determined by max_bds_per_packet */ struct { struct core_tx_bd *txq_bd; dma_addr_t tx_frag; u16 frag_len; } bds_set[]; };
struct qed_ll2_tx_queue { ... struct qed_ll2_tx_packet cur_completing_packet; };
struct qed_ll2_info { ... struct qed_ll2_tx_queue tx_queue; struct qed_ll2_cbs cbs; };
The problem is that member `cbs` in `struct qed_ll2_info` is placed just after an object of type `struct qed_ll2_tx_queue`, which is in itself an implicit flexible structure, which by definition ends in a flexible array member, in this case `bds_set`. This causes an undefined behavior bug at run-time when dynamic memory is allocated for `bds_set`, which could lead to a serious issue if `cbs` in `struct qed_ll2_info` is overwritten by the contents of `bds_set`. Notice that the type of `cbs` is a structure full of function pointers (and a cookie :) ):
include/linux/qed/qed_ll2_if.h: 107 typedef 108 void (*qed_ll2_complete_rx_packet_cb)(void *cxt, 109 struct qed_ll2_comp_rx_data *data); 110 111 typedef 112 void (*qed_ll2_release_rx_packet_cb)(void *cxt, 113 u8 connection_handle, 114 void *cookie, 115 dma_addr_t rx_buf_addr, 116 bool b_last_packet); 117 118 typedef 119 void (*qed_ll2_complete_tx_packet_cb)(void *cxt, 120 u8 connection_handle, 121 void *cookie, 122 dma_addr_t first_frag_addr, 123 bool b_last_fragment, 124 bool b_last_packet); 125 126 typedef 127 void (*qed_ll2_release_tx_packet_cb)(void *cxt, 128 u8 connection_handle, 129 void *cookie, 130 dma_addr_t first_frag_addr, 131 bool b_last_fragment, bool b_last_packet); 132 133 typedef 134 void (*qed_ll2_slowpath_cb)(void *cxt, u8 connection_handle, 135 u32 opaque_data_0, u32 opaque_data_1); 136 137 struct qed_ll2_cbs { 138 qed_ll2_complete_rx_packet_cb rx_comp_cb; 139 qed_ll2_release_rx_packet_cb rx_release_cb; 140 qed_ll2_complete_tx_packet_cb tx_comp_cb; 141 qed_ll2_release_tx_packet_cb tx_release_cb; 142 qed_ll2_slowpath_cb slowpath_cb; 143 void *cookie; 144 };
Fix this by moving the declaration of `cbs` to the middle of its containing structure `qed_ll2_info`, preventing it from being overwritten by the contents of `bds_set` at run-time.
This bug was introduced in 2017, when `bds_set` was converted to a one-element array, and started to be used as a Variable Length Object (VLO) at run-time.
Fixes: f5823fe6897c ("qed: Add ll2 option to limit the number of bds per packet") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/ZQ+Nz8DfPg56pIzr@work Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/qlogic/qed/qed_ll2.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.h +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.h @@ -110,9 +110,9 @@ struct qed_ll2_info { enum core_tx_dest tx_dest; u8 tx_stats_en; bool main_func_queue; + struct qed_ll2_cbs cbs; struct qed_ll2_rx_queue rx_queue; struct qed_ll2_tx_queue tx_queue; - struct qed_ll2_cbs cbs; };
extern const struct qed_ll2_ops qed_ll2_ops_pass;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
commit b938790e70540bf4f2e653dcd74b232494d06c8f upstream.
The following memory leak can be observed when the controller supports codecs which are stored in local_codecs list but the elements are never freed:
unreferenced object 0xffff88800221d840 (size 32): comm "kworker/u3:0", pid 36, jiffies 4294898739 (age 127.060s) hex dump (first 32 bytes): f8 d3 02 03 80 88 ff ff 80 d8 21 02 80 88 ff ff ..........!..... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffb324f557>] __kmalloc+0x47/0x120 [<ffffffffb39ef37d>] hci_codec_list_add.isra.0+0x2d/0x160 [<ffffffffb39ef643>] hci_read_codec_capabilities+0x183/0x270 [<ffffffffb39ef9ab>] hci_read_supported_codecs+0x1bb/0x2d0 [<ffffffffb39f162e>] hci_read_local_codecs_sync+0x3e/0x60 [<ffffffffb39ff1b3>] hci_dev_open_sync+0x943/0x11e0 [<ffffffffb396d55d>] hci_power_on+0x10d/0x3f0 [<ffffffffb30c99b4>] process_one_work+0x404/0x800 [<ffffffffb30ca134>] worker_thread+0x374/0x670 [<ffffffffb30d9108>] kthread+0x188/0x1c0 [<ffffffffb304db6b>] ret_from_fork+0x2b/0x50 [<ffffffffb300206a>] ret_from_fork_asm+0x1a/0x30
Cc: stable@vger.kernel.org Fixes: 8961987f3f5f ("Bluetooth: Enumerate local supported codec and cache details") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/hci_core.c | 1 + net/bluetooth/hci_event.c | 1 + net/bluetooth/hci_sync.c | 1 + 3 files changed, 3 insertions(+)
--- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -2784,6 +2784,7 @@ void hci_release_dev(struct hci_dev *hde hci_conn_params_clear_all(hdev); hci_discovery_filter_clear(hdev); hci_blocked_keys_clear(hdev); + hci_codec_list_clear(&hdev->local_codecs); hci_dev_unlock(hdev);
ida_simple_remove(&hci_index_ida, hdev->id); --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -33,6 +33,7 @@
#include "hci_request.h" #include "hci_debugfs.h" +#include "hci_codec.h" #include "a2mp.h" #include "amp.h" #include "smp.h" --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -5095,6 +5095,7 @@ int hci_dev_close_sync(struct hci_dev *h memset(hdev->eir, 0, sizeof(hdev->eir)); memset(hdev->dev_class, 0, sizeof(hdev->dev_class)); bacpy(&hdev->random_addr, BDADDR_ANY); + hci_codec_list_clear(&hdev->local_codecs);
hci_dev_put(hdev); return err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juerg Haefliger juerg.haefliger@canonical.com
commit 4fed494abcd4fde5c24de19160e93814f912fdb3 upstream.
Since commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC"), UBSAN_BOUNDS no longer pretends 1-element arrays are unbounded. Walking 'element' and 'channel_list' will trigger warnings, so make them proper flexible arrays.
False positive warnings were:
UBSAN: array-index-out-of-bounds in drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:6984:20 index 1 is out of range for type '__le32 [1]'
UBSAN: array-index-out-of-bounds in drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c:1126:27 index 1 is out of range for type '__le16 [1]'
for these lines of code:
6884 ch.chspec = (u16)le32_to_cpu(list->element[i]);
1126 params_le->channel_list[i] = cpu_to_le16(chanspec);
Cc: stable@vger.kernel.org # 6.5+ Signed-off-by: Juerg Haefliger juerg.haefliger@canonical.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230914070227.12028-1-juerg.haefliger@canonical.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../wireless/broadcom/brcm80211/brcmfmac/fwil_types.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h index bece26741d3a..611d1a6aabb9 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h @@ -442,7 +442,12 @@ struct brcmf_scan_params_v2_le { * fixed parameter portion is assumed, otherwise * ssid in the fixed portion is ignored */ - __le16 channel_list[1]; /* list of chanspecs */ + union { + __le16 padding; /* Reserve space for at least 1 entry for abort + * which uses an on stack brcmf_scan_params_v2_le + */ + DECLARE_FLEX_ARRAY(__le16, channel_list); /* chanspecs */ + }; };
struct brcmf_scan_results { @@ -702,7 +707,7 @@ struct brcmf_sta_info_le {
struct brcmf_chanspec_list { __le32 count; /* # of entries */ - __le32 element[1]; /* variable length uint32 list */ + __le32 element[]; /* variable length uint32 list */ };
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
commit 941c998b42f5c90384f49da89a6e11233de567cf upstream.
When HCI_QUIRK_STRICT_DUPLICATE_FILTER is set LE scanning requires periodic restarts of the scanning procedure as the controller would consider device previously found as duplicated despite of RSSI changes, but in order to set the scan timeout properly set le_scan_restart needs to be synchronous so it shall not use hci_cmd_sync_queue which defers the command processing to cmd_sync_work.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-bluetooth/578e6d7afd676129decafba846a933f5@agn... Fixes: 27d54b778ad1 ("Bluetooth: Rework le_scan_restart for hci_sync") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/hci_sync.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
--- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -412,11 +412,6 @@ static int hci_le_scan_restart_sync(stru LE_SCAN_FILTER_DUP_ENABLE); }
-static int le_scan_restart_sync(struct hci_dev *hdev, void *data) -{ - return hci_le_scan_restart_sync(hdev); -} - static void le_scan_restart(struct work_struct *work) { struct hci_dev *hdev = container_of(work, struct hci_dev, @@ -426,15 +421,15 @@ static void le_scan_restart(struct work_
bt_dev_dbg(hdev, "");
- hci_dev_lock(hdev); - - status = hci_cmd_sync_queue(hdev, le_scan_restart_sync, NULL, NULL); + status = hci_le_scan_restart_sync(hdev); if (status) { bt_dev_err(hdev, "failed to restart LE scan: status %d", status); - goto unlock; + return; }
+ hci_dev_lock(hdev); + if (!test_bit(HCI_QUIRK_STRICT_DUPLICATE_FILTER, &hdev->quirks) || !hdev->discovery.scan_start) goto unlock;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sascha Hauer s.hauer@pengutronix.de
commit 2e1b3ae3e1f2cf5a3c9c05d5f961d7d4257b489f upstream.
The MAC address is stored at offset 0x107 in the EEPROM, like correctly stated in the comment. Add a two bytes reserved field right before the MAC address to shift it from offset 0x105 to 0x107.
With this the MAC address returned from my RTL8723du wifi stick can be correctly decoded as "Shenzhen Four Seas Global Link Network Technology Co., Ltd."
Fixes: 87caeef032fc ("wifi: rtw88: Add rtw8723du chipset support") Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Reported-by: Yanik Fuchs Yanik.fuchs@mbv.ch Cc: stable@vger.kernel.org Acked-by: Ping-Ke Shih pkshih@realtek.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230907071614.2032404-1-s.hauer@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/realtek/rtw88/rtw8723d.h | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/wireless/realtek/rtw88/rtw8723d.h +++ b/drivers/net/wireless/realtek/rtw88/rtw8723d.h @@ -46,6 +46,7 @@ struct rtw8723du_efuse { u8 vender_id[2]; /* 0x100 */ u8 product_id[2]; /* 0x102 */ u8 usb_option; /* 0x104 */ + u8 res5[2]; /* 0x105 */ u8 mac_addr[ETH_ALEN]; /* 0x107 */ };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavoars@kernel.org
commit eec679e4ac5f47507774956fb3479c206e761af7 upstream.
In a TLV encoding scheme, the Length part represents the length after the header containing the values for type and length. In this case, `tlv_len` should be:
tlv_len == (sizeof(*tlv_rxba) - 1) - sizeof(tlv_rxba->header) + tlv_bitmap_len
Notice that the `- 1` accounts for the one-element array `bitmap`, which 1-byte size is already included in `sizeof(*tlv_rxba)`.
So, if the above is correct, there is a double-counting of some members in `struct mwifiex_ie_types_rxba_sync`, when `tlv_buf_left` and `tmp` are calculated:
968 tlv_buf_left -= (sizeof(*tlv_rxba) + tlv_len); 969 tmp = (u8 *)tlv_rxba + tlv_len + sizeof(*tlv_rxba);
in specific, members:
drivers/net/wireless/marvell/mwifiex/fw.h:777 777 u8 mac[ETH_ALEN]; 778 u8 tid; 779 u8 reserved; 780 __le16 seq_num; 781 __le16 bitmap_len;
This is clearly wrong, and affects the subsequent decoding of data in `event_buf` through `tlv_rxba`:
970 tlv_rxba = (struct mwifiex_ie_types_rxba_sync *)tmp;
Fix this by using `sizeof(tlv_rxba->header)` instead of `sizeof(*tlv_rxba)` in the calculation of `tlv_buf_left` and `tmp`.
This results in the following binary differences before/after changes:
| drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o | @@ -4698,11 +4698,11 @@ | drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:968 | tlv_buf_left -= (sizeof(tlv_rxba->header) + tlv_len); | - 1da7: lea -0x11(%rbx),%edx | + 1da7: lea -0x4(%rbx),%edx | 1daa: movzwl %bp,%eax | drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:969 | tmp = (u8 *)tlv_rxba + sizeof(tlv_rxba->header) + tlv_len; | - 1dad: lea 0x11(%r15,%rbp,1),%r15 | + 1dad: lea 0x4(%r15,%rbp,1),%r15
The above reflects the desired change: avoid counting 13 too many bytes; which is the total size of the double-counted members in `struct mwifiex_ie_types_rxba_sync`:
$ pahole -C mwifiex_ie_types_rxba_sync drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o struct mwifiex_ie_types_rxba_sync { struct mwifiex_ie_types_header header; /* 0 4 */
|----------------------------------------------------------------------- | u8 mac[6]; /* 4 6 */ | | u8 tid; /* 10 1 */ | | u8 reserved; /* 11 1 */ | | __le16 seq_num; /* 12 2 */ | | __le16 bitmap_len; /* 14 2 */ | | u8 bitmap[1]; /* 16 1 */ | |----------------------------------------------------------------------| | 13 bytes| -----------
/* size: 17, cachelines: 1, members: 7 */ /* last cacheline: 17 bytes */ } __attribute__((__packed__));
Fixes: 99ffe72cdae4 ("mwifiex: process rxba_sync event") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/06668edd68e7a26bbfeebd1201ae077a2a7a8bce.169293195... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c +++ b/drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c @@ -965,8 +965,8 @@ void mwifiex_11n_rxba_sync_event(struct } }
- tlv_buf_left -= (sizeof(*tlv_rxba) + tlv_len); - tmp = (u8 *)tlv_rxba + tlv_len + sizeof(*tlv_rxba); + tlv_buf_left -= (sizeof(tlv_rxba->header) + tlv_len); + tmp = (u8 *)tlv_rxba + sizeof(tlv_rxba->header) + tlv_len; tlv_rxba = (struct mwifiex_ie_types_rxba_sync *)tmp; } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 223ef474316466e9f61f6e0064f3a6fe4923a2c5 upstream.
On at least arm32, but presumably any arch with highmem, if the application passes in memory that resides in highmem for the rings, then we should fail that ring creation. We fail it with -EINVAL, which is what kernels that don't support IORING_SETUP_NO_MMAP will do as well.
Cc: stable@vger.kernel.org Fixes: 03d89a2de25b ("io_uring: support for user allocated memory for rings/sqes") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2678,7 +2678,7 @@ static void *__io_uaddr_map(struct page { struct page **page_array; unsigned int nr_pages; - int ret; + int ret, i;
*npages = 0;
@@ -2708,6 +2708,20 @@ err: */ if (page_array[0] != page_array[ret - 1]) goto err; + + /* + * Can't support mapping user allocated ring memory on 32-bit archs + * where it could potentially reside in highmem. Just fail those with + * -EINVAL, just like we did on kernels that didn't support this + * feature. + */ + for (i = 0; i < nr_pages; i++) { + if (PageHighMem(page_array[i])) { + ret = -EINVAL; + goto err; + } + } + *pages = page_array; *npages = nr_pages; return page_to_virt(page_array[0]);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Jeffery djeffery@redhat.com
commit 2fd7b0f6d5ad655b1d947d3acdd82f687c31465e upstream.
When raid5_get_active_stripe is called with a ctx containing a stripe_head in its batch_last pointer, it can cause a deadlock if the task sleeps waiting on another stripe_head to become available. The stripe_head held by batch_last can be blocking the advancement of other stripe_heads, leading to no stripe_heads being released so raid5_get_active_stripe waits forever.
Like with the quiesce state handling earlier in the function, batch_last needs to be released by raid5_get_active_stripe before it waits for another stripe_head.
Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: David Jeffery djeffery@redhat.com Reviewed-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20231002183422.13047-1-djeffery@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/raid5.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -854,6 +854,13 @@ struct stripe_head *raid5_get_active_str
set_bit(R5_INACTIVE_BLOCKED, &conf->cache_state); r5l_wake_reclaim(conf->log, 0); + + /* release batch_last before wait to avoid risk of deadlock */ + if (ctx && ctx->batch_last) { + raid5_release_stripe(ctx->batch_last); + ctx->batch_last = NULL; + } + wait_event_lock_irq(conf->wait_for_stripe, is_inactive_blocked(conf, hash), *(conf->hash_locks + hash));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
commit c82458101d5490230d735caecce14c9c27b1010c upstream.
Mark Blakeney reported that when suspending system with a Thunderbolt dock connected and then unplugging the dock before resume (which is pretty normal flow with laptops), resuming takes long time.
What happens is that the PCIe link from the root port to the PCIe switch inside the Thunderbolt device does not train (as expected, the link is unplugged):
pcieport 0000:00:07.2: restoring config space at offset 0x24 (was 0x3bf12001, writing 0x3bf12001) pcieport 0000:00:07.0: waiting 100 ms for downstream link pcieport 0000:01:00.0: not ready 1023ms after resume; giving up
However, at this point we still try to resume the devices below that unplugged link:
pcieport 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible ... pcieport 0000:01:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0) ... pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
And this is the link from PCIe switch downstream port to the xHCI on the dock:
xhci_hcd 0000:03:00.0: not ready 65535ms after resume; giving up xhci_hcd 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible xhci_hcd 0000:03:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
This ends up slowing down the resume time considerably. For this reason mark these devices as disconnected if the link above them did not train properly.
Fixes: e8b908146d44 ("PCI/PM: Increase wait time after resume") Link: https://lore.kernel.org/r/20230918053041.1018876-1-mika.westerberg@linux.int... Reported-by: Mark Blakeney mark.blakeney@bullet-systems.net Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217915 Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Lukas Wunner lukas@wunner.de Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-driver.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index a79c110c7e51..51ec9e7e784f 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -572,7 +572,19 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev) { - pci_bridge_wait_for_secondary_bus(pci_dev, "resume"); + int ret; + + ret = pci_bridge_wait_for_secondary_bus(pci_dev, "resume"); + if (ret) { + /* + * The downstream link failed to come up, so mark the + * devices below as disconnected to make sure we don't + * attempt to resume them. + */ + pci_walk_bus(pci_dev->subordinate, pci_dev_set_disconnected, + NULL); + return; + }
/* * When powering on a bridge from D3cold, the whole hierarchy may be
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sricharan Ramabadhran quic_srichara@quicinc.com
commit 6a878a54d0053ef21f3b829dc267487c2302b012 upstream.
PARF_SLV_ADDR_SPACE_SIZE_2_3_3 is used by qcom_pcie_post_init_2_3_3(). This PCIe slave address space size register offset is 0x358 but was incorrectly changed to 0x16c by 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions").
This prevented access to slave address space registers like iATU, etc., so the IPQ8074 PCIe controller was not enumerated.
Revert back to the correct 0x358 offset and remove the unused PARF_SLV_ADDR_SPACE_SIZE_2_3_3.
Fixes: 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions") Link: https://lore.kernel.org/r/20230919102948.1844909-1-quic_srichara@quicinc.com Tested-by: Robert Marko robimarko@gmail.com Signed-off-by: Sricharan Ramabadhran quic_srichara@quicinc.com [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-qcom.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -43,7 +43,6 @@ #define PARF_PHY_REFCLK 0x4c #define PARF_CONFIG_BITS 0x50 #define PARF_DBI_BASE_ADDR 0x168 -#define PARF_SLV_ADDR_SPACE_SIZE_2_3_3 0x16c /* Register offset specific to IP ver 2.3.3 */ #define PARF_MHI_CLOCK_RESET_CTRL 0x174 #define PARF_AXI_MSTR_WR_ADDR_HALT 0x178 #define PARF_AXI_MSTR_WR_ADDR_HALT_V2 0x1a8 @@ -797,8 +796,7 @@ static int qcom_pcie_post_init_2_3_3(str u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); u32 val;
- writel(SLV_ADDR_SPACE_SZ, - pcie->parf + PARF_SLV_ADDR_SPACE_SIZE_2_3_3); + writel(SLV_ADDR_SPACE_SZ, pcie->parf + PARF_SLV_ADDR_SPACE_SIZE);
val = readl(pcie->parf + PARF_PHY_CTRL); val &= ~PHY_TEST_PWR_DOWN;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jithu Joseph jithu.joseph@intel.com
commit 2545deba314eec91dc5ca1a954fe97f91ef1cf07 upstream.
Couple of error paths in do_core_test() was returning directly without doing a necessary cpus_read_unlock().
Following lockdep warning was observed when exercising these scenarios with PROVE_RAW_LOCK_NESTING enabled:
[ 139.304775] ================================================ [ 139.311185] WARNING: lock held when returning to user space! [ 139.317593] 6.6.0-rc2ifs01+ #11 Tainted: G S W I [ 139.324499] ------------------------------------------------ [ 139.330908] bash/11476 is leaving the kernel with locks still held! [ 139.338000] 1 lock held by bash/11476: [ 139.342262] #0: ffffffffaa26c930 (cpu_hotplug_lock){++++}-{0:0}, at: do_core_test+0x35/0x1c0 [intel_ifs]
Fix the flow so that all scenarios release the lock prior to returning from the function.
Fixes: 5210fb4e1880 ("platform/x86/intel/ifs: Sysfs interface for Array BIST") Cc: stable@vger.kernel.org Signed-off-by: Jithu Joseph jithu.joseph@intel.com Link: https://lore.kernel.org/r/20230927184824.2566086-1-jithu.joseph@intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/intel/ifs/runtest.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c index 1061eb7ec399..43c864add778 100644 --- a/drivers/platform/x86/intel/ifs/runtest.c +++ b/drivers/platform/x86/intel/ifs/runtest.c @@ -331,14 +331,15 @@ int do_core_test(int cpu, struct device *dev) switch (test->test_num) { case IFS_TYPE_SAF: if (!ifsd->loaded) - return -EPERM; - ifs_test_core(cpu, dev); + ret = -EPERM; + else + ifs_test_core(cpu, dev); break; case IFS_TYPE_ARRAY_BIST: ifs_array_test_core(cpu, dev); break; default: - return -EINVAL; + ret = -EINVAL; } out: cpus_read_unlock();
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jordan Rife jrife@google.com
commit 26297b4ce1ce4ea40bc9a48ec99f45da3f64d2e2 upstream.
commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect") ensured that kernel_connect() will not overwrite the address parameter in cases where BPF connect hooks perform an address rewrite. This change replaces direct calls to sock->ops->connect() in net with kernel_connect() to make these call safe.
Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/ Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect") Cc: stable@vger.kernel.org Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: Jordan Rife jrife@google.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/ipvs/ip_vs_sync.c | 4 ++-- net/rds/tcp_connect.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -1507,8 +1507,8 @@ static int make_send_sock(struct netns_i }
get_mcast_sockaddr(&mcast_addr, &salen, &ipvs->mcfg, id); - result = sock->ops->connect(sock, (struct sockaddr *) &mcast_addr, - salen, 0); + result = kernel_connect(sock, (struct sockaddr *)&mcast_addr, + salen, 0); if (result < 0) { pr_err("Error connecting to the multicast addr\n"); goto error; --- a/net/rds/tcp_connect.c +++ b/net/rds/tcp_connect.c @@ -173,7 +173,7 @@ int rds_tcp_conn_path_connect(struct rds * own the socket */ rds_tcp_set_callbacks(sock, cp); - ret = sock->ops->connect(sock, addr, addrlen, O_NONBLOCK); + ret = kernel_connect(sock, addr, addrlen, O_NONBLOCK);
rdsdebug("connect to address %pI6c returned %d\n", &conn->c_faddr, ret); if (ret == -EINPROGRESS)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit f8024f1f36a30a082b0457d5779c8847cea57f57 upstream.
syzbot reports that registering a mapped buffer ring on arm32 can trigger an OOPS. Registered buffer rings have two modes, one of them is the application passing in the memory that the buffer ring should reside in. Once those pages are mapped, we use page_address() to get a virtual address. This will obviously fail on highmem pages, which aren't mapped.
Add a check if we have any highmem pages after mapping, and fail the attempt to register a provided buffer ring if we do. This will return the same error as kernels that don't support provided buffer rings to begin with.
Link: https://lore.kernel.org/io-uring/000000000000af635c0606bcb889@google.com/ Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring") Cc: stable@vger.kernel.org Reported-by: syzbot+2113e61b8848fa7951d8@syzkaller.appspotmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/kbuf.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-)
--- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -481,7 +481,7 @@ static int io_pin_pbuf_ring(struct io_ur { struct io_uring_buf_ring *br; struct page **pages; - int nr_pages; + int i, nr_pages;
pages = io_pin_pages(reg->ring_addr, flex_array_size(br, bufs, reg->ring_entries), @@ -489,6 +489,17 @@ static int io_pin_pbuf_ring(struct io_ur if (IS_ERR(pages)) return PTR_ERR(pages);
+ /* + * Apparently some 32-bit boxes (ARM) will return highmem pages, + * which then need to be mapped. We could support that, but it'd + * complicate the code and slowdown the common cases quite a bit. + * So just error out, returning -EINVAL just like we did on kernels + * that didn't support mapped buffer rings. + */ + for (i = 0; i < nr_pages; i++) + if (PageHighMem(pages[i])) + goto error_unpin; + br = page_address(pages[0]); #ifdef SHM_COLOUR /* @@ -500,13 +511,8 @@ static int io_pin_pbuf_ring(struct io_ur * should use IOU_PBUF_RING_MMAP instead, and liburing will handle * this transparently. */ - if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) { - int i; - - for (i = 0; i < nr_pages; i++) - unpin_user_page(pages[i]); - return -EINVAL; - } + if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) + goto error_unpin; #endif bl->buf_pages = pages; bl->buf_nr_pages = nr_pages; @@ -514,6 +520,11 @@ static int io_pin_pbuf_ring(struct io_ur bl->is_mapped = 1; bl->is_mmap = 0; return 0; +error_unpin: + for (i = 0; i < nr_pages; i++) + unpin_user_page(pages[i]); + kvfree(pages); + return -EINVAL; }
static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 1658633c04653578429ff5dfc62fdc159203a8f2 upstream.
io_lockdep_assert_cq_locked() checks that locking is correctly done when a CQE is posted. If the ring is setup in a disabled state with IORING_SETUP_R_DISABLED, then ctx->submitter_task isn't assigned until the ring is later enabled. We generally don't post CQEs in this state, as no SQEs can be submitted. However it is possible to generate a CQE if tagged resources are being updated. If this happens and PROVE_LOCKING is enabled, then the locking check helper will dereference ctx->submitter_task, which hasn't been set yet.
Fixup io_lockdep_assert_cq_locked() to handle this case correctly. While at it, convert it to a static inline as well, so that generated line offsets will actually reflect which condition failed, rather than just the line offset for io_lockdep_assert_cq_locked() itself.
Reported-and-tested-by: syzbot+efc45d4e7ba6ab4ef1eb@syzkaller.appspotmail.com Fixes: f26cc9593581 ("io_uring: lockdep annotate CQ locking") Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.h | 41 +++++++++++++++++++++++++++-------------- 1 file changed, 27 insertions(+), 14 deletions(-)
--- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -87,20 +87,33 @@ bool __io_alloc_req_refill(struct io_rin bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task, bool cancel_all);
-#define io_lockdep_assert_cq_locked(ctx) \ - do { \ - lockdep_assert(in_task()); \ - \ - if (ctx->flags & IORING_SETUP_IOPOLL) { \ - lockdep_assert_held(&ctx->uring_lock); \ - } else if (!ctx->task_complete) { \ - lockdep_assert_held(&ctx->completion_lock); \ - } else if (ctx->submitter_task->flags & PF_EXITING) { \ - lockdep_assert(current_work()); \ - } else { \ - lockdep_assert(current == ctx->submitter_task); \ - } \ - } while (0) +#if defined(CONFIG_PROVE_LOCKING) +static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx) +{ + lockdep_assert(in_task()); + + if (ctx->flags & IORING_SETUP_IOPOLL) { + lockdep_assert_held(&ctx->uring_lock); + } else if (!ctx->task_complete) { + lockdep_assert_held(&ctx->completion_lock); + } else if (ctx->submitter_task) { + /* + * ->submitter_task may be NULL and we can still post a CQE, + * if the ring has been setup with IORING_SETUP_R_DISABLED. + * Not from an SQE, as those cannot be submitted, but via + * updating tagged resources. + */ + if (ctx->submitter_task->flags & PF_EXITING) + lockdep_assert(current_work()); + else + lockdep_assert(current == ctx->submitter_task); + } +} +#else +static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx) +{ +} +#endif
static inline void io_req_task_work_add(struct io_kiocb *req) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit f8d1b011ca8c9d64ce32da431a8420635a96958a upstream.
Commit b7af0635c87f ("btrfs: print transaction aborted messages with an error level") changed the log level of transaction aborted messages from a debug level to an error level, so that such messages are always visible even on production systems where the log level is normally above the debug level (and also on some syzbot reports).
Later, commit fccf0c842ed4 ("btrfs: move btrfs_abort_transaction to transaction.c") changed the log level back to debug level when the error number for a transaction abort should not have a stack trace printed. This happened for absolutely no reason. It's always useful to print transaction abort messages with an error level, regardless of whether the error number should cause a stack trace or not.
So change back the log level to error level.
Fixes: fccf0c842ed4 ("btrfs: move btrfs_abort_transaction to transaction.c") CC: stable@vger.kernel.org # 6.5+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -218,8 +218,8 @@ do { \ (errno))) { \ /* Stack trace printed. */ \ } else { \ - btrfs_debug((trans)->fs_info, \ - "Transaction aborted (error %d)", \ + btrfs_err((trans)->fs_info, \ + "Transaction aborted (error %d)", \ (errno)); \ } \ } \
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit 5f521494cc73520ffac18ede0758883b9aedd018 upstream.
[BUG] The following script would allow invalid mount options to be specified (although such invalid options would just be ignored):
# mkfs.btrfs -f $dev # mount $dev $mnt1 <<< Successful mount expected # mount $dev $mnt2 -o junk <<< Failed mount expected # echo $? 0
[CAUSE] For the 2nd mount, since the fs is already mounted, we won't go through open_ctree() thus no btrfs_parse_options(), but only through btrfs_parse_subvol_options().
However we do not treat unrecognized options from valid but irrelevant options, thus those invalid options would just be ignored by btrfs_parse_subvol_options().
[FIX] Add the handling for Opt_err to handle invalid options and error out, while still ignore other valid options inside btrfs_parse_subvol_options().
Reported-by: Anand Jain anand.jain@oracle.com CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/super.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -948,6 +948,10 @@ static int btrfs_parse_subvol_options(co
*subvol_objectid = subvolid; break; + case Opt_err: + btrfs_err(NULL, "unrecognized mount option '%s'", p); + error = -EINVAL; + goto out; default: break; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jordan Rife jrife@google.com
commit 86a7e0b69bd5b812e48a20c66c2161744f3caa16 upstream.
Callers of sock_sendmsg(), and similarly kernel_sendmsg(), in kernel space may observe their value of msg_name change in cases where BPF sendmsg hooks rewrite the send address. This has been confirmed to break NFS mounts running in UDP mode and has the potential to break other systems.
This patch:
1) Creates a new function called __sock_sendmsg() with same logic as the old sock_sendmsg() function. 2) Replaces calls to sock_sendmsg() made by __sys_sendto() and __sys_sendmsg() with __sock_sendmsg() to avoid an unnecessary copy, as these system calls are already protected. 3) Modifies sock_sendmsg() so that it makes a copy of msg_name if present before passing it down the stack to insulate callers from changes to the send address.
Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/ Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg") Cc: stable@vger.kernel.org Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: Jordan Rife jrife@google.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/socket.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-)
--- a/net/socket.c +++ b/net/socket.c @@ -732,6 +732,14 @@ static inline int sock_sendmsg_nosec(str return ret; }
+static int __sock_sendmsg(struct socket *sock, struct msghdr *msg) +{ + int err = security_socket_sendmsg(sock, msg, + msg_data_left(msg)); + + return err ?: sock_sendmsg_nosec(sock, msg); +} + /** * sock_sendmsg - send a message through @sock * @sock: socket @@ -742,10 +750,19 @@ static inline int sock_sendmsg_nosec(str */ int sock_sendmsg(struct socket *sock, struct msghdr *msg) { - int err = security_socket_sendmsg(sock, msg, - msg_data_left(msg)); + struct sockaddr_storage *save_addr = (struct sockaddr_storage *)msg->msg_name; + struct sockaddr_storage address; + int ret;
- return err ?: sock_sendmsg_nosec(sock, msg); + if (msg->msg_name) { + memcpy(&address, msg->msg_name, msg->msg_namelen); + msg->msg_name = &address; + } + + ret = __sock_sendmsg(sock, msg); + msg->msg_name = save_addr; + + return ret; } EXPORT_SYMBOL(sock_sendmsg);
@@ -1127,7 +1144,7 @@ static ssize_t sock_write_iter(struct ki if (sock->type == SOCK_SEQPACKET) msg.msg_flags |= MSG_EOR;
- res = sock_sendmsg(sock, &msg); + res = __sock_sendmsg(sock, &msg); *from = msg.msg_iter; return res; } @@ -2132,7 +2149,7 @@ int __sys_sendto(int fd, void __user *bu if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; msg.msg_flags = flags; - err = sock_sendmsg(sock, &msg); + err = __sock_sendmsg(sock, &msg);
out_put: fput_light(sock->file, fput_needed); @@ -2492,7 +2509,7 @@ static int ____sys_sendmsg(struct socket err = sock_sendmsg_nosec(sock, msg_sys); goto out_freectl; } - err = sock_sendmsg(sock, msg_sys); + err = __sock_sendmsg(sock, msg_sys); /* * If this is sendmmsg() and sending to current destination address was * successful, remember it.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nirmoy Das nirmoy.das@intel.com
commit 128c20eda73bd3e78505c574fb17adb46195c98b upstream.
PIPE_CONTROL_FLUSH_L3 is not needed for aux invalidation so don't set that.
Fixes: 78a6ccd65fa3 ("drm/i915/gt: Ensure memory quiesced before invalidation") Cc: Jonathan Cavitt jonathan.cavitt@intel.com Cc: Andi Shyti andi.shyti@linux.intel.com Cc: stable@vger.kernel.org # v5.8+ Cc: Andrzej Hajda andrzej.hajda@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Matt Roper matthew.d.roper@intel.com Cc: Tejas Upadhyay tejas.upadhyay@intel.com Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: Prathap Kumar Valsan prathap.kumar.valsan@intel.com Cc: Tapani Pälli tapani.palli@intel.com Cc: Mark Janes mark.janes@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Nirmoy Das nirmoy.das@intel.com Acked-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Tested-by: Tapani Pälli tapani.palli@intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230926142401.25687-1-nirmoy.... (cherry picked from commit 03d681412b38558aefe4fb0f46e36efa94bb21ef) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -271,8 +271,17 @@ int gen12_emit_flush_rcs(struct i915_req if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70)) bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+ /* + * L3 fabric flush is needed for AUX CCS invalidation + * which happens as part of pipe-control so we can + * ignore PIPE_CONTROL_FLUSH_L3. Also PIPE_CONTROL_FLUSH_L3 + * deals with Protected Memory which is not needed for + * AUX CCS invalidation and lead to unwanted side effects. + */ + if (mode & EMIT_FLUSH) + bit_group_1 |= PIPE_CONTROL_FLUSH_L3; + bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH; - bit_group_1 |= PIPE_CONTROL_FLUSH_L3; bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH; bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH; /* Wa_1409600907:tgl,adl-p */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit 134b8c5d8674e7cde380f82e9aedfd46dcdd16f7 upstream.
On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO.
This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3.
Cc: stable@vger.kernel.org Suggested-by: Jun Ma Jun.Ma2@amd.com Tested-by: David Perry David.Perry@amd.com Fixes: b10c1c5b3a4e ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2308,7 +2308,7 @@ static int amdgpu_device_ip_early_init(s adev->flags |= AMD_IS_PX;
if (!(adev->flags & AMD_IS_APU)) { - parent = pci_upstream_bridge(adev->pdev); + parent = pcie_find_root_port(adev->pdev); adev->has_pr3 = parent ? pci_pr3_present(parent) : false; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit 2a1fe39a5be785e962e387146aed34fa9a829f3f upstream.
While aligning SMU11 with SMU13 implementation an assumption was made that `dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization like in SMU13 but it isn't.
So restore some of the original logic and instead just check for amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode values; erring on the side of performance.
Cc: stable@vger.kernel.org # 6.1+ Reported-and-tested-by: Umio Yasuno coelacanth_dream@protonmail.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382 Fixes: e701156ccc6c ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 41 ++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -2081,36 +2081,41 @@ static int sienna_cichlid_display_disabl return ret; }
+#define MAX(a, b) ((a) > (b) ? (a) : (b)) + static int sienna_cichlid_update_pcie_parameters(struct smu_context *smu, uint32_t pcie_gen_cap, uint32_t pcie_width_cap) { struct smu_11_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; struct smu_11_0_pcie_table *pcie_table = &dpm_context->dpm_tables.pcie_table; - u32 smu_pcie_arg; + uint8_t *table_member1, *table_member2; + uint32_t min_gen_speed, max_gen_speed; + uint32_t min_lane_width, max_lane_width; + uint32_t smu_pcie_arg; int ret, i;
- /* PCIE gen speed and lane width override */ - if (!amdgpu_device_pcie_dynamic_switching_supported()) { - if (pcie_table->pcie_gen[NUM_LINK_LEVELS - 1] < pcie_gen_cap) - pcie_gen_cap = pcie_table->pcie_gen[NUM_LINK_LEVELS - 1]; + GET_PPTABLE_MEMBER(PcieGenSpeed, &table_member1); + GET_PPTABLE_MEMBER(PcieLaneCount, &table_member2);
- if (pcie_table->pcie_lane[NUM_LINK_LEVELS - 1] < pcie_width_cap) - pcie_width_cap = pcie_table->pcie_lane[NUM_LINK_LEVELS - 1]; + min_gen_speed = MAX(0, table_member1[0]); + max_gen_speed = MIN(pcie_gen_cap, table_member1[1]); + min_gen_speed = min_gen_speed > max_gen_speed ? + max_gen_speed : min_gen_speed; + min_lane_width = MAX(1, table_member2[0]); + max_lane_width = MIN(pcie_width_cap, table_member2[1]); + min_lane_width = min_lane_width > max_lane_width ? + max_lane_width : min_lane_width;
- /* Force all levels to use the same settings */ - for (i = 0; i < NUM_LINK_LEVELS; i++) { - pcie_table->pcie_gen[i] = pcie_gen_cap; - pcie_table->pcie_lane[i] = pcie_width_cap; - } + if (!amdgpu_device_pcie_dynamic_switching_supported()) { + pcie_table->pcie_gen[0] = max_gen_speed; + pcie_table->pcie_lane[0] = max_lane_width; } else { - for (i = 0; i < NUM_LINK_LEVELS; i++) { - if (pcie_table->pcie_gen[i] > pcie_gen_cap) - pcie_table->pcie_gen[i] = pcie_gen_cap; - if (pcie_table->pcie_lane[i] > pcie_width_cap) - pcie_table->pcie_lane[i] = pcie_width_cap; - } + pcie_table->pcie_gen[0] = min_gen_speed; + pcie_table->pcie_lane[0] = min_lane_width; } + pcie_table->pcie_gen[1] = max_gen_speed; + pcie_table->pcie_lane[1] = max_lane_width;
for (i = 0; i < NUM_LINK_LEVELS; i++) { smu_pcie_arg = (i << 16 |
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Herring robh@kernel.org
commit a654a69b9f9c06b2e56387d0b99f0e3e6b0ff4ef upstream.
Add the CPU Part number for the new Arm design.
Cc: stable@vger.kernel.org Signed-off-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20230921194156.1050055-1-robh@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -79,6 +79,7 @@ #define ARM_CPU_PART_CORTEX_A78AE 0xD42 #define ARM_CPU_PART_CORTEX_X1 0xD44 #define ARM_CPU_PART_CORTEX_A510 0xD46 +#define ARM_CPU_PART_CORTEX_A520 0xD80 #define ARM_CPU_PART_CORTEX_A710 0xD47 #define ARM_CPU_PART_CORTEX_A715 0xD4D #define ARM_CPU_PART_CORTEX_X2 0xD48 @@ -148,6 +149,7 @@ #define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE) #define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) #define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) +#define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520) #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) #define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715) #define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Herring robh@kernel.org
commit 471470bc7052d28ce125901877dd10e4c048e513 upstream.
Implement the workaround for ARM Cortex-A520 erratum 2966298. On an affected Cortex-A520 core, a speculatively executed unprivileged load might leak data from a privileged load via a cache side channel. The issue only exists for loads within a translation regime with the same translation (e.g. same ASID and VMID). Therefore, the issue only affects the return to EL0.
The workaround is to execute a TLBI before returning to EL0 after all loads of privileged data. A non-shareable TLBI to any address is sufficient.
The workaround isn't necessary if page table isolation (KPTI) is enabled, but for simplicity it will be. Page table isolation should normally be disabled for Cortex-A520 as it supports the CSV3 feature and the E0PD feature (used when KASLR is enabled).
Cc: stable@vger.kernel.org Signed-off-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20230921194156.1050055-2-robh@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/arch/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 13 +++++++++++++ arch/arm64/kernel/cpu_errata.c | 8 ++++++++ arch/arm64/kernel/entry.S | 4 ++++ arch/arm64/tools/cpucaps | 1 + 5 files changed, 28 insertions(+)
--- a/Documentation/arch/arm64/silicon-errata.rst +++ b/Documentation/arch/arm64/silicon-errata.rst @@ -63,6 +63,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A510 | #1902691 | ARM64_ERRATUM_1902691 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Cortex-A520 | #2966298 | ARM64_ERRATUM_2966298 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 | --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1038,6 +1038,19 @@ config ARM64_ERRATUM_2645198
If unsure, say Y.
+config ARM64_ERRATUM_2966298 + bool "Cortex-A520: 2966298: workaround for speculatively executed unprivileged load" + default y + help + This option adds the workaround for ARM Cortex-A520 erratum 2966298. + + On an affected Cortex-A520 core, a speculatively executed unprivileged + load might leak data from a privileged level via a cache side channel. + + Work around this problem by executing a TLBI before returning to EL0. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -730,6 +730,14 @@ const struct arm64_cpu_capabilities arm6 .cpu_enable = cpu_clear_bf16_from_user_emulation, }, #endif +#ifdef CONFIG_ARM64_ERRATUM_2966298 + { + .desc = "ARM erratum 2966298", + .capability = ARM64_WORKAROUND_2966298, + /* Cortex-A520 r0p0 - r0p1 */ + ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A520, 0, 0, 1), + }, +#endif #ifdef CONFIG_AMPERE_ERRATUM_AC03_CPU_38 { .desc = "AmpereOne erratum AC03_CPU_38", --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -428,6 +428,10 @@ alternative_else_nop_endif ldp x28, x29, [sp, #16 * 14]
.if \el == 0 +alternative_if ARM64_WORKAROUND_2966298 + tlbi vale1, xzr + dsb nsh +alternative_else_nop_endif alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 ldr lr, [sp, #S_LR] add sp, sp, #PT_REGS_SIZE // restore sp --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -83,6 +83,7 @@ WORKAROUND_2077057 WORKAROUND_2457168 WORKAROUND_2645198 WORKAROUND_2658417 +WORKAROUND_2966298 WORKAROUND_AMPERE_AC03_CPU_38 WORKAROUND_TRBE_OVERWRITE_FILL_MODE WORKAROUND_TSB_FLUSH_FAILURE
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit e1cd4004cde7c9b694bbdd8def0e02288ee58c74 ]
If an error occurs after a successful usb_alloc_urb() call, usb_free_urb() should be called.
Fixes: fb1a79a6b6e1 ("HID: sony: fix freeze when inserting ghlive ps3/wii dongles") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-sony.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/hid/hid-sony.c b/drivers/hid/hid-sony.c index dd942061fd775..a02046a78b2da 100644 --- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -2155,6 +2155,9 @@ static int sony_probe(struct hid_device *hdev, const struct hid_device_id *id) return ret;
err: + if (sc->ghl_urb) + usb_free_urb(sc->ghl_urb); + hid_hw_stop(hdev); return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 195273147e520844c1aae9fbf85cb6eb0bc0fdd7 ]
Fix the following kernel depency lock holding wed_lock with BH disabled.
[ 40.579696] mt798x-wmac 18000000.wifi: attaching wed device 0 version 2 [ 40.604648] platform 15010000.wed: MTK WED WO Firmware Version: DEV_000000, Build Time: 20221208202138 [ 40.613972] platform 15010000.wed: MTK WED WO Chip ID 00 Region 3 [ 40.943617] [ 40.945118] ======================================================== [ 40.951457] WARNING: possible irq lock inversion dependency detected [ 40.957797] 5.15.127 #0 Not tainted [ 40.961276] -------------------------------------------------------- [ 40.967614] insmod/2329 just changed the state of lock: [ 40.972827] ffffff8004003b08 (&dev->wed_lock){+.+.}-{2:2}, at: mt76_get_rxwi+0x1c/0xac [mt76] [ 40.981387] but this lock was taken by another, SOFTIRQ-safe lock in the past: [ 40.988592] (&q->lock){+.-.}-{2:2} [ 40.988602] [ 40.988602] [ 40.988602] and interrupts could create inverse lock ordering between them. [ 40.988602] [ 41.003445] [ 41.003445] other info that might help us debug this: [ 41.009957] Possible interrupt unsafe locking scenario: [ 41.009957] [ 41.016729] CPU0 CPU1 [ 41.021245] ---- ---- [ 41.025761] lock(&dev->wed_lock); [ 41.029241] local_irq_disable(); [ 41.035145] lock(&q->lock); [ 41.040620] lock(&dev->wed_lock); [ 41.046616] <Interrupt> [ 41.049223] lock(&q->lock); [ 41.052356] [ 41.052356] *** DEADLOCK *** [ 41.052356] [ 41.058260] 1 lock held by insmod/2329: [ 41.062085] #0: ffffff80003b9988 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x88/0x190 [ 41.070442] [ 41.070442] the shortest dependencies between 2nd lock and 1st lock: [ 41.078257] -> (&q->lock){+.-.}-{2:2} { [ 41.082177] HARDIRQ-ON-W at: [ 41.085396] lock_acquire+0xfc/0x2c0 [ 41.090787] _raw_spin_lock_bh+0x84/0xa0 [ 41.096525] mt76_dma_cleanup+0x24c/0x650 [mt76] [ 41.102977] mt76_dma_cleanup+0x614/0x650 [mt76] [ 41.109428] mt7915_eeprom_get_power_delta+0x1168/0x2464 [mt7915e] [ 41.117435] mt7915_eeprom_init+0x40/0x340 [mt7915e] [ 41.124222] cleanup_module+0x94/0xb28 [mt7915e] [ 41.130662] platform_probe+0x64/0xbc [ 41.136139] really_probe.part.0+0x98/0x2f4 [ 41.142134] __driver_probe_device+0x94/0x16c [ 41.148303] driver_probe_device+0x40/0x120 [ 41.154299] __driver_attach+0x94/0x190 [ 41.159947] bus_for_each_dev+0x5c/0x94 [ 41.165594] driver_attach+0x20/0x30 [ 41.170983] bus_add_driver+0x104/0x1f4 [ 41.176631] driver_register+0x74/0x120 [ 41.182280] __platform_driver_register+0x24/0x30 [ 41.188797] 0xffffffc000cb1074 [ 41.193754] do_one_initcall+0x70/0x2cc [ 41.199403] do_init_module+0x44/0x240 [ 41.204968] load_module+0x1f5c/0x2874 [ 41.210532] __do_sys_init_module+0x1d8/0x2ac [ 41.216702] __arm64_sys_init_module+0x18/0x20 [ 41.222958] invoke_syscall.constprop.0+0x4c/0xe0 [ 41.229474] do_el0_svc+0x50/0xf0 [ 41.234602] el0_svc+0x4c/0xcc [ 41.239471] el0t_64_sync_handler+0xe0/0x110 [ 41.245556] el0t_64_sync+0x15c/0x160 [ 41.251029] IN-SOFTIRQ-W at: [ 41.254249] lock_acquire+0xfc/0x2c0 [ 41.259638] _raw_spin_lock_bh+0x84/0xa0 [ 41.265372] mt76_queue_tx_complete+0x34/0x70 [mt76] [ 41.272170] mt76_free_pending_rxwi+0x36c/0x5d0 [mt76] [ 41.279140] mt76_free_pending_rxwi+0x5c0/0x5d0 [mt76] [ 41.286111] mt7915_eeprom_get_power_delta+0x620/0x2464 [mt7915e] [ 41.294026] __napi_poll.constprop.0+0x5c/0x230 [ 41.300372] net_rx_action+0xe4/0x294 [ 41.305847] _stext+0x154/0x4cc [ 41.310801] do_softirq+0xa4/0xbc [ 41.315930] __local_bh_enable_ip+0x168/0x174 [ 41.322097] napi_threaded_poll+0xbc/0x140 [ 41.328007] kthread+0x13c/0x150 [ 41.333049] ret_from_fork+0x10/0x20 [ 41.338437] INITIAL USE at: [ 41.341568] lock_acquire+0xfc/0x2c0 [ 41.346869] _raw_spin_lock_bh+0x84/0xa0 [ 41.352519] mt76_dma_cleanup+0x24c/0x650 [mt76] [ 41.358882] mt76_dma_cleanup+0x614/0x650 [mt76] [ 41.365245] mt7915_eeprom_get_power_delta+0x1168/0x2464 [mt7915e] [ 41.373160] mt7915_eeprom_init+0x40/0x340 [mt7915e] [ 41.379860] cleanup_module+0x94/0xb28 [mt7915e] [ 41.386213] platform_probe+0x64/0xbc [ 41.391602] really_probe.part.0+0x98/0x2f4 [ 41.397511] __driver_probe_device+0x94/0x16c [ 41.403594] driver_probe_device+0x40/0x120 [ 41.409502] __driver_attach+0x94/0x190 [ 41.415063] bus_for_each_dev+0x5c/0x94 [ 41.420625] driver_attach+0x20/0x30 [ 41.425926] bus_add_driver+0x104/0x1f4 [ 41.431487] driver_register+0x74/0x120 [ 41.437049] __platform_driver_register+0x24/0x30 [ 41.443479] 0xffffffc000cb1074 [ 41.448346] do_one_initcall+0x70/0x2cc [ 41.453907] do_init_module+0x44/0x240 [ 41.459383] load_module+0x1f5c/0x2874 [ 41.464860] __do_sys_init_module+0x1d8/0x2ac [ 41.470944] __arm64_sys_init_module+0x18/0x20 [ 41.477113] invoke_syscall.constprop.0+0x4c/0xe0 [ 41.483542] do_el0_svc+0x50/0xf0 [ 41.488582] el0_svc+0x4c/0xcc [ 41.493364] el0t_64_sync_handler+0xe0/0x110 [ 41.499361] el0t_64_sync+0x15c/0x160 [ 41.504748] } [ 41.506489] ... key at: [<ffffffc000c65ba0>] __this_module+0x3e0/0xffffffffffffa840 [mt76] [ 41.515371] ... acquired at: [ 41.518413] _raw_spin_lock+0x60/0x74 [ 41.522240] mt76_get_rxwi+0x1c/0xac [mt76] [ 41.526608] mt76_dma_cleanup+0x3e0/0x650 [mt76] [ 41.531410] mt76_dma_cleanup+0x614/0x650 [mt76] [ 41.536211] mt7915_dma_init+0x408/0x7b0 [mt7915e] [ 41.541177] mt7915_register_device+0x310/0x620 [mt7915e] [ 41.546749] mt7915_mmio_probe+0xcec/0x1d44 [mt7915e] [ 41.551973] platform_probe+0x64/0xbc [ 41.555802] really_probe.part.0+0x98/0x2f4 [ 41.560149] __driver_probe_device+0x94/0x16c [ 41.564670] driver_probe_device+0x40/0x120 [ 41.569017] __driver_attach+0x94/0x190 [ 41.573019] bus_for_each_dev+0x5c/0x94 [ 41.577018] driver_attach+0x20/0x30 [ 41.580758] bus_add_driver+0x104/0x1f4 [ 41.584758] driver_register+0x74/0x120 [ 41.588759] __platform_driver_register+0x24/0x30 [ 41.593628] init_module+0x74/0x1000 [mt7915e] [ 41.598248] do_one_initcall+0x70/0x2cc [ 41.602248] do_init_module+0x44/0x240 [ 41.606162] load_module+0x1f5c/0x2874 [ 41.610078] __do_sys_init_module+0x1d8/0x2ac [ 41.614600] __arm64_sys_init_module+0x18/0x20 [ 41.619209] invoke_syscall.constprop.0+0x4c/0xe0 [ 41.624076] do_el0_svc+0x50/0xf0 [ 41.627555] el0_svc+0x4c/0xcc [ 41.630776] el0t_64_sync_handler+0xe0/0x110 [ 41.635211] el0t_64_sync+0x15c/0x160 [ 41.639037] [ 41.640517] -> (&dev->wed_lock){+.+.}-{2:2} { [ 41.644872] HARDIRQ-ON-W at: [ 41.648003] lock_acquire+0xfc/0x2c0 [ 41.653219] _raw_spin_lock+0x60/0x74 [ 41.658520] mt76_free_pending_rxwi+0xc0/0x5d0 [mt76] [ 41.665232] mt76_dma_cleanup+0x1dc/0x650 [mt76] [ 41.671508] mt7915_eeprom_get_power_delta+0x1830/0x2464 [mt7915e] [ 41.679336] mt7915_unregister_device+0x5b4/0x910 [mt7915e] [ 41.686555] mt7915_eeprom_get_target_power+0xb8/0x230 [mt7915e] [ 41.694209] mt7986_wmac_enable+0xc30/0xcd0 [mt7915e] [ 41.700909] platform_remove+0x4c/0x64 [ 41.706298] __device_release_driver+0x194/0x240 [ 41.712554] driver_detach+0xc0/0x100 [ 41.717857] bus_remove_driver+0x54/0xac [ 41.723418] driver_unregister+0x2c/0x54 [ 41.728980] platform_driver_unregister+0x10/0x20 [ 41.735323] mt7915_ops+0x244/0xffffffffffffed58 [mt7915e] [ 41.742457] __arm64_sys_delete_module+0x170/0x23c [ 41.748887] invoke_syscall.constprop.0+0x4c/0xe0 [ 41.755229] do_el0_svc+0x50/0xf0 [ 41.760183] el0_svc+0x4c/0xcc [ 41.764878] el0t_64_sync_handler+0xe0/0x110 [ 41.770788] el0t_64_sync+0x15c/0x160 [ 41.776088] SOFTIRQ-ON-W at: [ 41.779220] lock_acquire+0xfc/0x2c0 [ 41.784435] _raw_spin_lock+0x60/0x74 [ 41.789737] mt76_get_rxwi+0x1c/0xac [mt76] [ 41.795580] mt7915_debugfs_rx_log+0x804/0xb74 [mt7915e] [ 41.802540] mtk_wed_start+0x970/0xaa0 [ 41.807929] mt7915_dma_start+0x26c/0x630 [mt7915e] [ 41.814455] mt7915_dma_start+0x5a4/0x630 [mt7915e] [ 41.820981] mt7915_dma_init+0x45c/0x7b0 [mt7915e] [ 41.827420] mt7915_register_device+0x310/0x620 [mt7915e] [ 41.834467] mt7915_mmio_probe+0xcec/0x1d44 [mt7915e] [ 41.841167] platform_probe+0x64/0xbc [ 41.846469] really_probe.part.0+0x98/0x2f4 [ 41.852291] __driver_probe_device+0x94/0x16c [ 41.858286] driver_probe_device+0x40/0x120 [ 41.864107] __driver_attach+0x94/0x190 [ 41.869582] bus_for_each_dev+0x5c/0x94 [ 41.875056] driver_attach+0x20/0x30 [ 41.880270] bus_add_driver+0x104/0x1f4 [ 41.885745] driver_register+0x74/0x120 [ 41.891221] __platform_driver_register+0x24/0x30 [ 41.897564] init_module+0x74/0x1000 [mt7915e] [ 41.903657] do_one_initcall+0x70/0x2cc [ 41.909130] do_init_module+0x44/0x240 [ 41.914520] load_module+0x1f5c/0x2874 [ 41.919909] __do_sys_init_module+0x1d8/0x2ac [ 41.925905] __arm64_sys_init_module+0x18/0x20 [ 41.931989] invoke_syscall.constprop.0+0x4c/0xe0 [ 41.938331] do_el0_svc+0x50/0xf0 [ 41.943285] el0_svc+0x4c/0xcc [ 41.947981] el0t_64_sync_handler+0xe0/0x110 [ 41.953892] el0t_64_sync+0x15c/0x160 [ 41.959192] INITIAL USE at: [ 41.962238] lock_acquire+0xfc/0x2c0 [ 41.967365] _raw_spin_lock+0x60/0x74 [ 41.972580] mt76_free_pending_rxwi+0xc0/0x5d0 [mt76] [ 41.979206] mt76_dma_cleanup+0x1dc/0x650 [mt76] [ 41.985395] mt7915_eeprom_get_power_delta+0x1830/0x2464 [mt7915e] [ 41.993137] mt7915_unregister_device+0x5b4/0x910 [mt7915e] [ 42.000270] mt7915_eeprom_get_target_power+0xb8/0x230 [mt7915e] [ 42.007837] mt7986_wmac_enable+0xc30/0xcd0 [mt7915e] [ 42.014450] platform_remove+0x4c/0x64 [ 42.019753] __device_release_driver+0x194/0x240 [ 42.025922] driver_detach+0xc0/0x100 [ 42.031137] bus_remove_driver+0x54/0xac [ 42.036612] driver_unregister+0x2c/0x54 [ 42.042087] platform_driver_unregister+0x10/0x20 [ 42.048344] mt7915_ops+0x244/0xffffffffffffed58 [mt7915e] [ 42.055391] __arm64_sys_delete_module+0x170/0x23c [ 42.061735] invoke_syscall.constprop.0+0x4c/0xe0 [ 42.067990] do_el0_svc+0x50/0xf0 [ 42.072857] el0_svc+0x4c/0xcc [ 42.077466] el0t_64_sync_handler+0xe0/0x110 [ 42.083289] el0t_64_sync+0x15c/0x160 [ 42.088503] } [ 42.090157] ... key at: [<ffffffc000c65c10>] __this_module+0x450/0xffffffffffffa840 [mt76] [ 42.098951] ... acquired at: [ 42.101907] __lock_acquire+0x718/0x1df0 [ 42.105994] lock_acquire+0xfc/0x2c0 [ 42.109734] _raw_spin_lock+0x60/0x74 [ 42.113561] mt76_get_rxwi+0x1c/0xac [mt76] [ 42.117929] mt7915_debugfs_rx_log+0x804/0xb74 [mt7915e] [ 42.123415] mtk_wed_start+0x970/0xaa0 [ 42.127328] mt7915_dma_start+0x26c/0x630 [mt7915e] [ 42.132379] mt7915_dma_start+0x5a4/0x630 [mt7915e] [ 42.137430] mt7915_dma_init+0x45c/0x7b0 [mt7915e] [ 42.142395] mt7915_register_device+0x310/0x620 [mt7915e] [ 42.147967] mt7915_mmio_probe+0xcec/0x1d44 [mt7915e] [ 42.153192] platform_probe+0x64/0xbc [ 42.157019] really_probe.part.0+0x98/0x2f4 [ 42.161367] __driver_probe_device+0x94/0x16c [ 42.165887] driver_probe_device+0x40/0x120 [ 42.170234] __driver_attach+0x94/0x190 [ 42.174235] bus_for_each_dev+0x5c/0x94 [ 42.178235] driver_attach+0x20/0x30 [ 42.181974] bus_add_driver+0x104/0x1f4 [ 42.185974] driver_register+0x74/0x120 [ 42.189974] __platform_driver_register+0x24/0x30 [ 42.194842] init_module+0x74/0x1000 [mt7915e] [ 42.199460] do_one_initcall+0x70/0x2cc [ 42.203460] do_init_module+0x44/0x240 [ 42.207376] load_module+0x1f5c/0x2874 [ 42.211290] __do_sys_init_module+0x1d8/0x2ac [ 42.215813] __arm64_sys_init_module+0x18/0x20 [ 42.220421] invoke_syscall.constprop.0+0x4c/0xe0 [ 42.225288] do_el0_svc+0x50/0xf0 [ 42.228768] el0_svc+0x4c/0xcc [ 42.231989] el0t_64_sync_handler+0xe0/0x110 [ 42.236424] el0t_64_sync+0x15c/0x160 [ 42.240249] [ 42.241730] [ 42.241730] stack backtrace: [ 42.246074] CPU: 1 PID: 2329 Comm: insmod Not tainted 5.15.127 #0 [ 42.252157] Hardware name: GainStrong Oolite-MT7981B V1 Dev Board (NAND boot) (DT) [ 42.259712] Call trace: [ 42.262147] dump_backtrace+0x0/0x174 [ 42.265802] show_stack+0x14/0x20 [ 42.269108] dump_stack_lvl+0x84/0xac [ 42.272761] dump_stack+0x14/0x2c [ 42.276066] print_irq_inversion_bug.part.0+0x1b0/0x1c4 [ 42.281285] mark_lock+0x8b8/0x8bc [ 42.284678] __lock_acquire+0x718/0x1df0 [ 42.288592] lock_acquire+0xfc/0x2c0 [ 42.292158] _raw_spin_lock+0x60/0x74 [ 42.295811] mt76_get_rxwi+0x1c/0xac [mt76] [ 42.300008] mt7915_debugfs_rx_log+0x804/0xb74 [mt7915e] [ 42.305320] mtk_wed_start+0x970/0xaa0 [ 42.309059] mt7915_dma_start+0x26c/0x630 [mt7915e] [ 42.313937] mt7915_dma_start+0x5a4/0x630 [mt7915e] [ 42.318815] mt7915_dma_init+0x45c/0x7b0 [mt7915e] [ 42.323606] mt7915_register_device+0x310/0x620 [mt7915e] [ 42.329005] mt7915_mmio_probe+0xcec/0x1d44 [mt7915e] [ 42.334056] platform_probe+0x64/0xbc [ 42.337711] really_probe.part.0+0x98/0x2f4 [ 42.341885] __driver_probe_device+0x94/0x16c [ 42.346232] driver_probe_device+0x40/0x120 [ 42.350407] __driver_attach+0x94/0x190 [ 42.354234] bus_for_each_dev+0x5c/0x94 [ 42.358061] driver_attach+0x20/0x30 [ 42.361627] bus_add_driver+0x104/0x1f4 [ 42.365454] driver_register+0x74/0x120 [ 42.369282] __platform_driver_register+0x24/0x30 [ 42.373977] init_module+0x74/0x1000 [mt7915e] [ 42.378423] do_one_initcall+0x70/0x2cc [ 42.382249] do_init_module+0x44/0x240 [ 42.385990] load_module+0x1f5c/0x2874 [ 42.389733] __do_sys_init_module+0x1d8/0x2ac [ 42.394082] __arm64_sys_init_module+0x18/0x20 [ 42.398518] invoke_syscall.constprop.0+0x4c/0xe0 [ 42.403211] do_el0_svc+0x50/0xf0 [ 42.406517] el0_svc+0x4c/0xcc [ 42.409565] el0t_64_sync_handler+0xe0/0x110 [ 42.413827] el0t_64_sync+0x15c/0x160 [ 42.674858] mt798x-wmac 18000000.wifi: HW/SW Version: 0x8a108a10, Build Time: 20221208201745a [ 42.674858] [ 42.692078] mt798x-wmac 18000000.wifi: WM Firmware Version: ____000000, Build Time: 20221208201806 [ 42.735606] mt798x-wmac 18000000.wifi: WA Firmware Version: DEV_000000, Build Time: 20221208202048
Tested-by: Daniel Golle daniel@makrotopia.org Fixes: 2666bece0905 ("wifi: mt76: introduce rxwi and rx token utility routines") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Acked-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/ee80be41c2a8d8749d83c6950a272a5e77aadd45.169322833... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/dma.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c index 465190ebaf1c4..f539913aadf86 100644 --- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -93,13 +93,13 @@ __mt76_get_rxwi(struct mt76_dev *dev) { struct mt76_txwi_cache *t = NULL;
- spin_lock(&dev->wed_lock); + spin_lock_bh(&dev->wed_lock); if (!list_empty(&dev->rxwi_cache)) { t = list_first_entry(&dev->rxwi_cache, struct mt76_txwi_cache, list); list_del(&t->list); } - spin_unlock(&dev->wed_lock); + spin_unlock_bh(&dev->wed_lock);
return t; } @@ -145,9 +145,9 @@ mt76_put_rxwi(struct mt76_dev *dev, struct mt76_txwi_cache *t) if (!t) return;
- spin_lock(&dev->wed_lock); + spin_lock_bh(&dev->wed_lock); list_add(&t->list, &dev->rxwi_cache); - spin_unlock(&dev->wed_lock); + spin_unlock_bh(&dev->wed_lock); } EXPORT_SYMBOL_GPL(mt76_put_rxwi);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhihao Cheng chengzhihao1@huawei.com
[ Upstream commit 017c73a34a661a861712f7cc1393a123e5b2208c ]
There exists mtd devices with zero erasesize, which will trigger a divide-by-zero exception while attaching ubi device. Fix it by refusing attaching if mtd's erasesize is 0.
Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Reported-by: Yu Hao yhao016@ucr.edu Link: https://lore.kernel.org/lkml/977347543.226888.1682011999468.JavaMail.zimbra@... Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/ubi/build.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c index 8b91a55ec0d28..8ee51e49fced5 100644 --- a/drivers/mtd/ubi/build.c +++ b/drivers/mtd/ubi/build.c @@ -894,6 +894,13 @@ int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num, return -EINVAL; }
+ /* UBI cannot work on flashes with zero erasesize. */ + if (!mtd->erasesize) { + pr_err("ubi: refuse attaching mtd%d - zero erasesize flash is not supported\n", + mtd->index); + return -EINVAL; + } + if (ubi_num == UBI_DEV_NUM_AUTO) { /* Search for an empty slot in the @ubi_devices array */ for (ubi_num = 0; ubi_num < UBI_MAX_DEVICES; ubi_num++)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 75a5221630fe5aa3fedba7a06be618db0f79ba1e ]
When stressing microLZMA EROFS images with the new global compressed deduplication feature enabled (`-Ededupe`), I found some short-lived temporary pages weren't properly released, which could slowly cause unexpected OOMs hours later.
Let's fix it now (LZ4 and DEFLATE don't have this issue.)
Fixes: 5c2a64252c5d ("erofs: introduce partial-referenced pclusters") Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20230907050542.97152-1-hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/decompressor_lzma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/erofs/decompressor_lzma.c b/fs/erofs/decompressor_lzma.c index 73091fbe3ea45..dee10d22ada96 100644 --- a/fs/erofs/decompressor_lzma.c +++ b/fs/erofs/decompressor_lzma.c @@ -217,9 +217,12 @@ int z_erofs_lzma_decompress(struct z_erofs_decompress_req *rq, strm->buf.out_size = min_t(u32, outlen, PAGE_SIZE - pageofs); outlen -= strm->buf.out_size; - if (!rq->out[no] && rq->fillgaps) /* deduped */ + if (!rq->out[no] && rq->fillgaps) { /* deduped */ rq->out[no] = erofs_allocpage(pagepool, GFP_KERNEL | __GFP_NOFAIL); + set_page_private(rq->out[no], + Z_EROFS_SHORTLIVED_PAGE); + } if (rq->out[no]) strm->buf.out = kmap(rq->out[no]) + pageofs; pageofs = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wen Gong quic_wgong@quicinc.com
[ Upstream commit 234249d88b091d006b82f8d570343aae5f383736 ]
When connect to MLO AP with more than one link, and the assoc response of AP is not success, then cfg80211_unhold_bss() is not called for all the links' cfg80211_bss except the primary link which means the link used by the latest successful association request. Thus the hold value of the cfg80211_bss is not reset to 0 after the assoc fail, and then the __cfg80211_unlink_bss() will not be called for the cfg80211_bss by __cfg80211_bss_expire().
Then the AP always looks exist even the AP is shutdown or reconfigured to another type, then it will lead error while connecting it again.
The detail info are as below.
When connect with muti-links AP, cfg80211_hold_bss() is called by cfg80211_mlme_assoc() for each cfg80211_bss of all the links. When assoc response from AP is not success(such as status_code==1), the ieee80211_link_data of non-primary link(sdata->link[link_id]) is NULL because ieee80211_assoc_success()->ieee80211_vif_update_links() is not called for the links.
Then struct cfg80211_rx_assoc_resp resp in cfg80211_rx_assoc_resp() and struct cfg80211_connect_resp_params cr in __cfg80211_connect_result() will only have the data of the primary link, and finally function cfg80211_connect_result_release_bsses() only call cfg80211_unhold_bss() for the primary link. Then cfg80211_bss of the other links will never free because its hold is always > 0 now.
Hence assign value for the bss and status from assoc_data since it is valid for this case. Also assign value of addr from assoc_data when the link is NULL because the addrs of assoc_data and link both represent the local link addr and they are same value for success connection.
Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Wen Gong quic_wgong@quicinc.com Link: https://lore.kernel.org/r/20230825070055.28164-1-quic_wgong@quicinc.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/cfg80211.h | 2 +- net/mac80211/mlme.c | 11 ++++++----- net/wireless/mlme.c | 3 ++- 3 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index d6fa7c8767ad3..3f03f9b375e56 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -7232,7 +7232,7 @@ struct cfg80211_rx_assoc_resp { int uapsd_queues; const u8 *ap_mld_addr; struct { - const u8 *addr; + u8 addr[ETH_ALEN] __aligned(2); struct cfg80211_bss *bss; u16 status; } links[IEEE80211_MLD_MAX_NUM_LINKS]; diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index f93eb38ae0b8d..46d46cfab6c84 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -5429,17 +5429,18 @@ static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata, for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { struct ieee80211_link_data *link;
- link = sdata_dereference(sdata->link[link_id], sdata); - if (!link) - continue; - if (!assoc_data->link[link_id].bss) continue;
resp.links[link_id].bss = assoc_data->link[link_id].bss; - resp.links[link_id].addr = link->conf->addr; + ether_addr_copy(resp.links[link_id].addr, + assoc_data->link[link_id].addr); resp.links[link_id].status = assoc_data->link[link_id].status;
+ link = sdata_dereference(sdata->link[link_id], sdata); + if (!link) + continue; + /* get uapsd queues configuration - same for all links */ resp.uapsd_queues = 0; for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index 775cac4d61006..3e2c398abddcc 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -52,7 +52,8 @@ void cfg80211_rx_assoc_resp(struct net_device *dev, cr.links[link_id].bssid = data->links[link_id].bss->bssid; cr.links[link_id].addr = data->links[link_id].addr; /* need to have local link addresses for MLO connections */ - WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); + WARN_ON(cr.ap_mld_addr && + !is_valid_ether_addr(cr.links[link_id].addr));
BUG_ON(!cr.links[link_id].bss->channel);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregory Greenman gregory.greenman@intel.com
[ Upstream commit 2d4caa1dbe915654d0e8845758d9c96e721377a8 ]
Handling of BSS_CHANGED_PS was missing in vif_cfg_changed callback. Fix it.
Fixes: 22c588343529 ("wifi: iwlwifi: mvm: replace bss_info_changed() with vif_cfg/link_info_changed()") Reported-by: Sultan Alsawaf sultan@kerneltoast.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230905162939.5ef0c8230de6.Ieed265014988c50ec68fb... [note: patch looks bigger than it is due to reindentation] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../wireless/intel/iwlwifi/mvm/mld-mac80211.c | 121 +++++++++--------- 1 file changed, 63 insertions(+), 58 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c index 8b6c641772ee6..b719843e94576 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c @@ -731,73 +731,78 @@ static void iwl_mvm_mld_vif_cfg_changed_station(struct iwl_mvm *mvm,
mvmvif->associated = vif->cfg.assoc;
- if (!(changes & BSS_CHANGED_ASSOC)) - return; - - if (vif->cfg.assoc) { - /* clear statistics to get clean beacon counter */ - iwl_mvm_request_statistics(mvm, true); - iwl_mvm_sf_update(mvm, vif, false); - iwl_mvm_power_vif_assoc(mvm, vif); - - for_each_mvm_vif_valid_link(mvmvif, i) { - memset(&mvmvif->link[i]->beacon_stats, 0, - sizeof(mvmvif->link[i]->beacon_stats)); + if (changes & BSS_CHANGED_ASSOC) { + if (vif->cfg.assoc) { + /* clear statistics to get clean beacon counter */ + iwl_mvm_request_statistics(mvm, true); + iwl_mvm_sf_update(mvm, vif, false); + iwl_mvm_power_vif_assoc(mvm, vif); + + for_each_mvm_vif_valid_link(mvmvif, i) { + memset(&mvmvif->link[i]->beacon_stats, 0, + sizeof(mvmvif->link[i]->beacon_stats)); + + if (vif->p2p) { + iwl_mvm_update_smps(mvm, vif, + IWL_MVM_SMPS_REQ_PROT, + IEEE80211_SMPS_DYNAMIC, i); + } + + rcu_read_lock(); + link_conf = rcu_dereference(vif->link_conf[i]); + if (link_conf && !link_conf->dtim_period) + protect = true; + rcu_read_unlock(); + }
- if (vif->p2p) { - iwl_mvm_update_smps(mvm, vif, - IWL_MVM_SMPS_REQ_PROT, - IEEE80211_SMPS_DYNAMIC, i); + if (!test_bit(IWL_MVM_STATUS_IN_HW_RESTART, &mvm->status) && + protect) { + /* If we're not restarting and still haven't + * heard a beacon (dtim period unknown) then + * make sure we still have enough minimum time + * remaining in the time event, since the auth + * might actually have taken quite a while + * (especially for SAE) and so the remaining + * time could be small without us having heard + * a beacon yet. + */ + iwl_mvm_protect_assoc(mvm, vif, 0); }
- rcu_read_lock(); - link_conf = rcu_dereference(vif->link_conf[i]); - if (link_conf && !link_conf->dtim_period) - protect = true; - rcu_read_unlock(); - } + iwl_mvm_sf_update(mvm, vif, false); + + /* FIXME: need to decide about misbehaving AP handling */ + iwl_mvm_power_vif_assoc(mvm, vif); + } else if (iwl_mvm_mld_vif_have_valid_ap_sta(mvmvif)) { + iwl_mvm_mei_host_disassociated(mvm);
- if (!test_bit(IWL_MVM_STATUS_IN_HW_RESTART, &mvm->status) && - protect) { - /* If we're not restarting and still haven't - * heard a beacon (dtim period unknown) then - * make sure we still have enough minimum time - * remaining in the time event, since the auth - * might actually have taken quite a while - * (especially for SAE) and so the remaining - * time could be small without us having heard - * a beacon yet. + /* If update fails - SF might be running in associated + * mode while disassociated - which is forbidden. */ - iwl_mvm_protect_assoc(mvm, vif, 0); + ret = iwl_mvm_sf_update(mvm, vif, false); + WARN_ONCE(ret && + !test_bit(IWL_MVM_STATUS_HW_RESTART_REQUESTED, + &mvm->status), + "Failed to update SF upon disassociation\n"); + + /* If we get an assert during the connection (after the + * station has been added, but before the vif is set + * to associated), mac80211 will re-add the station and + * then configure the vif. Since the vif is not + * associated, we would remove the station here and + * this would fail the recovery. + */ + iwl_mvm_mld_vif_delete_all_stas(mvm, vif); }
- iwl_mvm_sf_update(mvm, vif, false); - - /* FIXME: need to decide about misbehaving AP handling */ - iwl_mvm_power_vif_assoc(mvm, vif); - } else if (iwl_mvm_mld_vif_have_valid_ap_sta(mvmvif)) { - iwl_mvm_mei_host_disassociated(mvm); - - /* If update fails - SF might be running in associated - * mode while disassociated - which is forbidden. - */ - ret = iwl_mvm_sf_update(mvm, vif, false); - WARN_ONCE(ret && - !test_bit(IWL_MVM_STATUS_HW_RESTART_REQUESTED, - &mvm->status), - "Failed to update SF upon disassociation\n"); - - /* If we get an assert during the connection (after the - * station has been added, but before the vif is set - * to associated), mac80211 will re-add the station and - * then configure the vif. Since the vif is not - * associated, we would remove the station here and - * this would fail the recovery. - */ - iwl_mvm_mld_vif_delete_all_stas(mvm, vif); + iwl_mvm_bss_info_changed_station_assoc(mvm, vif, changes); }
- iwl_mvm_bss_info_changed_station_assoc(mvm, vif, changes); + if (changes & BSS_CHANGED_PS) { + ret = iwl_mvm_power_update_mac(mvm); + if (ret) + IWL_ERR(mvm, "failed to update power mode\n"); + } }
static void
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 424c82e8ad56756bb98b08268ffcf68d12d183eb ]
The iwl_fw_ini_error_dump_range structure has conflicting alignment requirements for the inner union and the outer struct:
In file included from drivers/net/wireless/intel/iwlwifi/fw/dbg.c:9: drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2: error: field within 'struct iwl_fw_ini_error_dump_range' is less aligned than 'union iwl_fw_ini_error_dump_range::(anonymous at drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2)' and is usually due to 'struct iwl_fw_ini_error_dump_range' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] union {
As the original intention was apparently to make the entire structure unaligned, mark the innermost members the same way so the union becomes packed as well.
Fixes: 973193554cae6 ("iwlwifi: dbg_ini: dump headers cleanup") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230616090343.2454061-1-arnd@kernel.org Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/fw/error-dump.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/fw/error-dump.h b/drivers/net/wireless/intel/iwlwifi/fw/error-dump.h index f5e08988dc7bf..06d6f7f664308 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/error-dump.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/error-dump.h @@ -310,9 +310,9 @@ struct iwl_fw_ini_fifo_hdr { struct iwl_fw_ini_error_dump_range { __le32 range_data_size; union { - __le32 internal_base_addr; - __le64 dram_base_addr; - __le32 page_num; + __le32 internal_base_addr __packed; + __le64 dram_base_addr __packed; + __le32 page_num __packed; struct iwl_fw_ini_fifo_hdr fifo_hdr; struct iwl_cmd_header fw_pkt_hdr; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 8ba438ef3cacc4808a63ed0ce24d4f0942cfe55d ]
A few lines above, space is kzalloc()'ed for: sizeof(struct iwl_nvm_data) + sizeof(struct ieee80211_channel) + sizeof(struct ieee80211_rate)
'mvm->nvm_data' is a 'struct iwl_nvm_data', so it is fine.
At the end of this structure, there is the 'channels' flex array. Each element is of type 'struct ieee80211_channel'. So only 1 element is allocated in this array.
When doing: mvm->nvm_data->bands[0].channels = mvm->nvm_data->channels; We point at the first element of the 'channels' flex array. So this is fine.
However, when doing: mvm->nvm_data->bands[0].bitrates = (void *)((u8 *)mvm->nvm_data->channels + 1); because of the "(u8 *)" cast, we add only 1 to the address of the beginning of the flex array.
It is likely that we want point at the 'struct ieee80211_rate' allocated just after.
Remove the spurious casting so that the pointer arithmetic works as expected.
Fixes: 8ca151b568b6 ("iwlwifi: add the MVM driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Acked-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/23f0ec986ef1529055f4f93dcb3940a6cf8d9a94.169014375... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c index 1f5db65a088d3..1d5ee4330f29f 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -802,7 +802,7 @@ int iwl_run_init_mvm_ucode(struct iwl_mvm *mvm) mvm->nvm_data->bands[0].n_channels = 1; mvm->nvm_data->bands[0].n_bitrates = 1; mvm->nvm_data->bands[0].bitrates = - (void *)((u8 *)mvm->nvm_data->channels + 1); + (void *)(mvm->nvm_data->channels + 1); mvm->nvm_data->bands[0].bitrates->hw_value = 10; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 37c20b2effe987b806c8de6d12978e4ffeff026f ]
Max Schulze reports crashes with brcmfmac. The reason seems to be a race between userspace removing the CQM config and the driver calling cfg80211_cqm_rssi_notify(), where if the data is freed while cfg80211_cqm_rssi_notify() runs it will crash since it assumes wdev->cqm_config is set. This can't be fixed with a simple non-NULL check since there's nothing we can do for locking easily, so use RCU instead to protect the pointer, but that requires pulling the updates out into an asynchronous worker so they can sleep and call back into the driver.
Since we need to change the free anyway, also change it to go back to the old settings if changing the settings fails.
Reported-and-tested-by: Max Schulze max.schulze@online.de Closes: https://lore.kernel.org/r/ac96309a-8d8d-4435-36e6-6d152eb31876@online.de Fixes: 4a4b8169501b ("cfg80211: Accept multiple RSSI thresholds for CQM") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/cfg80211.h | 3 +- net/wireless/core.c | 14 +++---- net/wireless/core.h | 7 +++- net/wireless/nl80211.c | 93 +++++++++++++++++++++++++++--------------- 4 files changed, 75 insertions(+), 42 deletions(-)
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 3f03f9b375e56..0debc3c9364e8 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -6014,7 +6014,8 @@ struct wireless_dev { } wext; #endif
- struct cfg80211_cqm_config *cqm_config; + struct wiphy_work cqm_rssi_work; + struct cfg80211_cqm_config __rcu *cqm_config;
struct list_head pmsr_list; spinlock_t pmsr_lock; diff --git a/net/wireless/core.c b/net/wireless/core.c index 25bc2e50a0615..64e8616171104 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -1181,16 +1181,11 @@ void wiphy_rfkill_set_hw_state_reason(struct wiphy *wiphy, bool blocked, } EXPORT_SYMBOL(wiphy_rfkill_set_hw_state_reason);
-void cfg80211_cqm_config_free(struct wireless_dev *wdev) -{ - kfree(wdev->cqm_config); - wdev->cqm_config = NULL; -} - static void _cfg80211_unregister_wdev(struct wireless_dev *wdev, bool unregister_netdev) { struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); + struct cfg80211_cqm_config *cqm_config; unsigned int link_id;
ASSERT_RTNL(); @@ -1227,7 +1222,10 @@ static void _cfg80211_unregister_wdev(struct wireless_dev *wdev, kfree_sensitive(wdev->wext.keys); wdev->wext.keys = NULL; #endif - cfg80211_cqm_config_free(wdev); + wiphy_work_cancel(wdev->wiphy, &wdev->cqm_rssi_work); + /* deleted from the list, so can't be found from nl80211 any more */ + cqm_config = rcu_access_pointer(wdev->cqm_config); + kfree_rcu(cqm_config, rcu_head);
/* * Ensure that all events have been processed and @@ -1379,6 +1377,8 @@ void cfg80211_init_wdev(struct wireless_dev *wdev) wdev->wext.connect.auth_type = NL80211_AUTHTYPE_AUTOMATIC; #endif
+ wiphy_work_init(&wdev->cqm_rssi_work, cfg80211_cqm_rssi_notify_work); + if (wdev->wiphy->flags & WIPHY_FLAG_PS_ON_BY_DEFAULT) wdev->ps = true; else diff --git a/net/wireless/core.h b/net/wireless/core.h index 8a807b609ef73..86f209abc06ab 100644 --- a/net/wireless/core.h +++ b/net/wireless/core.h @@ -295,12 +295,17 @@ struct cfg80211_beacon_registration { };
struct cfg80211_cqm_config { + struct rcu_head rcu_head; u32 rssi_hyst; s32 last_rssi_event_value; + enum nl80211_cqm_rssi_threshold_event last_rssi_event_type; int n_rssi_thresholds; s32 rssi_thresholds[]; };
+void cfg80211_cqm_rssi_notify_work(struct wiphy *wiphy, + struct wiphy_work *work); + void cfg80211_destroy_ifaces(struct cfg80211_registered_device *rdev);
/* free object */ @@ -566,8 +571,6 @@ cfg80211_bss_update(struct cfg80211_registered_device *rdev, #define CFG80211_DEV_WARN_ON(cond) ({bool __r = (cond); __r; }) #endif
-void cfg80211_cqm_config_free(struct wireless_dev *wdev); - void cfg80211_release_pmsr(struct wireless_dev *wdev, u32 portid); void cfg80211_pmsr_wdev_down(struct wireless_dev *wdev); void cfg80211_pmsr_free_wk(struct work_struct *work); diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 4dcbc40d07c85..705d1cf048309 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -12797,7 +12797,8 @@ static int nl80211_set_cqm_txe(struct genl_info *info, }
static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev, - struct net_device *dev) + struct net_device *dev, + struct cfg80211_cqm_config *cqm_config) { struct wireless_dev *wdev = dev->ieee80211_ptr; s32 last, low, high; @@ -12806,7 +12807,7 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev, int err;
/* RSSI reporting disabled? */ - if (!wdev->cqm_config) + if (!cqm_config) return rdev_set_cqm_rssi_range_config(rdev, dev, 0, 0);
/* @@ -12815,7 +12816,7 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev, * connection is established and enough beacons received to calculate * the average. */ - if (!wdev->cqm_config->last_rssi_event_value && + if (!cqm_config->last_rssi_event_value && wdev->links[0].client.current_bss && rdev->ops->get_station) { struct station_info sinfo = {}; @@ -12829,30 +12830,30 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev,
cfg80211_sinfo_release_content(&sinfo); if (sinfo.filled & BIT_ULL(NL80211_STA_INFO_BEACON_SIGNAL_AVG)) - wdev->cqm_config->last_rssi_event_value = + cqm_config->last_rssi_event_value = (s8) sinfo.rx_beacon_signal_avg; }
- last = wdev->cqm_config->last_rssi_event_value; - hyst = wdev->cqm_config->rssi_hyst; - n = wdev->cqm_config->n_rssi_thresholds; + last = cqm_config->last_rssi_event_value; + hyst = cqm_config->rssi_hyst; + n = cqm_config->n_rssi_thresholds;
for (i = 0; i < n; i++) { i = array_index_nospec(i, n); - if (last < wdev->cqm_config->rssi_thresholds[i]) + if (last < cqm_config->rssi_thresholds[i]) break; }
low_index = i - 1; if (low_index >= 0) { low_index = array_index_nospec(low_index, n); - low = wdev->cqm_config->rssi_thresholds[low_index] - hyst; + low = cqm_config->rssi_thresholds[low_index] - hyst; } else { low = S32_MIN; } if (i < n) { i = array_index_nospec(i, n); - high = wdev->cqm_config->rssi_thresholds[i] + hyst - 1; + high = cqm_config->rssi_thresholds[i] + hyst - 1; } else { high = S32_MAX; } @@ -12865,6 +12866,7 @@ static int nl80211_set_cqm_rssi(struct genl_info *info, u32 hysteresis) { struct cfg80211_registered_device *rdev = info->user_ptr[0]; + struct cfg80211_cqm_config *cqm_config = NULL, *old; struct net_device *dev = info->user_ptr[1]; struct wireless_dev *wdev = dev->ieee80211_ptr; int i, err; @@ -12882,10 +12884,6 @@ static int nl80211_set_cqm_rssi(struct genl_info *info, wdev->iftype != NL80211_IFTYPE_P2P_CLIENT) return -EOPNOTSUPP;
- wdev_lock(wdev); - cfg80211_cqm_config_free(wdev); - wdev_unlock(wdev); - if (n_thresholds <= 1 && rdev->ops->set_cqm_rssi_config) { if (n_thresholds == 0 || thresholds[0] == 0) /* Disabling */ return rdev_set_cqm_rssi_config(rdev, dev, 0, 0); @@ -12902,9 +12900,10 @@ static int nl80211_set_cqm_rssi(struct genl_info *info, n_thresholds = 0;
wdev_lock(wdev); - if (n_thresholds) { - struct cfg80211_cqm_config *cqm_config; + old = rcu_dereference_protected(wdev->cqm_config, + lockdep_is_held(&wdev->mtx));
+ if (n_thresholds) { cqm_config = kzalloc(struct_size(cqm_config, rssi_thresholds, n_thresholds), GFP_KERNEL); @@ -12919,11 +12918,18 @@ static int nl80211_set_cqm_rssi(struct genl_info *info, flex_array_size(cqm_config, rssi_thresholds, n_thresholds));
- wdev->cqm_config = cqm_config; + rcu_assign_pointer(wdev->cqm_config, cqm_config); + } else { + RCU_INIT_POINTER(wdev->cqm_config, NULL); }
- err = cfg80211_cqm_rssi_update(rdev, dev); - + err = cfg80211_cqm_rssi_update(rdev, dev, cqm_config); + if (err) { + rcu_assign_pointer(wdev->cqm_config, old); + kfree_rcu(cqm_config, rcu_head); + } else { + kfree_rcu(old, rcu_head); + } unlock: wdev_unlock(wdev);
@@ -19074,9 +19080,8 @@ void cfg80211_cqm_rssi_notify(struct net_device *dev, enum nl80211_cqm_rssi_threshold_event rssi_event, s32 rssi_level, gfp_t gfp) { - struct sk_buff *msg; struct wireless_dev *wdev = dev->ieee80211_ptr; - struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy); + struct cfg80211_cqm_config *cqm_config;
trace_cfg80211_cqm_rssi_notify(dev, rssi_event, rssi_level);
@@ -19084,18 +19089,41 @@ void cfg80211_cqm_rssi_notify(struct net_device *dev, rssi_event != NL80211_CQM_RSSI_THRESHOLD_EVENT_HIGH)) return;
- if (wdev->cqm_config) { - wdev->cqm_config->last_rssi_event_value = rssi_level; + rcu_read_lock(); + cqm_config = rcu_dereference(wdev->cqm_config); + if (cqm_config) { + cqm_config->last_rssi_event_value = rssi_level; + cqm_config->last_rssi_event_type = rssi_event; + wiphy_work_queue(wdev->wiphy, &wdev->cqm_rssi_work); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL(cfg80211_cqm_rssi_notify); + +void cfg80211_cqm_rssi_notify_work(struct wiphy *wiphy, struct wiphy_work *work) +{ + struct wireless_dev *wdev = container_of(work, struct wireless_dev, + cqm_rssi_work); + struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy); + enum nl80211_cqm_rssi_threshold_event rssi_event; + struct cfg80211_cqm_config *cqm_config; + struct sk_buff *msg; + s32 rssi_level;
- cfg80211_cqm_rssi_update(rdev, dev); + wdev_lock(wdev); + cqm_config = rcu_dereference_protected(wdev->cqm_config, + lockdep_is_held(&wdev->mtx)); + if (!wdev->cqm_config) + goto unlock;
- if (rssi_level == 0) - rssi_level = wdev->cqm_config->last_rssi_event_value; - } + cfg80211_cqm_rssi_update(rdev, wdev->netdev, cqm_config);
- msg = cfg80211_prepare_cqm(dev, NULL, gfp); + rssi_level = cqm_config->last_rssi_event_value; + rssi_event = cqm_config->last_rssi_event_type; + + msg = cfg80211_prepare_cqm(wdev->netdev, NULL, GFP_KERNEL); if (!msg) - return; + goto unlock;
if (nla_put_u32(msg, NL80211_ATTR_CQM_RSSI_THRESHOLD_EVENT, rssi_event)) @@ -19105,14 +19133,15 @@ void cfg80211_cqm_rssi_notify(struct net_device *dev, rssi_level)) goto nla_put_failure;
- cfg80211_send_cqm(msg, gfp); + cfg80211_send_cqm(msg, GFP_KERNEL);
- return; + goto unlock;
nla_put_failure: nlmsg_free(msg); + unlock: + wdev_unlock(wdev); } -EXPORT_SYMBOL(cfg80211_cqm_rssi_notify);
void cfg80211_cqm_txe_notify(struct net_device *dev, const u8 *peer, u32 num_packets,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira bristot@kernel.org
[ Upstream commit 02d89917ef68acbe65c7cc2323f1db4429879878 ]
The thread thread_thread_sum accounts for thread interference during a single activation. It was not being zeroed, so it was accumulating thread interference over all activations.
It was not that visible when timerlat was the highest priority.
Link: https://lore.kernel.org/lkml/97bff55b0141f2d01b47d9450a5672fde147b89a.169116...
Fixes: 27e348b221f6 ("rtla/timerlat: Add auto-analysis core") Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/tracing/rtla/src/timerlat_aa.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/tracing/rtla/src/timerlat_aa.c b/tools/tracing/rtla/src/timerlat_aa.c index e0ffe69c271c6..dec5b4c4511e1 100644 --- a/tools/tracing/rtla/src/timerlat_aa.c +++ b/tools/tracing/rtla/src/timerlat_aa.c @@ -159,6 +159,7 @@ static int timerlat_aa_irq_latency(struct timerlat_aa_data *taa_data, taa_data->thread_nmi_sum = 0; taa_data->thread_irq_sum = 0; taa_data->thread_softirq_sum = 0; + taa_data->thread_thread_sum = 0; taa_data->thread_blocking_duration = 0; taa_data->timer_irq_start_time = 0; taa_data->timer_irq_duration = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira bristot@kernel.org
[ Upstream commit 6c73daf26420b97fb8b4a620e4ffee5c1f9d44d1 ]
When estimating the IRQ timer delay, we are dealing with two different clock sources: the external clock source that timerlat uses as a reference and the clock used by the tracer. There are also two moments: the time reading the clock and the timer in which the event is placed in the buffer (the trace event timestamp).
If the processor is slow or there is some hardware noise, the difference between the timestamp and the external clock, read can be longer than the IRQ handler delay, resulting in a negative time.
If so, set IRQ to start delay as 0. In the end, it is less near-zero and relevant then the noise.
Link: https://lore.kernel.org/lkml/a066fb667c7136d86dcddb3c7ccd72587db3e7c7.169116...
Fixes: 27e348b221f6 ("rtla/timerlat: Add auto-analysis core") Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/tracing/rtla/src/timerlat_aa.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/tracing/rtla/src/timerlat_aa.c b/tools/tracing/rtla/src/timerlat_aa.c index dec5b4c4511e1..baf1efda0581d 100644 --- a/tools/tracing/rtla/src/timerlat_aa.c +++ b/tools/tracing/rtla/src/timerlat_aa.c @@ -338,7 +338,23 @@ static int timerlat_aa_irq_handler(struct trace_seq *s, struct tep_record *recor taa_data->timer_irq_start_time = start; taa_data->timer_irq_duration = duration;
- taa_data->timer_irq_start_delay = taa_data->timer_irq_start_time - expected_start; + /* + * We are dealing with two different clock sources: the + * external clock source that timerlat uses as a reference + * and the clock used by the tracer. There are also two + * moments: the time reading the clock and the timer in + * which the event is placed in the buffer (the trace + * event timestamp). If the processor is slow or there + * is some hardware noise, the difference between the + * timestamp and the external clock read can be longer + * than the IRQ handler delay, resulting in a negative + * time. If so, set IRQ start delay as 0. In the end, + * it is less relevant than the noise. + */ + if (expected_start < taa_data->timer_irq_start_time) + taa_data->timer_irq_start_delay = taa_data->timer_irq_start_time - expected_start; + else + taa_data->timer_irq_start_delay = 0;
/* * not exit from idle.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira bristot@kernel.org
[ Upstream commit 301deca09b254965661d3e971f1a60ac2ce41f5f ]
timerlat auto-analysis takes note of all IRQs, before or after the execution of the timerlat thread.
Because we cannot go backward in the trace (we will fix it when moving to trace-cmd lib?), timerlat aa take note of the last IRQ execution in the waiting for the IRQ state, and then print it if it is executed after the expected timer IRQ starting time.
After the thread sample, the timerlat starts recording the next IRQs as "previous" irq for the next occurrence.
However, if an IRQ happens after the thread measurement but before the tracing stops, it is classified as a previous IRQ. That is not wrong, as it can be "previous" for the subsequent activation. What is wrong is considering it as a potential source for the last activation.
Ignore the IRQ interference that happens after the IRQ starting time for now. A future improvement for timerlat can be either keeping a list of previous IRQ execution or using the trace-cmd library. Still, it requires further investigation - it is a new feature.
Link: https://lore.kernel.org/lkml/a44a3f5c801dcc697bacf7325b65d4a5b0460537.169116...
Fixes: 27e348b221f6 ("rtla/timerlat: Add auto-analysis core") Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/tracing/rtla/src/timerlat_aa.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/tools/tracing/rtla/src/timerlat_aa.c b/tools/tracing/rtla/src/timerlat_aa.c index baf1efda0581d..7093fd5333beb 100644 --- a/tools/tracing/rtla/src/timerlat_aa.c +++ b/tools/tracing/rtla/src/timerlat_aa.c @@ -545,7 +545,7 @@ static int timerlat_aa_kworker_start_handler(struct trace_seq *s, struct tep_rec static void timerlat_thread_analysis(struct timerlat_aa_data *taa_data, int cpu, int irq_thresh, int thread_thresh) { - unsigned long long exp_irq_ts; + long long exp_irq_ts; int total; int irq;
@@ -562,12 +562,15 @@ static void timerlat_thread_analysis(struct timerlat_aa_data *taa_data, int cpu,
/* * Expected IRQ arrival time using the trace clock as the base. + * + * TODO: Add a list of previous IRQ, and then run the list backwards. */ exp_irq_ts = taa_data->timer_irq_start_time - taa_data->timer_irq_start_delay; - - if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) - printf(" Previous IRQ interference: \t\t up to %9.2f us\n", - ns_to_usf(taa_data->prev_irq_duration)); + if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) { + if (taa_data->prev_irq_timstamp < taa_data->timer_irq_start_time) + printf(" Previous IRQ interference: \t\t up to %9.2f us\n", + ns_to_usf(taa_data->prev_irq_duration)); + }
/* * The delay that the IRQ suffered before starting.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit d1383077c225ceb87ac7a3b56b2c505193f77ed7 ]
As reported by Stephen, I neglected to add the kernel-doc for the new struct member. Fix that.
Reported-by: Stephen Rothwell sfr@canb.auug.org.au Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/cfg80211.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 0debc3c9364e8..641c6edc9b81d 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -5942,6 +5942,7 @@ void wiphy_delayed_work_cancel(struct wiphy *wiphy, * @event_lock: (private) lock for event list * @owner_nlportid: (private) owner socket port ID * @nl_owner_dead: (private) owner socket went away + * @cqm_rssi_work: (private) CQM RSSI reporting work * @cqm_config: (private) nl80211 RSSI monitor state * @pmsr_list: (private) peer measurement requests * @pmsr_lock: (private) peer measurements requests/results lock
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 6e48ebffc2db5419b3a51cfc509bde442252b356 ]
Since the changed field size was increased to u64, mesh_bss_info_changed pulls invalid bits from the first 3 bytes of the mesh id, clears them, and passes them on to ieee80211_link_info_change_notify, because ifmsh->mbss_changed was not updated to match its size. Fix this by turning into ifmsh->mbss_changed into an unsigned long array with 64 bit size.
Fixes: 15ddba5f4311 ("wifi: mac80211: consistently use u64 for BSS changes") Reported-by: Thomas Hühn thomas.huehn@hs-nordhausen.de Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20230913050134.53536-1-nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/ieee80211_i.h | 2 +- net/mac80211/mesh.c | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h index 91633a0b723e0..f8cd94ba55ccc 100644 --- a/net/mac80211/ieee80211_i.h +++ b/net/mac80211/ieee80211_i.h @@ -676,7 +676,7 @@ struct ieee80211_if_mesh { struct timer_list mesh_path_root_timer;
unsigned long wrkq_flags; - unsigned long mbss_changed; + unsigned long mbss_changed[64 / BITS_PER_LONG];
bool userspace_handles_dfs;
diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c index af8c5fc2db149..e31c312c124a1 100644 --- a/net/mac80211/mesh.c +++ b/net/mac80211/mesh.c @@ -1175,7 +1175,7 @@ void ieee80211_mbss_info_change_notify(struct ieee80211_sub_if_data *sdata,
/* if we race with running work, worst case this work becomes a noop */ for_each_set_bit(bit, &bits, sizeof(changed) * BITS_PER_BYTE) - set_bit(bit, &ifmsh->mbss_changed); + set_bit(bit, ifmsh->mbss_changed); set_bit(MESH_WORK_MBSS_CHANGED, &ifmsh->wrkq_flags); wiphy_work_queue(sdata->local->hw.wiphy, &sdata->work); } @@ -1257,7 +1257,7 @@ void ieee80211_stop_mesh(struct ieee80211_sub_if_data *sdata)
/* clear any mesh work (for next join) we may have accrued */ ifmsh->wrkq_flags = 0; - ifmsh->mbss_changed = 0; + memset(ifmsh->mbss_changed, 0, sizeof(ifmsh->mbss_changed));
local->fif_other_bss--; atomic_dec(&local->iff_allmultis); @@ -1724,9 +1724,9 @@ static void mesh_bss_info_changed(struct ieee80211_sub_if_data *sdata) u32 bit; u64 changed = 0;
- for_each_set_bit(bit, &ifmsh->mbss_changed, + for_each_set_bit(bit, ifmsh->mbss_changed, sizeof(changed) * BITS_PER_BYTE) { - clear_bit(bit, &ifmsh->mbss_changed); + clear_bit(bit, ifmsh->mbss_changed); changed |= BIT(bit); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pin-yen Lin treapking@chromium.org
[ Upstream commit aef7a0300047e7b4707ea0411dc9597cba108fc8 ]
Only skip the code path trying to access the rfc1042 headers when the buffer is too small, so the driver can still process packets without rfc1042 headers.
Fixes: 119585281617 ("wifi: mwifiex: Fix OOB and integer underflow when rx packets") Signed-off-by: Pin-yen Lin treapking@chromium.org Acked-by: Brian Norris briannorris@chromium.org Reviewed-by: Matthew Wang matthewmwang@chromium.org Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230908104308.1546501-1-treapking@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/marvell/mwifiex/sta_rx.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/sta_rx.c b/drivers/net/wireless/marvell/mwifiex/sta_rx.c index 65420ad674167..257737137cd70 100644 --- a/drivers/net/wireless/marvell/mwifiex/sta_rx.c +++ b/drivers/net/wireless/marvell/mwifiex/sta_rx.c @@ -86,7 +86,8 @@ int mwifiex_process_rx_packet(struct mwifiex_private *priv, rx_pkt_len = le16_to_cpu(local_rx_pd->rx_pkt_length); rx_pkt_hdr = (void *)local_rx_pd + rx_pkt_off;
- if (sizeof(*rx_pkt_hdr) + rx_pkt_off > skb->len) { + if (sizeof(rx_pkt_hdr->eth803_hdr) + sizeof(rfc1042_header) + + rx_pkt_off > skb->len) { mwifiex_dbg(priv->adapter, ERROR, "wrong rx packet offset: len=%d, rx_pkt_off=%d\n", skb->len, rx_pkt_off); @@ -95,12 +96,13 @@ int mwifiex_process_rx_packet(struct mwifiex_private *priv, return -1; }
- if ((!memcmp(&rx_pkt_hdr->rfc1042_hdr, bridge_tunnel_header, - sizeof(bridge_tunnel_header))) || - (!memcmp(&rx_pkt_hdr->rfc1042_hdr, rfc1042_header, - sizeof(rfc1042_header)) && - ntohs(rx_pkt_hdr->rfc1042_hdr.snap_type) != ETH_P_AARP && - ntohs(rx_pkt_hdr->rfc1042_hdr.snap_type) != ETH_P_IPX)) { + if (sizeof(*rx_pkt_hdr) + rx_pkt_off <= skb->len && + ((!memcmp(&rx_pkt_hdr->rfc1042_hdr, bridge_tunnel_header, + sizeof(bridge_tunnel_header))) || + (!memcmp(&rx_pkt_hdr->rfc1042_hdr, rfc1042_header, + sizeof(rfc1042_header)) && + ntohs(rx_pkt_hdr->rfc1042_hdr.snap_type) != ETH_P_AARP && + ntohs(rx_pkt_hdr->rfc1042_hdr.snap_type) != ETH_P_IPX))) { /* * Replace the 803 header and rfc1042 header (llc/snap) with an * EthernetII header, keep the src/dst and snap_type
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 058574879853260a22bbec1f94221dfc5149d85c ]
The hid-nvidia-shield driver uses functions that are built only when LEDS_CLASS is set, so make the driver depend on that symbol to prevent build errors.
riscv32-linux-ld: drivers/hid/hid-nvidia-shield.o: in function `.L11': hid-nvidia-shield.c:(.text+0x192): undefined reference to `led_classdev_unregister' riscv32-linux-ld: drivers/hid/hid-nvidia-shield.o: in function `.L113': hid-nvidia-shield.c:(.text+0xfa4): undefined reference to `led_classdev_register_ext'
Fixes: 09308562d4af ("HID: nvidia-shield: Initial driver implementation with Thunderstrike support") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Rahul Rameshbabu rrameshbabu@nvidia.com Cc: Jiri Kosina jkosina@suse.cz Cc: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: linux-input@vger.kernel.org Reviewed-by: Rahul Rameshbabu rrameshbabu@nvidia.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig index e11c1c8036769..dc456c86e9569 100644 --- a/drivers/hid/Kconfig +++ b/drivers/hid/Kconfig @@ -792,6 +792,7 @@ config HID_NVIDIA_SHIELD tristate "NVIDIA SHIELD devices" depends on USB_HID depends on BT_HIDP + depends on LEDS_CLASS help Support for NVIDIA SHIELD accessories.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jingbo Xu jefflexu@linux.alibaba.com
[ Upstream commit f939aeea7ab7d96cd321e7ac107f5a070836b66f ]
Device tags aren't actually required in flatdev mode, thus fix mount failure due to empty device tags in flatdev mode.
Signed-off-by: Jingbo Xu jefflexu@linux.alibaba.com Fixes: 8b465fecc35a ("erofs: support flattened block device for multi-blob images") Reviewed-by: Jia Zhu zhujia.zj@bytedance.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lore.kernel.org/r/20230915082728.56588-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 566f68ddfa36e..31a103399412e 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -238,7 +238,7 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb, return PTR_ERR(ptr); dis = ptr + erofs_blkoff(sb, *pos);
- if (!dif->path) { + if (!sbi->devs->flatdev && !dif->path) { if (!dis->tag[0]) { erofs_err(sb, "empty device tag @ pos %llu", *pos); return -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Liu song@kernel.org
[ Upstream commit cf094baa3e0f19f1f80ceaf205c80402b024386c ]
arch_prepare_bpf_trampoline() for s390 currently returns 0 on success. This is not a problem for regular trampoline. However, struct_ops relies on the return value to advance "image" pointer:
bpf_struct_ops_map_update_elem() { ... for_each_member(i, t, member) { ... err = bpf_struct_ops_prepare_trampoline(); ... image += err; } }
When arch_prepare_bpf_trampoline returns 0 on success, all members of the struct_ops will point to the same trampoline (the last one).
Fix this by returning the program size in arch_prepare_bpf_trampoline (on success). This is the same behavior as other architectures.
Signed-off-by: Song Liu song@kernel.org Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Reviewed-by: Ilya Leoshkevich iii@linux.ibm.com Link: https://lore.kernel.org/r/20230919060258.3237176-2-song@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/net/bpf_jit_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index de2fb12120d2e..2861e3360affc 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -2513,7 +2513,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, return -E2BIG; }
- return ret; + return tjit.common.prg; }
bool bpf_jit_supports_subprog_tailcalls(void)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Behún kabel@kernel.org
[ Upstream commit 9dc1664fab2246bc2c3e9bf2cf21518a857f9b5b ]
Commit c3f853184bed ("leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false") fixed a no-op BUG_ON. This turned out to cause a regression, since some in-tree device-tree files already use LED_COLOR_ID_MULTI.
Drop the BUG_ON altogether.
Fixes: c3f853184bed ("leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false") Reported-by: Da Xue da@libre.computer Closes: https://lore.kernel.org/linux-leds/ZQLelWcNjjp2xndY@duo.ucw.cz/T/ Signed-off-by: Marek Behún kabel@kernel.org Link: https://lore.kernel.org/r/20230918140724.18634-1-kabel@kernel.org Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/leds/led-core.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index 04f9ea675f2ce..214ed81eb0e92 100644 --- a/drivers/leds/led-core.c +++ b/drivers/leds/led-core.c @@ -479,10 +479,6 @@ int led_compose_name(struct device *dev, struct led_init_data *init_data,
led_parse_fwnode_props(dev, fwnode, &props);
- /* We want to label LEDs that can produce full range of colors - * as RGB, not multicolor */ - BUG_ON(props.color == LED_COLOR_ID_MULTI); - if (props.label) { /* * If init_data.devicename is NULL, then it indicates that
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Hwang hffilwlqm@gmail.com
[ Upstream commit b724a6418f1f853bcb39c8923bf14a50c7bdbd07 ]
Fix 'tr' dereferencing bug when CONFIG_BPF_JIT is turned off.
When CONFIG_BPF_JIT is turned off, 'bpf_trampoline_get()' returns NULL, which is same as the cases when CONFIG_BPF_JIT is turned on.
Closes: https://lore.kernel.org/r/202309131936.5Nc8eUD0-lkp@intel.com/ Fixes: f7b12b6fea00 ("bpf: verifier: refactor check_attach_btf_id()") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Leon Hwang hffilwlqm@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20230917153846.88732-1-hffilwlqm@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 477d91b926b35..6ba9d3ed8f0b0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1294,7 +1294,7 @@ static inline int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, static inline struct bpf_trampoline *bpf_trampoline_get(u64 key, struct bpf_attach_target_info *tgt_info) { - return ERR_PTR(-EOPNOTSUPP); + return NULL; } static inline void bpf_trampoline_put(struct bpf_trampoline *tr) {} #define DEFINE_BPF_DISPATCHER(name)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit 81335f90e8a88b81932df011105c46e708744f44 ]
In mark_chain_precision() logic, when we reach the entry to a global func, it is expected that R1-R5 might be still requested to be marked precise. This would correspond to some integer input arguments being tracked as precise. This is all expected and handled as a special case.
What's not expected is that we'll leave backtrack_state structure with some register bits set. This is because for subsequent precision propagations backtrack_state is reused without clearing masks, as all code paths are carefully written in a way to leave empty backtrack_state with zeroed out masks, for speed.
The fix is trivial, we always clear register bit in the register mask, and then, optionally, set reg->precise if register is SCALAR_VALUE type.
Reported-by: Chris Mason clm@meta.com Fixes: be2ef8161572 ("bpf: allow precision tracking for programs with subprogs") Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20230918210110.2241458-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9cdba4ce23d2b..93fd32f2957b7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4039,11 +4039,9 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno) bitmap_from_u64(mask, bt_reg_mask(bt)); for_each_set_bit(i, mask, 32) { reg = &st->frame[0]->regs[i]; - if (reg->type != SCALAR_VALUE) { - bt_clear_reg(bt, i); - continue; - } - reg->precise = true; + bt_clear_reg(bt, i); + if (reg->type == SCALAR_VALUE) + reg->precise = true; } return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wenst@chromium.org
[ Upstream commit 7e37c851374eca2d1f6128de03195c9f7b4baaf2 ]
The buck and linear range LDO (VSRAM_*) regulators share one set of ops. This set includes support for get/set mode. However this only makes sense for buck regulators, not LDOs. The callbacks were not checking whether the register offset and/or mask for mode setting was valid or not. This ends up making the kernel report "normal" mode operation for the LDOs.
Create a new set of ops without the get/set mode callbacks for the linear range LDO regulators.
Fixes: f67ff1bd58f0 ("regulator: mt6358: Add support for MT6358 regulator") Signed-off-by: Chen-Yu Tsai wenst@chromium.org Link: https://lore.kernel.org/r/20230920085336.136238-1-wenst@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/mt6358-regulator.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/regulator/mt6358-regulator.c b/drivers/regulator/mt6358-regulator.c index b9cda2210c330..65fbd95f1dbb0 100644 --- a/drivers/regulator/mt6358-regulator.c +++ b/drivers/regulator/mt6358-regulator.c @@ -43,7 +43,7 @@ struct mt6358_regulator_info { .desc = { \ .name = #vreg, \ .of_match = of_match_ptr(match), \ - .ops = &mt6358_volt_range_ops, \ + .ops = &mt6358_buck_ops, \ .type = REGULATOR_VOLTAGE, \ .id = MT6358_ID_##vreg, \ .owner = THIS_MODULE, \ @@ -139,7 +139,7 @@ struct mt6358_regulator_info { .desc = { \ .name = #vreg, \ .of_match = of_match_ptr(match), \ - .ops = &mt6358_volt_range_ops, \ + .ops = &mt6358_buck_ops, \ .type = REGULATOR_VOLTAGE, \ .id = MT6366_ID_##vreg, \ .owner = THIS_MODULE, \ @@ -450,7 +450,7 @@ static unsigned int mt6358_regulator_get_mode(struct regulator_dev *rdev) } }
-static const struct regulator_ops mt6358_volt_range_ops = { +static const struct regulator_ops mt6358_buck_ops = { .list_voltage = regulator_list_voltage_linear, .map_voltage = regulator_map_voltage_linear, .set_voltage_sel = regulator_set_voltage_sel_regmap, @@ -464,6 +464,18 @@ static const struct regulator_ops mt6358_volt_range_ops = { .get_mode = mt6358_regulator_get_mode, };
+static const struct regulator_ops mt6358_volt_range_ops = { + .list_voltage = regulator_list_voltage_linear, + .map_voltage = regulator_map_voltage_linear, + .set_voltage_sel = regulator_set_voltage_sel_regmap, + .get_voltage_sel = mt6358_get_buck_voltage_sel, + .set_voltage_time_sel = regulator_set_voltage_time_sel, + .enable = regulator_enable_regmap, + .disable = regulator_disable_regmap, + .is_enabled = regulator_is_enabled_regmap, + .get_status = mt6358_get_status, +}; + static const struct regulator_ops mt6358_volt_table_ops = { .list_voltage = regulator_list_voltage_table, .map_voltage = regulator_map_voltage_iterate,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yao Xiao xiaoyao@rock-chips.com
[ Upstream commit cbaabbcdcbd355f0a1ccc09a925575c51c270750 ]
hci_req_prepare_suspend() has been deprecated in favor of hci_suspend_sync().
Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Signed-off-by: Yao Xiao xiaoyao@rock-chips.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_request.h | 2 -- 1 file changed, 2 deletions(-)
diff --git a/net/bluetooth/hci_request.h b/net/bluetooth/hci_request.h index b9c5a98238374..0be75cf0efed8 100644 --- a/net/bluetooth/hci_request.h +++ b/net/bluetooth/hci_request.h @@ -71,7 +71,5 @@ struct sk_buff *hci_prepare_cmd(struct hci_dev *hdev, u16 opcode, u32 plen, void hci_req_add_le_scan_disable(struct hci_request *req, bool rpa_le_conn); void hci_req_add_le_passive_scan(struct hci_request *req);
-void hci_req_prepare_suspend(struct hci_dev *hdev, enum suspended_state next); - void hci_request_setup(struct hci_dev *hdev); void hci_request_cancel_all(struct hci_dev *hdev);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Hsu yinghsu@chromium.org
[ Upstream commit c7eaf80bfb0c8cef852cce9501b95dd5a6bddcb9 ]
Syzbot found a bug "BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580". It is because hci_link_tx_to holds an RCU read lock and calls hci_disconnect which would hold a mutex lock since the commit a13f316e90fd ("Bluetooth: hci_conn: Consolidate code for aborting connections"). Here's an example call trace:
__dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xfc/0x174 lib/dump_stack.c:106 ___might_sleep+0x4a9/0x4d3 kernel/sched/core.c:9663 __mutex_lock_common kernel/locking/mutex.c:576 [inline] __mutex_lock+0xc7/0x6e7 kernel/locking/mutex.c:732 hci_cmd_sync_queue+0x3a/0x287 net/bluetooth/hci_sync.c:388 hci_abort_conn+0x2cd/0x2e4 net/bluetooth/hci_conn.c:1812 hci_disconnect+0x207/0x237 net/bluetooth/hci_conn.c:244 hci_link_tx_to net/bluetooth/hci_core.c:3254 [inline] __check_timeout net/bluetooth/hci_core.c:3419 [inline] __check_timeout+0x310/0x361 net/bluetooth/hci_core.c:3399 hci_sched_le net/bluetooth/hci_core.c:3602 [inline] hci_tx_work+0xe8f/0x12d0 net/bluetooth/hci_core.c:3652 process_one_work+0x75c/0xba1 kernel/workqueue.c:2310 worker_thread+0x5b2/0x73a kernel/workqueue.c:2457 kthread+0x2f7/0x30b kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
This patch releases RCU read lock before calling hci_disconnect and reacquires it afterward to fix the bug.
Fixes: a13f316e90fd ("Bluetooth: hci_conn: Consolidate code for aborting connections") Signed-off-by: Ying Hsu yinghsu@chromium.org Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_core.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -3419,7 +3419,12 @@ static void hci_link_tx_to(struct hci_de if (c->type == type && c->sent) { bt_dev_err(hdev, "killing stalled connection %pMR", &c->dst); + /* hci_disconnect might sleep, so, we have to release + * the RCU read lock before calling it. + */ + rcu_read_unlock(); hci_disconnect(c, HCI_ERROR_REMOTE_USER_TERM); + rcu_read_lock(); } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
[ Upstream commit e0275ea52169412b8faccb4e2f4fed8a057844c6 ]
iso_listen_cis shall only return -EADDRINUSE if the listening socket has the destination set to BDADDR_ANY otherwise if the destination is set to a specific address it is for broadcast which shall be ignored.
Fixes: f764a6c2c1e4 ("Bluetooth: ISO: Add broadcast support") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/iso.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index 9b6a7eb2015f0..42f7b257bdfbc 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -499,7 +499,7 @@ static void iso_recv_frame(struct iso_conn *conn, struct sk_buff *skb) }
/* -------- Socket interface ---------- */ -static struct sock *__iso_get_sock_listen_by_addr(bdaddr_t *ba) +static struct sock *__iso_get_sock_listen_by_addr(bdaddr_t *src, bdaddr_t *dst) { struct sock *sk;
@@ -507,7 +507,10 @@ static struct sock *__iso_get_sock_listen_by_addr(bdaddr_t *ba) if (sk->sk_state != BT_LISTEN) continue;
- if (!bacmp(&iso_pi(sk)->src, ba)) + if (bacmp(&iso_pi(sk)->dst, dst)) + continue; + + if (!bacmp(&iso_pi(sk)->src, src)) return sk; }
@@ -965,7 +968,7 @@ static int iso_listen_cis(struct sock *sk)
write_lock(&iso_sk_list.lock);
- if (__iso_get_sock_listen_by_addr(&iso_pi(sk)->src)) + if (__iso_get_sock_listen_by_addr(&iso_pi(sk)->src, &iso_pi(sk)->dst)) err = -EADDRINUSE;
write_unlock(&iso_sk_list.lock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandra Diupina adiupina@astralinux.ru
[ Upstream commit a59addacf899b1b21a7b7449a1c52c98704c2472 ]
Process the result of hdlc_open() and call uhdlc_close() in case of an error. It is necessary to pass the error code up the control flow, similar to a possible error in request_irq(). Also add a hdlc_close() call to the uhdlc_close() because the comment to hdlc_close() says it must be called by the hardware driver when the HDLC device is being closed
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC") Signed-off-by: Alexandra Diupina adiupina@astralinux.ru Reviewed-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wan/fsl_ucc_hdlc.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c index 47c2ad7a3e429..fd50bb313b924 100644 --- a/drivers/net/wan/fsl_ucc_hdlc.c +++ b/drivers/net/wan/fsl_ucc_hdlc.c @@ -34,6 +34,8 @@ #define TDM_PPPOHT_SLIC_MAXIN #define RX_BD_ERRORS (R_CD_S | R_OV_S | R_CR_S | R_AB_S | R_NO_S | R_LG_S)
+static int uhdlc_close(struct net_device *dev); + static struct ucc_tdm_info utdm_primary_info = { .uf_info = { .tsa = 0, @@ -708,6 +710,7 @@ static int uhdlc_open(struct net_device *dev) hdlc_device *hdlc = dev_to_hdlc(dev); struct ucc_hdlc_private *priv = hdlc->priv; struct ucc_tdm *utdm = priv->utdm; + int rc = 0;
if (priv->hdlc_busy != 1) { if (request_irq(priv->ut_info->uf_info.irq, @@ -731,10 +734,13 @@ static int uhdlc_open(struct net_device *dev) napi_enable(&priv->napi); netdev_reset_queue(dev); netif_start_queue(dev); - hdlc_open(dev); + + rc = hdlc_open(dev); + if (rc) + uhdlc_close(dev); }
- return 0; + return rc; }
static void uhdlc_memclean(struct ucc_hdlc_private *priv) @@ -824,6 +830,8 @@ static int uhdlc_close(struct net_device *dev) netdev_reset_queue(dev); priv->hdlc_busy = 0;
+ hdlc_close(dev); + return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 684e45e120b82deccaf8b85633905304a3bbf56d ]
On MT76x0, LNA gain should be applied for both external and internal LNA. On MT76x2, LNA gain should be treated as 0 for external LNA. Move the LNA type based logic to mt76x2 in order to fix mt76x0.
Fixes: 2daa67588f34 ("mt76x0: unify lna_gain parsing") Reported-by: Shiji Yang yangshiji66@outlook.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230919194747.31647-1-nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt76x02_eeprom.c | 7 ------- drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c | 13 +++++++++++-- 2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_eeprom.c b/drivers/net/wireless/mediatek/mt76/mt76x02_eeprom.c index 0acabba2d1a50..5d402cf2951cb 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76x02_eeprom.c +++ b/drivers/net/wireless/mediatek/mt76/mt76x02_eeprom.c @@ -131,15 +131,8 @@ u8 mt76x02_get_lna_gain(struct mt76x02_dev *dev, s8 *lna_2g, s8 *lna_5g, struct ieee80211_channel *chan) { - u16 val; u8 lna;
- val = mt76x02_eeprom_get(dev, MT_EE_NIC_CONF_1); - if (val & MT_EE_NIC_CONF_1_LNA_EXT_2G) - *lna_2g = 0; - if (val & MT_EE_NIC_CONF_1_LNA_EXT_5G) - memset(lna_5g, 0, sizeof(s8) * 3); - if (chan->band == NL80211_BAND_2GHZ) lna = *lna_2g; else if (chan->hw_value <= 64) diff --git a/drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c b/drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c index d5809408d1d37..8c01855885ce3 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c +++ b/drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c @@ -256,7 +256,8 @@ void mt76x2_read_rx_gain(struct mt76x02_dev *dev) struct ieee80211_channel *chan = dev->mphy.chandef.chan; int channel = chan->hw_value; s8 lna_5g[3], lna_2g; - u8 lna; + bool use_lna; + u8 lna = 0; u16 val;
if (chan->band == NL80211_BAND_2GHZ) @@ -275,7 +276,15 @@ void mt76x2_read_rx_gain(struct mt76x02_dev *dev) dev->cal.rx.mcu_gain |= (lna_5g[1] & 0xff) << 16; dev->cal.rx.mcu_gain |= (lna_5g[2] & 0xff) << 24;
- lna = mt76x02_get_lna_gain(dev, &lna_2g, lna_5g, chan); + val = mt76x02_eeprom_get(dev, MT_EE_NIC_CONF_1); + if (chan->band == NL80211_BAND_2GHZ) + use_lna = !(val & MT_EE_NIC_CONF_1_LNA_EXT_2G); + else + use_lna = !(val & MT_EE_NIC_CONF_1_LNA_EXT_5G); + + if (use_lna) + lna = mt76x02_get_lna_gain(dev, &lna_2g, lna_5g, chan); + dev->cal.rx.lna_gain = mt76x02_sign_extend(lna, 8); } EXPORT_SYMBOL_GPL(mt76x2_read_rx_gain);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sandipan Das sandipan.das@amd.com
[ Upstream commit 23d2626b841c2adccdeb477665313c02dff02dc3 ]
Kernels older than v5.19 do not support PerfMonV2 and the PMI handler does not clear the overflow bits of the PerfCntrGlobalStatus register. Because of this, loading a recent kernel using kexec from an older kernel can result in inconsistent register states on Zen 4 systems.
The PMI handler of the new kernel gets confused and shows a warning when an overflow occurs because some of the overflow bits are set even if the corresponding counters are inactive. These are remnants from overflows that were handled by the older kernel.
During CPU hotplug, the PerfCntrGlobalCtl and PerfCntrGlobalStatus registers should always be cleared for PerfMonV2-capable processors. However, a condition used for NB event constaints applicable only to older processors currently prevents this from happening. Move the reset sequence to an appropriate place and also clear the LBR Freeze bit.
Fixes: 21d59e3e2c40 ("perf/x86/amd/core: Detect PerfMonV2 support") Signed-off-by: Sandipan Das sandipan.das@amd.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/882a87511af40792ba69bb0e9026f19a2e71e8a3.169469688... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/core.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index abadd5f234254..ed626bfa1eedb 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -534,8 +534,12 @@ static void amd_pmu_cpu_reset(int cpu) /* Clear enable bits i.e. PerfCntrGlobalCtl.PerfCntrEn */ wrmsrl(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0);
- /* Clear overflow bits i.e. PerfCntrGLobalStatus.PerfCntrOvfl */ - wrmsrl(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, amd_pmu_global_cntr_mask); + /* + * Clear freeze and overflow bits i.e. PerfCntrGLobalStatus.LbrFreeze + * and PerfCntrGLobalStatus.PerfCntrOvfl + */ + wrmsrl(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, + GLOBAL_STATUS_LBRS_FROZEN | amd_pmu_global_cntr_mask); }
static int amd_pmu_cpu_prepare(int cpu) @@ -570,6 +574,7 @@ static void amd_pmu_cpu_starting(int cpu) int i, nb_id;
cpuc->perf_ctr_virt_mask = AMD64_EVENTSEL_HOSTONLY; + amd_pmu_cpu_reset(cpu);
if (!x86_pmu.amd_nb_constraints) return; @@ -591,8 +596,6 @@ static void amd_pmu_cpu_starting(int cpu)
cpuc->amd_nb->nb_id = nb_id; cpuc->amd_nb->refcnt++; - - amd_pmu_cpu_reset(cpu); }
static void amd_pmu_cpu_dead(int cpu) @@ -601,6 +604,7 @@ static void amd_pmu_cpu_dead(int cpu)
kfree(cpuhw->lbr_sel); cpuhw->lbr_sel = NULL; + amd_pmu_cpu_reset(cpu);
if (!x86_pmu.amd_nb_constraints) return; @@ -613,8 +617,6 @@ static void amd_pmu_cpu_dead(int cpu)
cpuhw->amd_nb = NULL; } - - amd_pmu_cpu_reset(cpu); }
static inline void amd_pmu_set_global_ctl(u64 ctl)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira bristot@kernel.org
[ Upstream commit e8c44d3b713b96cda055a23b21e8c4f931dd159f ]
If no CPU list is passed, timerlat in user-space will dispatch one thread per sysconf(_SC_NPROCESSORS_CONF). However, not all CPU might be available, for instance, if HT is disabled.
Currently, rtla timerlat is stopping the session if an user-space thread cannot set affinity to a CPU, or if a running user-space thread is killed. However, this is too restrictive.
So, reduce the error to a debug message, and rtla timerlat run as long as there is at least one user-space thread alive.
Link: https://lore.kernel.org/lkml/59cf2c882900ab7de91c6ee33b382ac7fa6b4ed0.169478...
Fixes: cdca4f4e5e8e ("rtla/timerlat_top: Add timerlat user-space support") Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/tracing/rtla/src/timerlat_u.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/tracing/rtla/src/timerlat_u.c b/tools/tracing/rtla/src/timerlat_u.c index 05e310696dd5c..01dbf9a6b5a51 100644 --- a/tools/tracing/rtla/src/timerlat_u.c +++ b/tools/tracing/rtla/src/timerlat_u.c @@ -45,7 +45,7 @@ static int timerlat_u_main(int cpu, struct timerlat_u_params *params)
retval = sched_setaffinity(gettid(), sizeof(set), &set); if (retval == -1) { - err_msg("Error setting user thread affinity\n"); + debug_msg("Error setting user thread affinity %d, is the CPU online?\n", cpu); exit(1); }
@@ -193,7 +193,9 @@ void *timerlat_u_dispatcher(void *data) procs_count--; } } - break; + + if (!procs_count) + break; }
sleep(1);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 7a795ac8d49e2433e1b97caf5e99129daf8e1b08 ]
When regcache_rbtree_write() creates a new rbtree_node it was passing the wrong bit number to regcache_rbtree_set_register(). The bit number is the offset __in number of registers__, but in the case of creating a new block regcache_rbtree_write() was not dividing by the address stride to get the number of registers.
Fix this by dividing by map->reg_stride. Compare with regcache_rbtree_read() where the bit is checked.
This bug meant that the wrong register was marked as present. The register that was written to the cache could not be read from the cache because it was not marked as cached. But a nearby register could be marked as having a cached value even if it was never written to the cache.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Fixes: 3f4ff561bc88 ("regmap: rbtree: Make cache_present bitmap per node") Link: https://lore.kernel.org/r/20230922153711.28103-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regcache-rbtree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/base/regmap/regcache-rbtree.c b/drivers/base/regmap/regcache-rbtree.c index 06788965aa293..31d7bc682910c 100644 --- a/drivers/base/regmap/regcache-rbtree.c +++ b/drivers/base/regmap/regcache-rbtree.c @@ -453,7 +453,8 @@ static int regcache_rbtree_write(struct regmap *map, unsigned int reg, if (!rbnode) return -ENOMEM; regcache_rbtree_set_register(map, rbnode, - reg - rbnode->base_reg, value); + (reg - rbnode->base_reg) / map->reg_stride, + value); regcache_rbtree_insert(map, &rbtree_ctx->root, rbnode); rbtree_ctx->cached_rbnode = rbnode; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 31db78a4923ef5e2008f2eed321811ca79e7f71b ]
When ieee80211_key_link() is called by ieee80211_gtk_rekey_add() but returns 0 due to KRACK protection (identical key reinstall), ieee80211_gtk_rekey_add() will still return a pointer into the key, in a potential use-after-free. This normally doesn't happen since it's only called by iwlwifi in case of WoWLAN rekey offload which has its own KRACK protection, but still better to fix, do that by returning an error code and converting that to success on the cfg80211 boundary only, leaving the error for bad callers of ieee80211_gtk_rekey_add().
Reported-by: Dan Carpenter dan.carpenter@linaro.org Fixes: fdf7cb4185b6 ("mac80211: accept key reinstall without changing anything") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/cfg.c | 3 +++ net/mac80211/key.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index 45e7a5d9c7d94..e883c41a2163b 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -566,6 +566,9 @@ static int ieee80211_add_key(struct wiphy *wiphy, struct net_device *dev, }
err = ieee80211_key_link(key, link, sta); + /* KRACK protection, shouldn't happen but just silently accept key */ + if (err == -EALREADY) + err = 0;
out_unlock: mutex_unlock(&local->sta_mtx); diff --git a/net/mac80211/key.c b/net/mac80211/key.c index 21cf5a2089101..f719abe33a328 100644 --- a/net/mac80211/key.c +++ b/net/mac80211/key.c @@ -905,7 +905,7 @@ int ieee80211_key_link(struct ieee80211_key *key, */ if (ieee80211_key_identical(sdata, old_key, key)) { ieee80211_key_free_unused(key); - ret = 0; + ret = -EALREADY; goto out; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Breno Leitao leitao@debian.org
[ Upstream commit 599522d9d2e19d6240e4312577f1c5f3ffca22f6 ]
Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI handler, as shown below, several times while perf runs. A simple `perf top` run is enough to render the system unusable:
WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
This happens because the Performance Counter Global Status Register (PerfCntGlobalStatus) has one or more bits set which are considered reserved according to the "AMD64 Architecture Programmer’s Manual, Volume 2: System Programming, 24593":
https://www.amd.com/system/files/TechDocs/24593.pdf
To make this less intrusive, warn just once if any reserved bit is set and prompt the user to update the microcode. Also sanitize the value to what the code is handling, so that the overflow events continue to be handled for the number of counters that are known to be sane.
Going forward, the following microcode patch levels are recommended for Zen 4 processors in order to avoid such issues with reserved bits:
Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116 Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212
Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from the linux-firmware tree has binaries that meet the minimum required patch levels.
[ sandipan: - add message to prompt users to update microcode - rework commit message and call out required microcode levels ]
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling") Reported-by: Jirka Hladky jhladky@redhat.com Signed-off-by: Breno Leitao leitao@debian.org Signed-off-by: Sandipan Das sandipan.das@amd.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/all/3540f985652f41041e54ee82aa53e7dbd55739ae.1694696... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/core.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index ed626bfa1eedb..e24976593a298 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -886,7 +886,7 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) struct hw_perf_event *hwc; struct perf_event *event; int handled = 0, idx; - u64 status, mask; + u64 reserved, status, mask; bool pmu_enabled;
/* @@ -911,6 +911,14 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) status &= ~GLOBAL_STATUS_LBRS_FROZEN; }
+ reserved = status & ~amd_pmu_global_cntr_mask; + if (reserved) + pr_warn_once("Reserved PerfCntrGlobalStatus bits are set (0x%llx), please consider updating microcode\n", + reserved); + + /* Clear any reserved bits set by buggy microcode */ + status &= amd_pmu_global_cntr_mask; + for (idx = 0; idx < x86_pmu.num_counters; idx++) { if (!test_bit(idx, cpuc->active_mask)) continue;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yong Wu yong.wu@mediatek.com
[ Upstream commit b07eba71a512eb196cbcc29765c29c8c29b11b59 ]
In mt8192/mt8186, there is only one MM IOMMU that supports 16GB iova space, which is shared by display, vcodec and camera. These two SoC use one pgtable and have not the flag SHARE_PGTABLE, we should also keep share pgtable for this case.
In mtk_iommu_domain_finalise, MM IOMMU always share pgtable, thus remove the flag SHARE_PGTABLE checking. Infra IOMMU always uses independent pgtable.
Fixes: cf69ef46dbd9 ("iommu/mediatek: Fix two IOMMU share pagetable issue") Reported-by: Laura Nao laura.nao@collabora.com Closes: https://lore.kernel.org/linux-iommu/20230818154156.314742-1-laura.nao@collab... Signed-off-by: Yong Wu yong.wu@mediatek.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Tested-by: Laura Nao laura.nao@collabora.com Link: https://lore.kernel.org/r/20230819081443.8333-1-yong.wu@mediatek.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/mtk_iommu.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index c2764891a779c..ef27f9f1e17ef 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -258,7 +258,7 @@ struct mtk_iommu_data { struct device *smicomm_dev;
struct mtk_iommu_bank_data *bank; - struct mtk_iommu_domain *share_dom; /* For 2 HWs share pgtable */ + struct mtk_iommu_domain *share_dom;
struct regmap *pericfg; struct mutex mutex; /* Protect m4u_group/m4u_dom above */ @@ -625,8 +625,8 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, struct mtk_iommu_domain *share_dom = data->share_dom; const struct mtk_iommu_iova_region *region;
- /* Always use share domain in sharing pgtable case */ - if (MTK_IOMMU_HAS_FLAG(data->plat_data, SHARE_PGTABLE) && share_dom) { + /* Share pgtable when 2 MM IOMMU share the pgtable or one IOMMU use multiple iova ranges */ + if (share_dom) { dom->iop = share_dom->iop; dom->cfg = share_dom->cfg; dom->domain.pgsize_bitmap = share_dom->cfg.pgsize_bitmap; @@ -659,8 +659,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, /* Update our support page sizes bitmap */ dom->domain.pgsize_bitmap = dom->cfg.pgsize_bitmap;
- if (MTK_IOMMU_HAS_FLAG(data->plat_data, SHARE_PGTABLE)) - data->share_dom = dom; + data->share_dom = dom;
update_iova_region: /* Update the iova region for this domain */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Berg benjamin.berg@intel.com
[ Upstream commit aaba3cd33fc9593a858beeee419c0e6671ee9551 ]
When associating to an MLD AP, links may be disabled. Create all resources associated with a disabled link so that we can later enable it without having to create these resources on the fly.
Fixes: 6d543b34dbcf ("wifi: mac80211: Support disabled links during association") Signed-off-by: Benjamin Berg benjamin.berg@intel.com Link: https://lore.kernel.org/r/20230925173028.f9afdb26f6c7.I4e6e199aaefc1bf017362... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 46d46cfab6c84..24b2833e0e475 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -5107,9 +5107,10 @@ static bool ieee80211_assoc_success(struct ieee80211_sub_if_data *sdata, continue;
valid_links |= BIT(link_id); - if (assoc_data->link[link_id].disabled) { + if (assoc_data->link[link_id].disabled) dormant_links |= BIT(link_id); - } else if (link_id != assoc_data->assoc_link_id) { + + if (link_id != assoc_data->assoc_link_id) { err = ieee80211_sta_allocate_link(sta, link_id); if (err) goto out_err; @@ -5124,7 +5125,7 @@ static bool ieee80211_assoc_success(struct ieee80211_sub_if_data *sdata, struct ieee80211_link_data *link; struct link_sta_info *link_sta;
- if (!cbss || assoc_data->link[link_id].disabled) + if (!cbss) continue;
link = sdata_dereference(sdata->link[link_id], sdata);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michał Mirosław mirq-linux@rere.qmqm.pl
[ Upstream commit 8adb4e647a83cb5928c05dae95b010224aea0705 ]
When fixing a memory leak in commit d3c731564e09 ("regulator: plug of_node leak in regulator_register()'s error path") it moved the device_initialize() call earlier, but did not move the `dev->class` initialization. The bug was spotted and fixed by reverting part of the commit (in commit 5f4b204b6b81 "regulator: core: fix kobject release warning and memory leak in regulator_register()") but introducing a different bug: now early error paths use `kfree(dev)` instead of `put_device()` for an already initialized `struct device`.
Move the missing assignments to just after `device_initialize()`.
Fixes: d3c731564e09 ("regulator: plug of_node leak in regulator_register()'s error path") Signed-off-by: Michał Mirosław mirq-linux@rere.qmqm.pl Link: https://lore.kernel.org/r/b5b19cb458c40c9d02f3d5a7bd1ba7d97ba17279.169507730... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index d8e1caaf207e1..2820badc7a126 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -5542,6 +5542,8 @@ regulator_register(struct device *dev, goto rinse; } device_initialize(&rdev->dev); + dev_set_drvdata(&rdev->dev, rdev); + rdev->dev.class = ®ulator_class; spin_lock_init(&rdev->err_lock);
/* @@ -5603,11 +5605,9 @@ regulator_register(struct device *dev, rdev->supply_name = regulator_desc->supply_name;
/* register with sysfs */ - rdev->dev.class = ®ulator_class; rdev->dev.parent = config->dev; dev_set_name(&rdev->dev, "regulator.%lu", (unsigned long) atomic_inc_return(®ulator_no)); - dev_set_drvdata(&rdev->dev, rdev);
/* set regulator constraints */ if (init_data)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleksandr Tymoshenko ovt@google.com
[ Upstream commit be210c6d3597faf330cb9af33b9f1591d7b2a983 ]
The removal of IMA_TRUSTED_KEYRING made IMA_LOAD_X509 and IMA_BLACKLIST_KEYRING unavailable because the latter two depend on the former. Since IMA_TRUSTED_KEYRING was deprecated in favor of INTEGRITY_TRUSTED_KEYRING use it as a dependency for the two Kconfigs affected by the deprecation.
Fixes: 5087fd9e80e5 ("ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig") Signed-off-by: Oleksandr Tymoshenko ovt@google.com Reviewed-by: Nayna Jain nayna@linux.ibm.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/integrity/ima/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig index c17660bf5f347..e6df7c930397c 100644 --- a/security/integrity/ima/Kconfig +++ b/security/integrity/ima/Kconfig @@ -268,7 +268,7 @@ config IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY config IMA_BLACKLIST_KEYRING bool "Create IMA machine owner blacklist keyrings (EXPERIMENTAL)" depends on SYSTEM_TRUSTED_KEYRING - depends on IMA_TRUSTED_KEYRING + depends on INTEGRITY_TRUSTED_KEYRING default n help This option creates an IMA blacklist keyring, which contains all @@ -278,7 +278,7 @@ config IMA_BLACKLIST_KEYRING
config IMA_LOAD_X509 bool "Load X509 certificate onto the '.ima' trusted keyring" - depends on IMA_TRUSTED_KEYRING + depends on INTEGRITY_TRUSTED_KEYRING default n help File signature verification is based on the public keys
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilan Peer ilan.peer@intel.com
[ Upstream commit 22061bfc57fe08c77141dc876b4af75603c4d61d ]
The support for using link ID in the scan request API was only added in version 16. However, the code wrongly enabled this API usage also for older versions. Fix it.
Reported-by: Antoine Beaupré anarcat@debian.org Fixes: e98b23d0d7b8 ("wifi: iwlwifi: mvm: Add support for SCAN API version 16") Signed-off-by: Ilan Peer ilan.peer@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230926165546.086e635fbbe6.Ia660f35ca0b1079f2c2ea... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c index c1d9ce7534688..3cbe2c0b8d6bc 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c @@ -2342,7 +2342,7 @@ iwl_mvm_scan_umac_fill_general_p_v12(struct iwl_mvm *mvm, if (gen_flags & IWL_UMAC_SCAN_GEN_FLAGS_V2_FRAGMENTED_LMAC2) gp->num_of_fragments[SCAN_HB_LMAC_IDX] = IWL_SCAN_NUM_OF_FRAGS;
- if (version < 12) { + if (version < 16) { gp->scan_start_mac_or_link_id = scan_vif->id; } else { struct iwl_mvm_vif_link_info *link_info;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junxiao Bi junxiao.bi@oracle.com
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ]
The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device().
PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364
Fixes: 36d4cb460bcb ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_device.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c index b7ac60f4a2194..b6523d4b9259e 100644 --- a/drivers/target/target_core_device.c +++ b/drivers/target/target_core_device.c @@ -843,7 +843,6 @@ sector_t target_to_linux_sector(struct se_device *dev, sector_t lb) EXPORT_SYMBOL(target_to_linux_sector);
struct devices_idr_iter { - struct config_item *prev_item; int (*fn)(struct se_device *dev, void *data); void *data; }; @@ -853,11 +852,9 @@ static int target_devices_idr_iter(int id, void *p, void *data) { struct devices_idr_iter *iter = data; struct se_device *dev = p; + struct config_item *item; int ret;
- config_item_put(iter->prev_item); - iter->prev_item = NULL; - /* * We add the device early to the idr, so it can be used * by backend modules during configuration. We do not want @@ -867,12 +864,13 @@ static int target_devices_idr_iter(int id, void *p, void *data) if (!target_dev_configured(dev)) return 0;
- iter->prev_item = config_item_get_unless_zero(&dev->dev_group.cg_item); - if (!iter->prev_item) + item = config_item_get_unless_zero(&dev->dev_group.cg_item); + if (!item) return 0; mutex_unlock(&device_mutex);
ret = iter->fn(dev, iter->data); + config_item_put(item);
mutex_lock(&device_mutex); return ret; @@ -895,7 +893,6 @@ int target_for_each_device(int (*fn)(struct se_device *dev, void *data), mutex_lock(&device_mutex); ret = idr_for_each(&devices_idr, target_devices_idr_iter, &iter); mutex_unlock(&device_mutex); - config_item_put(iter.prev_item); return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 91e326563ee34509c35267808a4b1b3ea3db62a8 ]
Changing the direct dependencies of IMA_BLACKLIST_KEYRING and IMA_LOAD_X509 caused them to no longer depend on IMA, but a a configuration without IMA results in link failures:
arm-linux-gnueabi-ld: security/integrity/iint.o: in function `integrity_load_keys': iint.c:(.init.text+0xd8): undefined reference to `ima_load_x509'
aarch64-linux-ld: security/integrity/digsig_asymmetric.o: in function `asymmetric_verify': digsig_asymmetric.c:(.text+0x104): undefined reference to `ima_blacklist_keyring'
Adding explicit dependencies on IMA would fix this, but a more reliable way to do this is to enclose the entire Kconfig file in an 'if IMA' block. This also allows removing the existing direct dependencies.
Fixes: be210c6d3597f ("ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/integrity/ima/Kconfig | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig index e6df7c930397c..6ef7bde551263 100644 --- a/security/integrity/ima/Kconfig +++ b/security/integrity/ima/Kconfig @@ -29,9 +29,11 @@ config IMA to learn more about IMA. If unsure, say N.
+if IMA + config IMA_KEXEC bool "Enable carrying the IMA measurement list across a soft boot" - depends on IMA && TCG_TPM && HAVE_IMA_KEXEC + depends on TCG_TPM && HAVE_IMA_KEXEC default n help TPM PCRs are only reset on a hard reboot. In order to validate @@ -43,7 +45,6 @@ config IMA_KEXEC
config IMA_MEASURE_PCR_IDX int - depends on IMA range 8 14 default 10 help @@ -53,7 +54,7 @@ config IMA_MEASURE_PCR_IDX
config IMA_LSM_RULES bool - depends on IMA && AUDIT && (SECURITY_SELINUX || SECURITY_SMACK || SECURITY_APPARMOR) + depends on AUDIT && (SECURITY_SELINUX || SECURITY_SMACK || SECURITY_APPARMOR) default y help Disabling this option will disregard LSM based policy rules. @@ -61,7 +62,6 @@ config IMA_LSM_RULES choice prompt "Default template" default IMA_NG_TEMPLATE - depends on IMA help Select the default IMA measurement template.
@@ -80,14 +80,12 @@ endchoice
config IMA_DEFAULT_TEMPLATE string - depends on IMA default "ima-ng" if IMA_NG_TEMPLATE default "ima-sig" if IMA_SIG_TEMPLATE
choice prompt "Default integrity hash algorithm" default IMA_DEFAULT_HASH_SHA1 - depends on IMA help Select the default hash algorithm used for the measurement list, integrity appraisal and audit log. The compiled default @@ -117,7 +115,6 @@ endchoice
config IMA_DEFAULT_HASH string - depends on IMA default "sha1" if IMA_DEFAULT_HASH_SHA1 default "sha256" if IMA_DEFAULT_HASH_SHA256 default "sha512" if IMA_DEFAULT_HASH_SHA512 @@ -126,7 +123,6 @@ config IMA_DEFAULT_HASH
config IMA_WRITE_POLICY bool "Enable multiple writes to the IMA policy" - depends on IMA default n help IMA policy can now be updated multiple times. The new rules get @@ -137,7 +133,6 @@ config IMA_WRITE_POLICY
config IMA_READ_POLICY bool "Enable reading back the current IMA policy" - depends on IMA default y if IMA_WRITE_POLICY default n if !IMA_WRITE_POLICY help @@ -147,7 +142,6 @@ config IMA_READ_POLICY
config IMA_APPRAISE bool "Appraise integrity measurements" - depends on IMA default n help This option enables local measurement integrity appraisal. @@ -303,7 +297,6 @@ config IMA_APPRAISE_SIGNED_INIT
config IMA_MEASURE_ASYMMETRIC_KEYS bool - depends on IMA depends on ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y default y
@@ -322,7 +315,8 @@ config IMA_SECURE_AND_OR_TRUSTED_BOOT
config IMA_DISABLE_HTABLE bool "Disable htable to allow measurement of duplicate records" - depends on IMA default n help This option disables htable to allow measurement of duplicate records. + +endif
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit ed1cc05aa1f7fe8197d300e914afc28ab9818f89 ]
If the NFS4CLNT_RUN_MANAGER flag got set just before we cleared NFS4CLNT_MANAGER_RUNNING, then we might have won the race against nfs4_schedule_state_manager(), and are responsible for handling the recovery situation.
Fixes: aeabb3c96186 ("NFSv4: Fix a NFSv4 state manager deadlock") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4state.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 597ae4535fe33..9a5d911a7edc7 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -2714,6 +2714,13 @@ static void nfs4_state_manager(struct nfs_client *clp) nfs4_end_drain_session(clp); nfs4_clear_state_manager_bit(clp);
+ if (test_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state) && + !test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, + &clp->cl_state)) { + memflags = memalloc_nofs_save(); + continue; + } + if (!test_and_set_bit(NFS4CLNT_RECALL_RUNNING, &clp->cl_state)) { if (test_and_clear_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state)) { nfs_client_return_marked_delegations(clp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Schmidt mschmidt@redhat.com
[ Upstream commit c070e51db5e2a98d3aef7c324b15209ba47f3dca ]
When the PF and VF drivers both support flexible rx descriptors and have negotiated the VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC capability, the VF driver queries the PF for the list of supported descriptor formats (VIRTCHNL_OP_GET_SUPPORTED_RXDIDS). The PF driver is supposed to set the supported_rxdids bits that correspond to the descriptor formats the firmware implements. The legacy 32-byte rx desc format is always supported, even though it is not expressed in GLFLXP_RXDID_FLAGS.
The ice driver does not advertise the legacy 32-byte rx desc support, which leads to this failure to bring up the VF using the Intel out-of-tree iavf driver: iavf 0000:41:01.0: PF does not list support for default Rx descriptor format ... iavf 0000:41:01.0: PF returned error -5 (VIRTCHNL_STATUS_ERR_PARAM) to our request 6
The in-tree iavf driver does not expose this bug, because it does not yet implement VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC.
The ice driver must always set the ICE_RXDID_LEGACY_1 bit in supported_rxdids. The Intel out-of-tree ice driver and the ice driver in DPDK both do this.
I copied this piece of the code and the comment text from the Intel out-of-tree driver.
Fixes: e753df8fbca5 ("ice: Add support Flex RXD") Signed-off-by: Michal Schmidt mschmidt@redhat.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Link: https://lore.kernel.org/r/20230920115439.61172-1-mschmidt@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_virtchnl.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index dcf628b1fccd9..33ac6c4a8928f 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -2615,12 +2615,14 @@ static int ice_vc_query_rxdid(struct ice_vf *vf) goto err; }
- /* Read flexiflag registers to determine whether the - * corresponding RXDID is configured and supported or not. - * Since Legacy 16byte descriptor format is not supported, - * start from Legacy 32byte descriptor. + /* RXDIDs supported by DDP package can be read from the register + * to get the supported RXDID bitmap. But the legacy 32byte RXDID + * is not listed in DDP package, add it in the bitmap manually. + * Legacy 16byte descriptor is not supported. */ - for (i = ICE_RXDID_LEGACY_1; i < ICE_FLEX_DESC_RXDID_MAX_NUM; i++) { + rxdid->supported_rxdids |= BIT(ICE_RXDID_LEGACY_1); + + for (i = ICE_RXDID_FLEX_NIC; i < ICE_FLEX_DESC_RXDID_MAX_NUM; i++) { regval = rd32(hw, GLFLXP_RXDID_FLAGS(i, 0)); if ((regval >> GLFLXP_RXDID_FLAGS_FLEXIFLAG_4N_S) & GLFLXP_RXDID_FLAGS_FLEXIFLAG_4N_M)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Fastabend john.fastabend@gmail.com
[ Upstream commit 9b7177b1df64b8d7f85700027c324aadd6aded00 ]
Before fix e5c6de5fa0258 tcp_read_skb() would increment the tp->copied-seq value. This (as described in the commit) would cause an error for apps because once that is incremented the application might believe there is no data to be read. Then some apps would stall or abort believing no data is available.
However, the fix is incomplete because it introduces another issue in the skb dequeue. The loop does tcp_recv_skb() in a while loop to consume as many skbs as possible. The problem is the call is ...
tcp_recv_skb(sk, seq, &offset)
... where 'seq' is:
u32 seq = tp->copied_seq;
Now we can hit a case where we've yet incremented copied_seq from BPF side, but then tcp_recv_skb() fails this test ...
if (offset < skb->len || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))
... so that instead of returning the skb we call tcp_eat_recv_skb() which frees the skb. This is because the routine believes the SKB has been collapsed per comment:
/* This looks weird, but this can happen if TCP collapsing * splitted a fat GRO packet, while we released socket lock * in skb_splice_bits() */
This can't happen here we've unlinked the full SKB and orphaned it. Anyways it would confuse any BPF programs if the data were suddenly moved underneath it.
To fix this situation do simpler operation and just skb_peek() the data of the queue followed by the unlink. It shouldn't need to check this condition and tcp_read_skb() reads entire skbs so there is no need to handle the 'offset!=0' case as we would see in tcp_read_sock().
Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq") Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Jakub Sitnicki jakub@cloudflare.com Link: https://lore.kernel.org/bpf/20230926035300.135096-2-john.fastabend@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 75f24b931a185..9cfc07d1e4252 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1618,16 +1618,13 @@ EXPORT_SYMBOL(tcp_read_sock);
int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) { - struct tcp_sock *tp = tcp_sk(sk); - u32 seq = tp->copied_seq; struct sk_buff *skb; int copied = 0; - u32 offset;
if (sk->sk_state == TCP_LISTEN) return -ENOTCONN;
- while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) { + while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) { u8 tcp_flags; int used;
@@ -1640,13 +1637,10 @@ int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) copied = used; break; } - seq += used; copied += used;
- if (tcp_flags & TCPHDR_FIN) { - ++seq; + if (tcp_flags & TCPHDR_FIN) break; - } } return copied; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Fastabend john.fastabend@gmail.com
[ Upstream commit da9e915eaf5dadb1963b7738cdfa42ed55212445 ]
When data is peek'd off the receive queue we shouldn't considered it copied from tcp_sock side. When we increment copied_seq this will confuse tcp_data_ready() because copied_seq can be arbitrarily increased. From application side it results in poll() operations not waking up when expected.
Notice tcp stack without BPF recvmsg programs also does not increment copied_seq.
We broke this when we moved copied_seq into recvmsg to only update when actual copy was happening. But, it wasn't working correctly either before because the tcp_data_ready() tried to use the copied_seq value to see if data was read by user yet. See fixes tags.
Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq") Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Jakub Sitnicki jakub@cloudflare.com Link: https://lore.kernel.org/bpf/20230926035300.135096-3-john.fastabend@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bpf.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 81f0dff69e0b6..3272682030015 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -222,6 +222,7 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk, int *addr_len) { struct tcp_sock *tcp = tcp_sk(sk); + int peek = flags & MSG_PEEK; u32 seq = tcp->copied_seq; struct sk_psock *psock; int copied = 0; @@ -311,7 +312,8 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk, copied = -EAGAIN; } out: - WRITE_ONCE(tcp->copied_seq, seq); + if (!peek) + WRITE_ONCE(tcp->copied_seq, seq); tcp_rcv_space_adjust(sk); if (copied > 0) __tcp_cleanup_rbuf(sk, copied);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Sitnicki jakub@cloudflare.com
[ Upstream commit b80e31baa43614e086a9d29dc1151932b1bd7fc5 ]
With a SOCKMAP/SOCKHASH map and an sk_msg program user can steer messages sent from one TCP socket (s1) to actually egress from another TCP socket (s2):
tcp_bpf_sendmsg(s1) // = sk_prot->sendmsg tcp_bpf_send_verdict(s1) // __SK_REDIRECT case tcp_bpf_sendmsg_redir(s2) tcp_bpf_push_locked(s2) tcp_bpf_push(s2) tcp_rate_check_app_limited(s2) // expects tcp_sock tcp_sendmsg_locked(s2) // ditto
There is a hard-coded assumption in the call-chain, that the egress socket (s2) is a TCP socket.
However in commit 122e6c79efe1 ("sock_map: Update sock type checks for UDP") we have enabled redirects to non-TCP sockets. This was done for the sake of BPF sk_skb programs. There was no indention to support sk_msg send-to-egress use case.
As a result, attempts to send-to-egress through a non-TCP socket lead to a crash due to invalid downcast from sock to tcp_sock:
BUG: kernel NULL pointer dereference, address: 000000000000002f ... Call Trace: <TASK> ? show_regs+0x60/0x70 ? __die+0x1f/0x70 ? page_fault_oops+0x80/0x160 ? do_user_addr_fault+0x2d7/0x800 ? rcu_is_watching+0x11/0x50 ? exc_page_fault+0x70/0x1c0 ? asm_exc_page_fault+0x27/0x30 ? tcp_tso_segs+0x14/0xa0 tcp_write_xmit+0x67/0xce0 __tcp_push_pending_frames+0x32/0xf0 tcp_push+0x107/0x140 tcp_sendmsg_locked+0x99f/0xbb0 tcp_bpf_push+0x19d/0x3a0 tcp_bpf_sendmsg_redir+0x55/0xd0 tcp_bpf_send_verdict+0x407/0x550 tcp_bpf_sendmsg+0x1a1/0x390 inet_sendmsg+0x6a/0x70 sock_sendmsg+0x9d/0xc0 ? sockfd_lookup_light+0x12/0x80 __sys_sendto+0x10e/0x160 ? syscall_enter_from_user_mode+0x20/0x60 ? __this_cpu_preempt_check+0x13/0x20 ? lockdep_hardirqs_on+0x82/0x110 __x64_sys_sendto+0x1f/0x30 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Reject selecting a non-TCP sockets as redirect target from a BPF sk_msg program to prevent the crash. When attempted, user will receive an EACCES error from send/sendto/sendmsg() syscall.
Fixes: 122e6c79efe1 ("sock_map: Update sock type checks for UDP") Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20230920102055.42662-1-jakub@cloudflare.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/sock_map.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 8f07fea39d9ea..3fc4086a414ea 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -668,6 +668,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg, sk = __sock_map_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) + return SK_DROP;
msg->flags = flags; msg->sk_redir = sk; @@ -1267,6 +1269,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg, sk = __sock_hash_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) + return SK_DROP;
msg->flags = flags; msg->sk_redir = sk;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mauricio Faria de Oliveira mfo@canonical.com
[ Upstream commit cbc3d00cf88fda95dbcafee3b38655b7a8f2650a ]
Without this 'else' statement, an "usb" name goes into two handlers: the first/previous 'if' statement _AND_ the for-loop over 'devtable', but the latter is useless as it has no 'usb' device_id entry anyway.
Tested with allmodconfig before/after patch; no changes to *.mod.c:
git checkout v6.6-rc3 make -j$(nproc) allmodconfig make -j$(nproc) olddefconfig
make -j$(nproc) find . -name '*.mod.c' | cpio -pd /tmp/before
# apply patch
make -j$(nproc) find . -name '*.mod.c' | cpio -pd /tmp/after
diff -r /tmp/before/ /tmp/after/ # no difference
Fixes: acbef7b76629 ("modpost: fix module autoloading for OF devices with generic compatible property") Signed-off-by: Mauricio Faria de Oliveira mfo@canonical.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/mod/file2alias.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 38120f932b0dc..7056751c29b1f 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -1604,7 +1604,7 @@ void handle_moddevtable(struct module *mod, struct elf_info *info, /* First handle the "special" cases */ if (sym_is(name, namelen, "usb")) do_usb_table(symval, sym->st_size, mod); - if (sym_is(name, namelen, "of")) + else if (sym_is(name, namelen, "of")) do_of_table(symval, sym->st_size, mod); else if (sym_is(name, namelen, "pnp")) do_pnp_device_entry(symval, sym->st_size, mod);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Clark Wang xiaoning.wang@nxp.com
[ Upstream commit 6b09edc1b31762af58d3d95754354ca6a92d39c0 ]
The second parameter of stmmac_pltfr_init() needs the pointer of "struct plat_stmmacenet_data". So, correct the parameter typo when calling the function.
Otherwise, it may cause this alignment exception when doing suspend/resume. [ 49.067201] CPU1 is up [ 49.135258] Internal error: SP/PC alignment exception: 000000008a000000 [#1] PREEMPT SMP [ 49.143346] Modules linked in: soc_imx9 crct10dif_ce polyval_ce nvmem_imx_ocotp_fsb_s400 polyval_generic layerscape_edac_mod snd_soc_fsl_asoc_card snd_soc_imx_audmux snd_soc_imx_card snd_soc_wm8962 el_enclave snd_soc_fsl_micfil rtc_pcf2127 rtc_pcf2131 flexcan can_dev snd_soc_fsl_xcvr snd_soc_fsl_sai imx8_media_dev(C) snd_soc_fsl_utils fuse [ 49.173393] CPU: 0 PID: 565 Comm: sh Tainted: G C 6.5.0-rc4-next-20230804-05047-g5781a6249dae #677 [ 49.183721] Hardware name: NXP i.MX93 11X11 EVK board (DT) [ 49.189190] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 49.196140] pc : 0x80800052 [ 49.198931] lr : stmmac_pltfr_resume+0x34/0x50 [ 49.203368] sp : ffff800082f8bab0 [ 49.206670] x29: ffff800082f8bab0 x28: ffff0000047d0ec0 x27: ffff80008186c170 [ 49.213794] x26: 0000000b5e4ff1ba x25: ffff800081e5fa74 x24: 0000000000000010 [ 49.220918] x23: ffff800081fe0000 x22: 0000000000000000 x21: 0000000000000000 [ 49.228042] x20: ffff0000001b4010 x19: ffff0000001b4010 x18: 0000000000000006 [ 49.235166] x17: ffff7ffffe007000 x16: ffff800080000000 x15: 0000000000000000 [ 49.242290] x14: 00000000000000fc x13: 0000000000000000 x12: 0000000000000000 [ 49.249414] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff800082f8b8c0 [ 49.256538] x8 : 0000000000000008 x7 : 0000000000000001 x6 : 000000005f54a200 [ 49.263662] x5 : 0000000001000000 x4 : ffff800081b93680 x3 : ffff800081519be0 [ 49.270786] x2 : 0000000080800052 x1 : 0000000000000000 x0 : ffff0000001b4000 [ 49.277911] Call trace: [ 49.280346] 0x80800052 [ 49.282781] platform_pm_resume+0x2c/0x68 [ 49.286785] dpm_run_callback.constprop.0+0x74/0x134 [ 49.291742] device_resume+0x88/0x194 [ 49.295391] dpm_resume+0x10c/0x230 [ 49.298866] dpm_resume_end+0x18/0x30 [ 49.302515] suspend_devices_and_enter+0x2b8/0x624 [ 49.307299] pm_suspend+0x1fc/0x348 [ 49.310774] state_store+0x80/0x104 [ 49.314258] kobj_attr_store+0x18/0x2c [ 49.318002] sysfs_kf_write+0x44/0x54 [ 49.321659] kernfs_fop_write_iter+0x120/0x1ec [ 49.326088] vfs_write+0x1bc/0x300 [ 49.329485] ksys_write+0x70/0x104 [ 49.332874] __arm64_sys_write+0x1c/0x28 [ 49.336783] invoke_syscall+0x48/0x114 [ 49.340527] el0_svc_common.constprop.0+0xc4/0xe4 [ 49.345224] do_el0_svc+0x38/0x98 [ 49.348526] el0_svc+0x2c/0x84 [ 49.351568] el0t_64_sync_handler+0x100/0x12c [ 49.355910] el0t_64_sync+0x190/0x194 [ 49.359567] Code: ???????? ???????? ???????? ???????? (????????) [ 49.365644] ---[ end trace 0000000000000000 ]---
Fixes: 97117eb51ec8 ("net: stmmac: platform: provide stmmac_pltfr_init()") Signed-off-by: Clark Wang xiaoning.wang@nxp.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 231152ee5a323..5a3bd30d6c220 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -901,7 +901,7 @@ static int __maybe_unused stmmac_pltfr_resume(struct device *dev) struct platform_device *pdev = to_platform_device(dev); int ret;
- ret = stmmac_pltfr_init(pdev, priv->plat->bsp_priv); + ret = stmmac_pltfr_init(pdev, priv->plat); if (ret) return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 25563b581ba3a1f263a00e8c9a97f5e7363be6fd ]
While looking at a related syzbot report involving neigh_periodic_work(), I found that I forgot to add an annotation when deleting an RCU protected item from a list.
Readers use rcu_deference(*np), we need to use either rcu_assign_pointer() or WRITE_ONCE() on writer side to prevent store tearing.
I use rcu_assign_pointer() to have lockdep support, this was the choice made in neigh_flush_dev().
Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/neighbour.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/core/neighbour.c b/net/core/neighbour.c index ddd0f32de20ef..b57d3ea3ccc9e 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -988,7 +988,9 @@ static void neigh_periodic_work(struct work_struct *work) (state == NUD_FAILED || !time_in_range_open(jiffies, n->used, n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) { - *np = n->next; + rcu_assign_pointer(*np, + rcu_dereference_protected(n->next, + lockdep_is_held(&tbl->lock))); neigh_mark_dead(n); write_unlock(&n->lock); neigh_cleanup_and_release(n);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 5baa0433a15eadd729625004c37463acb982eca7 ]
n->output field can be read locklessly, while a writer might change the pointer concurrently.
Add missing annotations to prevent load-store tearing.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/neighbour.h | 2 +- net/bridge/br_netfilter_hooks.c | 2 +- net/core/neighbour.c | 10 +++++----- 3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/net/neighbour.h b/include/net/neighbour.h index f6a8ecc6b1fa7..ccc4a0f8b4ad8 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -541,7 +541,7 @@ static inline int neigh_output(struct neighbour *n, struct sk_buff *skb, READ_ONCE(hh->hh_len)) return neigh_hh_output(hh, skb);
- return n->output(n, skb); + return READ_ONCE(n->output)(n, skb); }
static inline struct neighbour * diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 1a801fab9543e..0be889905c2b6 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -294,7 +294,7 @@ int br_nf_pre_routing_finish_bridge(struct net *net, struct sock *sk, struct sk_ /* tell br_dev_xmit to continue with forwarding */ nf_bridge->bridged_dnat = 1; /* FIXME Need to refragment */ - ret = neigh->output(neigh, skb); + ret = READ_ONCE(neigh->output)(neigh, skb); } neigh_release(neigh); return ret; diff --git a/net/core/neighbour.c b/net/core/neighbour.c index b57d3ea3ccc9e..f16ec0e8a0348 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -410,7 +410,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev, */ __skb_queue_purge(&n->arp_queue); n->arp_queue_len_bytes = 0; - n->output = neigh_blackhole; + WRITE_ONCE(n->output, neigh_blackhole); if (n->nud_state & NUD_VALID) n->nud_state = NUD_NOARP; else @@ -920,7 +920,7 @@ static void neigh_suspect(struct neighbour *neigh) { neigh_dbg(2, "neigh %p is suspected\n", neigh);
- neigh->output = neigh->ops->output; + WRITE_ONCE(neigh->output, neigh->ops->output); }
/* Neighbour state is OK; @@ -932,7 +932,7 @@ static void neigh_connect(struct neighbour *neigh) { neigh_dbg(2, "neigh %p is connected\n", neigh);
- neigh->output = neigh->ops->connected_output; + WRITE_ONCE(neigh->output, neigh->ops->connected_output); }
static void neigh_periodic_work(struct work_struct *work) @@ -1449,7 +1449,7 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr, if (n2) n1 = n2; } - n1->output(n1, skb); + READ_ONCE(n1->output)(n1, skb); if (n2) neigh_release(n2); rcu_read_unlock(); @@ -3155,7 +3155,7 @@ int neigh_xmit(int index, struct net_device *dev, rcu_read_unlock(); goto out_kfree_skb; } - err = neigh->output(neigh, skb); + err = READ_ONCE(neigh->output)(neigh, skb); rcu_read_unlock(); } else if (index == NEIGH_LINK_TABLE) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells dhowells@redhat.com
[ Upstream commit 9d4c75800f61e5d75c1659ba201b6c0c7ead3070 ]
Including the transhdrlen in length is a problem when the packet is partially filled (e.g. something like send(MSG_MORE) happened previously) when appending to an IPv4 or IPv6 packet as we don't want to repeat the transport header or account for it twice. This can happen under some circumstances, such as splicing into an L2TP socket.
The symptom observed is a warning in __ip6_append_data():
WARNING: CPU: 1 PID: 5042 at net/ipv6/ip6_output.c:1800 __ip6_append_data.isra.0+0x1be8/0x47f0 net/ipv6/ip6_output.c:1800
that occurs when MSG_SPLICE_PAGES is used to append more data to an already partially occupied skbuff. The warning occurs when 'copy' is larger than the amount of data in the message iterator. This is because the requested length includes the transport header length when it shouldn't. This can be triggered by, for example:
sfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_L2TP); bind(sfd, ...); // ::1 connect(sfd, ...); // ::1 port 7 send(sfd, buffer, 4100, MSG_MORE); sendfile(sfd, dfd, NULL, 1024);
Fix this by only adding transhdrlen into the length if the write queue is empty in l2tp_ip6_sendmsg(), analogously to how UDP does things.
l2tp_ip_sendmsg() looks like it won't suffer from this problem as it builds the UDP packet itself.
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Reported-by: syzbot+62cbf263225ae13ff153@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/0000000000001c12b30605378ce8@google.com/ Suggested-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Signed-off-by: David Howells dhowells@redhat.com cc: Eric Dumazet edumazet@google.com cc: Willem de Bruijn willemdebruijn.kernel@gmail.com cc: "David S. Miller" davem@davemloft.net cc: David Ahern dsahern@kernel.org cc: Paolo Abeni pabeni@redhat.com cc: Jakub Kicinski kuba@kernel.org cc: netdev@vger.kernel.org cc: bpf@vger.kernel.org cc: syzkaller-bugs@googlegroups.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/l2tp/l2tp_ip6.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index ed8ebb6f59097..11f3d375cec00 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -507,7 +507,6 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) */ if (len > INT_MAX - transhdrlen) return -EMSGSIZE; - ulen = len + transhdrlen;
/* Mirror BSD error message compatibility */ if (msg->msg_flags & MSG_OOB) @@ -628,6 +627,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
back_from_confirm: lock_sock(sk); + ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0; err = ip6_append_data(sk, ip_generic_getfrag, msg, ulen, transhdrlen, &ipc6, &fl6, (struct rt6_info *)dst,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit caa0578c1d487d39e4bb947a1b4965417053b409 ]
When device_add() fails, ptp_ocp_dev_release() will be called after put_device(). Therefore, it seems that the ptp_ocp_dev_release() before put_device() is redundant.
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Reviewed-by: Vadim Feodrenko vadim.fedorenko@linux.dev Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_ocp.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/ptp/ptp_ocp.c b/drivers/ptp/ptp_ocp.c index 20a974ced8d6c..a7a6947ab4bc5 100644 --- a/drivers/ptp/ptp_ocp.c +++ b/drivers/ptp/ptp_ocp.c @@ -3998,7 +3998,6 @@ ptp_ocp_device_init(struct ptp_ocp *bp, struct pci_dev *pdev) return 0;
out: - ptp_ocp_dev_release(&bp->dev); put_device(&bp->dev); return err; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabio Estevam festevam@denx.de
[ Upstream commit 6ccf50d4d4741e064ba35511a95402c63bbe21a8 ]
Since commit 23d775f12dcd ("net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset") the following error is seen on a imx8mn board with a 88E6320 switch:
mv88e6085 30be0000.ethernet-1:00: Timeout waiting for EEPROM done
This board does not have an EEPROM attached to the switch though.
This problem is well explained by Andrew Lunn:
"If there is an EEPROM, and the EEPROM contains a lot of data, it could be that when we perform a hardware reset towards the end of probe, it interrupts an I2C bus transaction, leaving the I2C bus in a bad state, and future reads of the EEPROM do not work.
The work around for this was to poll the EEInt status and wait for it to go true before performing the hardware reset.
However, we have discovered that for some boards which do not have an EEPROM, EEInt never indicates complete. As a result, mv88e6xxx_g1_wait_eeprom_done() spins for a second and then prints a warning.
We probably need a different solution than calling mv88e6xxx_g1_wait_eeprom_done(). The datasheet for 6352 documents the EEPROM Command register:
bit 15 is:
EEPROM Unit Busy. This bit must be set to a one to start an EEPROM operation (see EEOp below). Only one EEPROM operation can be executing at one time so this bit must be zero before setting it to a one. When the requested EEPROM operation completes this bit will automatically be cleared to a zero. The transition of this bit from a one to a zero can be used to generate an interrupt (the EEInt in Global 1, offset 0x00).
and more interesting is bit 11:
Register Loader Running. This bit is set to one whenever the register loader is busy executing instructions contained in the EEPROM."
Change to using mv88e6xxx_g2_eeprom_wait() to fix the timeout error when the EEPROM chip is not present.
Fixes: 23d775f12dcd ("net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset") Suggested-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Fabio Estevam festevam@denx.de Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 6 ++++-- drivers/net/dsa/mv88e6xxx/global1.c | 31 ----------------------------- drivers/net/dsa/mv88e6xxx/global1.h | 1 - drivers/net/dsa/mv88e6xxx/global2.c | 2 +- drivers/net/dsa/mv88e6xxx/global2.h | 1 + 5 files changed, 6 insertions(+), 35 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 7af2f08a62f14..0d4b236d1e344 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3040,14 +3040,16 @@ static void mv88e6xxx_hardware_reset(struct mv88e6xxx_chip *chip) * from the wrong location resulting in the switch booting * to wrong mode and inoperable. */ - mv88e6xxx_g1_wait_eeprom_done(chip); + if (chip->info->ops->get_eeprom) + mv88e6xxx_g2_eeprom_wait(chip);
gpiod_set_value_cansleep(gpiod, 1); usleep_range(10000, 20000); gpiod_set_value_cansleep(gpiod, 0); usleep_range(10000, 20000);
- mv88e6xxx_g1_wait_eeprom_done(chip); + if (chip->info->ops->get_eeprom) + mv88e6xxx_g2_eeprom_wait(chip); } }
diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c index 2fa55a6435910..174c773b38c2b 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.c +++ b/drivers/net/dsa/mv88e6xxx/global1.c @@ -75,37 +75,6 @@ static int mv88e6xxx_g1_wait_init_ready(struct mv88e6xxx_chip *chip) return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_STS, bit, 1); }
-void mv88e6xxx_g1_wait_eeprom_done(struct mv88e6xxx_chip *chip) -{ - const unsigned long timeout = jiffies + 1 * HZ; - u16 val; - int err; - - /* Wait up to 1 second for the switch to finish reading the - * EEPROM. - */ - while (time_before(jiffies, timeout)) { - err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_STS, &val); - if (err) { - dev_err(chip->dev, "Error reading status"); - return; - } - - /* If the switch is still resetting, it may not - * respond on the bus, and so MDIO read returns - * 0xffff. Differentiate between that, and waiting for - * the EEPROM to be done by bit 0 being set. - */ - if (val != 0xffff && - val & BIT(MV88E6XXX_G1_STS_IRQ_EEPROM_DONE)) - return; - - usleep_range(1000, 2000); - } - - dev_err(chip->dev, "Timeout waiting for EEPROM done"); -} - /* Offset 0x01: Switch MAC Address Register Bytes 0 & 1 * Offset 0x02: Switch MAC Address Register Bytes 2 & 3 * Offset 0x03: Switch MAC Address Register Bytes 4 & 5 diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index c99ddd117fe6e..1095261f5b490 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -282,7 +282,6 @@ int mv88e6xxx_g1_set_switch_mac(struct mv88e6xxx_chip *chip, u8 *addr); int mv88e6185_g1_reset(struct mv88e6xxx_chip *chip); int mv88e6352_g1_reset(struct mv88e6xxx_chip *chip); int mv88e6250_g1_reset(struct mv88e6xxx_chip *chip); -void mv88e6xxx_g1_wait_eeprom_done(struct mv88e6xxx_chip *chip);
int mv88e6185_g1_ppu_enable(struct mv88e6xxx_chip *chip); int mv88e6185_g1_ppu_disable(struct mv88e6xxx_chip *chip); diff --git a/drivers/net/dsa/mv88e6xxx/global2.c b/drivers/net/dsa/mv88e6xxx/global2.c index 937a01f2ba75e..b2b5f6ba438f4 100644 --- a/drivers/net/dsa/mv88e6xxx/global2.c +++ b/drivers/net/dsa/mv88e6xxx/global2.c @@ -340,7 +340,7 @@ int mv88e6xxx_g2_pot_clear(struct mv88e6xxx_chip *chip) * Offset 0x15: EEPROM Addr (for 8-bit data access) */
-static int mv88e6xxx_g2_eeprom_wait(struct mv88e6xxx_chip *chip) +int mv88e6xxx_g2_eeprom_wait(struct mv88e6xxx_chip *chip) { int bit = __bf_shf(MV88E6XXX_G2_EEPROM_CMD_BUSY); int err; diff --git a/drivers/net/dsa/mv88e6xxx/global2.h b/drivers/net/dsa/mv88e6xxx/global2.h index 7e091965582b7..d9434f7cae538 100644 --- a/drivers/net/dsa/mv88e6xxx/global2.h +++ b/drivers/net/dsa/mv88e6xxx/global2.h @@ -365,6 +365,7 @@ int mv88e6xxx_g2_trunk_clear(struct mv88e6xxx_chip *chip);
int mv88e6xxx_g2_device_mapping_write(struct mv88e6xxx_chip *chip, int target, int port); +int mv88e6xxx_g2_eeprom_wait(struct mv88e6xxx_chip *chip);
extern const struct mv88e6xxx_irq_ops mv88e6097_watchdog_ops; extern const struct mv88e6xxx_irq_ops mv88e6250_watchdog_ops;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit d9e8319a6e3538b430f692b5625a76ffa0758adc ]
... into ->free_inode(), that is.
Fixes: 0af950f57fef "ovl: move ovl_entry into ovl_inode" Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index cc8977498c483..8e9c1cf83df24 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -164,6 +164,7 @@ static void ovl_free_inode(struct inode *inode) struct ovl_inode *oi = OVL_I(inode);
kfree(oi->redirect); + kfree(oi->oe); mutex_destroy(&oi->lock); kmem_cache_free(ovl_inode_cachep, oi); } @@ -173,7 +174,7 @@ static void ovl_destroy_inode(struct inode *inode) struct ovl_inode *oi = OVL_I(inode);
dput(oi->__upperdentry); - ovl_free_entry(oi->oe); + ovl_stack_put(ovl_lowerstack(oi->oe), ovl_numlower(oi->oe)); if (S_ISDIR(inode->i_mode)) ovl_dir_cache_free(inode); else
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit c54719c92aa3129f330cce81b88cf34f1627f756 ]
d_inode_rcu() is right - we might be in rcu pathwalk; however, OVL_E() hides plain d_inode() on the same dentry...
Fixes: a6ff2bc0be17 ("ovl: use OVL_E() and OVL_E_FLAGS() accessors") Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/super.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 8e9c1cf83df24..1090c68e5b051 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -101,8 +101,8 @@ static int ovl_revalidate_real(struct dentry *d, unsigned int flags, bool weak) static int ovl_dentry_revalidate_common(struct dentry *dentry, unsigned int flags, bool weak) { - struct ovl_entry *oe = OVL_E(dentry); - struct ovl_path *lowerstack = ovl_lowerstack(oe); + struct ovl_entry *oe; + struct ovl_path *lowerstack; struct inode *inode = d_inode_rcu(dentry); struct dentry *upper; unsigned int i; @@ -112,6 +112,8 @@ static int ovl_dentry_revalidate_common(struct dentry *dentry, if (!inode) return -ECHILD;
+ oe = OVL_I_E(inode); + lowerstack = ovl_lowerstack(oe); upper = ovl_i_dentry_upper(inode); if (upper) ret = ovl_revalidate_real(upper, flags, weak);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Maximets i.maximets@ovn.org
[ Upstream commit 9593c7cb6cf670ef724d17f7f9affd7a8d2ad0c5 ]
Commit b0e214d21203 ("netfilter: keep conntrack reference until IPsecv6 policy checks are done") is a direct copy of the old commit b59c270104f0 ("[NETFILTER]: Keep conntrack reference until IPsec policy checks are done") but for IPv6. However, it also copies a bug that this old commit had. That is: when the third packet of 3WHS connection establishment contains payload, it is added into socket receive queue without the XFRM check and the drop of connection tracking context.
That leads to nf_conntrack module being impossible to unload as it waits for all the conntrack references to be dropped while the packet release is deferred in per-cpu cache indefinitely, if not consumed by the application.
The issue for IPv4 was fixed in commit 6f0012e35160 ("tcp: add a missing nf_reset_ct() in 3WHS handling") by adding a missing XFRM check and correctly dropping the conntrack context. However, the issue was introduced to IPv6 code afterwards. Fixing it the same way for IPv6 now.
Fixes: b0e214d21203 ("netfilter: keep conntrack reference until IPsecv6 policy checks are done") Link: https://lore.kernel.org/netdev/d589a999-d4dd-2768-b2d5-89dec64a4a42@ovn.org/ Signed-off-by: Ilya Maximets i.maximets@ovn.org Acked-by: Florian Westphal fw@strlen.de Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230922210530.2045146-1-i.maximets@ovn.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/tcp_ipv6.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 3a88545a265d6..44b6949d72b22 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1640,9 +1640,12 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) struct sock *nsk;
sk = req->rsk_listener; - drop_reason = tcp_inbound_md5_hash(sk, skb, - &hdr->saddr, &hdr->daddr, - AF_INET6, dif, sdif); + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) + drop_reason = SKB_DROP_REASON_XFRM_POLICY; + else + drop_reason = tcp_inbound_md5_hash(sk, skb, + &hdr->saddr, &hdr->daddr, + AF_INET6, dif, sdif); if (drop_reason) { sk_drops_add(sk, skb); reqsk_put(req); @@ -1689,6 +1692,7 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) } goto discard_and_relse; } + nf_reset_ct(skb); if (nsk == sk) { reqsk_put(req); tcp_v6_restore_cb(skb);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit e9c65989920f7c28775ec4e0c11b483910fb67b8 ]
syzbot reported the following uninit-value access issue:
===================================================== BUG: KMSAN: uninit-value in smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline] BUG: KMSAN: uninit-value in smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482 CPU: 0 PID: 8696 Comm: kworker/0:3 Not tainted 5.8.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline] smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482 usbnet_probe+0x1152/0x3f90 drivers/net/usb/usbnet.c:1737 usb_probe_interface+0xece/0x1550 drivers/usb/core/driver.c:374 really_probe+0xf20/0x20b0 drivers/base/dd.c:529 driver_probe_device+0x293/0x390 drivers/base/dd.c:701 __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807 bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431 __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873 device_initial_probe+0x4a/0x60 drivers/base/dd.c:920 bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491 device_add+0x3b0e/0x40d0 drivers/base/core.c:2680 usb_set_configuration+0x380f/0x3f10 drivers/usb/core/message.c:2032 usb_generic_driver_probe+0x138/0x300 drivers/usb/core/generic.c:241 usb_probe_device+0x311/0x490 drivers/usb/core/driver.c:272 really_probe+0xf20/0x20b0 drivers/base/dd.c:529 driver_probe_device+0x293/0x390 drivers/base/dd.c:701 __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807 bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431 __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873 device_initial_probe+0x4a/0x60 drivers/base/dd.c:920 bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491 device_add+0x3b0e/0x40d0 drivers/base/core.c:2680 usb_new_device+0x1bd4/0x2a30 drivers/usb/core/hub.c:2554 hub_port_connect drivers/usb/core/hub.c:5208 [inline] hub_port_connect_change drivers/usb/core/hub.c:5348 [inline] port_event drivers/usb/core/hub.c:5494 [inline] hub_event+0x5e7b/0x8a70 drivers/usb/core/hub.c:5576 process_one_work+0x1688/0x2140 kernel/workqueue.c:2269 worker_thread+0x10bc/0x2730 kernel/workqueue.c:2415 kthread+0x551/0x590 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
Local variable ----buf.i87@smsc75xx_bind created at: __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline] smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline] smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482 __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline] smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline] smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482
This issue is caused because usbnet_read_cmd() reads less bytes than requested (zero byte in the reproducer). In this case, 'buf' is not properly filled.
This patch fixes the issue by returning -ENODATA if usbnet_read_cmd() reads less bytes than requested.
Fixes: d0cad871703b ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver") Reported-and-tested-by: syzbot+6966546b78d050bb0b5d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6966546b78d050bb0b5d Signed-off-by: Shigeru Yoshida syoshida@redhat.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20230923173549.3284502-1-syoshida@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc75xx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index 5d6454fedb3f1..78ad2da3ee29b 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -90,7 +90,9 @@ static int __must_check __smsc75xx_read_reg(struct usbnet *dev, u32 index, ret = fn(dev, USB_VENDOR_REQUEST_READ_REGISTER, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE, 0, index, &buf, 4); - if (unlikely(ret < 0)) { + if (unlikely(ret < 4)) { + ret = ret < 0 ? ret : -ENODATA; + netdev_warn(dev->net, "Failed to read reg index 0x%08x: %d\n", index, ret); return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Parthiban Veerasooran Parthiban.Veerasooran@microchip.com
[ Upstream commit 8957261cd8149ed9d0738c01c0320bcbff989407 ]
The ETHTOOL_A_PLCA_ENABLED data type is u8. But while parsing the value from the attribute, nla_get_u32() is used in the plca_update_sint() function instead of nla_get_u8(). So plca_cfg.enabled variable is updated with some garbage value instead of 0 or 1 and always enables plca even though plca is disabled through ethtool application. This bug has been fixed by parsing the values based on the attributes type in the policy.
Fixes: 8580e16c28f3 ("net/ethtool: add netlink interface for the PLCA RS") Signed-off-by: Parthiban Veerasooran Parthiban.Veerasooran@microchip.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230908044548.5878-1-Parthiban.Veerasooran@microc... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ethtool/plca.c | 45 +++++++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/net/ethtool/plca.c b/net/ethtool/plca.c index 5a8cab4df0c9c..a9334937ace26 100644 --- a/net/ethtool/plca.c +++ b/net/ethtool/plca.c @@ -21,16 +21,6 @@ struct plca_reply_data { #define PLCA_REPDATA(__reply_base) \ container_of(__reply_base, struct plca_reply_data, base)
-static void plca_update_sint(int *dst, const struct nlattr *attr, - bool *mod) -{ - if (!attr) - return; - - *dst = nla_get_u32(attr); - *mod = true; -} - // PLCA get configuration message ------------------------------------------- //
const struct nla_policy ethnl_plca_get_cfg_policy[] = { @@ -38,6 +28,29 @@ const struct nla_policy ethnl_plca_get_cfg_policy[] = { NLA_POLICY_NESTED(ethnl_header_policy), };
+static void plca_update_sint(int *dst, struct nlattr **tb, u32 attrid, + bool *mod) +{ + const struct nlattr *attr = tb[attrid]; + + if (!attr || + WARN_ON_ONCE(attrid >= ARRAY_SIZE(ethnl_plca_set_cfg_policy))) + return; + + switch (ethnl_plca_set_cfg_policy[attrid].type) { + case NLA_U8: + *dst = nla_get_u8(attr); + break; + case NLA_U32: + *dst = nla_get_u32(attr); + break; + default: + WARN_ON_ONCE(1); + } + + *mod = true; +} + static int plca_get_cfg_prepare_data(const struct ethnl_req_info *req_base, struct ethnl_reply_data *reply_base, struct genl_info *info) @@ -144,13 +157,13 @@ ethnl_set_plca(struct ethnl_req_info *req_info, struct genl_info *info) return -EOPNOTSUPP;
memset(&plca_cfg, 0xff, sizeof(plca_cfg)); - plca_update_sint(&plca_cfg.enabled, tb[ETHTOOL_A_PLCA_ENABLED], &mod); - plca_update_sint(&plca_cfg.node_id, tb[ETHTOOL_A_PLCA_NODE_ID], &mod); - plca_update_sint(&plca_cfg.node_cnt, tb[ETHTOOL_A_PLCA_NODE_CNT], &mod); - plca_update_sint(&plca_cfg.to_tmr, tb[ETHTOOL_A_PLCA_TO_TMR], &mod); - plca_update_sint(&plca_cfg.burst_cnt, tb[ETHTOOL_A_PLCA_BURST_CNT], + plca_update_sint(&plca_cfg.enabled, tb, ETHTOOL_A_PLCA_ENABLED, &mod); + plca_update_sint(&plca_cfg.node_id, tb, ETHTOOL_A_PLCA_NODE_ID, &mod); + plca_update_sint(&plca_cfg.node_cnt, tb, ETHTOOL_A_PLCA_NODE_CNT, &mod); + plca_update_sint(&plca_cfg.to_tmr, tb, ETHTOOL_A_PLCA_TO_TMR, &mod); + plca_update_sint(&plca_cfg.burst_cnt, tb, ETHTOOL_A_PLCA_BURST_CNT, &mod); - plca_update_sint(&plca_cfg.burst_tmr, tb[ETHTOOL_A_PLCA_BURST_TMR], + plca_update_sint(&plca_cfg.burst_tmr, tb, ETHTOOL_A_PLCA_BURST_TMR, &mod); if (!mod) return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeremy Cline jeremy@jcline.org
[ Upstream commit dfc7f7a988dad34c3bf4c053124fb26aa6c5f916 ]
The device list needs its associated lock held when modifying it, or the list could become corrupted, as syzbot discovered.
Reported-and-tested-by: syzbot+c1d0a03d305972dbbe14@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c1d0a03d305972dbbe14 Signed-off-by: Jeremy Cline jeremy@jcline.org Reviewed-by: Simon Horman horms@kernel.org Fixes: 6709d4b7bc2e ("net: nfc: Fix use-after-free caused by nfc_llcp_find_local") Link: https://lore.kernel.org/r/20230908235853.1319596-1-jeremy@jcline.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/nfc/llcp_core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/nfc/llcp_core.c b/net/nfc/llcp_core.c index f60e424e06076..6705bb895e239 100644 --- a/net/nfc/llcp_core.c +++ b/net/nfc/llcp_core.c @@ -1636,7 +1636,9 @@ int nfc_llcp_register_device(struct nfc_dev *ndev) timer_setup(&local->sdreq_timer, nfc_llcp_sdreq_timer, 0); INIT_WORK(&local->sdreq_timeout_work, nfc_llcp_sdreq_timeout_work);
+ spin_lock(&llcp_devices_lock); list_add(&local->list, &llcp_devices); + spin_unlock(&llcp_devices_lock);
return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit c4f922e86c8e0f7c5fe94e0547e9835fc9711f08 ]
Add spin lock protection for irq {un}mask registers' control.
After napi_complete_done() and this protection were applied, a lot of redundant interrupts no longer occur.
For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider Before the patches are applied: about 800,000 times happened After the patches were applied: about 100,000 times happened
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: a0c55bba0d0d ("rswitch: Fix PHY station management clock setting") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/rswitch.c | 12 ++++++++++++ drivers/net/ethernet/renesas/rswitch.h | 2 ++ 2 files changed, 14 insertions(+)
diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index 449ed1f5624c9..215854812f80a 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -799,6 +799,7 @@ static int rswitch_poll(struct napi_struct *napi, int budget) struct net_device *ndev = napi->dev; struct rswitch_private *priv; struct rswitch_device *rdev; + unsigned long flags; int quota = budget;
rdev = netdev_priv(ndev); @@ -817,8 +818,10 @@ static int rswitch_poll(struct napi_struct *napi, int budget) netif_wake_subqueue(ndev, 0);
if (napi_complete_done(napi, budget - quota)) { + spin_lock_irqsave(&priv->lock, flags); rswitch_enadis_data_irq(priv, rdev->tx_queue->index, true); rswitch_enadis_data_irq(priv, rdev->rx_queue->index, true); + spin_unlock_irqrestore(&priv->lock, flags); }
out: @@ -835,8 +838,10 @@ static void rswitch_queue_interrupt(struct net_device *ndev) struct rswitch_device *rdev = netdev_priv(ndev);
if (napi_schedule_prep(&rdev->napi)) { + spin_lock(&rdev->priv->lock); rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, false); rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, false); + spin_unlock(&rdev->priv->lock); __napi_schedule(&rdev->napi); } } @@ -1430,14 +1435,17 @@ static void rswitch_ether_port_deinit_all(struct rswitch_private *priv) static int rswitch_open(struct net_device *ndev) { struct rswitch_device *rdev = netdev_priv(ndev); + unsigned long flags;
phy_start(ndev->phydev);
napi_enable(&rdev->napi); netif_start_queue(ndev);
+ spin_lock_irqsave(&rdev->priv->lock, flags); rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, true); rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, true); + spin_unlock_irqrestore(&rdev->priv->lock, flags);
if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS)) iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDIE); @@ -1451,6 +1459,7 @@ static int rswitch_stop(struct net_device *ndev) { struct rswitch_device *rdev = netdev_priv(ndev); struct rswitch_gwca_ts_info *ts_info, *ts_info2; + unsigned long flags;
netif_tx_stop_all_queues(ndev); bitmap_clear(rdev->priv->opened_ports, rdev->port, 1); @@ -1466,8 +1475,10 @@ static int rswitch_stop(struct net_device *ndev) kfree(ts_info); }
+ spin_lock_irqsave(&rdev->priv->lock, flags); rswitch_enadis_data_irq(rdev->priv, rdev->tx_queue->index, false); rswitch_enadis_data_irq(rdev->priv, rdev->rx_queue->index, false); + spin_unlock_irqrestore(&rdev->priv->lock, flags);
phy_stop(ndev->phydev); napi_disable(&rdev->napi); @@ -1869,6 +1880,7 @@ static int renesas_eth_sw_probe(struct platform_device *pdev) priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); if (!priv) return -ENOMEM; + spin_lock_init(&priv->lock);
priv->ptp_priv = rcar_gen4_ptp_alloc(pdev); if (!priv->ptp_priv) diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h index bb9ed971a97ca..9740398067140 100644 --- a/drivers/net/ethernet/renesas/rswitch.h +++ b/drivers/net/ethernet/renesas/rswitch.h @@ -1011,6 +1011,8 @@ struct rswitch_private { struct rswitch_etha etha[RSWITCH_NUM_PORTS]; struct rswitch_mfwd mfwd;
+ spinlock_t lock; /* lock interrupt registers' control */ + bool gwca_halt; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit a0c55bba0d0d0b5591083f65f830940d8ae63f31 ]
Fix the MPIC.PSMCS value following the programming example in the section 6.4.2 Management Data Clock (MDC) Setting, Ethernet MAC IP, S4 Hardware User Manual Rev.1.00.
The value is calculated by MPIC.PSMCS = clk[MHz] / (MDC frequency[MHz] * 2) - 1 with the input clock frequency from clk_get_rate() and MDC frequency of 2.5MHz. Otherwise, this driver cannot communicate PHYs on the R-Car S4 Starter Kit board.
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Reported-by: Tam Nguyen tam.nguyen.xa@renesas.com Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Tested-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230926123054.3976752-1-yoshihiro.shimoda.uh@rene... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/rswitch.c | 13 ++++++++++++- drivers/net/ethernet/renesas/rswitch.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index 215854812f80a..660cbfe344d2c 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -4,6 +4,7 @@ * Copyright (C) 2022 Renesas Electronics Corporation */
+#include <linux/clk.h> #include <linux/dma-mapping.h> #include <linux/err.h> #include <linux/etherdevice.h> @@ -1049,7 +1050,7 @@ static void rswitch_rmac_setting(struct rswitch_etha *etha, const u8 *mac) static void rswitch_etha_enable_mii(struct rswitch_etha *etha) { rswitch_modify(etha->addr, MPIC, MPIC_PSMCS_MASK | MPIC_PSMHT_MASK, - MPIC_PSMCS(0x05) | MPIC_PSMHT(0x06)); + MPIC_PSMCS(etha->psmcs) | MPIC_PSMHT(0x06)); rswitch_modify(etha->addr, MPSM, 0, MPSM_MFF_C45); }
@@ -1681,6 +1682,12 @@ static void rswitch_etha_init(struct rswitch_private *priv, int index) etha->index = index; etha->addr = priv->addr + RSWITCH_ETHA_OFFSET + index * RSWITCH_ETHA_SIZE; etha->coma_addr = priv->addr; + + /* MPIC.PSMCS = (clk [MHz] / (MDC frequency [MHz] * 2) - 1. + * Calculating PSMCS value as MDC frequency = 2.5MHz. So, multiply + * both the numerator and the denominator by 10. + */ + etha->psmcs = clk_get_rate(priv->clk) / 100000 / (25 * 2) - 1; }
static int rswitch_device_alloc(struct rswitch_private *priv, int index) @@ -1882,6 +1889,10 @@ static int renesas_eth_sw_probe(struct platform_device *pdev) return -ENOMEM; spin_lock_init(&priv->lock);
+ priv->clk = devm_clk_get(&pdev->dev, NULL); + if (IS_ERR(priv->clk)) + return PTR_ERR(priv->clk); + priv->ptp_priv = rcar_gen4_ptp_alloc(pdev); if (!priv->ptp_priv) return -ENOMEM; diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h index 9740398067140..13a401cebd8b7 100644 --- a/drivers/net/ethernet/renesas/rswitch.h +++ b/drivers/net/ethernet/renesas/rswitch.h @@ -915,6 +915,7 @@ struct rswitch_etha { bool external_phy; struct mii_bus *mii; phy_interface_t phy_interface; + u32 psmcs; u8 mac_addr[MAX_ADDR_LEN]; int link; int speed; @@ -1012,6 +1013,7 @@ struct rswitch_private { struct rswitch_mfwd mfwd;
spinlock_t lock; /* lock interrupt registers' control */ + struct clk *clk;
bool gwca_halt; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 37d4f55567982e445f86dc0ff4ecfa72921abfe8 ]
This accidentally returns success, but it should return a negative error code.
Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Roger Quadros rogerq@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index bebcfd5e6b579..a3d952f67ae32 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -1749,6 +1749,7 @@ static int am65_cpsw_nuss_init_tx_chns(struct am65_cpsw_common *common) if (tx_chn->irq <= 0) { dev_err(dev, "Failed to get tx dma irq %d\n", tx_chn->irq); + ret = tx_chn->irq ?: -ENXIO; goto err; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Wilder dwilder@us.ibm.com
[ Upstream commit 51e7a66666e0ca9642c59464ef8359f0ac604d41 ]
In some OVS environments the TCP pseudo header checksum may need to be recomputed. Currently this is only done when the interface instance is configured for "Trunk Mode". We found the issue also occurs in some Kubernetes environments, these environments do not use "Trunk Mode", therefor the condition is removed.
Performance tests with this change show only a fractional decrease in throughput (< 0.2%).
Fixes: 7525de2516fb ("ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.") Signed-off-by: David Wilder dwilder@us.ibm.com Reviewed-by: Nick Child nnac123@linux.ibm.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmveth.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index 832a2ae019509..a8d79ee350f8d 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c @@ -1303,24 +1303,23 @@ static void ibmveth_rx_csum_helper(struct sk_buff *skb, * the user space for finding a flow. During this process, OVS computes * checksum on the first packet when CHECKSUM_PARTIAL flag is set. * - * So, re-compute TCP pseudo header checksum when configured for - * trunk mode. + * So, re-compute TCP pseudo header checksum. */ + if (iph_proto == IPPROTO_TCP) { struct tcphdr *tcph = (struct tcphdr *)(skb->data + iphlen); + if (tcph->check == 0x0000) { /* Recompute TCP pseudo header checksum */ - if (adapter->is_active_trunk) { - tcphdrlen = skb->len - iphlen; - if (skb_proto == ETH_P_IP) - tcph->check = - ~csum_tcpudp_magic(iph->saddr, - iph->daddr, tcphdrlen, iph_proto, 0); - else if (skb_proto == ETH_P_IPV6) - tcph->check = - ~csum_ipv6_magic(&iph6->saddr, - &iph6->daddr, tcphdrlen, iph_proto, 0); - } + tcphdrlen = skb->len - iphlen; + if (skb_proto == ETH_P_IP) + tcph->check = + ~csum_tcpudp_magic(iph->saddr, + iph->daddr, tcphdrlen, iph_proto, 0); + else if (skb_proto == ETH_P_IPV6) + tcph->check = + ~csum_ipv6_magic(&iph6->saddr, + &iph6->daddr, tcphdrlen, iph_proto, 0); /* Setup SKB fields for checksum offload */ skb_partial_csum_set(skb, iphlen, offsetof(struct tcphdr, check));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit af84f9e447a65b4b9f79e7e5d69e19039b431c56 ]
nft can perform merging of adjacent payload requests. This means that:
ether saddr 00:11 ... ether type 8021ad ...
is a single payload expression, for 8 bytes, starting at the ethernet source offset.
Check that offset+length is fully within the source/destination mac addersses.
This bug prevents 'ether type' from matching the correct h_proto in case vlan tag got stripped.
Fixes: de6843be3082 ("netfilter: nft_payload: rebuild vlan header when needed") Reported-by: David Ward david.ward@ll.mit.edu Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_payload.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c index 8cb8009899479..120f6d395b98b 100644 --- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -154,6 +154,17 @@ int nft_payload_inner_offset(const struct nft_pktinfo *pkt) return pkt->inneroff; }
+static bool nft_payload_need_vlan_copy(const struct nft_payload *priv) +{ + unsigned int len = priv->offset + priv->len; + + /* data past ether src/dst requested, copy needed */ + if (len > offsetof(struct ethhdr, h_proto)) + return true; + + return false; +} + void nft_payload_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) @@ -172,7 +183,7 @@ void nft_payload_eval(const struct nft_expr *expr, goto err;
if (skb_vlan_tag_present(skb) && - priv->offset >= offsetof(struct ethhdr, h_proto)) { + nft_payload_need_vlan_copy(priv)) { if (!nft_payload_copy_vlan(dest, skb, priv->offset, priv->len)) goto err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 8e56b063c86569e51eed1c5681ce6361fa97fc7a ]
In Scenario A and B below, as the delayed INIT_ACK always changes the peer vtag, SCTP ct with the incorrect vtag may cause packet loss.
Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK
192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151] 192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772] 192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] * 192.168.1.2 > 192.168.1.1: [COOKIE ECHO] 192.168.1.1 > 192.168.1.2: [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: [COOKIE ACK]
Scenario B: INIT_ACK is delayed until the peer completes its own handshake
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] *
This patch fixes it as below:
In SCTP_CID_INIT processing: - clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir]. (Scenario E) - set ct->proto.sctp.init[dir].
In SCTP_CID_INIT_ACK processing: - drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C) - drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A)
In SCTP_CID_COOKIE_ACK processing: - clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir]. (Scenario D)
Also, it's important to allow the ct state to move forward with cookie_echo and cookie_ack from the opposite dir for the collision scenarios.
There are also other Scenarios where it should allow the packet through, addressed by the processing above:
Scenario C: new CT is created by INIT_ACK.
Scenario D: start INIT on the existing ESTABLISHED ct.
Scenario E: start INIT after the old collision on the existing ESTABLISHED ct.
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] (both side are stopped, then start new connection again in hours) 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742]
Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netfilter/nf_conntrack_sctp.h | 1 + net/netfilter/nf_conntrack_proto_sctp.c | 43 ++++++++++++++++----- 2 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/include/linux/netfilter/nf_conntrack_sctp.h b/include/linux/netfilter/nf_conntrack_sctp.h index 625f491b95de8..fb31312825ae5 100644 --- a/include/linux/netfilter/nf_conntrack_sctp.h +++ b/include/linux/netfilter/nf_conntrack_sctp.h @@ -9,6 +9,7 @@ struct ip_ct_sctp { enum sctp_conntrack state;
__be32 vtag[IP_CT_DIR_MAX]; + u8 init[IP_CT_DIR_MAX]; u8 last_dir; u8 flags; }; diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c index b6bcc8f2f46b7..c6bd533983c1f 100644 --- a/net/netfilter/nf_conntrack_proto_sctp.c +++ b/net/netfilter/nf_conntrack_proto_sctp.c @@ -112,7 +112,7 @@ static const u8 sctp_conntracks[2][11][SCTP_CONNTRACK_MAX] = { /* shutdown_ack */ {sSA, sCL, sCW, sCE, sES, sSA, sSA, sSA, sSA}, /* error */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCL},/* Can't have Stale cookie*/ /* cookie_echo */ {sCL, sCL, sCE, sCE, sES, sSS, sSR, sSA, sCL},/* 5.2.4 - Big TODO */ -/* cookie_ack */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCL},/* Can't come in orig dir */ +/* cookie_ack */ {sCL, sCL, sCW, sES, sES, sSS, sSR, sSA, sCL},/* Can't come in orig dir */ /* shutdown_comp*/ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sCL, sCL}, /* heartbeat */ {sHS, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS}, /* heartbeat_ack*/ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS}, @@ -126,7 +126,7 @@ static const u8 sctp_conntracks[2][11][SCTP_CONNTRACK_MAX] = { /* shutdown */ {sIV, sCL, sCW, sCE, sSR, sSS, sSR, sSA, sIV}, /* shutdown_ack */ {sIV, sCL, sCW, sCE, sES, sSA, sSA, sSA, sIV}, /* error */ {sIV, sCL, sCW, sCL, sES, sSS, sSR, sSA, sIV}, -/* cookie_echo */ {sIV, sCL, sCW, sCE, sES, sSS, sSR, sSA, sIV},/* Can't come in reply dir */ +/* cookie_echo */ {sIV, sCL, sCE, sCE, sES, sSS, sSR, sSA, sIV},/* Can't come in reply dir */ /* cookie_ack */ {sIV, sCL, sCW, sES, sES, sSS, sSR, sSA, sIV}, /* shutdown_comp*/ {sIV, sCL, sCW, sCE, sES, sSS, sSR, sCL, sIV}, /* heartbeat */ {sIV, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS}, @@ -412,6 +412,9 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct, /* (D) vtag must be same as init_vtag as found in INIT_ACK */ if (sh->vtag != ct->proto.sctp.vtag[dir]) goto out_unlock; + } else if (sch->type == SCTP_CID_COOKIE_ACK) { + ct->proto.sctp.init[dir] = 0; + ct->proto.sctp.init[!dir] = 0; } else if (sch->type == SCTP_CID_HEARTBEAT) { if (ct->proto.sctp.vtag[dir] == 0) { pr_debug("Setting %d vtag %x for dir %d\n", sch->type, sh->vtag, dir); @@ -461,16 +464,18 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct, }
/* If it is an INIT or an INIT ACK note down the vtag */ - if (sch->type == SCTP_CID_INIT || - sch->type == SCTP_CID_INIT_ACK) { - struct sctp_inithdr _inithdr, *ih; + if (sch->type == SCTP_CID_INIT) { + struct sctp_inithdr _ih, *ih;
- ih = skb_header_pointer(skb, offset + sizeof(_sch), - sizeof(_inithdr), &_inithdr); - if (ih == NULL) + ih = skb_header_pointer(skb, offset + sizeof(_sch), sizeof(*ih), &_ih); + if (!ih) goto out_unlock; - pr_debug("Setting vtag %x for dir %d\n", - ih->init_tag, !dir); + + if (ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir]) + ct->proto.sctp.init[!dir] = 0; + ct->proto.sctp.init[dir] = 1; + + pr_debug("Setting vtag %x for dir %d\n", ih->init_tag, !dir); ct->proto.sctp.vtag[!dir] = ih->init_tag;
/* don't renew timeout on init retransmit so @@ -481,6 +486,24 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct, old_state == SCTP_CONNTRACK_CLOSED && nf_ct_is_confirmed(ct)) ignore = true; + } else if (sch->type == SCTP_CID_INIT_ACK) { + struct sctp_inithdr _ih, *ih; + __be32 vtag; + + ih = skb_header_pointer(skb, offset + sizeof(_sch), sizeof(*ih), &_ih); + if (!ih) + goto out_unlock; + + vtag = ct->proto.sctp.vtag[!dir]; + if (!ct->proto.sctp.init[!dir] && vtag && vtag != ih->init_tag) + goto out_unlock; + /* collision */ + if (ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] && + vtag != ih->init_tag) + goto out_unlock; + + pr_debug("Setting vtag %x for dir %d\n", ih->init_tag, !dir); + ct->proto.sctp.vtag[!dir] = ih->init_tag; }
ct->proto.sctp.state = new_state;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit e8dbde59ca3fe925d0105bfb380e8429928b16dd ]
Compare NETFILTER_CFG type audit logs emitted from kernel upon ruleset modifications against expected output.
Signed-off-by: Phil Sutter phil@nwl.cc Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Stable-dep-of: 0d880dc6f032 ("netfilter: nf_tables: Deduplicate nft_register_obj audit logs") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/netfilter/.gitignore | 1 + tools/testing/selftests/netfilter/Makefile | 4 +- .../selftests/netfilter/audit_logread.c | 165 ++++++++++++++++++ tools/testing/selftests/netfilter/config | 1 + .../testing/selftests/netfilter/nft_audit.sh | 108 ++++++++++++ 5 files changed, 277 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/netfilter/audit_logread.c create mode 100755 tools/testing/selftests/netfilter/nft_audit.sh
diff --git a/tools/testing/selftests/netfilter/.gitignore b/tools/testing/selftests/netfilter/.gitignore index 4cb887b574138..4b2928e1c19d8 100644 --- a/tools/testing/selftests/netfilter/.gitignore +++ b/tools/testing/selftests/netfilter/.gitignore @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only nf-queue connect_close +audit_logread diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile index 3686bfa6c58d7..321db8850da00 100644 --- a/tools/testing/selftests/netfilter/Makefile +++ b/tools/testing/selftests/netfilter/Makefile @@ -6,13 +6,13 @@ TEST_PROGS := nft_trans_stress.sh nft_fib.sh nft_nat.sh bridge_brouter.sh \ nft_concat_range.sh nft_conntrack_helper.sh \ nft_queue.sh nft_meta.sh nf_nat_edemux.sh \ ipip-conntrack-mtu.sh conntrack_tcp_unreplied.sh \ - conntrack_vrf.sh nft_synproxy.sh rpath.sh + conntrack_vrf.sh nft_synproxy.sh rpath.sh nft_audit.sh
HOSTPKG_CONFIG := pkg-config
CFLAGS += $(shell $(HOSTPKG_CONFIG) --cflags libmnl 2>/dev/null) LDLIBS += $(shell $(HOSTPKG_CONFIG) --libs libmnl 2>/dev/null || echo -lmnl)
-TEST_GEN_FILES = nf-queue connect_close +TEST_GEN_FILES = nf-queue connect_close audit_logread
include ../lib.mk diff --git a/tools/testing/selftests/netfilter/audit_logread.c b/tools/testing/selftests/netfilter/audit_logread.c new file mode 100644 index 0000000000000..a0a880fc2d9de --- /dev/null +++ b/tools/testing/selftests/netfilter/audit_logread.c @@ -0,0 +1,165 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE +#include <errno.h> +#include <fcntl.h> +#include <poll.h> +#include <signal.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/socket.h> +#include <unistd.h> +#include <linux/audit.h> +#include <linux/netlink.h> + +static int fd; + +#define MAX_AUDIT_MESSAGE_LENGTH 8970 +struct audit_message { + struct nlmsghdr nlh; + union { + struct audit_status s; + char data[MAX_AUDIT_MESSAGE_LENGTH]; + } u; +}; + +int audit_recv(int fd, struct audit_message *rep) +{ + struct sockaddr_nl addr; + socklen_t addrlen = sizeof(addr); + int ret; + + do { + ret = recvfrom(fd, rep, sizeof(*rep), 0, + (struct sockaddr *)&addr, &addrlen); + } while (ret < 0 && errno == EINTR); + + if (ret < 0 || + addrlen != sizeof(addr) || + addr.nl_pid != 0 || + rep->nlh.nlmsg_type == NLMSG_ERROR) /* short-cut for now */ + return -1; + + return ret; +} + +int audit_send(int fd, uint16_t type, uint32_t key, uint32_t val) +{ + static int seq = 0; + struct audit_message msg = { + .nlh = { + .nlmsg_len = NLMSG_SPACE(sizeof(msg.u.s)), + .nlmsg_type = type, + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + .nlmsg_seq = ++seq, + }, + .u.s = { + .mask = key, + .enabled = key == AUDIT_STATUS_ENABLED ? val : 0, + .pid = key == AUDIT_STATUS_PID ? val : 0, + } + }; + struct sockaddr_nl addr = { + .nl_family = AF_NETLINK, + }; + int ret; + + do { + ret = sendto(fd, &msg, msg.nlh.nlmsg_len, 0, + (struct sockaddr *)&addr, sizeof(addr)); + } while (ret < 0 && errno == EINTR); + + if (ret != (int)msg.nlh.nlmsg_len) + return -1; + return 0; +} + +int audit_set(int fd, uint32_t key, uint32_t val) +{ + struct audit_message rep = { 0 }; + int ret; + + ret = audit_send(fd, AUDIT_SET, key, val); + if (ret) + return ret; + + ret = audit_recv(fd, &rep); + if (ret < 0) + return ret; + return 0; +} + +int readlog(int fd) +{ + struct audit_message rep = { 0 }; + int ret = audit_recv(fd, &rep); + const char *sep = ""; + char *k, *v; + + if (ret < 0) + return ret; + + if (rep.nlh.nlmsg_type != AUDIT_NETFILTER_CFG) + return 0; + + /* skip the initial "audit(...): " part */ + strtok(rep.u.data, " "); + + while ((k = strtok(NULL, "="))) { + v = strtok(NULL, " "); + + /* these vary and/or are uninteresting, ignore */ + if (!strcmp(k, "pid") || + !strcmp(k, "comm") || + !strcmp(k, "subj")) + continue; + + /* strip the varying sequence number */ + if (!strcmp(k, "table")) + *strchrnul(v, ':') = '\0'; + + printf("%s%s=%s", sep, k, v); + sep = " "; + } + if (*sep) { + printf("\n"); + fflush(stdout); + } + return 0; +} + +void cleanup(int sig) +{ + audit_set(fd, AUDIT_STATUS_ENABLED, 0); + close(fd); + if (sig) + exit(0); +} + +int main(int argc, char **argv) +{ + struct sigaction act = { + .sa_handler = cleanup, + }; + + fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_AUDIT); + if (fd < 0) { + perror("Can't open netlink socket"); + return -1; + } + + if (sigaction(SIGTERM, &act, NULL) < 0 || + sigaction(SIGINT, &act, NULL) < 0) { + perror("Can't set signal handler"); + close(fd); + return -1; + } + + audit_set(fd, AUDIT_STATUS_ENABLED, 1); + audit_set(fd, AUDIT_STATUS_PID, getpid()); + + while (1) + readlog(fd); +} diff --git a/tools/testing/selftests/netfilter/config b/tools/testing/selftests/netfilter/config index 4faf2ce021d90..7c42b1b2c69b4 100644 --- a/tools/testing/selftests/netfilter/config +++ b/tools/testing/selftests/netfilter/config @@ -6,3 +6,4 @@ CONFIG_NFT_REDIR=m CONFIG_NFT_MASQ=m CONFIG_NFT_FLOW_OFFLOAD=m CONFIG_NF_CT_NETLINK=m +CONFIG_AUDIT=y diff --git a/tools/testing/selftests/netfilter/nft_audit.sh b/tools/testing/selftests/netfilter/nft_audit.sh new file mode 100755 index 0000000000000..83c271b1c7352 --- /dev/null +++ b/tools/testing/selftests/netfilter/nft_audit.sh @@ -0,0 +1,108 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Check that audit logs generated for nft commands are as expected. + +SKIP_RC=4 +RC=0 + +nft --version >/dev/null 2>&1 || { + echo "SKIP: missing nft tool" + exit $SKIP_RC +} + +logfile=$(mktemp) +echo "logging into $logfile" +./audit_logread >"$logfile" & +logread_pid=$! +trap 'kill $logread_pid; rm -f $logfile' EXIT +exec 3<"$logfile" + +do_test() { # (cmd, log) + echo -n "testing for cmd: $1 ... " + cat <&3 >/dev/null + $1 >/dev/null || exit 1 + sleep 0.1 + res=$(diff -a -u <(echo "$2") - <&3) + [ $? -eq 0 ] && { echo "OK"; return; } + echo "FAIL" + echo "$res" + ((RC++)) +} + +nft flush ruleset + +for table in t1 t2; do + do_test "nft add table $table" \ + "table=$table family=2 entries=1 op=nft_register_table" + + do_test "nft add chain $table c1" \ + "table=$table family=2 entries=1 op=nft_register_chain" + + do_test "nft add chain $table c2; add chain $table c3" \ + "table=$table family=2 entries=2 op=nft_register_chain" + + cmd="add rule $table c1 counter" + + do_test "nft $cmd" \ + "table=$table family=2 entries=1 op=nft_register_rule" + + do_test "nft $cmd; $cmd" \ + "table=$table family=2 entries=2 op=nft_register_rule" + + cmd="" + sep="" + for chain in c2 c3; do + for i in {1..3}; do + cmd+="$sep add rule $table $chain counter" + sep=";" + done + done + do_test "nft $cmd" \ + "table=$table family=2 entries=6 op=nft_register_rule" +done + +do_test 'nft reset rules t1 c2' \ +'table=t1 family=2 entries=3 op=nft_reset_rule' + +do_test 'nft reset rules table t1' \ +'table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule' + +do_test 'nft reset rules' \ +'table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule' + +for ((i = 0; i < 500; i++)); do + echo "add rule t2 c3 counter accept comment "rule $i"" +done | do_test 'nft -f -' \ +'table=t2 family=2 entries=500 op=nft_register_rule' + +do_test 'nft reset rules t2 c3' \ +'table=t2 family=2 entries=189 op=nft_reset_rule +table=t2 family=2 entries=188 op=nft_reset_rule +table=t2 family=2 entries=126 op=nft_reset_rule' + +do_test 'nft reset rules t2' \ +'table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=186 op=nft_reset_rule +table=t2 family=2 entries=188 op=nft_reset_rule +table=t2 family=2 entries=129 op=nft_reset_rule' + +do_test 'nft reset rules' \ +'table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule +table=t1 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=3 op=nft_reset_rule +table=t2 family=2 entries=180 op=nft_reset_rule +table=t2 family=2 entries=188 op=nft_reset_rule +table=t2 family=2 entries=135 op=nft_reset_rule' + +exit $RC
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit 203bb9d39866d3c5a8135433ce3742fe4f9d5741 ]
Add tests for sets and elements and deletion of all kinds. Also reorder rule reset tests: By moving the bulk rule add command up, the two 'reset rules' tests become identical.
While at it, fix for a failing bulk rule add test's error status getting lost due to its use in a pipe. Avoid this by using a temporary file.
Headings in diff output for failing tests contain no useful data, strip them.
Signed-off-by: Phil Sutter phil@nwl.cc Signed-off-by: Florian Westphal fw@strlen.de Stable-dep-of: 0d880dc6f032 ("netfilter: nf_tables: Deduplicate nft_register_obj audit logs") Signed-off-by: Sasha Levin sashal@kernel.org --- .../testing/selftests/netfilter/nft_audit.sh | 97 ++++++++++++++++--- 1 file changed, 81 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/netfilter/nft_audit.sh b/tools/testing/selftests/netfilter/nft_audit.sh index 83c271b1c7352..0b3255e7b3538 100755 --- a/tools/testing/selftests/netfilter/nft_audit.sh +++ b/tools/testing/selftests/netfilter/nft_audit.sh @@ -12,10 +12,11 @@ nft --version >/dev/null 2>&1 || { }
logfile=$(mktemp) +rulefile=$(mktemp) echo "logging into $logfile" ./audit_logread >"$logfile" & logread_pid=$! -trap 'kill $logread_pid; rm -f $logfile' EXIT +trap 'kill $logread_pid; rm -f $logfile $rulefile' EXIT exec 3<"$logfile"
do_test() { # (cmd, log) @@ -26,12 +27,14 @@ do_test() { # (cmd, log) res=$(diff -a -u <(echo "$2") - <&3) [ $? -eq 0 ] && { echo "OK"; return; } echo "FAIL" - echo "$res" - ((RC++)) + grep -v '^(---|+++|@@)' <<< "$res" + ((RC--)) }
nft flush ruleset
+# adding tables, chains and rules + for table in t1 t2; do do_test "nft add table $table" \ "table=$table family=2 entries=1 op=nft_register_table" @@ -62,6 +65,28 @@ for table in t1 t2; do "table=$table family=2 entries=6 op=nft_register_rule" done
+for ((i = 0; i < 500; i++)); do + echo "add rule t2 c3 counter accept comment "rule $i"" +done >$rulefile +do_test "nft -f $rulefile" \ +'table=t2 family=2 entries=500 op=nft_register_rule' + +# adding sets and elements + +settype='type inet_service; counter' +setelem='{ 22, 80, 443 }' +setblock="{ $settype; elements = $setelem; }" +do_test "nft add set t1 s $setblock" \ +"table=t1 family=2 entries=4 op=nft_register_set" + +do_test "nft add set t1 s2 $setblock; add set t1 s3 { $settype; }" \ +"table=t1 family=2 entries=5 op=nft_register_set" + +do_test "nft add element t1 s3 $setelem" \ +"table=t1 family=2 entries=3 op=nft_register_setelem" + +# resetting rules + do_test 'nft reset rules t1 c2' \ 'table=t1 family=2 entries=3 op=nft_reset_rule'
@@ -70,19 +95,6 @@ do_test 'nft reset rules table t1' \ table=t1 family=2 entries=3 op=nft_reset_rule table=t1 family=2 entries=3 op=nft_reset_rule'
-do_test 'nft reset rules' \ -'table=t1 family=2 entries=3 op=nft_reset_rule -table=t1 family=2 entries=3 op=nft_reset_rule -table=t1 family=2 entries=3 op=nft_reset_rule -table=t2 family=2 entries=3 op=nft_reset_rule -table=t2 family=2 entries=3 op=nft_reset_rule -table=t2 family=2 entries=3 op=nft_reset_rule' - -for ((i = 0; i < 500; i++)); do - echo "add rule t2 c3 counter accept comment "rule $i"" -done | do_test 'nft -f -' \ -'table=t2 family=2 entries=500 op=nft_register_rule' - do_test 'nft reset rules t2 c3' \ 'table=t2 family=2 entries=189 op=nft_reset_rule table=t2 family=2 entries=188 op=nft_reset_rule @@ -105,4 +117,57 @@ table=t2 family=2 entries=180 op=nft_reset_rule table=t2 family=2 entries=188 op=nft_reset_rule table=t2 family=2 entries=135 op=nft_reset_rule'
+# resetting sets and elements + +elem=(22 ,80 ,443) +relem="" +for i in {1..3}; do + relem+="${elem[((i - 1))]}" + do_test "nft reset element t1 s { $relem }" \ + "table=t1 family=2 entries=$i op=nft_reset_setelem" +done + +do_test 'nft reset set t1 s' \ +'table=t1 family=2 entries=3 op=nft_reset_setelem' + +# deleting rules + +readarray -t handles < <(nft -a list chain t1 c1 | \ + sed -n 's/.*counter.* handle (.*)$/\1/p') + +do_test "nft delete rule t1 c1 handle ${handles[0]}" \ +'table=t1 family=2 entries=1 op=nft_unregister_rule' + +cmd='delete rule t1 c1 handle' +do_test "nft $cmd ${handles[1]}; $cmd ${handles[2]}" \ +'table=t1 family=2 entries=2 op=nft_unregister_rule' + +do_test 'nft flush chain t1 c2' \ +'table=t1 family=2 entries=3 op=nft_unregister_rule' + +do_test 'nft flush table t2' \ +'table=t2 family=2 entries=509 op=nft_unregister_rule' + +# deleting chains + +do_test 'nft delete chain t2 c2' \ +'table=t2 family=2 entries=1 op=nft_unregister_chain' + +# deleting sets and elements + +do_test 'nft delete element t1 s { 22 }' \ +'table=t1 family=2 entries=1 op=nft_unregister_setelem' + +do_test 'nft delete element t1 s { 80, 443 }' \ +'table=t1 family=2 entries=2 op=nft_unregister_setelem' + +do_test 'nft flush set t1 s2' \ +'table=t1 family=2 entries=3 op=nft_unregister_setelem' + +do_test 'nft delete set t1 s2' \ +'table=t1 family=2 entries=1 op=nft_unregister_set' + +do_test 'nft delete set t1 s3' \ +'table=t1 family=2 entries=1 op=nft_unregister_set' + exit $RC
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phil Sutter phil@nwl.cc
[ Upstream commit 0d880dc6f032e0b541520e9926f398a77d3d433c ]
When adding/updating an object, the transaction handler emits suitable audit log entries already, the one in nft_obj_notify() is redundant. To fix that (and retain the audit logging from objects' 'update' callback), Introduce an "audit log free" variant for internal use.
Fixes: c520292f29b8 ("audit: log nftables configuration change events once per table") Signed-off-by: Phil Sutter phil@nwl.cc Reviewed-by: Richard Guy Briggs rgb@redhat.com Acked-by: Paul Moore paul@paul-moore.com (Audit) Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 44 ++++++++++++------- .../testing/selftests/netfilter/nft_audit.sh | 20 +++++++++ 2 files changed, 48 insertions(+), 16 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 976a9b763b9bb..be5869366c7d3 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -7868,24 +7868,14 @@ static int nf_tables_delobj(struct sk_buff *skb, const struct nfnl_info *info, return nft_delobj(&ctx, obj); }
-void nft_obj_notify(struct net *net, const struct nft_table *table, - struct nft_object *obj, u32 portid, u32 seq, int event, - u16 flags, int family, int report, gfp_t gfp) +static void +__nft_obj_notify(struct net *net, const struct nft_table *table, + struct nft_object *obj, u32 portid, u32 seq, int event, + u16 flags, int family, int report, gfp_t gfp) { struct nftables_pernet *nft_net = nft_pernet(net); struct sk_buff *skb; int err; - char *buf = kasprintf(gfp, "%s:%u", - table->name, nft_net->base_seq); - - audit_log_nfcfg(buf, - family, - obj->handle, - event == NFT_MSG_NEWOBJ ? - AUDIT_NFT_OP_OBJ_REGISTER : - AUDIT_NFT_OP_OBJ_UNREGISTER, - gfp); - kfree(buf);
if (!report && !nfnetlink_has_listeners(net, NFNLGRP_NFTABLES)) @@ -7908,13 +7898,35 @@ void nft_obj_notify(struct net *net, const struct nft_table *table, err: nfnetlink_set_err(net, portid, NFNLGRP_NFTABLES, -ENOBUFS); } + +void nft_obj_notify(struct net *net, const struct nft_table *table, + struct nft_object *obj, u32 portid, u32 seq, int event, + u16 flags, int family, int report, gfp_t gfp) +{ + struct nftables_pernet *nft_net = nft_pernet(net); + char *buf = kasprintf(gfp, "%s:%u", + table->name, nft_net->base_seq); + + audit_log_nfcfg(buf, + family, + obj->handle, + event == NFT_MSG_NEWOBJ ? + AUDIT_NFT_OP_OBJ_REGISTER : + AUDIT_NFT_OP_OBJ_UNREGISTER, + gfp); + kfree(buf); + + __nft_obj_notify(net, table, obj, portid, seq, event, + flags, family, report, gfp); +} EXPORT_SYMBOL_GPL(nft_obj_notify);
static void nf_tables_obj_notify(const struct nft_ctx *ctx, struct nft_object *obj, int event) { - nft_obj_notify(ctx->net, ctx->table, obj, ctx->portid, ctx->seq, event, - ctx->flags, ctx->family, ctx->report, GFP_KERNEL); + __nft_obj_notify(ctx->net, ctx->table, obj, ctx->portid, + ctx->seq, event, ctx->flags, ctx->family, + ctx->report, GFP_KERNEL); }
/* diff --git a/tools/testing/selftests/netfilter/nft_audit.sh b/tools/testing/selftests/netfilter/nft_audit.sh index 0b3255e7b3538..bb34329e02a7f 100755 --- a/tools/testing/selftests/netfilter/nft_audit.sh +++ b/tools/testing/selftests/netfilter/nft_audit.sh @@ -85,6 +85,26 @@ do_test "nft add set t1 s2 $setblock; add set t1 s3 { $settype; }" \ do_test "nft add element t1 s3 $setelem" \ "table=t1 family=2 entries=3 op=nft_register_setelem"
+# adding counters + +do_test 'nft add counter t1 c1' \ +'table=t1 family=2 entries=1 op=nft_register_obj' + +do_test 'nft add counter t2 c1; add counter t2 c2' \ +'table=t2 family=2 entries=2 op=nft_register_obj' + +# adding/updating quotas + +do_test 'nft add quota t1 q1 { 10 bytes }' \ +'table=t1 family=2 entries=1 op=nft_register_obj' + +do_test 'nft add quota t2 q1 { 10 bytes }; add quota t2 q2 { 10 bytes }' \ +'table=t2 family=2 entries=2 op=nft_register_obj' + +# changing the quota value triggers obj update path +do_test 'nft add quota t1 q1 { 20 bytes }' \ +'table=t1 family=2 entries=1 op=nft_register_obj' + # resetting rules
do_test 'nft reset rules t1 c2' \
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 087388278e0f301f4c61ddffb1911d3a180f84b8 ]
nft_rbtree_gc_elem() walks back and removes the end interval element that comes before the expired element.
There is a small chance that we've cached this element as 'rbe_ge'. If this happens, we hold and test a pointer that has been queued for freeing.
It also causes spurious insertion failures:
$ cat test-testcases-sets-0044interval_overlap_0.1/testout.log Error: Could not process rule: File exists add element t s { 0 - 2 } ^^^^^^ Failed to insert 0 - 2 given: table ip t { set s { type inet_service flags interval,timeout timeout 2s gc-interval 2s } }
The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.
Reason is that when we try to insert, the tree may hold an expired element that collides with the range we're adding. While we do evict/erase this element, we can trip over this check:
if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new)) return -ENOTEMPTY;
rbe_ge was erased by the synchronous gc, we should not have done this check. Next attempt won't find it, so retry results in successful insertion.
Restart in-kernel to avoid such spurious errors.
Such restart are rare, unless userspace intentionally adds very large numbers of elements with very short timeouts while setting a huge gc interval.
Even in this case, this cannot loop forever, on each retry an existing element has been removed.
As the caller is holding the transaction mutex, its impossible for a second entity to add more expiring elements to the tree.
After this it also becomes feasible to remove the async gc worker and perform all garbage collection from the commit path.
Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_rbtree.c | 46 +++++++++++++++++++++------------- 1 file changed, 29 insertions(+), 17 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 487572dcd6144..2660ceab3759d 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -233,10 +233,9 @@ static void nft_rbtree_gc_remove(struct net *net, struct nft_set *set, rb_erase(&rbe->node, &priv->root); }
-static int nft_rbtree_gc_elem(const struct nft_set *__set, - struct nft_rbtree *priv, - struct nft_rbtree_elem *rbe, - u8 genmask) +static const struct nft_rbtree_elem * +nft_rbtree_gc_elem(const struct nft_set *__set, struct nft_rbtree *priv, + struct nft_rbtree_elem *rbe, u8 genmask) { struct nft_set *set = (struct nft_set *)__set; struct rb_node *prev = rb_prev(&rbe->node); @@ -246,7 +245,7 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set,
gc = nft_trans_gc_alloc(set, 0, GFP_ATOMIC); if (!gc) - return -ENOMEM; + return ERR_PTR(-ENOMEM);
/* search for end interval coming before this element. * end intervals don't carry a timeout extension, they @@ -261,6 +260,7 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set, prev = rb_prev(prev); }
+ rbe_prev = NULL; if (prev) { rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); nft_rbtree_gc_remove(net, set, priv, rbe_prev); @@ -272,7 +272,7 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set, */ gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); if (WARN_ON_ONCE(!gc)) - return -ENOMEM; + return ERR_PTR(-ENOMEM);
nft_trans_gc_elem_add(gc, rbe_prev); } @@ -280,13 +280,13 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set, nft_rbtree_gc_remove(net, set, priv, rbe); gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); if (WARN_ON_ONCE(!gc)) - return -ENOMEM; + return ERR_PTR(-ENOMEM);
nft_trans_gc_elem_add(gc, rbe);
nft_trans_gc_queue_sync_done(gc);
- return 0; + return rbe_prev; }
static bool nft_rbtree_update_first(const struct nft_set *set, @@ -314,7 +314,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set, struct nft_rbtree *priv = nft_set_priv(set); u8 cur_genmask = nft_genmask_cur(net); u8 genmask = nft_genmask_next(net); - int d, err; + int d;
/* Descend the tree to search for an existing element greater than the * key value to insert that is greater than the new element. This is the @@ -363,9 +363,14 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set, */ if (nft_set_elem_expired(&rbe->ext) && nft_set_elem_active(&rbe->ext, cur_genmask)) { - err = nft_rbtree_gc_elem(set, priv, rbe, genmask); - if (err < 0) - return err; + const struct nft_rbtree_elem *removed_end; + + removed_end = nft_rbtree_gc_elem(set, priv, rbe, genmask); + if (IS_ERR(removed_end)) + return PTR_ERR(removed_end); + + if (removed_end == rbe_le || removed_end == rbe_ge) + return -EAGAIN;
continue; } @@ -486,11 +491,18 @@ static int nft_rbtree_insert(const struct net *net, const struct nft_set *set, struct nft_rbtree_elem *rbe = elem->priv; int err;
- write_lock_bh(&priv->lock); - write_seqcount_begin(&priv->count); - err = __nft_rbtree_insert(net, set, rbe, ext); - write_seqcount_end(&priv->count); - write_unlock_bh(&priv->lock); + do { + if (fatal_signal_pending(current)) + return -EINTR; + + cond_resched(); + + write_lock_bh(&priv->lock); + write_seqcount_begin(&priv->count); + err = __nft_rbtree_insert(net, set, rbe, ext); + write_seqcount_end(&priv->count); + write_unlock_bh(&priv->lock); + } while (err == -EAGAIN);
return err; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Poirier bpoirier@nvidia.com
[ Upstream commit 0add5c597f3253a9c6108a0a81d57f44ab0d9d30 ]
Due to a small omission, the offload_failed flag is missing from ipv4 fibmatch results. Make sure it is set correctly.
The issue can be witnessed using the following commands: echo "1 1" > /sys/bus/netdevsim/new_device ip link add dummy1 up type dummy ip route add 192.0.2.0/24 dev dummy1 echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/fib/fail_route_offload ip route add 198.51.100.0/24 dev dummy1 ip route # 192.168.15.0/24 has rt_trap # 198.51.100.0/24 has rt_offload_failed ip route get 192.168.15.1 fibmatch # Result has rt_trap ip route get 198.51.100.1 fibmatch # Result differs from the route shown by `ip route`, it is missing # rt_offload_failed ip link del dev dummy1 echo 1 > /sys/bus/netdevsim/del_device
Fixes: 36c5100e859d ("IPv4: Add "offload failed" indication to routes") Signed-off-by: Benjamin Poirier bpoirier@nvidia.com Reviewed-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20230926182730.231208-1-bpoirier@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/route.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 0a53ca6ebb0d5..14fbc5cd157ef 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -3417,6 +3417,8 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, fa->fa_type == fri.type) { fri.offload = READ_ONCE(fa->offload); fri.trap = READ_ONCE(fa->trap); + fri.offload_failed = + READ_ONCE(fa->offload_failed); break; } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Wolsieffer ben.wolsieffer@hefring.com
[ Upstream commit 6f195d6b0da3b689922ba9e302af2f49592fa9fc ]
The STM32MP1 keeps clk_rx enabled during suspend, and therefore the driver does not enable the clock in stm32_dwmac_init() if the device was suspended. The problem is that this same code runs on STM32 MCUs, which do disable clk_rx during suspend, causing the clock to never be re-enabled on resume.
This patch adds a variant flag to indicate that clk_rx remains enabled during suspend, and uses this to decide whether to enable the clock in stm32_dwmac_init() if the device was suspended.
This approach fixes this specific bug with limited opportunity for unintended side-effects, but I have a follow up patch that will refactor the clock configuration and hopefully make it less error prone.
Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.") Signed-off-by: Ben Wolsieffer ben.wolsieffer@hefring.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c index bdb4de59a6727..28c8ca5fba6c5 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c @@ -105,6 +105,7 @@ struct stm32_ops { int (*parse_data)(struct stm32_dwmac *dwmac, struct device *dev); u32 syscfg_eth_mask; + bool clk_rx_enable_in_suspend; };
static int stm32_dwmac_init(struct plat_stmmacenet_data *plat_dat) @@ -122,7 +123,8 @@ static int stm32_dwmac_init(struct plat_stmmacenet_data *plat_dat) if (ret) return ret;
- if (!dwmac->dev->power.is_suspended) { + if (!dwmac->ops->clk_rx_enable_in_suspend || + !dwmac->dev->power.is_suspended) { ret = clk_prepare_enable(dwmac->clk_rx); if (ret) { clk_disable_unprepare(dwmac->clk_tx); @@ -514,7 +516,8 @@ static struct stm32_ops stm32mp1_dwmac_data = { .suspend = stm32mp1_suspend, .resume = stm32mp1_resume, .parse_data = stm32mp1_parse_data, - .syscfg_eth_mask = SYSCFG_MP1_ETH_MASK + .syscfg_eth_mask = SYSCFG_MP1_ETH_MASK, + .clk_rx_enable_in_suspend = true };
static const struct of_device_id stm32_dwmac_match[] = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengfeng Ye dg573847474@gmail.com
[ Upstream commit 08e50cf071847323414df0835109b6f3560d44f5 ]
It seems that tipc_crypto_key_revoke() could be be invoked by wokequeue tipc_crypto_work_rx() under process context and timer/rx callback under softirq context, thus the lock acquisition on &tx->lock seems better use spin_lock_bh() to prevent possible deadlock.
This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock.
tipc_crypto_work_rx() <workqueue> --> tipc_crypto_key_distr() --> tipc_bcast_xmit() --> tipc_bcbase_xmit() --> tipc_bearer_bc_xmit() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <timer interrupt> --> tipc_disc_timeout() --> tipc_bearer_xmit_skb() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <deadlock here>
Signed-off-by: Chengfeng Ye dg573847474@gmail.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Acked-by: Jon Maloy jmaloy@redhat.com Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/crypto.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index 302fd749c4249..43c3f1c971b8f 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1441,14 +1441,14 @@ static int tipc_crypto_key_revoke(struct net *net, u8 tx_key) struct tipc_crypto *tx = tipc_net(net)->crypto_tx; struct tipc_key key;
- spin_lock(&tx->lock); + spin_lock_bh(&tx->lock); key = tx->key; WARN_ON(!key.active || tx_key != key.active);
/* Free the active key */ tipc_crypto_key_set_state(tx, key.passive, 0, key.pending); tipc_crypto_key_detach(tx->aead[key.active], &tx->lock); - spin_unlock(&tx->lock); + spin_unlock_bh(&tx->lock);
pr_warn("%s: key is revoked\n", tx->name); return -EKEYREVOKED;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neal Cardwell ncardwell@google.com
[ Upstream commit 059217c18be6757b95bfd77ba53fb50b48b8a816 ]
This commit fixes quick-ack counting so that it only considers that a quick-ack has been provided if we are sending an ACK that newly acknowledges data.
The code was erroneously using the number of data segments in outgoing skbs when deciding how many quick-ack credits to remove. This logic does not make sense, and could cause poor performance in request-response workloads, like RPC traffic, where requests or responses can be multi-segment skbs.
When a TCP connection decides to send N quick-acks, that is to accelerate the cwnd growth of the congestion control module controlling the remote endpoint of the TCP connection. That quick-ack decision is purely about the incoming data and outgoing ACKs. It has nothing to do with the outgoing data or the size of outgoing data.
And in particular, an ACK only serves the intended purpose of allowing the remote congestion control to grow the congestion window quickly if the ACK is ACKing or SACKing new data.
The fix is simple: only count packets as serving the goal of the quickack mechanism if they are ACKing/SACKing new data. We can tell whether this is the case by checking inet_csk_ack_scheduled(), since we schedule an ACK exactly when we are ACKing/SACKing new data.
Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.") Signed-off-by: Neal Cardwell ncardwell@google.com Reviewed-by: Yuchung Cheng ycheng@google.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tcp.h | 6 ++++-- net/ipv4/tcp_output.c | 7 +++---- 2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 10fc5c5928f71..b1b1e01c69839 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -350,12 +350,14 @@ ssize_t tcp_splice_read(struct socket *sk, loff_t *ppos, struct sk_buff *tcp_stream_alloc_skb(struct sock *sk, gfp_t gfp, bool force_schedule);
-static inline void tcp_dec_quickack_mode(struct sock *sk, - const unsigned int pkts) +static inline void tcp_dec_quickack_mode(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk);
if (icsk->icsk_ack.quick) { + /* How many ACKs S/ACKing new data have we sent? */ + const unsigned int pkts = inet_csk_ack_scheduled(sk) ? 1 : 0; + if (pkts >= icsk->icsk_ack.quick) { icsk->icsk_ack.quick = 0; /* Leaving quickack mode we deflate ATO. */ diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9f9ca68c47026..37fd9537423f1 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -177,8 +177,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp, }
/* Account for an ACK we sent. */ -static inline void tcp_event_ack_sent(struct sock *sk, unsigned int pkts, - u32 rcv_nxt) +static inline void tcp_event_ack_sent(struct sock *sk, u32 rcv_nxt) { struct tcp_sock *tp = tcp_sk(sk);
@@ -192,7 +191,7 @@ static inline void tcp_event_ack_sent(struct sock *sk, unsigned int pkts,
if (unlikely(rcv_nxt != tp->rcv_nxt)) return; /* Special ACK sent by DCTCP to reflect ECN */ - tcp_dec_quickack_mode(sk, pkts); + tcp_dec_quickack_mode(sk); inet_csk_clear_xmit_timer(sk, ICSK_TIME_DACK); }
@@ -1372,7 +1371,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, sk, skb);
if (likely(tcb->tcp_flags & TCPHDR_ACK)) - tcp_event_ack_sent(sk, tcp_skb_pcount(skb), rcv_nxt); + tcp_event_ack_sent(sk, rcv_nxt);
if (skb->len != tcp_header_size) { tcp_event_data_sent(tp, sk);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neal Cardwell ncardwell@google.com
[ Upstream commit 4720852ed9afb1c5ab84e96135cb5b73d5afde6f ]
This commit fixes poor delayed ACK behavior that can cause poor TCP latency in a particular boundary condition: when an application makes a TCP socket write that is an exact multiple of the MSS size.
The problem is that there is painful boundary discontinuity in the current delayed ACK behavior. With the current delayed ACK behavior, we have:
(1) If an app reads data when > 1*MSS is unacknowledged, then tcp_cleanup_rbuf() ACKs immediately because of:
tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss ||
(2) If an app reads all received data, and the packets were < 1*MSS, and either (a) the app is not ping-pong or (b) we received two packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause of:
((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) || ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && !inet_csk_in_pingpong_mode(sk))) &&
(3) *However*: if an app reads exactly 1*MSS of data, tcp_cleanup_rbuf() does not send an immediate ACK. This is true even if the app is not ping-pong and the 1*MSS of data had the PSH bit set, suggesting the sending application completed an application write.
Thus if the app is not ping-pong, we have this painful case where
1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a
write whose last skb is an exact multiple of 1*MSS can get a 40ms delayed ACK. This means that any app that transfers data in one direction and takes care to align write size or packet size with MSS can suffer this problem. With receive zero copy making 4KB MSS values more common, it is becoming more common to have application writes naturally align with MSS, and more applications are likely to encounter this delayed ACK problem.
The fix in this commit is to refine the delayed ACK heuristics with a simple check: immediately ACK a received 1*MSS skb with PSH bit set if the app reads all data. Why? If an skb has a len of exactly 1*MSS and has the PSH bit set then it is likely the end of an application write. So more data may not be arriving soon, and yet the data sender may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send an ACK immediately if the app reads all of the data and is not ping-pong. Note that this logic is also executed for the case where len > MSS, but in that case this logic does not matter (and does not hurt) because tcp_cleanup_rbuf() will always ACK immediately if the app reads data and there is more than an MSS of unACKed data.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Neal Cardwell ncardwell@google.com Reviewed-by: Yuchung Cheng ycheng@google.com Reviewed-by: Eric Dumazet edumazet@google.com Cc: Xin Guo guoxin0309@gmail.com Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_input.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 48c2b96b08435..a5781f86ac375 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -243,6 +243,19 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) if (unlikely(len > icsk->icsk_ack.rcv_mss + MAX_TCP_OPTION_SPACE)) tcp_gro_dev_warn(sk, skb, len); + /* If the skb has a len of exactly 1*MSS and has the PSH bit + * set then it is likely the end of an application write. So + * more data may not be arriving soon, and yet the data sender + * may be waiting for an ACK if cwnd-bound or using TX zero + * copy. So we set ICSK_ACK_PUSHED here so that + * tcp_cleanup_rbuf() will send an ACK immediately if the app + * reads all of the data and is not ping-pong. If len > MSS + * then this logic does not matter (and does not hurt) because + * tcp_cleanup_rbuf() will always ACK immediately if the app + * reads data and there is more than an MSS of unACKed data. + */ + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_PSH) + icsk->icsk_ack.pending |= ICSK_ACK_PUSHED; } else { /* Otherwise, we make more careful check taking into account, * that SACKs block is variable.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 2222a78075f0c19ca18db53fd6623afb4aff602d ]
During the 4-way handshake, the transport's state is set to ACTIVE in sctp_process_init() when processing INIT_ACK chunk on client or COOKIE_ECHO chunk on server.
In the collision scenario below:
192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021]
when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state, sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it creates a new association and sets its transport to ACTIVE then updates to the old association in sctp_assoc_update().
However, in sctp_assoc_update(), it will skip the transport update if it finds a transport with the same ipaddr already existing in the old asoc, and this causes the old asoc's transport state not to move to ACTIVE after the handshake.
This means if DATA retransmission happens at this moment, it won't be able to enter PF state because of the check 'transport->state == SCTP_ACTIVE' in sctp_do_8_2_transport_strike().
This patch fixes it by updating the transport in sctp_assoc_update() with sctp_assoc_add_peer() where it updates the transport state if there is already a transport with the same ipaddr exists in the old asoc.
Signed-off-by: Xin Long lucien.xin@gmail.com Reviewed-by: Simon Horman horms@kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.169617232... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/associola.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 796529167e8d2..c45c192b78787 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -1159,8 +1159,7 @@ int sctp_assoc_update(struct sctp_association *asoc, /* Add any peer addresses from the new association. */ list_for_each_entry(trans, &new->peer.transport_addr_list, transports) - if (!sctp_assoc_lookup_paddr(asoc, &trans->ipaddr) && - !sctp_assoc_add_peer(asoc, &trans->ipaddr, + if (!sctp_assoc_add_peer(asoc, &trans->ipaddr, GFP_ATOMIC, trans->state)) return -ENOMEM;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 1f4e803cd9c9166eb8b6c8b0b8e4124f7499fc07 ]
Currently, when hb_interval is changed by users, it won't take effect until the next expiry of hb timer. As the default value is 30s, users have to wait up to 30s to wait its hb_interval update to work.
This becomes pretty bad in containers where a much smaller value is usually set on hb_interval. This patch improves it by resetting the hb timer immediately once the value of hb_interval is updated by users.
Note that we don't address the already existing 'problem' when sending a heartbeat 'on demand' if one hb has just been sent(from the timer) mentioned in:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html
Signed-off-by: Xin Long lucien.xin@gmail.com Reviewed-by: Simon Horman horms@kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.169617266... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/socket.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 7cf207706eb66..652af155966f1 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -2450,6 +2450,7 @@ static int sctp_apply_peer_addr_params(struct sctp_paddrparams *params, if (trans) { trans->hbinterval = msecs_to_jiffies(params->spp_hbinterval); + sctp_transport_reset_hb_timer(trans); } else if (asoc) { asoc->hbinterval = msecs_to_jiffies(params->spp_hbinterval);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit d0f95894fda7d4f895b29c1097f92d7fee278cb2 ]
syzbot caught another data-race in netlink when setting sk->sk_err.
Annotate all of them for good measure.
BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x00000000 -> 0x00000016
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netlink/af_netlink.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 20082171f24a3..9c6bc47bc7f7b 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -352,7 +352,7 @@ static void netlink_overrun(struct sock *sk) if (!nlk_test_bit(RECV_NO_ENOBUFS, sk)) { if (!test_and_set_bit(NETLINK_S_CONGESTED, &nlk_sk(sk)->state)) { - sk->sk_err = ENOBUFS; + WRITE_ONCE(sk->sk_err, ENOBUFS); sk_error_report(sk); } } @@ -1577,7 +1577,7 @@ static int do_one_set_err(struct sock *sk, struct netlink_set_err_data *p) goto out; }
- sk->sk_err = p->code; + WRITE_ONCE(sk->sk_err, p->code); sk_error_report(sk); out: return ret; @@ -1966,7 +1966,7 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) { ret = netlink_dump(sk); if (ret) { - sk->sk_err = -ret; + WRITE_ONCE(sk->sk_err, -ret); sk_error_report(sk); } } @@ -2485,7 +2485,7 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err, err_bad_put: nlmsg_free(skb); err_skb: - NETLINK_CB(in_skb).sk->sk_err = ENOBUFS; + WRITE_ONCE(NETLINK_CB(in_skb).sk->sk_err, ENOBUFS); sk_error_report(NETLINK_CB(in_skb).sk); } EXPORT_SYMBOL(netlink_ack);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haiyang Zhang haiyangz@microsoft.com
commit 7a54de92657455210d0ca71d4176b553952c871a upstream.
sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove the subtraction from header size.
Cc: stable@vger.kernel.org Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting") Signed-off-by: Haiyang Zhang haiyangz@microsoft.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: a43e8e9ffa0d ("net: mana: Fix oversized sge0 for GSO packets") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/microsoft/mana/mana_en.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -262,8 +262,6 @@ netdev_tx_t mana_start_xmit(struct sk_bu ihs = skb_transport_offset(skb) + sizeof(struct udphdr); } else { ihs = skb_tcp_all_headers(skb); - if (ipv6_has_hopopt_jumbo(skb)) - ihs -= sizeof(struct hop_jumbo_hdr); }
u64_stats_update_begin(&tx_stats->syncp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haiyang Zhang haiyangz@microsoft.com
[ Upstream commit a43e8e9ffa0d1de058964edf1a0622cbb7e27cfe ]
Handle the case when GSO SKB linear length is too large.
MANA NIC requires GSO packets to put only the header part to SGE0, otherwise the TX queue may stop at the HW level.
So, use 2 SGEs for the skb linear part which contains more than the packet header.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang haiyangz@microsoft.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microsoft/mana/mana_en.c | 191 +++++++++++++----- include/net/mana/mana.h | 5 +- 2 files changed, 138 insertions(+), 58 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 9f9bd3571da16..6d23a815ddeb6 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -89,63 +89,137 @@ static unsigned int mana_checksum_info(struct sk_buff *skb) return 0; }
+static void mana_add_sge(struct mana_tx_package *tp, struct mana_skb_head *ash, + int sg_i, dma_addr_t da, int sge_len, u32 gpa_mkey) +{ + ash->dma_handle[sg_i] = da; + ash->size[sg_i] = sge_len; + + tp->wqe_req.sgl[sg_i].address = da; + tp->wqe_req.sgl[sg_i].mem_key = gpa_mkey; + tp->wqe_req.sgl[sg_i].size = sge_len; +} + static int mana_map_skb(struct sk_buff *skb, struct mana_port_context *apc, - struct mana_tx_package *tp) + struct mana_tx_package *tp, int gso_hs) { struct mana_skb_head *ash = (struct mana_skb_head *)skb->head; + int hsg = 1; /* num of SGEs of linear part */ struct gdma_dev *gd = apc->ac->gdma_dev; + int skb_hlen = skb_headlen(skb); + int sge0_len, sge1_len = 0; struct gdma_context *gc; struct device *dev; skb_frag_t *frag; dma_addr_t da; + int sg_i; int i;
gc = gd->gdma_context; dev = gc->dev; - da = dma_map_single(dev, skb->data, skb_headlen(skb), DMA_TO_DEVICE);
+ if (gso_hs && gso_hs < skb_hlen) { + sge0_len = gso_hs; + sge1_len = skb_hlen - gso_hs; + } else { + sge0_len = skb_hlen; + } + + da = dma_map_single(dev, skb->data, sge0_len, DMA_TO_DEVICE); if (dma_mapping_error(dev, da)) return -ENOMEM;
- ash->dma_handle[0] = da; - ash->size[0] = skb_headlen(skb); + mana_add_sge(tp, ash, 0, da, sge0_len, gd->gpa_mkey);
- tp->wqe_req.sgl[0].address = ash->dma_handle[0]; - tp->wqe_req.sgl[0].mem_key = gd->gpa_mkey; - tp->wqe_req.sgl[0].size = ash->size[0]; + if (sge1_len) { + sg_i = 1; + da = dma_map_single(dev, skb->data + sge0_len, sge1_len, + DMA_TO_DEVICE); + if (dma_mapping_error(dev, da)) + goto frag_err; + + mana_add_sge(tp, ash, sg_i, da, sge1_len, gd->gpa_mkey); + hsg = 2; + }
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + sg_i = hsg + i; + frag = &skb_shinfo(skb)->frags[i]; da = skb_frag_dma_map(dev, frag, 0, skb_frag_size(frag), DMA_TO_DEVICE); - if (dma_mapping_error(dev, da)) goto frag_err;
- ash->dma_handle[i + 1] = da; - ash->size[i + 1] = skb_frag_size(frag); - - tp->wqe_req.sgl[i + 1].address = ash->dma_handle[i + 1]; - tp->wqe_req.sgl[i + 1].mem_key = gd->gpa_mkey; - tp->wqe_req.sgl[i + 1].size = ash->size[i + 1]; + mana_add_sge(tp, ash, sg_i, da, skb_frag_size(frag), + gd->gpa_mkey); }
return 0;
frag_err: - for (i = i - 1; i >= 0; i--) - dma_unmap_page(dev, ash->dma_handle[i + 1], ash->size[i + 1], + for (i = sg_i - 1; i >= hsg; i--) + dma_unmap_page(dev, ash->dma_handle[i], ash->size[i], DMA_TO_DEVICE);
- dma_unmap_single(dev, ash->dma_handle[0], ash->size[0], DMA_TO_DEVICE); + for (i = hsg - 1; i >= 0; i--) + dma_unmap_single(dev, ash->dma_handle[i], ash->size[i], + DMA_TO_DEVICE);
return -ENOMEM; }
+/* Handle the case when GSO SKB linear length is too large. + * MANA NIC requires GSO packets to put only the packet header to SGE0. + * So, we need 2 SGEs for the skb linear part which contains more than the + * header. + * Return a positive value for the number of SGEs, or a negative value + * for an error. + */ +static int mana_fix_skb_head(struct net_device *ndev, struct sk_buff *skb, + int gso_hs) +{ + int num_sge = 1 + skb_shinfo(skb)->nr_frags; + int skb_hlen = skb_headlen(skb); + + if (gso_hs < skb_hlen) { + num_sge++; + } else if (gso_hs > skb_hlen) { + if (net_ratelimit()) + netdev_err(ndev, + "TX nonlinear head: hs:%d, skb_hlen:%d\n", + gso_hs, skb_hlen); + + return -EINVAL; + } + + return num_sge; +} + +/* Get the GSO packet's header size */ +static int mana_get_gso_hs(struct sk_buff *skb) +{ + int gso_hs; + + if (skb->encapsulation) { + gso_hs = skb_inner_tcp_all_headers(skb); + } else { + if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) { + gso_hs = skb_transport_offset(skb) + + sizeof(struct udphdr); + } else { + gso_hs = skb_tcp_all_headers(skb); + } + } + + return gso_hs; +} + netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) { enum mana_tx_pkt_format pkt_fmt = MANA_SHORT_PKT_FMT; struct mana_port_context *apc = netdev_priv(ndev); + int gso_hs = 0; /* zero for non-GSO pkts */ u16 txq_idx = skb_get_queue_mapping(skb); struct gdma_dev *gd = apc->ac->gdma_dev; bool ipv4 = false, ipv6 = false; @@ -157,7 +231,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) struct mana_txq *txq; struct mana_cq *cq; int err, len; - u16 ihs;
if (unlikely(!apc->port_is_up)) goto tx_drop; @@ -207,19 +280,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) pkg.wqe_req.client_data_unit = 0;
pkg.wqe_req.num_sge = 1 + skb_shinfo(skb)->nr_frags; - WARN_ON_ONCE(pkg.wqe_req.num_sge > MAX_TX_WQE_SGL_ENTRIES); - - if (pkg.wqe_req.num_sge <= ARRAY_SIZE(pkg.sgl_array)) { - pkg.wqe_req.sgl = pkg.sgl_array; - } else { - pkg.sgl_ptr = kmalloc_array(pkg.wqe_req.num_sge, - sizeof(struct gdma_sge), - GFP_ATOMIC); - if (!pkg.sgl_ptr) - goto tx_drop_count; - - pkg.wqe_req.sgl = pkg.sgl_ptr; - }
if (skb->protocol == htons(ETH_P_IP)) ipv4 = true; @@ -227,6 +287,26 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) ipv6 = true;
if (skb_is_gso(skb)) { + int num_sge; + + gso_hs = mana_get_gso_hs(skb); + + num_sge = mana_fix_skb_head(ndev, skb, gso_hs); + if (num_sge > 0) + pkg.wqe_req.num_sge = num_sge; + else + goto tx_drop_count; + + u64_stats_update_begin(&tx_stats->syncp); + if (skb->encapsulation) { + tx_stats->tso_inner_packets++; + tx_stats->tso_inner_bytes += skb->len - gso_hs; + } else { + tx_stats->tso_packets++; + tx_stats->tso_bytes += skb->len - gso_hs; + } + u64_stats_update_end(&tx_stats->syncp); + pkg.tx_oob.s_oob.is_outer_ipv4 = ipv4; pkg.tx_oob.s_oob.is_outer_ipv6 = ipv6;
@@ -250,26 +330,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) &ipv6_hdr(skb)->daddr, 0, IPPROTO_TCP, 0); } - - if (skb->encapsulation) { - ihs = skb_inner_tcp_all_headers(skb); - u64_stats_update_begin(&tx_stats->syncp); - tx_stats->tso_inner_packets++; - tx_stats->tso_inner_bytes += skb->len - ihs; - u64_stats_update_end(&tx_stats->syncp); - } else { - if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) { - ihs = skb_transport_offset(skb) + sizeof(struct udphdr); - } else { - ihs = skb_tcp_all_headers(skb); - } - - u64_stats_update_begin(&tx_stats->syncp); - tx_stats->tso_packets++; - tx_stats->tso_bytes += skb->len - ihs; - u64_stats_update_end(&tx_stats->syncp); - } - } else if (skb->ip_summed == CHECKSUM_PARTIAL) { csum_type = mana_checksum_info(skb);
@@ -292,11 +352,25 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) } else { /* Can't do offload of this type of checksum */ if (skb_checksum_help(skb)) - goto free_sgl_ptr; + goto tx_drop_count; } }
- if (mana_map_skb(skb, apc, &pkg)) { + WARN_ON_ONCE(pkg.wqe_req.num_sge > MAX_TX_WQE_SGL_ENTRIES); + + if (pkg.wqe_req.num_sge <= ARRAY_SIZE(pkg.sgl_array)) { + pkg.wqe_req.sgl = pkg.sgl_array; + } else { + pkg.sgl_ptr = kmalloc_array(pkg.wqe_req.num_sge, + sizeof(struct gdma_sge), + GFP_ATOMIC); + if (!pkg.sgl_ptr) + goto tx_drop_count; + + pkg.wqe_req.sgl = pkg.sgl_ptr; + } + + if (mana_map_skb(skb, apc, &pkg, gso_hs)) { u64_stats_update_begin(&tx_stats->syncp); tx_stats->mana_map_err++; u64_stats_update_end(&tx_stats->syncp); @@ -1254,11 +1328,16 @@ static void mana_unmap_skb(struct sk_buff *skb, struct mana_port_context *apc) struct mana_skb_head *ash = (struct mana_skb_head *)skb->head; struct gdma_context *gc = apc->ac->gdma_dev->gdma_context; struct device *dev = gc->dev; - int i; + int hsg, i; + + /* Number of SGEs of linear part */ + hsg = (skb_is_gso(skb) && skb_headlen(skb) > ash->size[0]) ? 2 : 1;
- dma_unmap_single(dev, ash->dma_handle[0], ash->size[0], DMA_TO_DEVICE); + for (i = 0; i < hsg; i++) + dma_unmap_single(dev, ash->dma_handle[i], ash->size[i], + DMA_TO_DEVICE);
- for (i = 1; i < skb_shinfo(skb)->nr_frags + 1; i++) + for (i = hsg; i < skb_shinfo(skb)->nr_frags + hsg; i++) dma_unmap_page(dev, ash->dma_handle[i], ash->size[i], DMA_TO_DEVICE); } diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 024ad8ddb27e5..571cc011b0ec5 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -101,9 +101,10 @@ struct mana_txq {
/* skb data and frags dma mappings */ struct mana_skb_head { - dma_addr_t dma_handle[MAX_SKB_FRAGS + 1]; + /* GSO pkts may have 2 SGEs for the linear part*/ + dma_addr_t dma_handle[MAX_SKB_FRAGS + 2];
- u32 size[MAX_SKB_FRAGS + 1]; + u32 size[MAX_SKB_FRAGS + 2]; };
#define MANA_HEADROOM sizeof(struct mana_skb_head)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit b07b6b27a50e3a740c9aa6260ee4bb3ab29515ab ]
The commit in Fixes updated the error handling path of thunderstrike_create() and the remove function but not the error handling path of shield_probe(), should an error occur after a successful thunderstrike_create() call.
Add the missing call. Make sure it is safe to call in the probe error handling path by preventing the led_classdev from attempting to set the LED brightness to the off state on unregister.
Fixes: f88af60e74a5 ("HID: nvidia-shield: Support LED functionality for Thunderstrike") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Rahul Rameshbabu rrameshbabu@nvidia.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-nvidia-shield.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-nvidia-shield.c b/drivers/hid/hid-nvidia-shield.c index 9c44974135079..1ce9e42f57c71 100644 --- a/drivers/hid/hid-nvidia-shield.c +++ b/drivers/hid/hid-nvidia-shield.c @@ -482,7 +482,7 @@ static inline int thunderstrike_led_create(struct thunderstrike *ts)
led->name = "thunderstrike:blue:led"; led->max_brightness = 1; - led->flags = LED_CORE_SUSPENDRESUME; + led->flags = LED_CORE_SUSPENDRESUME | LED_RETAIN_AT_SHUTDOWN; led->brightness_get = &thunderstrike_led_get_brightness; led->brightness_set = &thunderstrike_led_set_brightness;
@@ -694,6 +694,7 @@ static int shield_probe(struct hid_device *hdev, const struct hid_device_id *id) err_haptics: if (ts->haptics_dev) input_unregister_device(ts->haptics_dev); + led_classdev_unregister(&ts->led_dev); return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Kosina jkosina@suse.cz
[ Upstream commit b328dd02e19cb9d3b35de4322f5363516a20ac8c ]
usb_free_urb() does the NULL check itself, so there is no need to duplicate it prior to calling.
Reported-by: kernel test robot lkp@intel.com Fixes: e1cd4004cde7c9 ("HID: sony: Fix a potential memory leak in sony_probe()") Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-sony.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/hid/hid-sony.c b/drivers/hid/hid-sony.c index a02046a78b2da..ebc0aa4e4345f 100644 --- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -2155,8 +2155,7 @@ static int sony_probe(struct hid_device *hdev, const struct hid_device_id *id) return ret;
err: - if (sc->ghl_urb) - usb_free_urb(sc->ghl_urb); + usb_free_urb(sc->ghl_urb);
hid_hw_stop(hdev); return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com
[ Upstream commit 8f02139ad9a7e6e5c05712f8c1501eebed8eacfd ]
The EHL (Elkhart Lake) based platforms provide a OOB (Out of band) service, which allows to wakup device when the system is in S5 (Soft-Off state). This OOB service can be enabled/disabled from BIOS settings. When enabled, the ISH device gets PME wake capability. To enable PME wakeup, driver also needs to enable ACPI GPE bit.
On resume, BIOS will clear the wakeup bit. So driver need to re-enable it in resume function to keep the next wakeup capability. But this BIOS clearing of wakeup bit doesn't decrement internal OS GPE reference count, so this reenabling on every resume will cause reference count to overflow.
So first disable and reenable ACPI GPE bit using acpi_disable_gpe().
Fixes: 2e23a70edabe ("HID: intel-ish-hid: ipc: finish power flow for EHL OOB") Reported-by: Kai-Heng Feng kai.heng.feng@canonical.com Closes: https://lore.kernel.org/lkml/CAAd53p4=oLYiH2YbVSmrPNj1zpMcfp=Wxbasb5vhMXOWCA... Tested-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/intel-ish-hid/ipc/pci-ish.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/hid/intel-ish-hid/ipc/pci-ish.c b/drivers/hid/intel-ish-hid/ipc/pci-ish.c index 55cb25038e632..710fda5f19e1c 100644 --- a/drivers/hid/intel-ish-hid/ipc/pci-ish.c +++ b/drivers/hid/intel-ish-hid/ipc/pci-ish.c @@ -133,6 +133,14 @@ static int enable_gpe(struct device *dev) } wakeup = &adev->wakeup;
+ /* + * Call acpi_disable_gpe(), so that reference count + * gpe_event_info->runtime_count doesn't overflow. + * When gpe_event_info->runtime_count = 0, the call + * to acpi_disable_gpe() simply return. + */ + acpi_disable_gpe(wakeup->gpe_device, wakeup->gpe_number); + acpi_sts = acpi_enable_gpe(wakeup->gpe_device, wakeup->gpe_number); if (ACPI_FAILURE(acpi_sts)) { dev_err(dev, "enable ose_gpe failed\n");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 566aeed6871ac2189b5bfe03e1a5b3b7be5eca38 ]
Since FIXED_PHY depends on PHYLIB, PHYLIB needs to be set to avoid a kconfig warning:
WARNING: unmet direct dependencies detected for FIXED_PHY Depends on [n]: NETDEVICES [=y] && PHYLIB [=n] Selected by [y]: - LAN743X [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && PCI [=y] && PTP_1588_CLOCK_OPTIONAL [=y]
Fixes: 73c4d1b307ae ("net: lan743x: select FIXED_PHY") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Closes: lore.kernel.org/r/202309261802.JPbRHwti-lkp@intel.com Cc: Bryan Whitehead bryan.whitehead@microchip.com Cc: UNGLinuxDriver@microchip.com Reviewed-by: Simon Horman horms@kernel.org Tested-by: Simon Horman horms@kernel.org # build-tested Link: https://lore.kernel.org/r/20231002193544.14529-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/microchip/Kconfig b/drivers/net/ethernet/microchip/Kconfig index 329e374b9539c..43ba71e82260c 100644 --- a/drivers/net/ethernet/microchip/Kconfig +++ b/drivers/net/ethernet/microchip/Kconfig @@ -46,6 +46,7 @@ config LAN743X tristate "LAN743x support" depends on PCI depends on PTP_1588_CLOCK_OPTIONAL + select PHYLIB select FIXED_PHY select CRC16 select CRC32
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: John David Anglin dave@parisc-linux.org
commit 914988e099fc658436fbd7b8f240160c352b6552 upstream.
Back in 2005, Kyle McMartin removed the 16-byte alignment for ldcw semaphores on PA 2.0 machines (CONFIG_PA20). This broke spinlocks on pre PA8800 processors. The main symptom was random faults in mmap'd memory (e.g., gcc compilations, etc).
Unfortunately, the errata for this ldcw change is lost.
The issue is the 16-byte alignment required for ldcw semaphore instructions can only be reduced to natural alignment when the ldcw operation can be handled coherently in cache. Only PA8800 and PA8900 processors actually support doing the operation in cache.
Aligning the spinlock dynamically adds two integer instructions to each spinlock.
Tested on rp3440, c8000 and a500.
Signed-off-by: John David Anglin dave.anglin@bell.net Link: https://lore.kernel.org/linux-parisc/6b332788-2227-127f-ba6d-55e99ecf4ed8@be... Link: https://lore.kernel.org/linux-parisc/20050609050702.GB4641@roadwarrior.mcmar... Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/asm/ldcw.h | 37 ++++++++++++++++--------------- arch/parisc/include/asm/spinlock_types.h | 5 ---- 2 files changed, 20 insertions(+), 22 deletions(-)
--- a/arch/parisc/include/asm/ldcw.h +++ b/arch/parisc/include/asm/ldcw.h @@ -2,39 +2,42 @@ #ifndef __PARISC_LDCW_H #define __PARISC_LDCW_H
-#ifndef CONFIG_PA20 /* Because kmalloc only guarantees 8-byte alignment for kmalloc'd data, and GCC only guarantees 8-byte alignment for stack locals, we can't be assured of 16-byte alignment for atomic lock data even if we specify "__attribute ((aligned(16)))" in the type declaration. So, we use a struct containing an array of four ints for the atomic lock type and dynamically select the 16-byte aligned int from the array - for the semaphore. */ + for the semaphore. */ + +/* From: "Jim Hull" <jim.hull of hp.com> + I've attached a summary of the change, but basically, for PA 2.0, as + long as the ",CO" (coherent operation) completer is implemented, then the + 16-byte alignment requirement for ldcw and ldcd is relaxed, and instead + they only require "natural" alignment (4-byte for ldcw, 8-byte for + ldcd). + + Although the cache control hint is accepted by all PA 2.0 processors, + it is only implemented on PA8800/PA8900 CPUs. Prior PA8X00 CPUs still + require 16-byte alignment. If the address is unaligned, the operation + of the instruction is undefined. The ldcw instruction does not generate + unaligned data reference traps so misaligned accesses are not detected. + This hid the problem for years. So, restore the 16-byte alignment dropped + by Kyle McMartin in "Remove __ldcw_align for PA-RISC 2.0 processors". */
#define __PA_LDCW_ALIGNMENT 16 -#define __PA_LDCW_ALIGN_ORDER 4 #define __ldcw_align(a) ({ \ unsigned long __ret = (unsigned long) &(a)->lock[0]; \ __ret = (__ret + __PA_LDCW_ALIGNMENT - 1) \ & ~(__PA_LDCW_ALIGNMENT - 1); \ (volatile unsigned int *) __ret; \ }) -#define __LDCW "ldcw"
-#else /*CONFIG_PA20*/ -/* From: "Jim Hull" <jim.hull of hp.com> - I've attached a summary of the change, but basically, for PA 2.0, as - long as the ",CO" (coherent operation) completer is specified, then the - 16-byte alignment requirement for ldcw and ldcd is relaxed, and instead - they only require "natural" alignment (4-byte for ldcw, 8-byte for - ldcd). */ - -#define __PA_LDCW_ALIGNMENT 4 -#define __PA_LDCW_ALIGN_ORDER 2 -#define __ldcw_align(a) (&(a)->slock) +#ifdef CONFIG_PA20 #define __LDCW "ldcw,co" - -#endif /*!CONFIG_PA20*/ +#else +#define __LDCW "ldcw" +#endif
/* LDCW, the only atomic read-write operation PA-RISC has. *sigh*. We don't explicitly expose that "*a" may be written as reload --- a/arch/parisc/include/asm/spinlock_types.h +++ b/arch/parisc/include/asm/spinlock_types.h @@ -9,15 +9,10 @@ #ifndef __ASSEMBLY__
typedef struct { -#ifdef CONFIG_PA20 - volatile unsigned int slock; -# define __ARCH_SPIN_LOCK_UNLOCKED { __ARCH_SPIN_LOCK_UNLOCKED_VAL } -#else volatile unsigned int lock[4]; # define __ARCH_SPIN_LOCK_UNLOCKED \ { { __ARCH_SPIN_LOCK_UNLOCKED_VAL, __ARCH_SPIN_LOCK_UNLOCKED_VAL, \ __ARCH_SPIN_LOCK_UNLOCKED_VAL, __ARCH_SPIN_LOCK_UNLOCKED_VAL } } -#endif } arch_spinlock_t;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jordan Rife jrife@google.com
commit cedc019b9f260facfadd20c6c490e403abf292e3 upstream.
Recent changes to kernel_connect() and kernel_bind() ensure that callers are insulated from changes to the address parameter made by BPF SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and ops->bind() with kernel_connect() and kernel_bind() to ensure that SMB mounts do not see their mount address overwritten in such cases.
Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.came... Cc: stable@vger.kernel.org # 6.0+ Signed-off-by: Jordan Rife jrife@google.com Acked-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/connect.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -2890,9 +2890,9 @@ bind_socket(struct TCP_Server_Info *serv if (server->srcaddr.ss_family != AF_UNSPEC) { /* Bind to the specified local IP address */ struct socket *socket = server->ssocket; - rc = socket->ops->bind(socket, - (struct sockaddr *) &server->srcaddr, - sizeof(server->srcaddr)); + rc = kernel_bind(socket, + (struct sockaddr *) &server->srcaddr, + sizeof(server->srcaddr)); if (rc < 0) { struct sockaddr_in *saddr4; struct sockaddr_in6 *saddr6; @@ -3041,8 +3041,8 @@ generic_ip_connect(struct TCP_Server_Inf socket->sk->sk_sndbuf, socket->sk->sk_rcvbuf, socket->sk->sk_rcvtimeo);
- rc = socket->ops->connect(socket, saddr, slen, - server->noblockcnt ? O_NONBLOCK : 0); + rc = kernel_connect(socket, saddr, slen, + server->noblockcnt ? O_NONBLOCK : 0); /* * When mounting SMB root file systems, we do not want to block in * connect. Otherwise bail out and then let cifs_reconnect() perform
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit d3b3c637e4eb8d3bbe53e5692aee66add72f9851 upstream.
John David Anglin reported that giving "nr_cpus=1" on the command line causes a crash, while "maxcpus=1" works.
Reported-by: John David Anglin dave.anglin@bell.net Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/smp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/parisc/kernel/smp.c +++ b/arch/parisc/kernel/smp.c @@ -440,7 +440,9 @@ int __cpu_up(unsigned int cpu, struct ta if (cpu_online(cpu)) return 0;
- if (num_online_cpus() < setup_max_cpus && smp_boot_one_cpu(cpu, tidle)) + if (num_online_cpus() < nr_cpu_ids && + num_online_cpus() < setup_max_cpus && + smp_boot_one_cpu(cpu, tidle)) return -EIO;
return cpu_online(cpu) ? 0 : -EIO;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
commit 9850ccd5dd88075b2b7fd28d96299d5535f58cc5 upstream.
Commit 4dba12881f88 ("dm zoned: support arbitrary number of devices") made the pointers to additional zoned devices to be stored in a dynamically allocated dmz->ddev array. However, this array is not freed.
Rename dmz_put_zoned_device to dmz_put_zoned_devices and fix it to free the dmz->ddev array when cleaning up zoned device information. Remove NULL assignment for all dmz->ddev elements and just free the dmz->ddev array instead.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 4dba12881f88 ("dm zoned: support arbitrary number of devices") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-zoned-target.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -748,17 +748,16 @@ err: /* * Cleanup zoned device information. */ -static void dmz_put_zoned_device(struct dm_target *ti) +static void dmz_put_zoned_devices(struct dm_target *ti) { struct dmz_target *dmz = ti->private; int i;
- for (i = 0; i < dmz->nr_ddevs; i++) { - if (dmz->ddev[i]) { + for (i = 0; i < dmz->nr_ddevs; i++) + if (dmz->ddev[i]) dm_put_device(ti, dmz->ddev[i]); - dmz->ddev[i] = NULL; - } - } + + kfree(dmz->ddev); }
static int dmz_fixup_devices(struct dm_target *ti) @@ -948,7 +947,7 @@ err_bio: err_meta: dmz_dtr_metadata(dmz->metadata); err_dev: - dmz_put_zoned_device(ti); + dmz_put_zoned_devices(ti); err: kfree(dmz->dev); kfree(dmz); @@ -978,7 +977,7 @@ static void dmz_dtr(struct dm_target *ti
bioset_exit(&dmz->bio_set);
- dmz_put_zoned_device(ti); + dmz_put_zoned_devices(ti);
mutex_destroy(&dmz->chunk_lock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@nvidia.com
commit c38d23a54445f9a8aa6831fafc9af0496ba02f9e upstream.
Like any other set command, require admin permissions to do it.
Cc: stable@vger.kernel.org Fixes: 2b34c5580226 ("RDMA/core: Add command to set ib_core device net namspace sharing mode") Link: https://lore.kernel.org/r/75d329fdd7381b52cbdf87910bef16c9965abb1f.169644343... Reviewed-by: Parav Pandit parav@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/nldev.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -2529,6 +2529,7 @@ static const struct rdma_nl_cbs nldev_cb }, [RDMA_NLDEV_CMD_SYS_SET] = { .doit = nldev_set_sys_set_doit, + .flags = RDMA_NL_ADMIN_PERM, }, [RDMA_NLDEV_CMD_STAT_SET] = { .doit = nldev_stat_set_doit,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit 55e95bfccf6db8d26a66c46e1de50d53c59a6774 upstream.
Smatch complains that the error path where "action" is invalid leaks the "ce" allocation: drivers/of/dynamic.c:935 of_changeset_action() warn: possible memory leak of 'ce'
Fix this by doing the validation before the allocation.
Note that there is not any actual problem with upstream kernels. All callers of of_changeset_action() are static inlines with fixed action values.
Fixes: 914d9d831e61 ("of: dynamic: Refactor action prints to not use "%pOF" inside devtree_lock") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/r/202309011059.EOdr4im9-lkp@intel.com/ Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/7dfaf999-30ad-491c-9615-fb1138db121c@moroto.mounta... Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/dynamic.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -927,13 +927,13 @@ int of_changeset_action(struct of_change { struct of_changeset_entry *ce;
+ if (WARN_ON(action >= ARRAY_SIZE(action_names))) + return -EINVAL; + ce = kzalloc(sizeof(*ce), GFP_KERNEL); if (!ce) return -ENOMEM;
- if (WARN_ON(action >= ARRAY_SIZE(action_names))) - return -EINVAL; - /* get a reference to the node */ ce->action = action; ce->np = of_node_get(np);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit d7f393430a17c2bfcdf805462a5aa80be4285b27 upstream.
In order to be sure that 'buff' is never truncated, its size should be 12, not 11.
When building with W=1, this fixes the following warnings:
drivers/infiniband/hw/mlx4/sysfs.c: In function ‘add_port_entries’: drivers/infiniband/hw/mlx4/sysfs.c:268:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=] 268 | sprintf(buff, "%d", i); | ^ drivers/infiniband/hw/mlx4/sysfs.c:268:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11 268 | sprintf(buff, "%d", i); | ^~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/mlx4/sysfs.c:286:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=] 286 | sprintf(buff, "%d", i); | ^ drivers/infiniband/hw/mlx4/sysfs.c:286:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11 286 | sprintf(buff, "%d", i); | ^~~~~~~~~~~~~~~~~~~~~~
Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/0bb1443eb47308bc9be30232cc23004c4d4cf43e.169544853... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx4/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx4/sysfs.c +++ b/drivers/infiniband/hw/mlx4/sysfs.c @@ -223,7 +223,7 @@ void del_sysfs_port_mcg_attr(struct mlx4 static int add_port_entries(struct mlx4_ib_dev *device, int port_num) { int i; - char buff[11]; + char buff[12]; struct mlx4_ib_iov_port *port = NULL; int ret = 0 ; struct ib_port_attr attr;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
commit f9315f17bf778cb8079a29639419fcc8a41a3c84 upstream.
pinctrl_gpio_set_config() expects the GPIO number from the global GPIO numberspace, not the controller-relative offset, which needs to be added to the chip base.
Fixes: 5ae4cb94b313 ("gpio: aspeed: Add debounce support") Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Andy Shevchenko andy@kernel.org Reviewed-by: Andrew Jeffery andrew@codeconstruct.com.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-aspeed.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-aspeed.c +++ b/drivers/gpio/gpio-aspeed.c @@ -973,7 +973,7 @@ static int aspeed_gpio_set_config(struct else if (param == PIN_CONFIG_BIAS_DISABLE || param == PIN_CONFIG_BIAS_PULL_DOWN || param == PIN_CONFIG_DRIVE_STRENGTH) - return pinctrl_gpio_set_config(offset, config); + return pinctrl_gpio_set_config(chip->base + offset, config); else if (param == PIN_CONFIG_DRIVE_OPEN_DRAIN || param == PIN_CONFIG_DRIVE_OPEN_SOURCE) /* Return -ENOTSUPP to trigger emulation, as per datasheet */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duje Mihanović duje.mihanovic@skole.hr
commit f0575116507b981e6a810e78ce3c9040395b958b upstream.
Similarly to PXA3xx and MMP2, pinctrl-single isn't capable of setting pin direction on MMP either.
Fixes: a770d946371e ("gpio: pxa: add pin control gpio direction and request") Signed-off-by: Duje Mihanović duje.mihanovic@skole.hr Reviewed-by: Andy Shevchenko andy@kernel.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-pxa.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpio/gpio-pxa.c +++ b/drivers/gpio/gpio-pxa.c @@ -238,6 +238,7 @@ static bool pxa_gpio_has_pinctrl(void) switch (gpio_type) { case PXA3XX_GPIO: case MMP2_GPIO: + case MMP_GPIO: return false;
default:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Zhang markzhang@nvidia.com
commit e0fe97efdb00f0f32b038a4836406a82886aec9c upstream.
Initialize the structure to 0 so that it's fields won't have random values. For example fields like rec.traffic_class (as well as rec.flow_label and rec.sl) is used to generate the user AH through: cma_iboe_join_multicast cma_make_mc_event ib_init_ah_from_mcmember
And a random traffic_class causes a random IP DSCP in RoCEv2.
Fixes: b5de0c60cc30 ("RDMA/cma: Fix use after free race in roce multicast join") Signed-off-by: Mark Zhang markzhang@nvidia.com Link: https://lore.kernel.org/r/20230927090511.603595-1-markzhang@nvidia.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/cma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -4946,7 +4946,7 @@ static int cma_iboe_join_multicast(struc int err = 0; struct sockaddr *addr = (struct sockaddr *)&mc->addr; struct net_device *ndev = NULL; - struct ib_sa_multicast ib; + struct ib_sa_multicast ib = {}; enum ib_gid_type gid_type; bool send_only;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@nvidia.com
commit 18126c767658ae8a831257c6cb7776c5ba5e7249 upstream.
The following compilation error is false alarm as RDMA devices don't have such large amount of ports to actually cause to format truncation.
drivers/infiniband/core/cma_configfs.c: In function ‘make_cma_ports’: drivers/infiniband/core/cma_configfs.c:223:57: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=] 223 | snprintf(port_str, sizeof(port_str), "%u", i + 1); | ^ drivers/infiniband/core/cma_configfs.c:223:17: note: ‘snprintf’ output between 2 and 11 bytes into a destination of size 10 223 | snprintf(port_str, sizeof(port_str), "%u", i + 1); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make[5]: *** [scripts/Makefile.build:243: drivers/infiniband/core/cma_configfs.o] Error 1
Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm") Link: https://lore.kernel.org/r/a7e3b347ee134167fa6a3787c56ef231a04bc8c2.169443463... Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/cma_configfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/core/cma_configfs.c +++ b/drivers/infiniband/core/cma_configfs.c @@ -217,7 +217,7 @@ static int make_cma_ports(struct cma_dev return -ENOMEM;
for (i = 0; i < ports_num; i++) { - char port_str[10]; + char port_str[11];
ports[i].port_num = i + 1; snprintf(port_str, sizeof(port_str), "%u", i + 1);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Selvin Xavier selvin.xavier@broadcom.com
commit 9fc5f9a92fe6897dbed7b9295b234cb7e3cc9d11 upstream.
Flag that indicate control path command completion should be cleared only after copying the command response data. As soon as the is_in_used flag is clear, the waiting thread can proceed with wrong response data. This wrong data is causing multiple issues like wrong lkey used in data traffic and wrong AH Id etc.
Use a memory barrier to ensure that the response data is copied and visible to the process waiting on a different cpu core before clearing the is_in_used flag.
Clear the is_in_used after copying the command response.
Fixes: bcfee4ce3e01 ("RDMA/bnxt_re: remove redundant cmdq_bitmap") Signed-off-by: Saravanan Vajravel saravanan.vajravel@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1695199280-13520-2-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -664,7 +664,6 @@ static int bnxt_qplib_process_qp_event(s blocked = cookie & RCFW_CMD_IS_BLOCKING; cookie &= RCFW_MAX_COOKIE_VALUE; crsqe = &rcfw->crsqe_tbl[cookie]; - crsqe->is_in_used = false;
if (WARN_ONCE(test_bit(FIRMWARE_STALL_DETECTED, &rcfw->cmdq.flags), @@ -680,8 +679,14 @@ static int bnxt_qplib_process_qp_event(s atomic_dec(&rcfw->timeout_send);
if (crsqe->is_waiter_alive) { - if (crsqe->resp) + if (crsqe->resp) { memcpy(crsqe->resp, qp_event, sizeof(*qp_event)); + /* Insert write memory barrier to ensure that + * response data is copied before clearing the + * flags + */ + smp_wmb(); + } if (!blocked) wait_cmds++; } @@ -693,6 +698,8 @@ static int bnxt_qplib_process_qp_event(s if (!is_waiter_alive) crsqe->resp = NULL;
+ crsqe->is_in_used = false; + hwq->cons += req_size;
/* This is a case to handle below scenario -
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Meskhidze konstantin.meskhidze@huawei.com
commit c489800e0d48097fc6afebd862c6afa039110a36 upstream.
Since size of 'hdr' pointer and '*hdr' structure is equal on 64-bit machines issue probably didn't cause any wrong behavior. But anyway, fixing of typo is required.
Fixes: da0f60df7bd5 ("RDMA/uverbs: Prohibit write() calls with too small buffers") Co-developed-by: Ivanov Mikhail ivanov.mikhail1@huawei-partners.com Signed-off-by: Ivanov Mikhail ivanov.mikhail1@huawei-partners.com Signed-off-by: Konstantin Meskhidze konstantin.meskhidze@huawei.com Link: https://lore.kernel.org/r/20230905103258.1738246-1-konstantin.meskhidze@huaw... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/core/uverbs_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -535,7 +535,7 @@ static ssize_t verify_hdr(struct ib_uver if (hdr->in_words * 4 != count) return -EINVAL;
- if (count < method_elm->req_size + sizeof(hdr)) { + if (count < method_elm->req_size + sizeof(*hdr)) { /* * rdma-core v18 and v19 have a bug where they send DESTROY_CQ * with a 16 byte write instead of 24. Old kernels didn't
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bart Van Assche bvanassche@acm.org
commit e193b7955dfad68035b983a0011f4ef3590c85eb upstream.
After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler callback, it performs one of the following actions: * Call scsi_queue_insert(). * Call scsi_finish_command(). * Call scsi_eh_scmd_add(). Hence, SCSI abort handlers must not call scsi_done(). Otherwise all the above actions would trigger a use-after-free. Hence remove the scsi_done() call from srp_abort(). Keep the srp_free_req() call before returning SUCCESS because we may not see the command again if SUCCESS is returned.
Cc: Bob Pearson rpearsonhpe@gmail.com Cc: Shinichiro Kawasaki shinichiro.kawasaki@wdc.com Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang") Signed-off-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20230823205727.505681-1-bvanassche@acm.org Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/ulp/srp/ib_srp.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-)
--- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2784,7 +2784,6 @@ static int srp_abort(struct scsi_cmnd *s u32 tag; u16 ch_idx; struct srp_rdma_ch *ch; - int ret;
shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
@@ -2798,19 +2797,14 @@ static int srp_abort(struct scsi_cmnd *s shost_printk(KERN_ERR, target->scsi_host, "Sending SRP abort for tag %#x\n", tag); if (srp_send_tsk_mgmt(ch, tag, scmnd->device->lun, - SRP_TSK_ABORT_TASK, NULL) == 0) - ret = SUCCESS; - else if (target->rport->state == SRP_RPORT_LOST) - ret = FAST_IO_FAIL; - else - ret = FAILED; - if (ret == SUCCESS) { + SRP_TSK_ABORT_TASK, NULL) == 0) { srp_free_req(ch, req, scmnd, 0); - scmnd->result = DID_ABORT << 16; - scsi_done(scmnd); + return SUCCESS; } + if (target->rport->state == SRP_RPORT_LOST) + return FAST_IO_FAIL;
- return ret; + return FAILED; }
static int srp_reset_device(struct scsi_cmnd *scmnd)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bernard Metzler bmt@zurich.ibm.com
commit 53a3f777049771496f791504e7dc8ef017cba590 upstream.
In case immediate MPA request processing fails, the newly created endpoint unlinks the listening endpoint and is ready to be dropped. This special case was not handled correctly by the code handling the later TCP socket close, causing a NULL dereference crash in siw_cm_work_handler() when dereferencing a NULL listener. We now also cancel the useless MPA timeout, if immediate MPA request processing fails.
This patch furthermore simplifies MPA processing in general: Scheduling a useless TCP socket read in sk_data_ready() upcall is now surpressed, if the socket is already moved out of TCP_ESTABLISHED state.
Fixes: 6c52fdc244b5 ("rdma/siw: connection management") Signed-off-by: Bernard Metzler bmt@zurich.ibm.com Link: https://lore.kernel.org/r/20230905145822.446263-1-bmt@zurich.ibm.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/sw/siw/siw_cm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/drivers/infiniband/sw/siw/siw_cm.c +++ b/drivers/infiniband/sw/siw/siw_cm.c @@ -976,6 +976,7 @@ static void siw_accept_newconn(struct si siw_cep_put(cep); new_cep->listen_cep = NULL; if (rv) { + siw_cancel_mpatimer(new_cep); siw_cep_set_free(new_cep); goto error; } @@ -1100,9 +1101,12 @@ static void siw_cm_work_handler(struct w /* * Socket close before MPA request received. */ - siw_dbg_cep(cep, "no mpareq: drop listener\n"); - siw_cep_put(cep->listen_cep); - cep->listen_cep = NULL; + if (cep->listen_cep) { + siw_dbg_cep(cep, + "no mpareq: drop listener\n"); + siw_cep_put(cep->listen_cep); + cep->listen_cep = NULL; + } } } release_cep = 1; @@ -1227,7 +1231,11 @@ static void siw_cm_llp_data_ready(struct if (!cep) goto out;
- siw_dbg_cep(cep, "state: %d\n", cep->state); + siw_dbg_cep(cep, "cep state: %d, socket state %d\n", + cep->state, sk->sk_state); + + if (sk->sk_state != TCP_ESTABLISHED) + goto out;
switch (cep->state) { case SIW_EPSTATE_RDMA_MODE:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
commit 374012b0045780b7ad498be62e85153009bb7fe9 upstream.
Fix the deadlock by refactoring the MR cache cleanup flow to flush the workqueue without holding the rb_lock. This adds a race between cache cleanup and creation of new entries which we solve by denied creation of new entries after cache cleanup started.
Lockdep: WARNING: possible circular locking dependency detected [ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted [ 2785.339778 ] ------------------------------------------------------ [ 2785.340848 ] devlink/53872 is trying to acquire lock: [ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900 [ 2785.343403 ] [ 2785.343403 ] but task is already holding lock: [ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib] [ 2785.346273 ] [ 2785.346273 ] which lock already depends on the new lock. [ 2785.346273 ] [ 2785.347720 ] [ 2785.347720 ] the existing dependency chain (in reverse order) is: [ 2785.349003 ] [ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}: [ 2785.350160 ] __mutex_lock+0x14c/0x15c0 [ 2785.350962 ] delayed_cache_work_func+0x2d1/0x610 [mlx5_ib] [ 2785.352044 ] process_one_work+0x7c2/0x1310 [ 2785.352879 ] worker_thread+0x59d/0xec0 [ 2785.353636 ] kthread+0x28f/0x330 [ 2785.354370 ] ret_from_fork+0x1f/0x30 [ 2785.355135 ] [ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}: [ 2785.356515 ] __lock_acquire+0x2d8a/0x5fe0 [ 2785.357349 ] lock_acquire+0x1c1/0x540 [ 2785.358121 ] __flush_work+0xe8/0x900 [ 2785.358852 ] __cancel_work_timer+0x2c7/0x3f0 [ 2785.359711 ] mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib] [ 2785.360781 ] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib] [ 2785.361969 ] __mlx5_ib_remove+0x68/0x120 [mlx5_ib] [ 2785.362960 ] mlx5r_remove+0x63/0x80 [mlx5_ib] [ 2785.363870 ] auxiliary_bus_remove+0x52/0x70 [ 2785.364715 ] device_release_driver_internal+0x3c1/0x600 [ 2785.365695 ] bus_remove_device+0x2a5/0x560 [ 2785.366525 ] device_del+0x492/0xb80 [ 2785.367276 ] mlx5_detach_device+0x1a9/0x360 [mlx5_core] [ 2785.368615 ] mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core] [ 2785.369934 ] mlx5_devlink_reload_down+0x292/0x580 [mlx5_core] [ 2785.371292 ] devlink_reload+0x439/0x590 [ 2785.372075 ] devlink_nl_cmd_reload+0xaef/0xff0 [ 2785.372973 ] genl_family_rcv_msg_doit.isra.0+0x1bd/0x290 [ 2785.374011 ] genl_rcv_msg+0x3ca/0x6c0 [ 2785.374798 ] netlink_rcv_skb+0x12c/0x360 [ 2785.375612 ] genl_rcv+0x24/0x40 [ 2785.376295 ] netlink_unicast+0x438/0x710 [ 2785.377121 ] netlink_sendmsg+0x7a1/0xca0 [ 2785.377926 ] sock_sendmsg+0xc5/0x190 [ 2785.378668 ] __sys_sendto+0x1bc/0x290 [ 2785.379440 ] __x64_sys_sendto+0xdc/0x1b0 [ 2785.380255 ] do_syscall_64+0x3d/0x90 [ 2785.381031 ] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 2785.381967 ] [ 2785.381967 ] other info that might help us debug this: [ 2785.381967 ] [ 2785.383448 ] Possible unsafe locking scenario: [ 2785.383448 ] [ 2785.384544 ] CPU0 CPU1 [ 2785.385383 ] ---- ---- [ 2785.386193 ] lock(&dev->cache.rb_lock); [ 2785.386940 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.388327 ] lock(&dev->cache.rb_lock); [ 2785.389425 ] lock((work_completion)(&(&ent->dwork)->work)); [ 2785.390414 ] [ 2785.390414 ] *** DEADLOCK *** [ 2785.390414 ] [ 2785.391579 ] 6 locks held by devlink/53872: [ 2785.392341 ] #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40 [ 2785.393630 ] #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0 [ 2785.395324 ] #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core] [ 2785.397322 ] #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core] [ 2785.399231 ] #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600 [ 2785.400864 ] #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]
Fixes: b95845178328 ("RDMA/mlx5: Change the cache structure to an RB-tree") Signed-off-by: Shay Drory shayd@nvidia.com Signed-off-by: Michael Guralnik michaelgur@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/mr.c | 16 ++++++++++++++-- 2 files changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -797,6 +797,7 @@ struct mlx5_mkey_cache { struct dentry *fs_root; unsigned long last_add; struct delayed_work remove_ent_dwork; + u8 disable: 1; };
struct mlx5_ib_port_resources { --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1024,19 +1024,27 @@ void mlx5_mkey_cache_cleanup(struct mlx5 if (!dev->cache.wq) return;
- cancel_delayed_work_sync(&dev->cache.remove_ent_dwork); mutex_lock(&dev->cache.rb_lock); + dev->cache.disable = true; for (node = rb_first(root); node; node = rb_next(node)) { ent = rb_entry(node, struct mlx5_cache_ent, node); xa_lock_irq(&ent->mkeys); ent->disabled = true; xa_unlock_irq(&ent->mkeys); - cancel_delayed_work_sync(&ent->dwork); } + mutex_unlock(&dev->cache.rb_lock); + + /* + * After all entries are disabled and will not reschedule on WQ, + * flush it and all async commands. + */ + flush_workqueue(dev->cache.wq);
mlx5_mkey_cache_debugfs_cleanup(dev); mlx5_cmd_cleanup_async_ctx(&dev->async_ctx);
+ /* At this point all entries are disabled and have no concurrent work. */ + mutex_lock(&dev->cache.rb_lock); node = rb_first(root); while (node) { ent = rb_entry(node, struct mlx5_cache_ent, node); @@ -1815,6 +1823,10 @@ static int cache_ent_find_and_store(stru }
mutex_lock(&cache->rb_lock); + if (cache->disable) { + mutex_unlock(&cache->rb_lock); + return 0; + } ent = mkey_cache_ent_from_rb_key(dev, mr->mmkey.rb_key); if (ent) { if (ent->rb_key.ndescs == mr->mmkey.rb_key.ndescs) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Guralnik michaelgur@nvidia.com
commit 4f14c6c0213e1def48f0f887d35f44095416c67d upstream.
After the change to use dynamic cache structure, new cache entries can be added and the mkey allocation can no longer assume that all mkeys created for the cache have access_flags equal to zero.
Example of a flow that exposes the issue: A user registers MR with RO on a HCA that cannot UMR RO and the mkey is created outside of the cache. When the user deregisters the MR, a new cache entry is created to store mkeys with RO.
Later, the user registers 2 MRs with RO. The first MR is reused from the new cache entry. When we try to get the second mkey from the cache we see the entry is empty so we go to the MR cache mkey allocation flow which would have allocated a mkey with no access flags, resulting the user getting a MR without RO.
Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Reviewed-by: Edward Srouji edwards@nvidia.com Signed-off-by: Michael Guralnik michaelgur@nvidia.com Link: https://lore.kernel.org/r/8a802700b82def3ace3f77cd7a9ad9d734af87e7.169520395... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/mr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -301,7 +301,8 @@ static int get_mkc_octo_size(unsigned in
static void set_cache_mkc(struct mlx5_cache_ent *ent, void *mkc) { - set_mkc_access_pd_addr_fields(mkc, 0, 0, ent->dev->umrc.pd); + set_mkc_access_pd_addr_fields(mkc, ent->rb_key.access_flags, 0, + ent->dev->umrc.pd); MLX5_SET(mkc, mkc, free, 1); MLX5_SET(mkc, mkc, umr_en, 1); MLX5_SET(mkc, mkc, access_mode_1_0, ent->rb_key.access_mode & 0x3);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hamdan Igbaria hamdani@nvidia.com
commit 2fad8f06a582cd431d398a0b3f9be21d069603ab upstream.
The mutex was not unlocked on some of the error flows. Moved the unlock location to include all the error flow scenarios.
Fixes: e1f4a52ac171 ("RDMA/mlx5: Create an indirect flow table for steering anchor") Reviewed-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Hamdan Igbaria hamdani@nvidia.com Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.169520395... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/fs.c +++ b/drivers/infiniband/hw/mlx5/fs.c @@ -2470,8 +2470,8 @@ destroy_res: mlx5_steering_anchor_destroy_res(ft_prio); put_flow_table: put_flow_table(dev, ft_prio, true); - mutex_unlock(&dev->flow_db->lock); free_obj: + mutex_unlock(&dev->flow_db->lock); kfree(obj);
return err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
commit dab994bcc609a172bfdab15a0d4cb7e50e8b5458 upstream.
checkpath is complaining about NULL string, change it to 'Unknown'.
Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping") Signed-off-by: Shay Drory shayd@nvidia.com Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.169520395... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -2070,7 +2070,7 @@ static inline char *mmap_cmd2str(enum ml case MLX5_IB_MMAP_DEVICE_MEM: return "Device Memory"; default: - return NULL; + return "Unknown"; } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Colin Ian King colin.i.king@gmail.com
commit f286620b5dc974fe281d8feed6e228fd2f39d013 upstream.
There is a spelling mistake in a quirk entry. Fix it.
Signed-off-by: Colin Ian King colin.i.king@gmail.com Fixes: 3babae915f4c ("ALSA: hda/tas2781: Add tas2781 HDA driver") Link: https://lore.kernel.org/r/20230821080003.16678-1-colin.i.king@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10016,7 +10016,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x3869, "Lenovo Yoga7 14IAL7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual powe mode2 YC", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual power mode2 YC", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3884, "Y780 YG DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x3886, "Y780 VECO DUAL", ALC287_FIXUP_TAS2781_I2C), SND_PCI_QUIRK(0x17aa, 0x38a7, "Y780P AMD YG dual", ALC287_FIXUP_TAS2781_I2C),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit fb6254df09bba303db2a1002085f6c0b90a456ed upstream.
If system has two speakers and one connect to 0x14 pin, use this function will disable it.
Fixes: e43252db7e20 ("ALSA: hda/realtek - ALC287 I2S speaker platform support") Signed-off-by: Kailang Yang kailang@realtek.com Link: https://lore.kernel.org/r/e3f2aac3fe6a47079d728a6443358cc2@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7065,8 +7065,10 @@ static void alc287_fixup_bind_dacs(struc snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn); spec->gen.preferred_dacs = preferred_pairs; spec->gen.auto_mute_via_amp = 1; - snd_hda_codec_write_cache(codec, 0x14, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, - 0x0); /* Make sure 0x14 was disable */ + if (spec->gen.autocfg.speaker_pins[0] != 0x14) { + snd_hda_codec_write_cache(codec, 0x14, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, + 0x0); /* Make sure 0x14 was disable */ + } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 62d5e970d022ef4bde18948dd67247c3194384c1 upstream.
In snp_accept_memory(), the npages variables value is calculated from phys_addr_t variables but is an unsigned int. A very large range passed into snp_accept_memory() could lead to truncating npages to zero. This doesn't happen at the moment but let's be prepared.
Fixes: 6c3211796326 ("x86/sev: Add SNP-specific unaccepted memory support") Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/6d511c25576494f682063c9fb6c705b526a3757e.168744150... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/sev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 2787826d9f60..d8c1e3be74c0 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -868,8 +868,7 @@ void snp_set_memory_private(unsigned long vaddr, unsigned long npages)
void snp_accept_memory(phys_addr_t start, phys_addr_t end) { - unsigned long vaddr; - unsigned int npages; + unsigned long vaddr, npages;
if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) return;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tom Lendacky thomas.lendacky@amd.com
commit 6bc6f7d9d7ac3cdbe9e8b0495538b4a0cc11f032 upstream.
SNP retrieves the majority of CPUID information from the SNP CPUID page. But there are times when that information needs to be supplemented by the hypervisor, for example, obtaining the initial APIC ID of the vCPU from leaf 1.
The current implementation uses the MSR protocol to retrieve the data from the hypervisor, even when a GHCB exists. The problem arises when an NMI arrives on return from the VMGEXIT. The NMI will be immediately serviced and may generate a #VC requiring communication with the hypervisor.
Since a GHCB exists in this case, it will be used. As part of using the GHCB, the #VC handler will write the GHCB physical address into the GHCB MSR and the #VC will be handled.
When the NMI completes, processing resumes at the site of the VMGEXIT which is expecting to read the GHCB MSR and find a CPUID MSR protocol response. Since the NMI handling overwrote the GHCB MSR response, the guest will see an invalid reply from the hypervisor and self-terminate.
Fix this problem by using the GHCB when it is available. Any NMI received is properly handled because the GHCB contents are copied into a backup page and restored on NMI exit, thus preserving the active GHCB request or result.
[ bp: Touchups. ]
Fixes: ee0bfa08a345 ("x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers") Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/a5856fa1ebe3879de91a8f6298b6bbd901c61881.169057856... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/sev-shared.c | 69 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 55 insertions(+), 14 deletions(-)
--- a/arch/x86/kernel/sev-shared.c +++ b/arch/x86/kernel/sev-shared.c @@ -256,7 +256,7 @@ static int __sev_cpuid_hv(u32 fn, int re return 0; }
-static int sev_cpuid_hv(struct cpuid_leaf *leaf) +static int __sev_cpuid_hv_msr(struct cpuid_leaf *leaf) { int ret;
@@ -279,6 +279,45 @@ static int sev_cpuid_hv(struct cpuid_lea return ret; }
+static int __sev_cpuid_hv_ghcb(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) +{ + u32 cr4 = native_read_cr4(); + int ret; + + ghcb_set_rax(ghcb, leaf->fn); + ghcb_set_rcx(ghcb, leaf->subfn); + + if (cr4 & X86_CR4_OSXSAVE) + /* Safe to read xcr0 */ + ghcb_set_xcr0(ghcb, xgetbv(XCR_XFEATURE_ENABLED_MASK)); + else + /* xgetbv will cause #UD - use reset value for xcr0 */ + ghcb_set_xcr0(ghcb, 1); + + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_CPUID, 0, 0); + if (ret != ES_OK) + return ret; + + if (!(ghcb_rax_is_valid(ghcb) && + ghcb_rbx_is_valid(ghcb) && + ghcb_rcx_is_valid(ghcb) && + ghcb_rdx_is_valid(ghcb))) + return ES_VMM_ERROR; + + leaf->eax = ghcb->save.rax; + leaf->ebx = ghcb->save.rbx; + leaf->ecx = ghcb->save.rcx; + leaf->edx = ghcb->save.rdx; + + return ES_OK; +} + +static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) +{ + return ghcb ? __sev_cpuid_hv_ghcb(ghcb, ctxt, leaf) + : __sev_cpuid_hv_msr(leaf); +} + /* * This may be called early while still running on the initial identity * mapping. Use RIP-relative addressing to obtain the correct address @@ -388,19 +427,20 @@ snp_cpuid_get_validated_func(struct cpui return false; }
-static void snp_cpuid_hv(struct cpuid_leaf *leaf) +static void snp_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) { - if (sev_cpuid_hv(leaf)) + if (sev_cpuid_hv(ghcb, ctxt, leaf)) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV); }
-static int snp_cpuid_postprocess(struct cpuid_leaf *leaf) +static int snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt, + struct cpuid_leaf *leaf) { struct cpuid_leaf leaf_hv = *leaf;
switch (leaf->fn) { case 0x1: - snp_cpuid_hv(&leaf_hv); + snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
/* initial APIC ID */ leaf->ebx = (leaf_hv.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0)); @@ -419,7 +459,7 @@ static int snp_cpuid_postprocess(struct break; case 0xB: leaf_hv.subfn = 0; - snp_cpuid_hv(&leaf_hv); + snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
/* extended APIC ID */ leaf->edx = leaf_hv.edx; @@ -467,7 +507,7 @@ static int snp_cpuid_postprocess(struct } break; case 0x8000001E: - snp_cpuid_hv(&leaf_hv); + snp_cpuid_hv(ghcb, ctxt, &leaf_hv);
/* extended APIC ID */ leaf->eax = leaf_hv.eax; @@ -488,7 +528,7 @@ static int snp_cpuid_postprocess(struct * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value * should be treated as fatal by caller. */ -static int snp_cpuid(struct cpuid_leaf *leaf) +static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) { const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -522,7 +562,7 @@ static int snp_cpuid(struct cpuid_leaf * return 0; }
- return snp_cpuid_postprocess(leaf); + return snp_cpuid_postprocess(ghcb, ctxt, leaf); }
/* @@ -544,14 +584,14 @@ void __init do_vc_no_ghcb(struct pt_regs leaf.fn = fn; leaf.subfn = subfn;
- ret = snp_cpuid(&leaf); + ret = snp_cpuid(NULL, NULL, &leaf); if (!ret) goto cpuid_done;
if (ret != -EOPNOTSUPP) goto fail;
- if (sev_cpuid_hv(&leaf)) + if (__sev_cpuid_hv_msr(&leaf)) goto fail;
cpuid_done: @@ -848,14 +888,15 @@ static enum es_result vc_handle_ioio(str return ret; }
-static int vc_handle_cpuid_snp(struct pt_regs *regs) +static int vc_handle_cpuid_snp(struct ghcb *ghcb, struct es_em_ctxt *ctxt) { + struct pt_regs *regs = ctxt->regs; struct cpuid_leaf leaf; int ret;
leaf.fn = regs->ax; leaf.subfn = regs->cx; - ret = snp_cpuid(&leaf); + ret = snp_cpuid(ghcb, ctxt, &leaf); if (!ret) { regs->ax = leaf.eax; regs->bx = leaf.ebx; @@ -874,7 +915,7 @@ static enum es_result vc_handle_cpuid(st enum es_result ret; int snp_cpuid_ret;
- snp_cpuid_ret = vc_handle_cpuid_snp(regs); + snp_cpuid_ret = vc_handle_cpuid_snp(ghcb, ctxt); if (!snp_cpuid_ret) return ES_OK; if (snp_cpuid_ret != -EOPNOTSUPP)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 53ff5cf89142b978b1a5ca8dc4d4425e6a09745f upstream.
Thread A + Thread B ksmbd_session_lookup | smb2_sess_setup sess = xa_load | | | xa_erase(&conn->sessions, sess->id); | | ksmbd_session_destroy(sess) --> kfree(sess) | // UAF! | sess->last_active = jiffies | +
This patch add rwsem to fix race condition between ksmbd_session_lookup and ksmbd_expire_session.
Reported-by: luosili rootlab@huawei.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/connection.c | 2 ++ fs/smb/server/connection.h | 1 + fs/smb/server/mgmt/user_session.c | 10 +++++++--- 3 files changed, 10 insertions(+), 3 deletions(-)
--- a/fs/smb/server/connection.c +++ b/fs/smb/server/connection.c @@ -84,6 +84,8 @@ struct ksmbd_conn *ksmbd_conn_alloc(void spin_lock_init(&conn->llist_lock); INIT_LIST_HEAD(&conn->lock_list);
+ init_rwsem(&conn->session_lock); + down_write(&conn_list_lock); list_add(&conn->conns_list, &conn_list); up_write(&conn_list_lock); --- a/fs/smb/server/connection.h +++ b/fs/smb/server/connection.h @@ -50,6 +50,7 @@ struct ksmbd_conn { struct nls_table *local_nls; struct unicode_map *um; struct list_head conns_list; + struct rw_semaphore session_lock; /* smb session 1 per user */ struct xarray sessions; unsigned long last_active; --- a/fs/smb/server/mgmt/user_session.c +++ b/fs/smb/server/mgmt/user_session.c @@ -174,7 +174,7 @@ static void ksmbd_expire_session(struct unsigned long id; struct ksmbd_session *sess;
- down_write(&sessions_table_lock); + down_write(&conn->session_lock); xa_for_each(&conn->sessions, id, sess) { if (sess->state != SMB2_SESSION_VALID || time_after(jiffies, @@ -185,7 +185,7 @@ static void ksmbd_expire_session(struct continue; } } - up_write(&sessions_table_lock); + up_write(&conn->session_lock); }
int ksmbd_session_register(struct ksmbd_conn *conn, @@ -227,7 +227,9 @@ void ksmbd_sessions_deregister(struct ks } } } + up_write(&sessions_table_lock);
+ down_write(&conn->session_lock); xa_for_each(&conn->sessions, id, sess) { unsigned long chann_id; struct channel *chann; @@ -244,7 +246,7 @@ void ksmbd_sessions_deregister(struct ks ksmbd_session_destroy(sess); } } - up_write(&sessions_table_lock); + up_write(&conn->session_lock); }
struct ksmbd_session *ksmbd_session_lookup(struct ksmbd_conn *conn, @@ -252,9 +254,11 @@ struct ksmbd_session *ksmbd_session_look { struct ksmbd_session *sess;
+ down_read(&conn->session_lock); sess = xa_load(&conn->sessions, id); if (sess) sess->last_active = jiffies; + up_read(&conn->session_lock); return sess; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: luosili rootlab@huawei.com
commit c69813471a1ec081a0b9bf0c6bd7e8afd818afce upstream.
drop reference after use opinfo.
Signed-off-by: luosili rootlab@huawei.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -8036,10 +8036,10 @@ static void smb20_oplock_break_ack(struc goto err_out; }
- opinfo_put(opinfo); - ksmbd_fd_put(work, fp); opinfo->op_state = OPLOCK_STATE_NONE; wake_up_interruptible_all(&opinfo->oplock_q); + opinfo_put(opinfo); + ksmbd_fd_put(work, fp);
rsp->StructureSize = cpu_to_le16(24); rsp->OplockLevel = rsp_oplevel;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 75ac9a3dd65f7eab4d12b0a0f744234b5300a491 upstream.
There is a race condition issue between parallel smb2 lock request.
Time + Thread A | Thread A smb2_lock | smb2_lock | insert smb_lock to lock_list | spin_unlock(&work->conn->llist_lock) | | | spin_lock(&conn->llist_lock); | kfree(cmp_lock); | // UAF! | list_add(&smb_lock->llist, &rollback_list) +
This patch swaps the line for adding the smb lock to the rollback list and adding the lock list of connection to fix the race issue.
Reported-by: luosili rootlab@huawei.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -7029,10 +7029,6 @@ skip:
ksmbd_debug(SMB, "would have to wait for getting lock\n"); - spin_lock(&work->conn->llist_lock); - list_add_tail(&smb_lock->clist, - &work->conn->lock_list); - spin_unlock(&work->conn->llist_lock); list_add(&smb_lock->llist, &rollback_list);
argv = kmalloc(sizeof(void *), GFP_KERNEL); @@ -7063,9 +7059,6 @@ skip:
if (work->state != KSMBD_WORK_ACTIVE) { list_del(&smb_lock->llist); - spin_lock(&work->conn->llist_lock); - list_del(&smb_lock->clist); - spin_unlock(&work->conn->llist_lock); locks_free_lock(flock);
if (work->state == KSMBD_WORK_CANCELLED) { @@ -7087,19 +7080,16 @@ skip: }
list_del(&smb_lock->llist); - spin_lock(&work->conn->llist_lock); - list_del(&smb_lock->clist); - spin_unlock(&work->conn->llist_lock); release_async_work(work); goto retry; } else if (!rc) { + list_add(&smb_lock->llist, &rollback_list); spin_lock(&work->conn->llist_lock); list_add_tail(&smb_lock->clist, &work->conn->lock_list); list_add_tail(&smb_lock->flist, &fp->lock_list); spin_unlock(&work->conn->llist_lock); - list_add(&smb_lock->llist, &rollback_list); ksmbd_debug(SMB, "successful in taking lock\n"); } else { goto out;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@nvidia.com
commit c99a7457e5bb873914a74307ba2df85f6799203b upstream.
During execution of mlx5_mkey_cache_cleanup(), there is a guarantee that MR are not registered and/or destroyed. It means that we don't need newly introduced cache disable flag.
Fixes: 374012b00457 ("RDMA/mlx5: Fix mkey cache possible deadlock on cleanup") Link: https://lore.kernel.org/r/c7e9c9f98c8ae4a7413d97d9349b29f5b0a23dbe.169592162... Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 - drivers/infiniband/hw/mlx5/mr.c | 5 ----- 2 files changed, 6 deletions(-)
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -797,7 +797,6 @@ struct mlx5_mkey_cache { struct dentry *fs_root; unsigned long last_add; struct delayed_work remove_ent_dwork; - u8 disable: 1; };
struct mlx5_ib_port_resources { --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1026,7 +1026,6 @@ void mlx5_mkey_cache_cleanup(struct mlx5 return;
mutex_lock(&dev->cache.rb_lock); - dev->cache.disable = true; for (node = rb_first(root); node; node = rb_next(node)) { ent = rb_entry(node, struct mlx5_cache_ent, node); xa_lock_irq(&ent->mkeys); @@ -1824,10 +1823,6 @@ static int cache_ent_find_and_store(stru }
mutex_lock(&cache->rb_lock); - if (cache->disable) { - mutex_unlock(&cache->rb_lock); - return 0; - } ent = mkey_cache_ent_from_rb_key(dev, mr->mmkey.rb_key); if (ent) { if (ent->rb_key.ndescs == mr->mmkey.rb_key.ndescs) {
Hello,
On Mon, 9 Oct 2023 14:59:24 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] e640325b3c9a ("Linux 6.5.7-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: debugfs_rm_non_contexts.sh ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_arm64.sh ok 12 selftests: damon-tests: build_i386_idle_flag.sh ok 13 selftests: damon-tests: build_i386_highpte.sh ok 14 selftests: damon-tests: build_nomemcg.sh [33m [92mPASS [39m _remote_run_corr.sh SUCCESS
On 10/9/23 06:59, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 10/9/23 05:59, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Mon, Oct 09, 2023 at 02:59:24PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully compiled and installed bindeb-pkgs on my computer (Acer Aspire E15, Intel Core i3 Haswell). No noticeable regressions.
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On 10/9/23 5:59 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 09 Oct 2023 14:59:24 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.5: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.5.7-rc1-g36615f3f3371 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Mon, 9 Oct 2023 at 18:37, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE: Build warning on armv7, net/mac80211/mlme.c: In function 'ieee80211_rx_mgmt_assoc_resp': net/mac80211/mlme.c:5474:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
## Build * kernel: 6.5.7-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.5.y * git commit: e640325b3c9a18f1de24daa332563620eb03309e * git describe: v6.5.6-164-ge640325b3c9a * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.5.y/build/v6.5.6-...
## Test Regressions (compared to v6.5.6)
## Metric Regressions (compared to v6.5.6)
## Test Fixes (compared to v6.5.6)
## Metric Fixes (compared to v6.5.6)
## Test result summary total: 132702, pass: 113630, fail: 2151, skip: 16744, xfail: 177
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 141 total, 141 passed, 0 failed * arm64: 52 total, 51 passed, 1 failed * i386: 41 total, 41 passed, 0 failed * mips: 30 total, 28 passed, 2 failed * parisc: 4 total, 4 passed, 0 failed * powerpc: 37 total, 35 passed, 2 failed * riscv: 26 total, 25 passed, 1 failed * s390: 16 total, 13 passed, 3 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 46 total, 46 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-vm * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my x86_64 and ARM64 test systems. No errors or regressions.
Tested-by: Allen Pais apais@linux.microsoft.com
Thanks.
On Mon, Oct 09, 2023 at 02:59:24PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.7 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000. Anything received after that time might be too late.
Build results: total: 157 pass: 157 fail: 0 Qemu test results: total: 530 pass: 530 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org