This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.4.7-rc1
Wayne Lin wayne.lin@amd.com drm/amd/display: Add polling method to handle MST reply packet
Srinivasan Shanmugam srinivasan.shanmugam@amd.com drm/amd/display: Clean up errors & warnings in amdgpu_dm.c
Yu Kuai yukuai3@huawei.com scsi: sg: Fix checking return value of blk_get_queue()
Yu Kuai yukuai3@huawei.com scsi/sg: don't grab scsi host module reference
Abe Kohandel abe.kohandel@intel.com spi: dw: Remove misleading comment for Mount Evans SoC
Yunxiang Li Yunxiang.Li@amd.com drm/ttm: fix bulk_move corruption when adding a entry
Mohamed Khalfella mkhalfella@purestorage.com tracing/histograms: Return an error if we fail to add histogram to hist_vars list
Miguel Ojeda ojeda@kernel.org kbuild: rust: avoid creating temporary files
Zhang Yi yi.zhang@huawei.com jbd2: recheck chechpointing non-dirty buffer
Vladimir Oltean vladimir.oltean@nxp.com net: phy: prevent stale pointer dereference in phy_init()
Eric Dumazet edumazet@google.com tcp: annotate data-races around fastopenq.max_qlen
Eric Dumazet edumazet@google.com tcp: annotate data-races around icsk->icsk_user_timeout
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->notsent_lowat
Eric Dumazet edumazet@google.com tcp: annotate data-races around rskq_defer_accept
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->linger2
Eric Dumazet edumazet@google.com tcp: annotate data-races around icsk->icsk_syn_retries
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_probes
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_intvl
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_time
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->tsoffset
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->tcp_tx_delay
Tomasz Moń tomasz.mon@nordicsemi.no Bluetooth: btusb: Fix bluetooth on Intel Macbook 2014
Pauli Virtanen pav@iki.fi Bluetooth: SCO: fix sco_conn related locking and validity issues
Siddh Raman Pant code@siddh.me Bluetooth: hci_conn: return ERR_PTR instead of NULL when there is no link
Douglas Anderson dianders@chromium.org Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_remove_adv_monitor()
Pauli Virtanen pav@iki.fi Bluetooth: ISO: fix iso_conn related locking and validity issues
Pauli Virtanen pav@iki.fi Bluetooth: hci_event: call disconnect callback before deleting conn
Pauli Virtanen pav@iki.fi Bluetooth: use RCU for hci_conn_params and iterate safely in hci_sync
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip bound chain on rule flush
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip bound chain in netns release path
Florian Westphal fw@strlen.de netfilter: nft_set_pipapo: fix improper element removal
Florian Westphal fw@strlen.de netfilter: nf_tables: can't schedule in nft_chain_validate
Florian Westphal fw@strlen.de netfilter: nf_tables: fix spurious set element insertion failure
Vitaly Rodionov vitalyr@opensource.cirrus.com ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp
Kuniyuki Iwashima kuniyu@amazon.com llc: Don't drop packet from non-root netns.
Zhang Shurong zhang_shurong@foxmail.com fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: always mtk_get_ib1_pkt_type
Kuniyuki Iwashima kuniyu@amazon.com Revert "tcp: avoid the lookup process failing to get sk in ehash table"
Yuanjun Gong ruc_gongyuanjun@163.com net:ipv6: check return value of pskb_trim()
Wang Ming machel@vivo.com net: ipv4: Use kfree_sensitive instead of kfree
Eric Dumazet edumazet@google.com tcp: annotate data-races around tcp_rsk(req)->ts_recent
Eric Dumazet edumazet@google.com tcp: annotate data-races around tcp_rsk(req)->txhash
Antoine Tenart atenart@kernel.org net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
Florian Kauer florian.kauer@linutronix.de igc: Prevent garbled TX queue with XDP ZEROCOPY
Kurt Kanzenbach kurt@linutronix.de igc: Avoid transmit queue timeout for XDP
Alexander Duyck alexanderduyck@fb.com bpf, arm64: Fix BTI type used for freplace attached functions
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Repeat check_max_stack_depth for async callbacks
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Fix subprog idx logic in check_max_stack_depth
Geetha sowjanya gakula@marvell.com octeontx2-pf: Dont allocate BPIDs for LBK interfaces
Ido Schimmel idosch@nvidia.com vrf: Fix lockdep splat in output path
Jiapeng Chong jiapeng.chong@linux.alibaba.com security: keys: Modify mismatched function name
Ahmed Zaki ahmed.zaki@intel.com iavf: fix reset task race with iavf_remove()
Ahmed Zaki ahmed.zaki@intel.com iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies
Marcin Szycik marcin.szycik@linux.intel.com iavf: Wait for reset in callbacks which trigger it
Przemek Kitszel przemyslaw.kitszel@intel.com iavf: make functions static where possible
Ahmed Zaki ahmed.zaki@intel.com iavf: use internal state to free traffic IRQs
Ding Hui dinghui@sangfor.com.cn iavf: Fix out-of-bounds when setting channels on remove
Ding Hui dinghui@sangfor.com.cn iavf: Fix use-after-free in free_netdev
Andrzej Hajda andrzej.hajda@intel.com drm/i915/perf: add sentinel to xehp_oa_b_counters
Heiner Kallweit hkallweit1@gmail.com r8169: fix ASPM-related problem for chip version 42 and 43
Tristram Ha Tristram.Ha@microchip.com net: dsa: microchip: correct KSZ8795 static MAC table access
Victor Nogueira victor@mojatatu.com net: sched: cls_bpf: Undo tcf_bind_filter in case of an error
Victor Nogueira victor@mojatatu.com net: sched: cls_u32: Undo refcount decrement in case update failed
Victor Nogueira victor@mojatatu.com net: sched: cls_u32: Undo tcf_bind_filter if u32_replace_hw_knode
Victor Nogueira victor@mojatatu.com net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms
Martin Fuzzey martin.fuzzey@flowbird.group regulator: da9063: fix null pointer deref with partial DT config
Dan Carpenter dan.carpenter@linaro.org ASoC: SOF: ipc3-dtrace: uninitialized data in dfsentry_trace_filter_write()
Michal Swiatkowski michal.swiatkowski@linux.intel.com ice: prevent NULL pointer deref during reload
Petr Oros poros@redhat.com ice: Unregister netdev and devlink_port only once
Shyam Prasad N nspmangalore@gmail.com cifs: fix mid leak during reconnection after timeout threshold
Dan Carpenter error27@gmail.com iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid()
Yan Zhai yan@cloudflare.com gso: fix dodgy bit handling for GSO_UDP_L4
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: handle probe deferral
Kuniyuki Iwashima kuniyu@amazon.com bridge: Add extack warning when enabling STP in netns.
Tanmay Patil t-patil@ti.com net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()
Linus Walleij linus.walleij@linaro.org dsa: mv88e6xxx: Do a final check before timing out
Marc Zyngier maz@kernel.org arm64: Fix HFGxTR_EL2 field naming
Paulo Alcantara pc@manguebit.com smb: client: fix missed ses refcounting
Yonghong Song yhs@fb.com kallsyms: strip LTO-only suffixes from promoted global functions
Jaewon Kim jaewon02.kim@samsung.com spi: s3c64xx: clear loopback bit after loopback test
Christoph Hellwig hch@lst.de btrfs: be a bit more careful when setting mirror_num_ret in btrfs_map_block
James Clark james.clark@arm.com perf build: Fix library not found error when using CSLIBS
Yangtao Li frank.li@vivo.com fbdev: imxfb: Removed unneeded release_mem_region
Martin Kaiser martin@kaiser.cx fbdev: imxfb: warn about invalid left/right margin
Jonas Gorski jonas.gorski@gmail.com spi: bcm63xx: fix max prepend length
Biju Das biju.das.jz@bp.renesas.com pinctrl: renesas: rzg2l: Handle non-unique subnode names
Geert Uytterhoeven geert+renesas@glider.be pinctrl: renesas: rzv2m: Handle non-unique subnode names
Suren Baghdasaryan surenb@google.com sched/psi: use kernfs polling functions for PSI trigger polling
Miaohe Lin linmiaohe@huawei.com sched/fair: Use recent_used_cpu to test p->cpus_ptr
Peter Zijlstra peterz@infradead.org iov_iter: Mark copy_iovec_from_user() noclone
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: qcom: q6apm: do not close GPR port before closing graph
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wcd938x: fix dB range for HPHL and HPHR
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix mbhc impedance loglevel
Vijendar Mukunda Vijendar.Mukunda@amd.com ASoC: amd: acp: fix for invalid dai id handling in acp_get_byte_count()
Hao Chen chenhao418@huawei.com net: hns3: fix strncpy() not using dest-buf length as length issue
Ying Hsu yinghsu@chromium.org igb: Fix igb_down hung on surprise removal
Yi Kuo yi@yikuo.dev wifi: iwlwifi: pcie: add device id 51F1 for killer 1675
Johannes Berg johannes.berg@intel.com wifi: iwlwifi: mvm: avoid baid size integer overflow
Mukesh Sisodiya mukesh.sisodiya@intel.com wifi: iwlwifi: Add support for new PCI Id
Gustavo A. R. Silva gustavoars@kernel.org wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point()
Mukesh Sisodiya mukesh.sisodiya@intel.com wifi: iwlwifi: mvm: Add NULL check before dereferencing the pointer
Petr Oros poros@redhat.com devlink: report devlink_port_type_warn source device
Jisheng Zhang jszhang@kernel.org net: ethernet: litex: add support for 64 bit stats
Gregory Greenman gregory.greenman@intel.com wifi: iwlwifi: mvm: fix potential array out of bounds access
P Praneesh quic_ppranees@quicinc.com wifi: ath11k: fix memory leak in WMI firmware stats
Balamurugan S quic_bselvara@quicinc.com wifi: ath12k: Avoid NULL pointer access during management transmit cleanup
Abe Kohandel abe.kohandel@intel.com spi: dw: Add compatible for Intel Mount Evans SoC
Ilan Peer ilan.peer@intel.com wifi: mac80211_hwsim: Fix possible NULL dereference
Wen Gong quic_wgong@quicinc.com wifi: ath11k: add support default regdb while searching board-2.bin for WCN6855
Jakub Kicinski kuba@kernel.org devlink: make health report on unregistered instance warn just once
Yonghong Song yhs@fb.com bpf: Silence a warning in btf_type_id_size()
Martin Blumenstingl martin.blumenstingl@googlemail.com wifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()
Aditi Ghag aditi.ghag@isovalent.com bpf: tcp: Avoid taking fast sock lock in iterator
Andrii Nakryiko andrii@kernel.org bpf: drop unnecessary user-triggerable WARN_ONCE in verifierl log
Brad Larson blarson@amd.com spi: cadence-quadspi: Add compatible for AMD Pensando Elba SoC
Martin KaFai Lau martin.lau@kernel.org bpf: Address KCSAN report on bpf_lru_list
Kui-Feng Lee thinker.li@gmail.com bpf: Print a warning only if writing to unprivileged_bpf_disabled.
Maxime Bizon mbizon@freebox.fr wifi: ath11k: fix registration of 6Ghz-only phy without the full channel range
Yicong Yang yangyicong@hisilicon.com sched/fair: Don't balance task to its current running CPU
Thomas Weißschuh linux@weissschuh.net tools/nolibc: ensure stack protector guard is never zero
Paul E. McKenney paulmck@kernel.org rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp
Shigeru Yoshida syoshida@redhat.com rcu-tasks: Avoid pr_info() with spin lock in cblist_init_generic()
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Dell Studio 1569
Mark Rutland mark.rutland@arm.com arm64: mm: fix VA-range sanity check
Youngmin Nam youngmin.nam@samsung.com arm64: set __exception_irq_entry with __irq_entry as a default
Mario Limonciello mario.limonciello@amd.com ACPI: resource: Remove "Zen" specific match and quirks
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Lenovo ThinkPad X131e (3371 AMD version)
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Apple iMac11,3
Hans de Goede hdegoede@redhat.com ACPI: x86: Add ACPI_QUIRK_UART1_SKIP for Lenovo Yoga Book yb1-x90f/l
Hans de Goede hdegoede@redhat.com ACPI: button: Add lid disable DMI quirk for Nextbook Ares 8A
Hans de Goede hdegoede@redhat.com ACPI: x86: Add skip i2c clients quirk for Nextbook Ares 8A
Sandeep Dhavale dhavale@google.com erofs: Fix detection of atomic context
Filipe Manana fdmanana@suse.com btrfs: abort transaction at update_ref_for_cow() when ref count is zero
Christoph Hellwig hch@lst.de btrfs: don't check PageError in __extent_writepage
David Sterba dsterba@suse.com btrfs: add xxhash to fast checksum implementations
Thomas Gleixner tglx@linutronix.de posix-timers: Ensure timer ID search-loop limit is valid
Ming Lei ming.lei@redhat.com blk-mq: fix NULL dereference on q->elevator in blk_mq_elv_switch_none
Yu Kuai yukuai3@huawei.com scsi: sg: fix blktrace debugfs entries leakage
Yu Kuai yukuai3@huawei.com md/raid10: prevent soft lockup while flush writes
Yu Kuai yukuai3@huawei.com md: fix data corruption for raid456 when reshape restart while grow up
Immad Mir mirimmad17@gmail.com FS: JFS: Check for read-only mounted filesystem in txBegin
Immad Mir mirimmad17@gmail.com FS: JFS: Fix null-ptr-deref Read in txBegin
Gustavo A. R. Silva gustavoars@kernel.org MIPS: dec: prom: Address -Warray-bounds warning
Yogesh yogi.kernel@gmail.com fs: jfs: Fix UBSAN: array-index-out-of-bounds in dbAllocDmapLev
Matthew Anderson ruinairas1992@gmail.com ALSA: hda/realtek: Add quirks for ROG ALLY CS35l41 audio
Jan Kara jack@suse.cz udf: Fix uninitialized array access for some pathnames
Christian Brauner brauner@kernel.org ovl: check type and offset of struct vfsmount in ovl_entry
Marco Morandini marco.morandini@polimi.it HID: add quirk for 03f0:464a HP Elite Presenter Mouse
Ye Bin yebin10@huawei.com quota: fix warning in dqgrab()
Jan Kara jack@suse.cz quota: Properly disable quotas when add_dquot_ref() fails
Oswald Buddenhagen oswald.buddenhagen@gmx.de ALSA: emu10k1: roll up loops in DSP setup code for Audigy
hackyzh002 hackyzh002@gmail.com drm/radeon: Fix integer overflow in radeon_cs_parser_init
Eric Whitney enwlinux@gmail.com ext4: correct inline offset when handling xattrs in inode body
Marc Zyngier maz@kernel.org KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
Marc Zyngier maz@kernel.org KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
Oliver Upton oliver.upton@linux.dev KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
Marc Zyngier maz@kernel.org KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix soundwire initialisation race
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix codec initialisation race
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd934x: fix resource leaks on component remove
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix missing mbhc init error handling
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix resource leaks on component remove
Sheetal sheetal@nvidia.com ASoC: tegra: Fix AMX byte map
Johan Hovold johan+linaro@kernel.org ASoC: qdsp6: audioreach: fix topology probe deferral
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd-mbhc-v2: fix resource leaks on component remove
Nathan Chancellor nathan@kernel.org ASoC: cs35l45: Select REGMAP_IRQ
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix missing clsh ctrl error handling
Thomas Petazzoni thomas.petazzoni@bootlin.com ASoC: cs42l51: fix driver to properly autoload with automatic module loading
Sameer Pujar spujar@nvidia.com ASoC: rt5640: Fix sleep in atomic context
Sheetal sheetal@nvidia.com ASoC: tegra: Fix ADX byte map
Fabio Estevam festevam@denx.de ASoC: fsl_sai: Revert "ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode"
Matus Gajdos matuszpd@gmail.com ASoC: fsl_sai: Disable bit clock with transmitter
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Keep PHY active for DP displays on DCN31
Taimur Hassan syed.hassan@amd.com drm/amd/display: check TG is non-null before checking if enabled
Zhikai Zhai zhikai.zhai@amd.com drm/amd/display: Disable MPC split by default on special asic
Simon Ser contact@emersion.fr drm/amd/display: only accept async flips for fast updates
Jocelyn Falempe jfalempe@redhat.com drm/client: Fix memory leak in drm_client_modeset_probe
Jocelyn Falempe jfalempe@redhat.com drm/client: Fix memory leak in drm_client_target_cloned
Ben Skeggs bskeggs@redhat.com drm/nouveau/i2c: fix number of aux event slots
Ben Skeggs bskeggs@redhat.com drm/nouveau/kms/nv50-: init hpd_irq_lock for PIOR DP
Ben Skeggs bskeggs@redhat.com drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts
Alex Deucher alexander.deucher@amd.com drm/amdgpu/pm: make mclk consistent for smu 13.0.7
Alex Deucher alexander.deucher@amd.com drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
Guchun Chen guchun.chen@amd.com drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
Ville Syrjälä ville.syrjala@linux.intel.com dma-buf/dma-resv: Stop leaking on krealloc() failure
Dan Carpenter dan.carpenter@linaro.org accel/qaic: Add consistent integer overflow checks
Dan Carpenter dan.carpenter@linaro.org accel/qaic: tighten bounds checking in decode_message()
Dan Carpenter dan.carpenter@linaro.org accel/qaic: tighten bounds checking in encode_message()
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: add ConnTrack procfs kconfig
Heiner Kallweit hkallweit1@gmail.com Revert "r8169: disable ASPM during NAPI poll"
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: fix time stamp counter initialization
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: gs_can_open(): improve error handling
YueHaibing yuehaibing@huawei.com can: bcm: Fix UAF in bcm_proc_show()
Fedor Ross fedor.ross@ifm.com can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout
Mark Brown broonie@kernel.org arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
Helge Deller deller@gmx.de ia64: mmap: Consider pgoff when searching for free mapping
Mark Brown broonie@kernel.org regmap: Account for register length in SMBus I/O limits
Rob Herring robh@kernel.org of: Preserve "of-display" device name for compatibility
Harald Freudenberger freude@linux.ibm.com s390/zcrypt: fix reply buffer calculations for CCA replies
Mark Brown broonie@kernel.org regmap: Drop initial version of maximum transfer length fixes
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: add 'ct' action kconfig dep
Dan Carpenter dan.carpenter@linaro.org accel/qaic: Fix a leak in map_user_pages()
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: set timeout to 15 minutes
Josef Bacik josef@toxicpanda.com btrfs: fix race between balance and cancel/pause
Miklos Szeredi mszeredi@redhat.com fuse: ioctl: translate ENOSYS in outarg
Filipe Manana fdmanana@suse.com btrfs: zoned: fix memory leak after finding block group with super blocks
Filipe Manana fdmanana@suse.com btrfs: fix double iput() on inode after an error during orphan cleanup
Josef Bacik josef@toxicpanda.com btrfs: set_page_extent_mapped after read_folio in btrfs_cont_expand
Qu Wenruo wqu@suse.com btrfs: raid56: always verify the P/Q contents for scrub
Bernd Schubert bschubert@ddn.com fuse: Apply flags2 only when userspace set the FUSE_INIT_EXT
Miklos Szeredi mszeredi@redhat.com fuse: add feature flag for expire-only
Miklos Szeredi mszeredi@redhat.com fuse: revalidate: don't invalidate if interrupted
Filipe Manana fdmanana@suse.com btrfs: fix warning when putting transaction with qgroups enabled after abort
Filipe Manana fdmanana@suse.com btrfs: fix iput() on error pointer after error during orphan cleanup
Georg Müller georgmueller@gmx.net perf probe: Read DWARF files from the correct CU
Georg Müller georgmueller@gmx.net perf probe: Add test for regression introduced by switch to die_get_decl_file()
Miguel Ojeda ojeda@kernel.org prctl: move PR_GET_AUXV out of PR_MCE_KILL
Petr Pavlu petr.pavlu@suse.com keys: Fix linking a duplicate key to a keyring's assoc_array
Colin Ian King colin.i.king@gmail.com selftests/mm: mkdirty: fix incorrect position of #endif
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: fix node allocation testing on 32 bit
Liam R. Howlett Liam.Howlett@oracle.com mm/mlock: fix vma iterator conversion of apply_vma_lock_flags()
Peng Zhang zhangpeng.00@bytedance.com maple_tree: set the node limit when creating a new root node
Luka Guzenko l.guzenko@web.de ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx
Christoffer Sandberg cs@tuxedo.de ALSA: hda/realtek: Add quirk for Clevo NS70AU
Kailang Yang kailang@realtek.com ALSA: hda/realtek - remove 3k pull low procedure
Helge Deller deller@gmx.de io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
Jens Axboe axboe@kernel.dk io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
-------------
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/exception.h | 5 - arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/asm/kvm_pgtable.h | 26 +-- arch/arm64/kernel/fpsimd.c | 33 ++- arch/arm64/kvm/arch_timer.c | 6 +- arch/arm64/kvm/arm.c | 19 +- arch/arm64/kvm/hyp/pgtable.c | 47 +++- arch/arm64/kvm/mmu.c | 18 +- arch/arm64/kvm/vgic/vgic-v3.c | 2 +- arch/arm64/kvm/vgic/vgic-v4.c | 7 +- arch/arm64/mm/mmu.c | 4 +- arch/arm64/net/bpf_jit_comp.c | 8 +- arch/arm64/tools/sysreg | 12 +- arch/ia64/kernel/sys_ia64.c | 2 +- arch/mips/include/asm/dec/prom.h | 2 +- arch/parisc/kernel/sys_parisc.c | 15 +- block/blk-mq.c | 10 +- drivers/accel/qaic/qaic_control.c | 39 ++-- drivers/acpi/button.c | 9 + drivers/acpi/resource.c | 60 ----- drivers/acpi/video_detect.c | 24 ++ drivers/acpi/x86/utils.c | 26 ++- drivers/base/regmap/regmap-i2c.c | 8 +- drivers/base/regmap/regmap-spi-avmm.c | 2 +- drivers/base/regmap/regmap.c | 6 +- drivers/bluetooth/btusb.c | 1 + drivers/dma-buf/dma-resv.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 5 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 256 +++++++++------------ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 12 + .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 110 +++++++++ .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 11 + .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 5 + .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 +- .../drm/amd/display/dc/dcn303/dcn303_resource.c | 2 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 8 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 2 +- drivers/gpu/drm/drm_client_modeset.c | 6 + drivers/gpu/drm/i915/i915_perf.c | 1 + drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 + drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h | 4 +- drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c | 27 ++- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 11 +- drivers/gpu/drm/radeon/radeon_cs.c | 3 +- drivers/gpu/drm/ttm/ttm_resource.c | 5 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + drivers/iommu/iommu-sva.c | 3 +- drivers/md/md.c | 14 +- drivers/md/raid10.c | 2 + drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 10 +- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 1 + drivers/net/can/usb/gs_usb.c | 130 ++++++----- drivers/net/dsa/microchip/ksz8795.c | 8 +- drivers/net/dsa/microchip/ksz_common.c | 8 +- drivers/net/dsa/microchip/ksz_common.h | 7 + drivers/net/dsa/mv88e6xxx/chip.c | 7 + drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 33 ++- .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 29 ++- drivers/net/ethernet/intel/iavf/iavf.h | 16 +- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 39 ++-- drivers/net/ethernet/intel/iavf/iavf_main.c | 223 ++++++++++++------ drivers/net/ethernet/intel/iavf/iavf_txrx.c | 43 ++-- drivers/net/ethernet/intel/iavf/iavf_txrx.h | 4 - drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 5 +- drivers/net/ethernet/intel/ice/ice_base.c | 2 + drivers/net/ethernet/intel/ice/ice_ethtool.c | 13 +- drivers/net/ethernet/intel/ice/ice_lib.c | 27 --- drivers/net/ethernet/intel/ice/ice_main.c | 10 +- drivers/net/ethernet/intel/igb/igb_main.c | 5 + drivers/net/ethernet/intel/igc/igc_main.c | 12 +- drivers/net/ethernet/litex/litex_liteeth.c | 19 +- .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 +- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +-- drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c | 2 +- drivers/net/ethernet/realtek/r8169_main.c | 18 +- drivers/net/ethernet/ti/cpsw_ale.c | 24 +- drivers/net/phy/phy_device.c | 21 +- drivers/net/vrf.c | 12 +- drivers/net/wireless/ath/ath11k/core.c | 53 +++-- drivers/net/wireless/ath/ath11k/mac.c | 3 +- drivers/net/wireless/ath/ath11k/wmi.c | 5 + drivers/net/wireless/ath/ath12k/mac.c | 1 + drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c | 9 +- drivers/net/wireless/intel/iwlwifi/mvm/power.c | 14 +- drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 2 +- drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 4 + drivers/net/wireless/realtek/rtw88/sdio.c | 24 +- drivers/net/wireless/virtual/mac80211_hwsim.c | 4 +- drivers/of/platform.c | 2 +- drivers/pinctrl/renesas/pinctrl-rzg2l.c | 28 ++- drivers/pinctrl/renesas/pinctrl-rzv2m.c | 28 ++- drivers/regulator/da9063-regulator.c | 3 + drivers/s390/crypto/zcrypt_msgtype6.c | 33 ++- drivers/scsi/sg.c | 10 + drivers/spi/spi-bcm63xx.c | 2 +- drivers/spi/spi-cadence-quadspi.c | 19 ++ drivers/spi/spi-dw-mmio.c | 22 ++ drivers/spi/spi-s3c64xx.c | 2 + drivers/video/fbdev/au1200fb.c | 3 + drivers/video/fbdev/imxfb.c | 5 +- fs/btrfs/block-group.c | 1 + fs/btrfs/ctree.c | 10 +- fs/btrfs/disk-io.c | 3 + fs/btrfs/extent_io.c | 33 +-- fs/btrfs/inode.c | 35 +-- fs/btrfs/qgroup.c | 1 + fs/btrfs/raid56.c | 11 +- fs/btrfs/volumes.c | 17 +- fs/erofs/zdata.c | 2 +- fs/ext4/xattr.c | 14 ++ fs/fuse/dir.c | 2 +- fs/fuse/inode.c | 8 +- fs/fuse/ioctl.c | 21 +- fs/jbd2/checkpoint.c | 102 +++----- fs/jfs/jfs_dmap.c | 3 + fs/jfs/jfs_txnmgr.c | 5 + fs/jfs/namei.c | 5 + fs/overlayfs/ovl_entry.h | 9 + fs/quota/dquot.c | 5 +- fs/smb/client/connect.c | 19 +- fs/smb/client/dfs.c | 26 +-- fs/smb/client/smb2transport.c | 2 +- fs/udf/unicode.c | 2 +- include/kvm/arm_vgic.h | 2 +- include/linux/psi.h | 5 +- include/linux/psi_types.h | 3 + include/linux/sched/signal.h | 2 +- include/linux/tcp.h | 2 +- include/net/bluetooth/hci_core.h | 5 + include/net/ip.h | 2 +- include/net/tcp.h | 31 ++- include/uapi/linux/fuse.h | 3 + io_uring/io_uring.c | 52 ++--- kernel/bpf/bpf_lru_list.c | 21 +- kernel/bpf/bpf_lru_list.h | 7 +- kernel/bpf/btf.c | 23 +- kernel/bpf/log.c | 3 - kernel/bpf/syscall.c | 3 +- kernel/bpf/verifier.c | 32 ++- kernel/cgroup/cgroup.c | 2 +- kernel/kallsyms.c | 5 +- kernel/rcu/tasks.h | 5 +- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 4 +- kernel/sched/fair.c | 4 +- kernel/sched/psi.c | 29 ++- kernel/sys.c | 10 +- kernel/time/posix-timers.c | 31 +-- kernel/trace/trace_events_hist.c | 3 +- lib/iov_iter.c | 2 +- lib/maple_tree.c | 3 +- mm/mlock.c | 9 +- net/bluetooth/hci_conn.c | 14 +- net/bluetooth/hci_core.c | 42 +++- net/bluetooth/hci_event.c | 15 +- net/bluetooth/hci_sync.c | 121 ++++++++-- net/bluetooth/iso.c | 55 +++-- net/bluetooth/mgmt.c | 26 +-- net/bluetooth/sco.c | 23 +- net/bridge/br_stp_if.c | 3 + net/can/bcm.c | 12 +- net/devlink/health.c | 2 +- net/devlink/leftover.c | 5 +- net/ipv4/esp4.c | 2 +- net/ipv4/inet_connection_sock.c | 2 +- net/ipv4/inet_hashtables.c | 17 +- net/ipv4/inet_timewait_sock.c | 8 +- net/ipv4/ip_output.c | 4 +- net/ipv4/tcp.c | 57 ++--- net/ipv4/tcp_fastopen.c | 6 +- net/ipv4/tcp_ipv4.c | 27 ++- net/ipv4/tcp_minisocks.c | 11 +- net/ipv4/tcp_output.c | 6 +- net/ipv4/udp_offload.c | 16 +- net/ipv6/ip6_gre.c | 3 +- net/ipv6/tcp_ipv6.c | 4 +- net/ipv6/udp_offload.c | 3 +- net/llc/llc_input.c | 3 - net/netfilter/nf_tables_api.c | 12 +- net/netfilter/nft_set_pipapo.c | 6 +- net/sched/cls_bpf.c | 99 ++++---- net/sched/cls_matchall.c | 35 +-- net/sched/cls_u32.c | 48 +++- net/wireless/wext-core.c | 6 + scripts/Makefile.build | 5 +- scripts/Makefile.host | 6 +- scripts/kallsyms.c | 6 +- security/keys/request_key.c | 35 ++- security/keys/trusted-keys/trusted_tpm2.c | 2 +- sound/pci/emu10k1/emufx.c | 112 +-------- sound/pci/hda/patch_realtek.c | 100 +++++++- sound/soc/amd/acp/amd.h | 7 +- sound/soc/codecs/Kconfig | 1 + sound/soc/codecs/cs42l51-i2c.c | 6 + sound/soc/codecs/cs42l51.c | 7 - sound/soc/codecs/cs42l51.h | 1 - sound/soc/codecs/rt5640.c | 12 +- sound/soc/codecs/wcd-mbhc-v2.c | 57 +++-- sound/soc/codecs/wcd934x.c | 12 + sound/soc/codecs/wcd938x.c | 86 ++++++- sound/soc/fsl/fsl_sai.c | 8 +- sound/soc/fsl/fsl_sai.h | 1 + sound/soc/qcom/qdsp6/q6apm.c | 7 +- sound/soc/qcom/qdsp6/topology.c | 4 +- sound/soc/sof/ipc3-dtrace.c | 9 +- sound/soc/tegra/tegra210_adx.c | 34 ++- sound/soc/tegra/tegra210_amx.c | 40 ++-- tools/include/nolibc/stackprotector.h | 5 +- tools/perf/Makefile.config | 4 +- .../tests/shell/test_uprobe_from_different_cu.sh | 77 +++++++ tools/perf/util/dwarf-aux.c | 4 +- tools/testing/radix-tree/maple.c | 6 +- tools/testing/selftests/mm/mkdirty.c | 2 +- tools/testing/selftests/tc-testing/config | 2 + tools/testing/selftests/tc-testing/settings | 1 + 218 files changed, 2462 insertions(+), 1482 deletions(-)
From: Jens Axboe axboe@kernel.dk
commit a9be202269580ca611c6cebac90eaf1795497800 upstream.
io-wq assumes that an issue is blocking, but it may not be if the request type has asked for a non-blocking attempt. If we get -EAGAIN for that case, then we need to treat it as a final result and not retry or arm poll for it.
Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/issues/897 Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2032,6 +2032,14 @@ fail: ret = io_issue_sqe(req, issue_flags); if (ret != -EAGAIN) break; + + /* + * If REQ_F_NOWAIT is set, then don't wait or retry with + * poll. -EAGAIN is final for that case. + */ + if (req->flags & REQ_F_NOWAIT) + break; + /* * We can get EAGAIN for iopolled IO even though we're * forcing a sync submission from here, since we can't
From: Helge Deller deller@gmx.de
commit 32832a407a7178eec3215fad9b1a3298c14b0d69 upstream.
The io_uring testcase is broken on IA-64 since commit d808459b2e31 ("io_uring: Adjust mapping wrt architecture aliasing requirements").
The reason is, that this commit introduced an own architecture independend get_unmapped_area() search algorithm which finds on IA-64 a memory region which is outside of the regular memory region used for shared userspace mappings and which can't be used on that platform due to aliasing.
To avoid similar problems on IA-64 and other platforms in the future, it's better to switch back to the architecture-provided get_unmapped_area() function and adjust the needed input parameters before the call. Beside fixing the issue, the function now becomes easier to understand and maintain.
This patch has been successfully tested with the io_uring testcase on physical x86-64, ppc64le, IA-64 and PA-RISC machines. On PA-RISC the LTP mmmap testcases did not report any regressions.
Cc: stable@vger.kernel.org # 6.4 Signed-off-by: Helge Deller deller@gmx.de Reported-by: matoro matoro_mailinglist_kernel@matoro.tk Fixes: d808459b2e31 ("io_uring: Adjust mapping wrt architecture aliasing requirements") Link: https://lore.kernel.org/r/20230721152432.196382-2-deller@gmx.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/sys_parisc.c | 15 +++++++++----- io_uring/io_uring.c | 42 ++++++++++++++++------------------------ 2 files changed, 27 insertions(+), 30 deletions(-)
--- a/arch/parisc/kernel/sys_parisc.c +++ b/arch/parisc/kernel/sys_parisc.c @@ -26,12 +26,17 @@ #include <linux/compat.h>
/* - * Construct an artificial page offset for the mapping based on the physical + * Construct an artificial page offset for the mapping based on the virtual * address of the kernel file mapping variable. + * If filp is zero the calculated pgoff value aliases the memory of the given + * address. This is useful for io_uring where the mapping shall alias a kernel + * address and a userspace adress where both the kernel and the userspace + * access the same memory region. */ -#define GET_FILP_PGOFF(filp) \ - (filp ? (((unsigned long) filp->f_mapping) >> 8) \ - & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL) +#define GET_FILP_PGOFF(filp, addr) \ + ((filp ? (((unsigned long) filp->f_mapping) >> 8) \ + & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL) \ + + (addr >> PAGE_SHIFT))
static unsigned long shared_align_offset(unsigned long filp_pgoff, unsigned long pgoff) @@ -111,7 +116,7 @@ static unsigned long arch_get_unmapped_a do_color_align = 0; if (filp || (flags & MAP_SHARED)) do_color_align = 1; - filp_pgoff = GET_FILP_PGOFF(filp); + filp_pgoff = GET_FILP_PGOFF(filp, addr);
if (flags & MAP_FIXED) { /* Even MAP_FIXED mappings must reside within TASK_SIZE */ --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3433,8 +3433,6 @@ static unsigned long io_uring_mmu_get_un unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); - struct vm_unmapped_area_info info; void *ptr;
/* @@ -3449,32 +3447,26 @@ static unsigned long io_uring_mmu_get_un if (IS_ERR(ptr)) return -ENOMEM;
- info.flags = VM_UNMAPPED_AREA_TOPDOWN; - info.length = len; - info.low_limit = max(PAGE_SIZE, mmap_min_addr); - info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); + /* + * Some architectures have strong cache aliasing requirements. + * For such architectures we need a coherent mapping which aliases + * kernel memory *and* userspace memory. To achieve that: + * - use a NULL file pointer to reference physical memory, and + * - use the kernel virtual address of the shared io_uring context + * (instead of the userspace-provided address, which has to be 0UL + * anyway). + * For architectures without such aliasing requirements, the + * architecture will return any suitable mapping because addr is 0. + */ + filp = NULL; + flags |= MAP_SHARED; + pgoff = 0; /* has been translated to ptr above */ #ifdef SHM_COLOUR - info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL); + addr = (uintptr_t) ptr; #else - info.align_mask = PAGE_MASK & (SHMLBA - 1UL); + addr = 0UL; #endif - info.align_offset = (unsigned long) ptr; - - /* - * A failed mmap() very likely causes application failure, - * so fall back to the bottom-up function here. This scenario - * can happen with large stack limits and large mmap() - * allocations. - */ - addr = vm_unmapped_area(&info); - if (offset_in_page(addr)) { - info.flags = 0; - info.low_limit = TASK_UNMAPPED_BASE; - info.high_limit = mmap_end; - addr = vm_unmapped_area(&info); - } - - return addr; + return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); }
#else /* !CONFIG_MMU */
From: Kailang Yang kailang@realtek.com
commit 69ea4c9d02b7947cdd612335a61cc1a02e544ccd upstream.
This was the ALC283 depop procedure. Maybe this procedure wasn't suitable with new codec. So, let us remove it. But HP 15z-fc000 must do 3k pull low. If it reboot with plugged headset, it will have errors show don't find codec error messages. Run 3k pull low will solve issues. So, let AMD chipset will run this for workarround.
Fixes: 5aec98913095 ("ALSA: hda/realtek - ALC236 headset MIC recording issue") Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Reported-by: Joseph C. Sible josephcsible@gmail.com Closes: https://lore.kernel.org/r/CABpewhE4REgn9RJZduuEU6Z_ijXNeQWnrxO1tg70Gkw=F8qNY... Link: https://lore.kernel.org/r/4678992299664babac4403d9978e7ba7@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -122,6 +122,7 @@ struct alc_spec { unsigned int ultra_low_power:1; unsigned int has_hs_key:1; unsigned int no_internal_mic_pin:1; + unsigned int en_3kpull_low:1;
/* for PLL fix */ hda_nid_t pll_nid; @@ -3622,6 +3623,7 @@ static void alc256_shutup(struct hda_cod if (!hp_pin) hp_pin = 0x21;
+ alc_update_coefex_idx(codec, 0x57, 0x04, 0x0007, 0x1); /* Low power */ hp_pin_sense = snd_hda_jack_detect(codec, hp_pin);
if (hp_pin_sense) @@ -3638,8 +3640,7 @@ static void alc256_shutup(struct hda_cod /* If disable 3k pulldown control for alc257, the Mic detection will not work correctly * when booting with headset plugged. So skip setting it for the codec alc257 */ - if (codec->core.vendor_id != 0x10ec0236 && - codec->core.vendor_id != 0x10ec0257) + if (spec->en_3kpull_low) alc_update_coef_idx(codec, 0x46, 0, 3 << 12);
if (!spec->no_shutup_pins) @@ -10601,6 +10602,8 @@ static int patch_alc269(struct hda_codec spec->shutup = alc256_shutup; spec->init_hook = alc256_init; spec->gen.mixer_nid = 0; /* ALC256 does not have any loopback mixer path */ + if (codec->bus->pci->vendor == PCI_VENDOR_ID_AMD) + spec->en_3kpull_low = true; break; case 0x10ec0257: spec->codec_variant = ALC269_TYPE_ALC257;
From: Christoffer Sandberg cs@tuxedo.de
commit c250ef8954eda2024c8861c36e9fc1b589481fe7 upstream.
Fixes headset detection on Clevo NS70AU.
Co-developed-by: Werner Sembach wse@tuxedocomputers.com Signed-off-by: Werner Sembach wse@tuxedocomputers.com Signed-off-by: Christoffer Sandberg cs@tuxedo.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230718145722.10592-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9647,6 +9647,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1558, 0x5157, "Clevo W517GU1", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x51a1, "Clevo NS50MU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x51b1, "Clevo NS50AU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1558, 0x51b3, "Clevo NS70AU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x5630, "Clevo NP50RNJS", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70a1, "Clevo NB70T[HJK]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70b3, "Clevo NK70SB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
From: Luka Guzenko l.guzenko@web.de
commit 0659400f18c0e6c0c69d74fe5d09e7f6fbbd52a2 upstream.
The HP Laptop 15s-eq2xxx uses ALC236 codec and controls the mute LED using COEF 0x07 index 1. No existing quirk covers this configuration. Adds a new quirk and enables it for the device.
Signed-off-by: Luka Guzenko l.guzenko@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230718161241.393181-1-l.guzenko@web.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4624,6 +4624,21 @@ static void alc236_fixup_hp_mute_led_coe } }
+static void alc236_fixup_hp_mute_led_coefbit2(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + + if (action == HDA_FIXUP_ACT_PRE_PROBE) { + spec->mute_led_polarity = 0; + spec->mute_led_coef.idx = 0x07; + spec->mute_led_coef.mask = 1; + spec->mute_led_coef.on = 1; + spec->mute_led_coef.off = 0; + snd_hda_gen_add_mute_led_cdev(codec, coef_mute_led_set); + } +} + /* turn on/off mic-mute LED per capture hook by coef bit */ static int coef_micmute_led_set(struct led_classdev *led_cdev, enum led_brightness brightness) @@ -7134,6 +7149,7 @@ enum { ALC285_FIXUP_HP_GPIO_LED, ALC285_FIXUP_HP_MUTE_LED, ALC285_FIXUP_HP_SPECTRE_X360_MUTE_LED, + ALC236_FIXUP_HP_MUTE_LED_COEFBIT2, ALC236_FIXUP_HP_GPIO_LED, ALC236_FIXUP_HP_MUTE_LED, ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF, @@ -8557,6 +8573,10 @@ static const struct hda_fixup alc269_fix .type = HDA_FIXUP_FUNC, .v.func = alc285_fixup_hp_spectre_x360_mute_led, }, + [ALC236_FIXUP_HP_MUTE_LED_COEFBIT2] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc236_fixup_hp_mute_led_coefbit2, + }, [ALC236_FIXUP_HP_GPIO_LED] = { .type = HDA_FIXUP_FUNC, .v.func = alc236_fixup_hp_gpio_led, @@ -9441,6 +9461,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x886d, "HP ZBook Fury 17.3 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), SND_PCI_QUIRK(0x103c, 0x8873, "HP ZBook Studio 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT), + SND_PCI_QUIRK(0x103c, 0x887a, "HP Laptop 15s-eq2xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x888d, "HP ZBook Power 15.6 inch G8 Mobile Workstation PC", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8895, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED), SND_PCI_QUIRK(0x103c, 0x8896, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_MUTE_LED),
From: Peng Zhang zhangpeng.00@bytedance.com
commit 3c769fd88b9742954763a968e84de09f7ad78cfe upstream.
Set the node limit of the root node so that the last pivot of all nodes is the node limit (if the node is not full).
This patch also fixes a bug in mas_rev_awalk(). Effectively, always setting a maximum makes mas_logical_pivot() behave as mas_safe_pivot(). Without this fix, it is possible that very small tasks would fail to find the correct gap. Although this has not been observed with real tasks, it has been reported to happen in m68k nommu running the maple tree tests.
Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg5... Link: https://lkml.kernel.org/r/20230711035444.526-2-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang zhangpeng.00@bytedance.com Reviewed-by: Liam R. Howlett Liam.Howlett@oracle.com Tested-by: Geert Uytterhoeven geert@linux-m68k.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/maple_tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -3693,7 +3693,8 @@ static inline int mas_root_expand(struct mas->offset = slot; pivots[slot] = mas->last; if (mas->last != ULONG_MAX) - slot++; + pivots[++slot] = ULONG_MAX; + mas->depth = 1; mas_set_height(mas); ma_set_meta(node, maple_leaf_64, 0, slot);
From: Liam R. Howlett Liam.Howlett@oracle.com
commit 2658f94d679243209889cdfa8de3743cde1abea9 upstream.
apply_vma_lock_flags() calls mlock_fixup(), which could merge the VMA after where the vma iterator is located. Although this is not an issue, the next iteration of the loop will check the start of the vma to be equal to the locally saved 'tmp' variable and cause an incorrect failure scenario. Fix the error by setting tmp to the end of the vma iterator value before restarting the loop.
There is also a potential of the error code being overwritten when the loop terminates early. Fix the return issue by directly returning when an error is encountered since there is nothing to undo after the loop.
Link: https://lkml.kernel.org/r/20230711175020.4091336-1-Liam.Howlett@oracle.com Fixes: 37598f5a9d8b ("mlock: convert mlock to vma iterator") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Reported-by: Ryan Roberts ryan.roberts@arm.com Link: https://lore.kernel.org/linux-mm/50341ca1-d582-b33a-e3d0-acb08a65166f@arm.co... Tested-by: Ryan Roberts ryan.roberts@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mlock.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/mm/mlock.c +++ b/mm/mlock.c @@ -471,7 +471,6 @@ static int apply_vma_lock_flags(unsigned { unsigned long nstart, end, tmp; struct vm_area_struct *vma, *prev; - int error; VMA_ITERATOR(vmi, current->mm, start);
VM_BUG_ON(offset_in_page(start)); @@ -492,6 +491,7 @@ static int apply_vma_lock_flags(unsigned nstart = start; tmp = vma->vm_start; for_each_vma_range(vmi, vma, end) { + int error; vm_flags_t newflags;
if (vma->vm_start != tmp) @@ -505,14 +505,15 @@ static int apply_vma_lock_flags(unsigned tmp = end; error = mlock_fixup(&vmi, vma, &prev, nstart, tmp, newflags); if (error) - break; + return error; + tmp = vma_iter_end(&vmi); nstart = tmp; }
- if (vma_iter_end(&vmi) < end) + if (tmp < end) return -ENOMEM;
- return error; + return 0; }
/*
From: Liam R. Howlett Liam.Howlett@oracle.com
commit ef5c3de5211b5a3a8102b25aa83eb4cde65ac2fd upstream.
Internal node counting was altered and the 64 bit test was updated, however the 32bit test was missed.
Restore the 32bit test to a functional state.
Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg5... Link: https://lkml.kernel.org/r/20230712173916.168805-2-Liam.Howlett@oracle.com Fixes: 541e06b772c1 ("maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/radix-tree/maple.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -206,9 +206,9 @@ static noinline void check_new_node(stru e = i - 1; } else { if (i >= 4) - e = i - 4; - else if (i == 3) - e = i - 2; + e = i - 3; + else if (i >= 1) + e = i - 1; else e = 0; }
From: Colin Ian King colin.i.king@gmail.com
commit 25b5949c30938c7f26dbadc948b491e0e0811c78 upstream.
The #endif is the wrong side of a } causing a build failure when __NR_userfaultfd is not defined. Fix this by moving the #end to enclose the }
Link: https://lkml.kernel.org/r/20230712134648.456349-1-colin.i.king@gmail.com Fixes: 9eac40fc0cc7 ("selftests/mm: mkdirty: test behavior of (pte|pmd)_mkdirty on VMAs without write permissions") Signed-off-by: Colin Ian King colin.i.king@gmail.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Shuah Khan shuah@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mm/mkdirty.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/mm/mkdirty.c +++ b/tools/testing/selftests/mm/mkdirty.c @@ -321,8 +321,8 @@ close_uffd: munmap: munmap(dst, pagesize); free(src); -#endif /* __NR_userfaultfd */ } +#endif /* __NR_userfaultfd */
int main(void) {
From: Petr Pavlu petr.pavlu@suse.com
commit d55901522f96082a43b9842d34867363c0cdbac5 upstream.
When making a DNS query inside the kernel using dns_query(), the request code can in rare cases end up creating a duplicate index key in the assoc_array of the destination keyring. It is eventually found by a BUG_ON() check in the assoc_array implementation and results in a crash.
Example report: [2158499.700025] kernel BUG at ../lib/assoc_array.c:652! [2158499.700039] invalid opcode: 0000 [#1] SMP PTI [2158499.700065] CPU: 3 PID: 31985 Comm: kworker/3:1 Kdump: loaded Not tainted 5.3.18-150300.59.90-default #1 SLE15-SP3 [2158499.700096] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [2158499.700351] Workqueue: cifsiod cifs_resolve_server [cifs] [2158499.700380] RIP: 0010:assoc_array_insert+0x85f/0xa40 [2158499.700401] Code: ff 74 2b 48 8b 3b 49 8b 45 18 4c 89 e6 48 83 e7 fe e8 95 ec 74 00 3b 45 88 7d db 85 c0 79 d4 0f 0b 0f 0b 0f 0b e8 41 f2 be ff <0f> 0b 0f 0b 81 7d 88 ff ff ff 7f 4c 89 eb 4c 8b ad 58 ff ff ff 0f [2158499.700448] RSP: 0018:ffffc0bd6187faf0 EFLAGS: 00010282 [2158499.700470] RAX: ffff9f1ea7da2fe8 RBX: ffff9f1ea7da2fc1 RCX: 0000000000000005 [2158499.700492] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 [2158499.700515] RBP: ffffc0bd6187fbb0 R08: ffff9f185faf1100 R09: 0000000000000000 [2158499.700538] R10: ffff9f1ea7da2cc0 R11: 000000005ed8cec8 R12: ffffc0bd6187fc28 [2158499.700561] R13: ffff9f15feb8d000 R14: ffff9f1ea7da2fc0 R15: ffff9f168dc0d740 [2158499.700585] FS: 0000000000000000(0000) GS:ffff9f185fac0000(0000) knlGS:0000000000000000 [2158499.700610] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2158499.700630] CR2: 00007fdd94fca238 CR3: 0000000809d8c006 CR4: 00000000003706e0 [2158499.700702] Call Trace: [2158499.700741] ? key_alloc+0x447/0x4b0 [2158499.700768] ? __key_link_begin+0x43/0xa0 [2158499.700790] __key_link_begin+0x43/0xa0 [2158499.700814] request_key_and_link+0x2c7/0x730 [2158499.700847] ? dns_resolver_read+0x20/0x20 [dns_resolver] [2158499.700873] ? key_default_cmp+0x20/0x20 [2158499.700898] request_key_tag+0x43/0xa0 [2158499.700926] dns_query+0x114/0x2ca [dns_resolver] [2158499.701127] dns_resolve_server_name_to_ip+0x194/0x310 [cifs] [2158499.701164] ? scnprintf+0x49/0x90 [2158499.701190] ? __switch_to_asm+0x40/0x70 [2158499.701211] ? __switch_to_asm+0x34/0x70 [2158499.701405] reconn_set_ipaddr_from_hostname+0x81/0x2a0 [cifs] [2158499.701603] cifs_resolve_server+0x4b/0xd0 [cifs] [2158499.701632] process_one_work+0x1f8/0x3e0 [2158499.701658] worker_thread+0x2d/0x3f0 [2158499.701682] ? process_one_work+0x3e0/0x3e0 [2158499.701703] kthread+0x10d/0x130 [2158499.701723] ? kthread_park+0xb0/0xb0 [2158499.701746] ret_from_fork+0x1f/0x40
The situation occurs as follows: * Some kernel facility invokes dns_query() to resolve a hostname, for example, "abcdef". The function registers its global DNS resolver cache as current->cred.thread_keyring and passes the query to request_key_net() -> request_key_tag() -> request_key_and_link(). * Function request_key_and_link() creates a keyring_search_context object. Its match_data.cmp method gets set via a call to type->match_preparse() (resolves to dns_resolver_match_preparse()) to dns_resolver_cmp(). * Function request_key_and_link() continues and invokes search_process_keyrings_rcu() which returns that a given key was not found. The control is then passed to request_key_and_link() -> construct_alloc_key(). * Concurrently to that, a second task similarly makes a DNS query for "abcdef." and its result gets inserted into the DNS resolver cache. * Back on the first task, function construct_alloc_key() first runs __key_link_begin() to determine an assoc_array_edit operation to insert a new key. Index keys in the array are compared exactly as-is, using keyring_compare_object(). The operation finds that "abcdef" is not yet present in the destination keyring. * Function construct_alloc_key() continues and checks if a given key is already present on some keyring by again calling search_process_keyrings_rcu(). This search is done using dns_resolver_cmp() and "abcdef" gets matched with now present key "abcdef.". * The found key is linked on the destination keyring by calling __key_link() and using the previously calculated assoc_array_edit operation. This inserts the "abcdef." key in the array but creates a duplicity because the same index key is already present.
Fix the problem by postponing __key_link_begin() in construct_alloc_key() until an actual key which should be linked into the destination keyring is determined.
[jarkko@kernel.org: added a fixes tag and cc to stable] Cc: stable@vger.kernel.org # v5.3+ Fixes: df593ee23e05 ("keys: Hoist locking out of __key_link_begin()") Signed-off-by: Petr Pavlu petr.pavlu@suse.com Reviewed-by: Joey Lee jlee@suse.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/request_key.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
--- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -401,17 +401,21 @@ static int construct_alloc_key(struct ke set_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags);
if (dest_keyring) { - ret = __key_link_lock(dest_keyring, &ctx->index_key); + ret = __key_link_lock(dest_keyring, &key->index_key); if (ret < 0) goto link_lock_failed; - ret = __key_link_begin(dest_keyring, &ctx->index_key, &edit); - if (ret < 0) - goto link_prealloc_failed; }
- /* attach the key to the destination keyring under lock, but we do need + /* + * Attach the key to the destination keyring under lock, but we do need * to do another check just in case someone beat us to it whilst we - * waited for locks */ + * waited for locks. + * + * The caller might specify a comparison function which looks for keys + * that do not exactly match but are still equivalent from the caller's + * perspective. The __key_link_begin() operation must be done only after + * an actual key is determined. + */ mutex_lock(&key_construction_mutex);
rcu_read_lock(); @@ -420,12 +424,16 @@ static int construct_alloc_key(struct ke if (!IS_ERR(key_ref)) goto key_already_present;
- if (dest_keyring) + if (dest_keyring) { + ret = __key_link_begin(dest_keyring, &key->index_key, &edit); + if (ret < 0) + goto link_alloc_failed; __key_link(dest_keyring, key, &edit); + }
mutex_unlock(&key_construction_mutex); if (dest_keyring) - __key_link_end(dest_keyring, &ctx->index_key, edit); + __key_link_end(dest_keyring, &key->index_key, edit); mutex_unlock(&user->cons_lock); *_key = key; kleave(" = 0 [%d]", key_serial(key)); @@ -438,10 +446,13 @@ key_already_present: mutex_unlock(&key_construction_mutex); key = key_ref_to_ptr(key_ref); if (dest_keyring) { + ret = __key_link_begin(dest_keyring, &key->index_key, &edit); + if (ret < 0) + goto link_alloc_failed_unlocked; ret = __key_link_check_live_key(dest_keyring, key); if (ret == 0) __key_link(dest_keyring, key, &edit); - __key_link_end(dest_keyring, &ctx->index_key, edit); + __key_link_end(dest_keyring, &key->index_key, edit); if (ret < 0) goto link_check_failed; } @@ -456,8 +467,10 @@ link_check_failed: kleave(" = %d [linkcheck]", ret); return ret;
-link_prealloc_failed: - __key_link_end(dest_keyring, &ctx->index_key, edit); +link_alloc_failed: + mutex_unlock(&key_construction_mutex); +link_alloc_failed_unlocked: + __key_link_end(dest_keyring, &key->index_key, edit); link_lock_failed: mutex_unlock(&user->cons_lock); key_put(key);
From: Miguel Ojeda ojeda@kernel.org
commit 636e348353a7cc52609fdba5ff3270065da140d5 upstream.
Somehow PR_GET_AUXV got added into PR_MCE_KILL's switch when the patch was applied [1].
Thus move it out of the switch, to the place the patch added it.
In the recently released v6.4 kernel some user could, in principle, be already using this feature by mapping the right page and passing the PR_GET_AUXV constant as a pointer:
prctl(PR_MCE_KILL, PR_GET_AUXV, ...)
So this does change the behavior for users. We could keep the bug since the other subcases in PR_MCE_KILL (PR_MCE_KILL_CLEAR and PR_MCE_KILL_SET) do not overlap.
However, v6.4 may be recent enough (2 weeks old) that moving the lines (rather than just adding a new case) does not break anybody? Moreover, the documentation in man-pages was just committed today [2].
Link: https://lkml.kernel.org/r/20230708233344.361854-1-ojeda@kernel.org Fixes: ddc65971bb67 ("prctl: add PR_GET_AUXV to copy auxv to userspace") Link: https://lore.kernel.org/all/d81864a7f7f43bca6afa2a09fc2e850e4050ab42.1680611... [1] Link: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=8cf0c... [2] Signed-off-by: Miguel Ojeda ojeda@kernel.org Cc: Josh Triplett josh@joshtriplett.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -2529,11 +2529,6 @@ SYSCALL_DEFINE5(prctl, int, option, unsi else return -EINVAL; break; - case PR_GET_AUXV: - if (arg4 || arg5) - return -EINVAL; - error = prctl_get_auxv((void __user *)arg2, arg3); - break; default: return -EINVAL; } @@ -2688,6 +2683,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsi case PR_SET_VMA: error = prctl_set_vma(arg2, arg3, arg4, arg5); break; + case PR_GET_AUXV: + if (arg4 || arg5) + return -EINVAL; + error = prctl_get_auxv((void __user *)arg2, arg3); + break; #ifdef CONFIG_KSM case PR_SET_MEMORY_MERGE: if (arg3 || arg4 || arg5)
From: Georg Müller georgmueller@gmx.net
commit 56cbeacf143530576905623ac72ae0964f3293a6 upstream.
This patch adds a test to validate that 'perf probe' works for binaries where DWARF info is split into multiple CUs
Signed-off-by: Georg Müller georgmueller@gmx.net Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: regressions@lists.linux.dev Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230628084551.1860532-5-georgmueller@gmx.net Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/tests/shell/test_uprobe_from_different_cu.sh | 77 ++++++++++++++++ 1 file changed, 77 insertions(+) create mode 100755 tools/perf/tests/shell/test_uprobe_from_different_cu.sh
--- /dev/null +++ b/tools/perf/tests/shell/test_uprobe_from_different_cu.sh @@ -0,0 +1,77 @@ +#!/bin/bash +# test perf probe of function from different CU +# SPDX-License-Identifier: GPL-2.0 + +set -e + +temp_dir=$(mktemp -d /tmp/perf-uprobe-different-cu-sh.XXXXXXXXXX) + +cleanup() +{ + trap - EXIT TERM INT + if [[ "${temp_dir}" =~ ^/tmp/perf-uprobe-different-cu-sh.*$ ]]; then + echo "--- Cleaning up ---" + perf probe -x ${temp_dir}/testfile -d foo + rm -f "${temp_dir}/"* + rmdir "${temp_dir}" + fi +} + +trap_cleanup() +{ + cleanup + exit 1 +} + +trap trap_cleanup EXIT TERM INT + +cat > ${temp_dir}/testfile-foo.h << EOF +struct t +{ + int *p; + int c; +}; + +extern int foo (int i, struct t *t); +EOF + +cat > ${temp_dir}/testfile-foo.c << EOF +#include "testfile-foo.h" + +int +foo (int i, struct t *t) +{ + int j, res = 0; + for (j = 0; j < i && j < t->c; j++) + res += t->p[j]; + + return res; +} +EOF + +cat > ${temp_dir}/testfile-main.c << EOF +#include "testfile-foo.h" + +static struct t g; + +int +main (int argc, char **argv) +{ + int i; + int j[argc]; + g.c = argc; + g.p = j; + for (i = 0; i < argc; i++) + j[i] = (int) argv[i][0]; + return foo (3, &g); +} +EOF + +gcc -g -Og -flto -c ${temp_dir}/testfile-foo.c -o ${temp_dir}/testfile-foo.o +gcc -g -Og -c ${temp_dir}/testfile-main.c -o ${temp_dir}/testfile-main.o +gcc -g -Og -o ${temp_dir}/testfile ${temp_dir}/testfile-foo.o ${temp_dir}/testfile-main.o + +perf probe -x ${temp_dir}/testfile --funcs foo +perf probe -x ${temp_dir}/testfile foo + +cleanup
From: Georg Müller georgmueller@gmx.net
commit c66e1c68c13b872505f25ab641c44b77313ee7fe upstream.
After switching from dwarf_decl_file() to die_get_decl_file(), it is not possible to add probes for certain functions:
$ perf probe -x /usr/lib/systemd/systemd-logind match_unit_removed A function DIE doesn't have decl_line. Maybe broken DWARF? A function DIE doesn't have decl_line. Maybe broken DWARF? Probe point 'match_unit_removed' not found. Error: Failed to add events.
The problem is that die_get_decl_file() uses the wrong CU to search for the file. elfutils commit e1db5cdc9f has some good explanation for this:
dwarf_decl_file uses dwarf_attr_integrate to get the DW_AT_decl_file attribute. This means the attribute might come from a different DIE in a different CU. If so, we need to use the CU associated with the attribute, not the original DIE, to resolve the file name.
This patch uses the same source of information as elfutils: use attribute DW_AT_decl_file and use this CU to search for the file.
Fixes: dc9a5d2ccd5c823c ("perf probe: Fix to get declared file name from clang DWARF5") Signed-off-by: Georg Müller georgmueller@gmx.net Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: regressions@lists.linux.dev Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230628084551.1860532-6-georgmueller@gmx.net Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/dwarf-aux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/tools/perf/util/dwarf-aux.c +++ b/tools/perf/util/dwarf-aux.c @@ -478,8 +478,10 @@ static const char *die_get_file_name(Dwa { Dwarf_Die cu_die; Dwarf_Files *files; + Dwarf_Attribute attr_mem;
- if (idx < 0 || !dwarf_diecu(dw_die, &cu_die, NULL, NULL) || + if (idx < 0 || !dwarf_attr_integrate(dw_die, DW_AT_decl_file, &attr_mem) || + !dwarf_cu_die(attr_mem.cu, &cu_die, NULL, NULL, NULL, NULL, NULL, NULL) || dwarf_getsrcfiles(&cu_die, &files, NULL) != 0) return NULL;
From: Filipe Manana fdmanana@suse.com
commit cbaee87f2ef628c10331b69a2f3def6bc32402d7 upstream.
At btrfs_orphan_cleanup(), if we can't find an inode (btrfs_iget() returns an -ENOENT error pointer), we proceed with 'ret' set to -ENOENT and the inode pointer set to ERR_PTR(-ENOENT). Later when we proceed to the body of the following if statement:
if (ret == -ENOENT || inode->i_nlink) { (...) trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) { ret = PTR_ERR(trans); iput(inode); goto out; } (...) ret = btrfs_del_orphan_item(trans, root, found_key.objectid); btrfs_end_transaction(trans); if (ret) { iput(inode); goto out; } continue; }
If we get an error from btrfs_start_transaction() or from the call to btrfs_del_orphan_item() we end calling iput() against an inode pointer that has a value of ERR_PTR(-ENOENT), resulting in a crash with the following trace:
[876.667] BUG: kernel NULL pointer dereference, address: 0000000000000096 [876.667] #PF: supervisor read access in kernel mode [876.667] #PF: error_code(0x0000) - not-present page [876.667] PGD 0 P4D 0 [876.668] Oops: 0000 [#1] PREEMPT SMP PTI [876.668] CPU: 0 PID: 2356187 Comm: mount Tainted: G W 6.4.0-rc6-btrfs-next-134+ #1 [876.668] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [876.668] RIP: 0010:iput+0xa/0x20 [876.668] Code: ff ff ff 66 (...) [876.669] RSP: 0018:ffffafa9c0c9f9d0 EFLAGS: 00010282 [876.669] RAX: ffffffffffffffe4 RBX: 000000000009453b RCX: 0000000000000000 [876.669] RDX: 0000000000000001 RSI: ffffafa9c0c9f930 RDI: fffffffffffffffe [876.669] RBP: ffff95c612f3b800 R08: 0000000000000001 R09: ffffffffffffffe4 [876.670] R10: 00018f2a71010000 R11: 000000000ead96e3 R12: ffff95cb7d6909a0 [876.670] R13: fffffffffffffffe R14: ffff95c60f477000 R15: 00000000ffffffe4 [876.670] FS: 00007f5fbe30a840(0000) GS:ffff95ccdfa00000(0000) knlGS:0000000000000000 [876.670] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [876.671] CR2: 0000000000000096 CR3: 000000055e9f6004 CR4: 0000000000370ef0 [876.671] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [876.671] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [876.672] Call Trace: [876.744] <TASK> [876.744] ? __die_body+0x1b/0x60 [876.744] ? page_fault_oops+0x15d/0x450 [876.745] ? __kmem_cache_alloc_node+0x47/0x410 [876.745] ? do_user_addr_fault+0x65/0x8a0 [876.745] ? exc_page_fault+0x74/0x170 [876.746] ? asm_exc_page_fault+0x22/0x30 [876.746] ? iput+0xa/0x20 [876.746] btrfs_orphan_cleanup+0x221/0x330 [btrfs] [876.746] btrfs_lookup_dentry+0x58f/0x5f0 [btrfs] [876.747] btrfs_lookup+0xe/0x30 [btrfs] [876.747] __lookup_slow+0x82/0x130 [876.785] walk_component+0xe5/0x160 [876.786] path_lookupat.isra.0+0x6e/0x150 [876.786] filename_lookup+0xcf/0x1a0 [876.786] ? mod_objcg_state+0xd2/0x360 [876.786] ? obj_cgroup_charge+0xf5/0x110 [876.787] ? should_failslab+0xa/0x20 [876.787] ? kmem_cache_alloc+0x47/0x450 [876.787] vfs_path_lookup+0x51/0x90 [876.788] mount_subtree+0x8d/0x130 [876.788] btrfs_mount+0x149/0x410 [btrfs] [876.788] ? __kmem_cache_alloc_node+0x47/0x410 [876.788] ? vfs_parse_fs_param+0xc0/0x110 [876.789] legacy_get_tree+0x24/0x50 [876.834] vfs_get_tree+0x22/0xd0 [876.852] path_mount+0x2d8/0x9c0 [876.852] do_mount+0x79/0x90 [876.852] __x64_sys_mount+0x8e/0xd0 [876.853] do_syscall_64+0x38/0x90 [876.899] entry_SYSCALL_64_after_hwframe+0x72/0xdc [876.958] RIP: 0033:0x7f5fbe50b76a [876.959] Code: 48 8b 0d a9 (...) [876.959] RSP: 002b:00007fff01925798 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [876.959] RAX: ffffffffffffffda RBX: 00007f5fbe694264 RCX: 00007f5fbe50b76a [876.960] RDX: 0000561bde6c8720 RSI: 0000561bde6bdec0 RDI: 0000561bde6c31a0 [876.960] RBP: 0000561bde6bdc70 R08: 0000000000000000 R09: 0000000000000001 [876.960] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [876.960] R13: 0000561bde6c31a0 R14: 0000561bde6c8720 R15: 0000561bde6bdc70 [876.960] </TASK>
So fix this by setting 'inode' to NULL whenever we get an error from btrfs_iget(), and to make the code simpler, stop testing for 'ret' being -ENOENT to check if we have an inode - instead test for 'inode' being NULL or not. Having a NULL 'inode' prevents any iput() call from crashing, as iput() ignores NULL inode pointers. Also, stop testing for a NULL return value from btrfs_iget() with PTR_ERR_OR_ZERO(), because btrfs_iget() never returns NULL - in case an inode is not found, it returns ERR_PTR(-ENOENT), and in case of memory allocation failure, it returns ERR_PTR(-ENOMEM). We also don't need the extra iput() calls on the error branches for the btrfs_start_transaction() and btrfs_del_orphan_item() calls, as we have already called iput() before, so remove them.
Fixes: a13bb2c03848 ("btrfs: add missing iputs on orphan cleanup failure") CC: stable@vger.kernel.org # 6.4 Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3546,11 +3546,14 @@ int btrfs_orphan_cleanup(struct btrfs_ro found_key.type = BTRFS_INODE_ITEM_KEY; found_key.offset = 0; inode = btrfs_iget(fs_info->sb, last_objectid, root); - ret = PTR_ERR_OR_ZERO(inode); - if (ret && ret != -ENOENT) - goto out; + if (IS_ERR(inode)) { + ret = PTR_ERR(inode); + inode = NULL; + if (ret != -ENOENT) + goto out; + }
- if (ret == -ENOENT && root == fs_info->tree_root) { + if (!inode && root == fs_info->tree_root) { struct btrfs_root *dead_root; int is_dead_root = 0;
@@ -3611,8 +3614,8 @@ int btrfs_orphan_cleanup(struct btrfs_ro * deleted but wasn't. The inode number may have been reused, * but either way, we can delete the orphan item. */ - if (ret == -ENOENT || inode->i_nlink) { - if (!ret) { + if (!inode || inode->i_nlink) { + if (inode) { ret = btrfs_drop_verity_items(BTRFS_I(inode)); iput(inode); if (ret) @@ -3621,7 +3624,6 @@ int btrfs_orphan_cleanup(struct btrfs_ro trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) { ret = PTR_ERR(trans); - iput(inode); goto out; } btrfs_debug(fs_info, "auto deleting %Lu", @@ -3629,10 +3631,8 @@ int btrfs_orphan_cleanup(struct btrfs_ro ret = btrfs_del_orphan_item(trans, root, found_key.objectid); btrfs_end_transaction(trans); - if (ret) { - iput(inode); + if (ret) goto out; - } continue; }
From: Filipe Manana fdmanana@suse.com
commit aa84ce8a78a1a5c10cdf9c7a5fb0c999fbc2c8d6 upstream.
If we have a transaction abort with qgroups enabled we get a warning triggered when doing the final put on the transaction, like this:
[552.6789] ------------[ cut here ]------------ [552.6815] WARNING: CPU: 4 PID: 81745 at fs/btrfs/transaction.c:144 btrfs_put_transaction+0x123/0x130 [btrfs] [552.6817] Modules linked in: btrfs blake2b_generic xor (...) [552.6819] CPU: 4 PID: 81745 Comm: btrfs-transacti Tainted: G W 6.4.0-rc6-btrfs-next-134+ #1 [552.6819] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [552.6819] RIP: 0010:btrfs_put_transaction+0x123/0x130 [btrfs] [552.6821] Code: bd a0 01 00 (...) [552.6821] RSP: 0018:ffffa168c0527e28 EFLAGS: 00010286 [552.6821] RAX: ffff936042caed00 RBX: ffff93604a3eb448 RCX: 0000000000000000 [552.6821] RDX: ffff93606421b028 RSI: ffffffff92ff0878 RDI: ffff93606421b010 [552.6821] RBP: ffff93606421b000 R08: 0000000000000000 R09: ffffa168c0d07c20 [552.6821] R10: 0000000000000000 R11: ffff93608dc52950 R12: ffffa168c0527e70 [552.6821] R13: ffff93606421b000 R14: ffff93604a3eb420 R15: ffff93606421b028 [552.6821] FS: 0000000000000000(0000) GS:ffff93675fb00000(0000) knlGS:0000000000000000 [552.6821] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [552.6821] CR2: 0000558ad262b000 CR3: 000000014feda005 CR4: 0000000000370ee0 [552.6822] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [552.6822] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [552.6822] Call Trace: [552.6822] <TASK> [552.6822] ? __warn+0x80/0x130 [552.6822] ? btrfs_put_transaction+0x123/0x130 [btrfs] [552.6824] ? report_bug+0x1f4/0x200 [552.6824] ? handle_bug+0x42/0x70 [552.6824] ? exc_invalid_op+0x14/0x70 [552.6824] ? asm_exc_invalid_op+0x16/0x20 [552.6824] ? btrfs_put_transaction+0x123/0x130 [btrfs] [552.6826] btrfs_cleanup_transaction+0xe7/0x5e0 [btrfs] [552.6828] ? _raw_spin_unlock_irqrestore+0x23/0x40 [552.6828] ? try_to_wake_up+0x94/0x5e0 [552.6828] ? __pfx_process_timeout+0x10/0x10 [552.6828] transaction_kthread+0x103/0x1d0 [btrfs] [552.6830] ? __pfx_transaction_kthread+0x10/0x10 [btrfs] [552.6832] kthread+0xee/0x120 [552.6832] ? __pfx_kthread+0x10/0x10 [552.6832] ret_from_fork+0x29/0x50 [552.6832] </TASK> [552.6832] ---[ end trace 0000000000000000 ]---
This corresponds to this line of code:
void btrfs_put_transaction(struct btrfs_transaction *transaction) { (...) WARN_ON(!RB_EMPTY_ROOT( &transaction->delayed_refs.dirty_extent_root)); (...) }
The warning happens because btrfs_qgroup_destroy_extent_records(), called in the transaction abort path, we free all entries from the rbtree "dirty_extent_root" with rbtree_postorder_for_each_entry_safe(), but we don't actually empty the rbtree - it's still pointing to nodes that were freed.
So set the rbtree's root node to NULL to avoid this warning (assign RB_ROOT).
Fixes: 81f7eb00ff5b ("btrfs: destroy qgroup extent records on transaction abort") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/qgroup.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -4433,4 +4433,5 @@ void btrfs_qgroup_destroy_extent_records ulist_free(entry->old_roots); kfree(entry); } + *root = RB_ROOT; }
From: Miklos Szeredi mszeredi@redhat.com
commit a9d1c4c6df0e568207907c04aed9e7beb1294c42 upstream.
If the LOOKUP request triggered from fuse_dentry_revalidate() is interrupted, then the dentry will be invalidated, possibly resulting in submounts being unmounted.
Reported-by: Xu Rongbo xurongbo@baidu.com Closes: https://lore.kernel.org/all/CAJfpegswN_CJJ6C3RZiaK6rpFmNyWmXfaEpnQUJ42KCwNF5... Fixes: 9e6268db496a ("[PATCH] FUSE - read-write operations") Cc: stable@vger.kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -258,7 +258,7 @@ static int fuse_dentry_revalidate(struct spin_unlock(&fi->lock); } kfree(forget); - if (ret == -ENOMEM) + if (ret == -ENOMEM || ret == -EINTR) goto out; if (ret || fuse_invalid_attr(&outarg.attr) || fuse_stale_inode(inode, outarg.generation, &outarg.attr))
From: Miklos Szeredi mszeredi@redhat.com
commit 5cadfbd5a11e5495cac217534c5f788168b1afd7 upstream.
Add an init flag idicating whether the FUSE_EXPIRE_ONLY flag of FUSE_NOTIFY_INVAL_ENTRY is effective.
This is needed for backports of this feature, otherwise the server could just check the protocol version.
Fixes: 4f8d37020e1f ("fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY") Cc: stable@vger.kernel.org # v6.2 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/inode.c | 3 ++- include/uapi/linux/fuse.h | 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-)
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1254,7 +1254,8 @@ void fuse_send_init(struct fuse_mount *f FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS | FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA | FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT | - FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP; + FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP | + FUSE_HAS_EXPIRE_ONLY; #ifdef CONFIG_FUSE_DAX if (fm->fc->dax) flags |= FUSE_MAP_ALIGNMENT; --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -206,6 +206,7 @@ * - add extension header * - add FUSE_EXT_GROUPS * - add FUSE_CREATE_SUPP_GROUP + * - add FUSE_HAS_EXPIRE_ONLY */
#ifndef _LINUX_FUSE_H @@ -369,6 +370,7 @@ struct fuse_file_lock { * FUSE_HAS_INODE_DAX: use per inode DAX * FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir, * symlink and mknod (single group that matches parent) + * FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -406,6 +408,7 @@ struct fuse_file_lock { #define FUSE_SECURITY_CTX (1ULL << 32) #define FUSE_HAS_INODE_DAX (1ULL << 33) #define FUSE_CREATE_SUPP_GROUP (1ULL << 34) +#define FUSE_HAS_EXPIRE_ONLY (1ULL << 35)
/** * CUSE INIT request/reply flags
From: Bernd Schubert bschubert@ddn.com
commit 3066ff93476c35679cb07a97cce37d9bb07632ff upstream.
This is just a safety precaution to avoid checking flags on memory that was initialized on the user space side. libfuse zeroes struct fuse_init_out outarg, but this is not guranteed to be done in all implementations. Better is to act on flags and to only apply flags2 when FUSE_INIT_EXT is set.
There is a risk with this change, though - it might break existing user space libraries, which are already using flags2 without setting FUSE_INIT_EXT.
The corresponding libfuse patch is here https://github.com/libfuse/libfuse/pull/662
Signed-off-by: Bernd Schubert bschubert@ddn.com Fixes: 53db28933e95 ("fuse: extend init flags") Cc: stable@vger.kernel.org # v5.17 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/inode.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1134,7 +1134,10 @@ static void process_init_reply(struct fu process_init_limits(fc, arg);
if (arg->minor >= 6) { - u64 flags = arg->flags | (u64) arg->flags2 << 32; + u64 flags = arg->flags; + + if (flags & FUSE_INIT_EXT) + flags |= (u64) arg->flags2 << 32;
ra_pages = arg->max_readahead / PAGE_SIZE; if (flags & FUSE_ASYNC_READ)
From: Qu Wenruo wqu@suse.com
commit 486c737f7fdc0c3f6464cf27ede811daec2769a1 upstream.
[REGRESSION] Commit 75b470332965 ("btrfs: raid56: migrate recovery and scrub recovery path to use error_bitmap") changed the behavior of scrub_rbio().
Initially if we have no error reading the raid bio, we will assign @need_check to true, then finish_parity_scrub() would later verify the content of P/Q stripes before writeback.
But after that commit we never verify the content of P/Q stripes and just writeback them.
This can lead to unrepaired P/Q stripes during scrub, or already corrupted P/Q copied to the dev-replace target.
[FIX] The situation is more complex than the regression, in fact the initial behavior is not 100% correct either.
If we have the following rare case, it can still lead to the same problem using the old behavior:
0 16K 32K 48K 64K Data 1: |IIIIIII| | Data 2: | | Parity: | |CCCCCCC| |
Where "I" means IO error, "C" means corruption.
In the above case, we're scrubbing the parity stripe, then read out all the contents of Data 1, Data 2, Parity stripes.
But found IO error in Data 1, which leads to rebuild using Data 2 and Parity and got the correct data.
In that case, we would not verify if the Parity is correct for range [16K, 32K).
So here we have to always verify the content of Parity no matter if we did recovery or not.
This patch would remove the @need_check parameter of finish_parity_scrub() completely, and would always do the P/Q verification before writeback.
Fixes: 75b470332965 ("btrfs: raid56: migrate recovery and scrub recovery path to use error_bitmap") CC: stable@vger.kernel.org # 6.2+ Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/raid56.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
--- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -71,7 +71,7 @@ static void rmw_rbio_work_locked(struct static void index_rbio_pages(struct btrfs_raid_bio *rbio); static int alloc_rbio_pages(struct btrfs_raid_bio *rbio);
-static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check); +static int finish_parity_scrub(struct btrfs_raid_bio *rbio); static void scrub_rbio_work_locked(struct work_struct *work);
static void free_raid_bio_pointers(struct btrfs_raid_bio *rbio) @@ -2404,7 +2404,7 @@ static int alloc_rbio_essential_pages(st return 0; }
-static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check) +static int finish_parity_scrub(struct btrfs_raid_bio *rbio) { struct btrfs_io_context *bioc = rbio->bioc; const u32 sectorsize = bioc->fs_info->sectorsize; @@ -2445,9 +2445,6 @@ static int finish_parity_scrub(struct bt */ clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);
- if (!need_check) - goto writeback; - p_sector.page = alloc_page(GFP_NOFS); if (!p_sector.page) return -ENOMEM; @@ -2516,7 +2513,6 @@ static int finish_parity_scrub(struct bt q_sector.page = NULL; }
-writeback: /* * time to start writing. Make bios for everything from the * higher layers (the bio_list in our rbio) and our p/q. Ignore @@ -2699,7 +2695,6 @@ static int scrub_assemble_read_bios(stru
static void scrub_rbio(struct btrfs_raid_bio *rbio) { - bool need_check = false; int sector_nr; int ret;
@@ -2722,7 +2717,7 @@ static void scrub_rbio(struct btrfs_raid * We have every sector properly prepared. Can finish the scrub * and writeback the good content. */ - ret = finish_parity_scrub(rbio, need_check); + ret = finish_parity_scrub(rbio); wait_event(rbio->io_wait, atomic_read(&rbio->stripes_pending) == 0); for (sector_nr = 0; sector_nr < rbio->stripe_nsectors; sector_nr++) { int found_errors;
From: Josef Bacik josef@toxicpanda.com
commit 17b17fcd6d446b95904a6929c40012ee7f0afc0c upstream.
While trying to get the subpage blocksize tests running, I hit the following panic on generic/476
assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:229 kernel BUG at fs/btrfs/subpage.c:229! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 1 PID: 1453 Comm: fsstress Not tainted 6.4.0-rc7+ #12 Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20230301gitf80f052277c8-26.fc38 03/01/2023 pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : btrfs_subpage_assert+0xbc/0xf0 lr : btrfs_subpage_assert+0xbc/0xf0 Call trace: btrfs_subpage_assert+0xbc/0xf0 btrfs_subpage_clear_checked+0x38/0xc0 btrfs_page_clear_checked+0x48/0x98 btrfs_truncate_block+0x5d0/0x6a8 btrfs_cont_expand+0x5c/0x528 btrfs_write_check.isra.0+0xf8/0x150 btrfs_buffered_write+0xb4/0x760 btrfs_do_write_iter+0x2f8/0x4b0 btrfs_file_write_iter+0x1c/0x30 do_iter_readv_writev+0xc8/0x158 do_iter_write+0x9c/0x210 vfs_iter_write+0x24/0x40 iter_file_splice_write+0x224/0x390 direct_splice_actor+0x38/0x68 splice_direct_to_actor+0x12c/0x260 do_splice_direct+0x90/0xe8 generic_copy_file_range+0x50/0x90 vfs_copy_file_range+0x29c/0x470 __arm64_sys_copy_file_range+0xcc/0x498 invoke_syscall.constprop.0+0x80/0xd8 do_el0_svc+0x6c/0x168 el0_svc+0x50/0x1b0 el0t_64_sync_handler+0x114/0x120 el0t_64_sync+0x194/0x198
This happens because during btrfs_cont_expand we'll get a page, set it as mapped, and if it's not Uptodate we'll read it. However between the read and re-locking the page we could have called release_folio() on the page, but left the page in the file mapping. release_folio() can clear the page private, and thus further down we blow up when we go to modify the subpage bits.
Fix this by putting the set_page_extent_mapped() after the read. This is safe because read_folio() will call set_page_extent_mapped() before it does the read, and then if we clear page private but leave it on the mapping we're completely safe re-setting set_page_extent_mapped(). With this patch I can now run generic/476 without panicing.
CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4734,9 +4734,6 @@ again: ret = -ENOMEM; goto out; } - ret = set_page_extent_mapped(page); - if (ret < 0) - goto out_unlock;
if (!PageUptodate(page)) { ret = btrfs_read_folio(NULL, page_folio(page)); @@ -4751,6 +4748,17 @@ again: goto out_unlock; } } + + /* + * We unlock the page after the io is completed and then re-lock it + * above. release_folio() could have come in between that and cleared + * PagePrivate(), but left the page in the mapping. Set the page mapped + * here to make sure it's properly set for the subpage stuff. + */ + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; + wait_on_page_writeback(page);
lock_extent(io_tree, block_start, block_end, &cached_state);
From: Filipe Manana fdmanana@suse.com
commit b777d279ff31979add57e8a3f810bceb7ef0cfb7 upstream.
At btrfs_orphan_cleanup(), if we were able to find the inode, we do an iput() on the inode, then if btrfs_drop_verity_items() succeeds and then either btrfs_start_transaction() or btrfs_del_orphan_item() fail, we do another iput() in the respective error paths, resulting in an extra iput() on the inode.
Fix this by setting inode to NULL after the first iput(), as iput() ignores a NULL inode pointer argument.
Fixes: a13bb2c03848 ("btrfs: add missing iputs on orphan cleanup failure") CC: stable@vger.kernel.org # 6.4 Reviewed-by: Boris Burkov boris@bur.io Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3618,6 +3618,7 @@ int btrfs_orphan_cleanup(struct btrfs_ro if (inode) { ret = btrfs_drop_verity_items(BTRFS_I(inode)); iput(inode); + inode = NULL; if (ret) goto out; }
From: Filipe Manana fdmanana@suse.com
commit f1a07c2b4e2c473ec322b8b9ece071b8c88a3512 upstream.
At exclude_super_stripes(), if we happen to find a block group that has super blocks mapped to it and we are on a zoned filesystem, we error out as this is not supposed to happen, indicating either a bug or maybe some memory corruption for example. However we are exiting the function without freeing the memory allocated for the logical address of the super blocks. Fix this by freeing the logical address.
Fixes: 12659251ca5d ("btrfs: implement log-structured superblock for ZONED mode") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-group.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2084,6 +2084,7 @@ static int exclude_super_stripes(struct
/* Shouldn't have super stripes in sequential zones */ if (zoned && nr) { + kfree(logical); btrfs_err(fs_info, "zoned: block group %llu must not contain super block", cache->start);
From: Miklos Szeredi mszeredi@redhat.com
commit 6a567e920fd0451bf29abc418df96c3365925770 upstream.
Fuse shouldn't return ENOSYS from its ioctl implementation. If userspace responds with ENOSYS it should be translated to ENOTTY.
There are two ways to return an error from the IOCTL request:
- fuse_out_header.error - fuse_ioctl_out.result
Commit 02c0cab8e734 ("fuse: ioctl: translate ENOSYS") already fixed this issue for the first case, but missed the second case. This patch fixes the second case.
Reported-by: Jonathan Katz jkatz@eitmlabs.org Closes: https://lore.kernel.org/all/CALKgVmcC1VUV_gJVq70n--omMJZUb4HSh_FqvLTHgNBc+HC... Fixes: 02c0cab8e734 ("fuse: ioctl: translate ENOSYS") Cc: stable@vger.kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/ioctl.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)
--- a/fs/fuse/ioctl.c +++ b/fs/fuse/ioctl.c @@ -9,14 +9,23 @@ #include <linux/compat.h> #include <linux/fileattr.h>
-static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args) +static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args, + struct fuse_ioctl_out *outarg) { - ssize_t ret = fuse_simple_request(fm, args); + ssize_t ret; + + args->out_args[0].size = sizeof(*outarg); + args->out_args[0].value = outarg; + + ret = fuse_simple_request(fm, args);
/* Translate ENOSYS, which shouldn't be returned from fs */ if (ret == -ENOSYS) ret = -ENOTTY;
+ if (ret >= 0 && outarg->result == -ENOSYS) + outarg->result = -ENOTTY; + return ret; }
@@ -264,13 +273,11 @@ long fuse_do_ioctl(struct file *file, un }
ap.args.out_numargs = 2; - ap.args.out_args[0].size = sizeof(outarg); - ap.args.out_args[0].value = &outarg; ap.args.out_args[1].size = out_size; ap.args.out_pages = true; ap.args.out_argvar = true;
- transferred = fuse_send_ioctl(fm, &ap.args); + transferred = fuse_send_ioctl(fm, &ap.args, &outarg); err = transferred; if (transferred < 0) goto out; @@ -399,12 +406,10 @@ static int fuse_priv_ioctl(struct inode args.in_args[1].size = inarg.in_size; args.in_args[1].value = ptr; args.out_numargs = 2; - args.out_args[0].size = sizeof(outarg); - args.out_args[0].value = &outarg; args.out_args[1].size = inarg.out_size; args.out_args[1].value = ptr;
- err = fuse_send_ioctl(fm, &args); + err = fuse_send_ioctl(fm, &args, &outarg); if (!err) { if (outarg.result < 0) err = outarg.result;
From: Josef Bacik josef@toxicpanda.com
commit b19c98f237cd76981aaded52c258ce93f7daa8cb upstream.
Syzbot reported a panic that looks like this:
assertion failed: fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED, in fs/btrfs/ioctl.c:465 ------------[ cut here ]------------ kernel BUG at fs/btrfs/messages.c:259! RIP: 0010:btrfs_assertfail+0x2c/0x30 fs/btrfs/messages.c:259 Call Trace: <TASK> btrfs_exclop_balance fs/btrfs/ioctl.c:465 [inline] btrfs_ioctl_balance fs/btrfs/ioctl.c:3564 [inline] btrfs_ioctl+0x531e/0x5b30 fs/btrfs/ioctl.c:4632 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The reproducer is running a balance and a cancel or pause in parallel. The way balance finishes is a bit wonky, if we were paused we need to save the balance_ctl in the fs_info, but clear it otherwise and cleanup. However we rely on the return values being specific errors, or having a cancel request or no pause request. If balance completes and returns 0, but we have a pause or cancel request we won't do the appropriate cleanup, and then the next time we try to start a balance we'll trip this ASSERT.
The error handling is just wrong here, we always want to clean up, unless we got -ECANCELLED and we set the appropriate pause flag in the exclusive op. With this patch the reproducer ran for an hour without tripping, previously it would trip in less than a few minutes.
Reported-by: syzbot+c0f3acf145cb465426d5@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4071,14 +4071,6 @@ static int alloc_profile_is_valid(u64 fl return has_single_bit_set(flags); }
-static inline int balance_need_close(struct btrfs_fs_info *fs_info) -{ - /* cancel requested || normal exit path */ - return atomic_read(&fs_info->balance_cancel_req) || - (atomic_read(&fs_info->balance_pause_req) == 0 && - atomic_read(&fs_info->balance_cancel_req) == 0); -} - /* * Validate target profile against allowed profiles and return true if it's OK. * Otherwise print the error message and return false. @@ -4268,6 +4260,7 @@ int btrfs_balance(struct btrfs_fs_info * u64 num_devices; unsigned seq; bool reducing_redundancy; + bool paused = false; int i;
if (btrfs_fs_closing(fs_info) || @@ -4398,6 +4391,7 @@ int btrfs_balance(struct btrfs_fs_info * if (ret == -ECANCELED && atomic_read(&fs_info->balance_pause_req)) { btrfs_info(fs_info, "balance: paused"); btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED); + paused = true; } /* * Balance can be canceled by: @@ -4426,8 +4420,8 @@ int btrfs_balance(struct btrfs_fs_info * btrfs_update_ioctl_balance_args(fs_info, bargs); }
- if ((ret && ret != -ECANCELED && ret != -ENOSPC) || - balance_need_close(fs_info)) { + /* We didn't pause, we can clean everything up. */ + if (!paused) { reset_balance_state(fs_info); btrfs_exclop_finish(fs_info); }
From: Matthieu Baerts matthieu.baerts@tessares.net
commit fda05798c22a354efde09a76bdfc276b2d591829 upstream.
When looking for something else in LKFT reports [1], I noticed that the TC selftest ended with a timeout error:
not ok 1 selftests: tc-testing: tdc.sh # TIMEOUT 45 seconds
The timeout had been introduced 3 years ago, see the Fixes commit below.
This timeout is only in place when executing the selftests via the kselftests runner scripts. I guess this is not what most TC devs are using and nobody noticed the issue before.
The new timeout is set to 15 minutes as suggested by Pedro [2]. It looks like it is plenty more time than what it takes in "normal" conditions.
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Cc: stable@vger.kernel.org Link: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230711/tes... [1] Link: https://lore.kernel.org/netdev/0e061d4a-9a23-9f58-3b35-d8919de332d7@tessares... [2] Suggested-by: Pedro Tammela pctammela@mojatatu.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Reviewed-by: Zhengchao Shao shaozhengchao@huawei.com Link: https://lore.kernel.org/r/20230713-tc-selftests-lkft-v1-1-1eb4fd3a96e7@tessa... Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/tc-testing/settings | 1 + 1 file changed, 1 insertion(+) create mode 100644 tools/testing/selftests/tc-testing/settings
--- /dev/null +++ b/tools/testing/selftests/tc-testing/settings @@ -0,0 +1 @@ +timeout=900
From: Dan Carpenter dan.carpenter@linaro.org
commit 73274c33d961f4aa0f968f763e2c9f4210b4f4a3 upstream.
If get_user_pages_fast() allocates some pages but not as many as we wanted, then the current code leaks those pages. Call put_page() on the pages before returning.
Fixes: 129776ac2e38 ("accel/qaic: Add control path") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Pranjal Ramajor Asha Kanojiya quic_pkanojiy@quicinc.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Reviewed-by: Dafna Hirschfeld dhirschfeld@habana.ai Cc: stable@vger.kernel.org # 6.4.x Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Link: https://patchwork.freedesktop.org/patch/msgid/ZK0Q+ZuONTsBG+1T@moroto Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/qaic/qaic_control.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -418,9 +418,12 @@ static int find_and_map_user_pages(struc }
ret = get_user_pages_fast(xfer_start_addr, nr_pages, 0, page_list); - if (ret < 0 || ret != nr_pages) { - ret = -EFAULT; + if (ret < 0) goto free_page_list; + if (ret != nr_pages) { + nr_pages = ret; + ret = -EFAULT; + goto put_pages; }
sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 719b4774a8cb1a501e2d22a5a4a3a0a870e427d5 upstream.
When looking for something else in LKFT reports [1], I noticed most of the tests were skipped because the "teardown stage" did not complete successfully.
Pedro found out this is due to the fact CONFIG_NF_FLOW_TABLE is required but not listed in the 'config' file. Adding it to the list fixes the issues on LKFT side. CONFIG_NET_ACT_CT is now set to 'm' in the final kconfig.
Fixes: c34b961a2492 ("net/sched: act_ct: Create nf flow table per zone") Cc: stable@vger.kernel.org Link: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230711/tes... [1] Link: https://lore.kernel.org/netdev/0e061d4a-9a23-9f58-3b35-d8919de332d7@tessares... [2] Suggested-by: Pedro Tammela pctammela@mojatatu.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Tested-by: Zhengchao Shao shaozhengchao@huawei.com Link: https://lore.kernel.org/r/20230713-tc-selftests-lkft-v1-2-1eb4fd3a96e7@tessa... Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/tc-testing/config | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/tc-testing/config +++ b/tools/testing/selftests/tc-testing/config @@ -5,6 +5,7 @@ CONFIG_NF_CONNTRACK=m CONFIG_NF_CONNTRACK_MARK=y CONFIG_NF_CONNTRACK_ZONES=y CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_FLOW_TABLE=m CONFIG_NF_NAT=m CONFIG_NETFILTER_XT_TARGET_LOG=m
From: Mark Brown broonie@kernel.org
commit bc64734825c59e18a27ac266b07e14944c111fd8 upstream.
When problems were noticed with the register address not being taken into account when limiting raw transfers with I2C devices we fixed this in the core. Unfortunately it has subsequently been realised that a lot of buses were relying on the prior behaviour, partly due to unclear documentation not making it obvious what was intended in the core. This is all more involved to fix than is sensible for a fix commit so let's just drop the original fixes, a separate commit will fix the originally observed problem in an I2C specific way
Fixes: 3981514180c9 ("regmap: Account for register length when chunking") Fixes: c8e796895e23 ("regmap: spi-avmm: Fix regmap_bus max_raw_write") Signed-off-by: Mark Brown broonie@kernel.org Reviewed-by: Xu Yilun yilun.xu@intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-1-80e2aed22e83@ker... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/regmap/regmap-spi-avmm.c | 2 +- drivers/base/regmap/regmap.c | 6 ++---- 2 files changed, 3 insertions(+), 5 deletions(-)
--- a/drivers/base/regmap/regmap-spi-avmm.c +++ b/drivers/base/regmap/regmap-spi-avmm.c @@ -660,7 +660,7 @@ static const struct regmap_bus regmap_sp .reg_format_endian_default = REGMAP_ENDIAN_NATIVE, .val_format_endian_default = REGMAP_ENDIAN_NATIVE, .max_raw_read = SPI_AVMM_VAL_SIZE * MAX_READ_CNT, - .max_raw_write = SPI_AVMM_REG_SIZE + SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT, + .max_raw_write = SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT, .free_context = spi_avmm_bridge_ctx_free, };
--- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -2082,8 +2082,6 @@ int _regmap_raw_write(struct regmap *map size_t val_count = val_len / val_bytes; size_t chunk_count, chunk_bytes; size_t chunk_regs = val_count; - size_t max_data = map->max_raw_write - map->format.reg_bytes - - map->format.pad_bytes; int ret, i;
if (!val_count) @@ -2091,8 +2089,8 @@ int _regmap_raw_write(struct regmap *map
if (map->use_single_write) chunk_regs = 1; - else if (map->max_raw_write && val_len > max_data) - chunk_regs = max_data / val_bytes; + else if (map->max_raw_write && val_len > map->max_raw_write) + chunk_regs = map->max_raw_write / val_bytes;
chunk_count = val_count / chunk_regs; chunk_bytes = chunk_regs * val_bytes;
From: Harald Freudenberger freude@linux.ibm.com
commit 4cfca532ddc3474b3fc42592d0e4237544344b1a upstream.
The length information for available buffer space for CCA replies is covered with two fields in the T6 header prepended on each CCA reply: fromcardlen1 and fromcardlen2. The sum of these both values must not exceed the AP bus limit for this card (24KB for CEX8, 12KB CEX7 and older) minus the always present headers.
The current code adjusted the fromcardlen2 value in case of exceeding the AP bus limit when there was a non-zero value given from userspace. Some tests now showed that this was the wrong assumption. Instead the userspace value given for this field should always be trusted and if the sum of the two fields exceeds the AP bus limit for this card the first field fromcardlen1 should be adjusted instead.
So now the calculation is done with this new insight in mind. Also some additional checks for overflow have been introduced and some comments to provide some documentation for future maintainers of this complicated calculation code.
Furthermore the 128 bytes of fix overhead which is used in the current code is not correct. Investigations showed that for a reply always the same two header structs are prepended before a possible payload. So this is also fixed with this patch.
Signed-off-by: Harald Freudenberger freude@linux.ibm.com Reviewed-by: Holger Dengler dengler@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/crypto/zcrypt_msgtype6.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-)
--- a/drivers/s390/crypto/zcrypt_msgtype6.c +++ b/drivers/s390/crypto/zcrypt_msgtype6.c @@ -1111,23 +1111,36 @@ static long zcrypt_msgtype6_send_cprb(bo struct ica_xcRB *xcrb, struct ap_message *ap_msg) { - int rc; struct response_type *rtype = ap_msg->private; struct { struct type6_hdr hdr; struct CPRBX cprbx; /* ... more data blocks ... */ } __packed * msg = ap_msg->msg; + unsigned int max_payload_size; + int rc, delta;
- /* - * Set the queue's reply buffer length minus 128 byte padding - * as reply limit for the card firmware. - */ - msg->hdr.fromcardlen1 = min_t(unsigned int, msg->hdr.fromcardlen1, - zq->reply.bufsize - 128); - if (msg->hdr.fromcardlen2) - msg->hdr.fromcardlen2 = - zq->reply.bufsize - msg->hdr.fromcardlen1 - 128; + /* calculate maximum payload for this card and msg type */ + max_payload_size = zq->reply.bufsize - sizeof(struct type86_fmt2_msg); + + /* limit each of the two from fields to the maximum payload size */ + msg->hdr.fromcardlen1 = min(msg->hdr.fromcardlen1, max_payload_size); + msg->hdr.fromcardlen2 = min(msg->hdr.fromcardlen2, max_payload_size); + + /* calculate delta if the sum of both exceeds max payload size */ + delta = msg->hdr.fromcardlen1 + msg->hdr.fromcardlen2 + - max_payload_size; + if (delta > 0) { + /* + * Sum exceeds maximum payload size, prune fromcardlen1 + * (always trust fromcardlen2) + */ + if (delta > msg->hdr.fromcardlen1) { + rc = -EINVAL; + goto out; + } + msg->hdr.fromcardlen1 -= delta; + }
init_completion(&rtype->work); rc = ap_queue_message(zq->queue, ap_msg);
From: Rob Herring robh@kernel.org
commit 0bb8f49cd2cc8cb32ac51189ff9fcbe7ec3d9d65 upstream.
Since commit 241d2fb56a18 ("of: Make OF framebuffer device names unique"), as spotted by Frédéric Bonnard, the historical "of-display" device is gone: the updated logic creates "of-display.0" instead, then as many "of-display.N" as required.
This means that offb no longer finds the expected device, which prevents the Debian Installer from setting up its interface, at least on ppc64el.
Fix this by keeping "of-display" for the first device and "of-display.N" for subsequent devices.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217328 Link: https://bugs.debian.org/1033058 Fixes: 241d2fb56a18 ("of: Make OF framebuffer device names unique") Cc: stable@vger.kernel.org Cc: Cyril Brulebois cyril@debamax.com Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Helge Deller deller@gmx.de Acked-by: Helge Deller deller@gmx.de Acked-by: Thomas Zimmermann tzimmermann@suse.de Reviewed-by: Michal Suchánek msuchanek@suse.de Link: https://lore.kernel.org/r/20230710174007.2291013-1-robh@kernel.org Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -553,7 +553,7 @@ static int __init of_platform_default_po if (!of_get_property(node, "linux,opened", NULL) || !of_get_property(node, "linux,boot-display", NULL)) continue; - dev = of_platform_device_create(node, "of-display.0", NULL); + dev = of_platform_device_create(node, "of-display", NULL); of_node_put(node); if (WARN_ON(!dev)) return -ENOMEM;
From: Mark Brown broonie@kernel.org
commit 0c9d2eb5e94792fe64019008a04d4df5e57625af upstream.
The SMBus I2C buses have limits on the size of transfers they can do but do not factor in the register length meaning we may try to do a transfer longer than our length limit, the core will not take care of this. Future changes will factor this out into the core but there are a number of users that assume current behaviour so let's just do something conservative here.
This does not take account padding bits but practically speaking these are very rarely if ever used on I2C buses given that they generally run slowly enough to mean there's no issue.
Cc: stable@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Reviewed-by: Xu Yilun yilun.xu@intel.com Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-2-80e2aed22e83@ker... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/regmap/regmap-i2c.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/base/regmap/regmap-i2c.c +++ b/drivers/base/regmap/regmap-i2c.c @@ -242,8 +242,8 @@ static int regmap_i2c_smbus_i2c_read(voi static const struct regmap_bus regmap_i2c_smbus_i2c_block = { .write = regmap_i2c_smbus_i2c_write, .read = regmap_i2c_smbus_i2c_read, - .max_raw_read = I2C_SMBUS_BLOCK_MAX, - .max_raw_write = I2C_SMBUS_BLOCK_MAX, + .max_raw_read = I2C_SMBUS_BLOCK_MAX - 1, + .max_raw_write = I2C_SMBUS_BLOCK_MAX - 1, };
static int regmap_i2c_smbus_i2c_write_reg16(void *context, const void *data, @@ -299,8 +299,8 @@ static int regmap_i2c_smbus_i2c_read_reg static const struct regmap_bus regmap_i2c_smbus_i2c_block_reg16 = { .write = regmap_i2c_smbus_i2c_write_reg16, .read = regmap_i2c_smbus_i2c_read_reg16, - .max_raw_read = I2C_SMBUS_BLOCK_MAX, - .max_raw_write = I2C_SMBUS_BLOCK_MAX, + .max_raw_read = I2C_SMBUS_BLOCK_MAX - 2, + .max_raw_write = I2C_SMBUS_BLOCK_MAX - 2, };
static const struct regmap_bus *regmap_get_i2c_bus(struct i2c_client *i2c,
From: Helge Deller deller@gmx.de
commit 07e981137f17e5275b6fa5fd0c28b0ddb4519702 upstream.
IA64 is the only architecture which does not consider the pgoff value when searching for a possible free memory region with vm_unmapped_area(). Adding this seems to have no negative side effect on IA64, so add it now to make IA64 consistent with all other architectures.
Cc: stable@vger.kernel.org # 6.4 Signed-off-by: Helge Deller deller@gmx.de Tested-by: matoro matoro_mailinglist_kernel@matoro.tk Cc: Andrew Morton akpm@linux-foundation.org Cc: linux-ia64@vger.kernel.org Link: https://lore.kernel.org/r/20230721152432.196382-3-deller@gmx.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/ia64/kernel/sys_ia64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c index 6e948d015332..eb561cc93632 100644 --- a/arch/ia64/kernel/sys_ia64.c +++ b/arch/ia64/kernel/sys_ia64.c @@ -63,7 +63,7 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len info.low_limit = addr; info.high_limit = TASK_SIZE; info.align_mask = align_mask; - info.align_offset = 0; + info.align_offset = pgoff << PAGE_SHIFT; return vm_unmapped_area(&info); }
From: Mark Brown broonie@kernel.org
commit d4d5be94a87872421ea2569044092535aff0b886 upstream.
When we reconfigure the SVE vector length we discard the backing storage for the SVE vectors and then reallocate on next SVE use, leaving the SME specific state alone. This means that we do not enable SME traps if they were already disabled. That means that userspace code can enter streaming mode without trapping, putting the task in a state where if we try to save the state of the task we will fault.
Since the ABI does not specify that changing the SVE vector length disturbs SME state, and since SVE code may not be aware of SME code in the process, we shouldn't simply discard any ZA state. Instead immediately reallocate the storage for SVE, and disable SME if we change the SVE vector length while there is no SME state active.
Disabling SME traps on SVE vector length changes would make the overall code more complex since we would have a state where we have valid SME state stored but might get a SME trap.
Fixes: 9e4ab6c89109 ("arm64/sme: Implement vector length configuration prctl()s") Reported-by: David Spickett David.Spickett@arm.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230720-arm64-fix-sve-sme-vl-change-v2-1-8eea06b8... Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/fpsimd.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
--- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -847,6 +847,8 @@ void sve_sync_from_fpsimd_zeropad(struct int vec_set_vector_length(struct task_struct *task, enum vec_type type, unsigned long vl, unsigned long flags) { + bool free_sme = false; + if (flags & ~(unsigned long)(PR_SVE_VL_INHERIT | PR_SVE_SET_VL_ONEXEC)) return -EINVAL; @@ -897,21 +899,36 @@ int vec_set_vector_length(struct task_st task->thread.fp_type = FP_STATE_FPSIMD; }
- if (system_supports_sme() && type == ARM64_VEC_SME) { - task->thread.svcr &= ~(SVCR_SM_MASK | - SVCR_ZA_MASK); - clear_thread_flag(TIF_SME); + if (system_supports_sme()) { + if (type == ARM64_VEC_SME || + !(task->thread.svcr & (SVCR_SM_MASK | SVCR_ZA_MASK))) { + /* + * We are changing the SME VL or weren't using + * SME anyway, discard the state and force a + * reallocation. + */ + task->thread.svcr &= ~(SVCR_SM_MASK | + SVCR_ZA_MASK); + clear_thread_flag(TIF_SME); + free_sme = true; + } }
if (task == current) put_cpu_fpsimd_context();
/* - * Force reallocation of task SVE and SME state to the correct - * size on next use: + * Free the changed states if they are not in use, SME will be + * reallocated to the correct size on next use and we just + * allocate SVE now in case it is needed for use in streaming + * mode. */ - sve_free(task); - if (system_supports_sme() && type == ARM64_VEC_SME) + if (system_supports_sve()) { + sve_free(task); + sve_alloc(task, true); + } + + if (free_sme) sme_free(task);
task_set_vl(task, type, vl);
From: Fedor Ross fedor.ross@ifm.com
commit 9efa1a5407e81265ea502cab83be4de503decc49 upstream.
The mcp251xfd controller needs an idle bus to enter 'Normal CAN 2.0 mode' or . The maximum length of a CAN frame is 736 bits (64 data bytes, CAN-FD, EFF mode, worst case bit stuffing and interframe spacing). For low bit rates like 10 kbit/s the arbitrarily chosen MCP251XFD_POLL_TIMEOUT_US of 1 ms is too small.
Otherwise during polling for the CAN controller to enter 'Normal CAN 2.0 mode' the timeout limit is exceeded and the configuration fails with:
| $ ip link set dev can1 up type can bitrate 10000 | [ 731.911072] mcp251xfd spi2.1 can1: Controller failed to enter mode CAN 2.0 Mode (6) and stays in Configuration Mode (4) (con=0x068b0760, osc=0x00000468). | [ 731.927192] mcp251xfd spi2.1 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. | [ 731.938101] A link change request failed with some changes committed already. Interface can1 may have been left with an inconsistent configuration, please check. | RTNETLINK answers: Connection timed out
Make MCP251XFD_POLL_TIMEOUT_US timeout calculation dynamic. Use maximum of 1ms and bit time of 1 full 64 data bytes CAN-FD frame in EFF mode, worst case bit stuffing and interframe spacing at the current bit rate.
For easier backporting define the macro MCP251XFD_FRAME_LEN_MAX_BITS that holds the max frame length in bits, which is 736. This can be replaced by can_frame_bits(true, true, true, true, CANFD_MAX_DLEN) in a cleanup patch later.
Fixes: 55e5b97f003e8 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Signed-off-by: Fedor Ross fedor.ross@ifm.com Signed-off-by: Marek Vasut marex@denx.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230717-mcp251xfd-fix-increase-poll-timeout-v5-... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 10 ++++++++-- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 1 + 2 files changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c @@ -227,6 +227,8 @@ static int __mcp251xfd_chip_set_mode(const struct mcp251xfd_priv *priv, const u8 mode_req, bool nowait) { + const struct can_bittiming *bt = &priv->can.bittiming; + unsigned long timeout_us = MCP251XFD_POLL_TIMEOUT_US; u32 con = 0, con_reqop, osc = 0; u8 mode; int err; @@ -246,12 +248,16 @@ __mcp251xfd_chip_set_mode(const struct m if (mode_req == MCP251XFD_REG_CON_MODE_SLEEP || nowait) return 0;
+ if (bt->bitrate) + timeout_us = max_t(unsigned long, timeout_us, + MCP251XFD_FRAME_LEN_MAX_BITS * USEC_PER_SEC / + bt->bitrate); + err = regmap_read_poll_timeout(priv->map_reg, MCP251XFD_REG_CON, con, !mcp251xfd_reg_invalid(con) && FIELD_GET(MCP251XFD_REG_CON_OPMOD_MASK, con) == mode_req, - MCP251XFD_POLL_SLEEP_US, - MCP251XFD_POLL_TIMEOUT_US); + MCP251XFD_POLL_SLEEP_US, timeout_us); if (err != -ETIMEDOUT && err != -EBADMSG) return err;
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h @@ -387,6 +387,7 @@ static_assert(MCP251XFD_TIMESTAMP_WORK_D #define MCP251XFD_OSC_STAB_TIMEOUT_US (10 * MCP251XFD_OSC_STAB_SLEEP_US) #define MCP251XFD_POLL_SLEEP_US (10) #define MCP251XFD_POLL_TIMEOUT_US (USEC_PER_MSEC) +#define MCP251XFD_FRAME_LEN_MAX_BITS (736)
/* Misc */ #define MCP251XFD_NAPI_WEIGHT 32
From: YueHaibing yuehaibing@huawei.com
commit 55c3b96074f3f9b0aee19bf93cd71af7516582bb upstream.
BUG: KASAN: slab-use-after-free in bcm_proc_show+0x969/0xa80 Read of size 8 at addr ffff888155846230 by task cat/7862
CPU: 1 PID: 7862 Comm: cat Not tainted 6.5.0-rc1-00153-gc8746099c197 #230 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xd5/0x150 print_report+0xc1/0x5e0 kasan_report+0xba/0xf0 bcm_proc_show+0x969/0xa80 seq_read_iter+0x4f6/0x1260 seq_read+0x165/0x210 proc_reg_read+0x227/0x300 vfs_read+0x1d5/0x8d0 ksys_read+0x11e/0x240 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Allocated by task 7846: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x9e/0xa0 bcm_sendmsg+0x264b/0x44e0 sock_sendmsg+0xda/0x180 ____sys_sendmsg+0x735/0x920 ___sys_sendmsg+0x11d/0x1b0 __sys_sendmsg+0xfa/0x1d0 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 7846: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 ____kasan_slab_free+0x161/0x1c0 slab_free_freelist_hook+0x119/0x220 __kmem_cache_free+0xb4/0x2e0 rcu_core+0x809/0x1bd0
bcm_op is freed before procfs entry be removed in bcm_release(), this lead to bcm_proc_show() may read the freed bcm_op.
Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol") Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Oliver Hartkopp socketcan@hartkopp.net Acked-by: Oliver Hartkopp socketcan@hartkopp.net Link: https://lore.kernel.org/all/20230715092543.15548-1-yuehaibing@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/can/bcm.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1526,6 +1526,12 @@ static int bcm_release(struct socket *so
lock_sock(sk);
+#if IS_ENABLED(CONFIG_PROC_FS) + /* remove procfs entry */ + if (net->can.bcmproc_dir && bo->bcm_proc_read) + remove_proc_entry(bo->procname, net->can.bcmproc_dir); +#endif /* CONFIG_PROC_FS */ + list_for_each_entry_safe(op, next, &bo->tx_ops, list) bcm_remove_op(op);
@@ -1561,12 +1567,6 @@ static int bcm_release(struct socket *so list_for_each_entry_safe(op, next, &bo->rx_ops, list) bcm_remove_op(op);
-#if IS_ENABLED(CONFIG_PROC_FS) - /* remove procfs entry */ - if (net->can.bcmproc_dir && bo->bcm_proc_read) - remove_proc_entry(bo->procname, net->can.bcmproc_dir); -#endif /* CONFIG_PROC_FS */ - /* remove device reference */ if (bo->bound) { bo->bound = 0;
From: Marc Kleine-Budde mkl@pengutronix.de
commit 2603be9e8167ddc7bea95dcfab9ffc33414215aa upstream.
The gs_usb driver handles USB devices with more than 1 CAN channel. The RX path for all channels share the same bulk endpoint (the transmitted bulk data encodes the channel number). These per-device resources are allocated and submitted by the first opened channel.
During this allocation, the resources are either released immediately in case of a failure or the URBs are anchored. All anchored URBs are finally killed with gs_usb_disconnect().
Currently, gs_can_open() returns with an error if the allocation of a URB or a buffer fails. However, if usb_submit_urb() fails, the driver continues with the URBs submitted so far, even if no URBs were successfully submitted.
Treat every error as fatal and free all allocated resources immediately.
Switch to goto-style error handling, to prepare the driver for more per-device resource allocation.
Cc: stable@vger.kernel.org Cc: John Whittington git@jbrengineering.co.uk Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-1-9017... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -833,6 +833,7 @@ static int gs_can_open(struct net_device .mode = cpu_to_le32(GS_CAN_MODE_START), }; struct gs_host_frame *hf; + struct urb *urb = NULL; u32 ctrlmode; u32 flags = 0; int rc, i; @@ -856,13 +857,14 @@ static int gs_can_open(struct net_device
if (!parent->active_channels) { for (i = 0; i < GS_MAX_RX_URBS; i++) { - struct urb *urb; u8 *buf;
/* alloc rx urb */ urb = usb_alloc_urb(0, GFP_KERNEL); - if (!urb) - return -ENOMEM; + if (!urb) { + rc = -ENOMEM; + goto out_usb_kill_anchored_urbs; + }
/* alloc rx buffer */ buf = kmalloc(dev->parent->hf_size_rx, @@ -870,8 +872,8 @@ static int gs_can_open(struct net_device if (!buf) { netdev_err(netdev, "No memory left for USB buffer\n"); - usb_free_urb(urb); - return -ENOMEM; + rc = -ENOMEM; + goto out_usb_free_urb; }
/* fill, anchor, and submit rx urb */ @@ -894,9 +896,7 @@ static int gs_can_open(struct net_device netdev_err(netdev, "usb_submit failed (err=%d)\n", rc);
- usb_unanchor_urb(urb); - usb_free_urb(urb); - break; + goto out_usb_unanchor_urb; }
/* Drop reference, @@ -945,7 +945,8 @@ static int gs_can_open(struct net_device if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) gs_usb_timestamp_stop(dev); dev->can.state = CAN_STATE_STOPPED; - return rc; + + goto out_usb_kill_anchored_urbs; }
parent->active_channels++; @@ -953,6 +954,18 @@ static int gs_can_open(struct net_device netif_start_queue(netdev);
return 0; + +out_usb_unanchor_urb: + usb_unanchor_urb(urb); +out_usb_free_urb: + usb_free_urb(urb); +out_usb_kill_anchored_urbs: + if (!parent->active_channels) + usb_kill_anchored_urbs(&dev->tx_submitted); + + close_candev(netdev); + + return rc; }
static int gs_usb_get_state(const struct net_device *netdev,
From: Marc Kleine-Budde mkl@pengutronix.de
commit 5886e4d5ecec3e22844efed90b2dd383ef804b3a upstream.
If the gs_usb device driver is unloaded (or unbound) before the interface is shut down, the USB stack first calls the struct usb_driver::disconnect and then the struct net_device_ops::ndo_stop callback.
In gs_usb_disconnect() all pending bulk URBs are killed, i.e. no more RX'ed CAN frames are send from the USB device to the host. Later in gs_can_close() a reset control message is send to each CAN channel to remove the controller from the CAN bus. In this race window the USB device can still receive CAN frames from the bus and internally queue them to be send to the host.
At least in the current version of the candlelight firmware, the queue of received CAN frames is not emptied during the reset command. After loading (or binding) the gs_usb driver, new URBs are submitted during the struct net_device_ops::ndo_open callback and the candlelight firmware starts sending its already queued CAN frames to the host.
However, this scenario was not considered when implementing the hardware timestamp function. The cycle counter/time counter infrastructure is set up (gs_usb_timestamp_init()) after the USBs are submitted, resulting in a NULL pointer dereference if timecounter_cyc2time() (via the call chain: gs_usb_receive_bulk_callback() -> gs_usb_set_timestamp() -> gs_usb_skb_set_timestamp()) is called too early.
Move the gs_usb_timestamp_init() function before the URBs are submitted to fix this problem.
For a comprehensive solution, we need to consider gs_usb devices with more than 1 channel. The cycle counter/time counter infrastructure is setup per channel, but the RX URBs are per device. Once gs_can_open() of _a_ channel has been called, and URBs have been submitted, the gs_usb_receive_bulk_callback() can be called for _all_ available channels, even for channels that are not running, yet. As cycle counter/time counter has not set up, this will again lead to a NULL pointer dereference.
Convert the cycle counter/time counter from a "per channel" to a "per device" functionality. Also set it up, before submitting any URBs to the device.
Further in gs_usb_receive_bulk_callback(), don't process any URBs for not started CAN channels, only resubmit the URB.
Fixes: 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") Closes: https://github.com/candle-usb/candleLight_fw/issues/137#issuecomment-1623532... Cc: stable@vger.kernel.org Cc: John Whittington git@jbrengineering.co.uk Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-2-9017... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 101 ++++++++++++++++++++++--------------------- 1 file changed, 53 insertions(+), 48 deletions(-)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -303,12 +303,6 @@ struct gs_can { struct can_bittiming_const bt_const, data_bt_const; unsigned int channel; /* channel number */
- /* time counter for hardware timestamps */ - struct cyclecounter cc; - struct timecounter tc; - spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */ - struct delayed_work timestamp; - u32 feature; unsigned int hf_size_tx;
@@ -325,6 +319,13 @@ struct gs_usb { struct gs_can *canch[GS_MAX_INTF]; struct usb_anchor rx_submitted; struct usb_device *udev; + + /* time counter for hardware timestamps */ + struct cyclecounter cc; + struct timecounter tc; + spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */ + struct delayed_work timestamp; + unsigned int hf_size_rx; u8 active_channels; }; @@ -388,15 +389,15 @@ static int gs_cmd_reset(struct gs_can *d GFP_KERNEL); }
-static inline int gs_usb_get_timestamp(const struct gs_can *dev, +static inline int gs_usb_get_timestamp(const struct gs_usb *parent, u32 *timestamp_p) { __le32 timestamp; int rc;
- rc = usb_control_msg_recv(dev->udev, 0, GS_USB_BREQ_TIMESTAMP, + rc = usb_control_msg_recv(parent->udev, 0, GS_USB_BREQ_TIMESTAMP, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_INTERFACE, - dev->channel, 0, + 0, 0, ×tamp, sizeof(timestamp), USB_CTRL_GET_TIMEOUT, GFP_KERNEL); @@ -410,20 +411,20 @@ static inline int gs_usb_get_timestamp(c
static u64 gs_usb_timestamp_read(const struct cyclecounter *cc) __must_hold(&dev->tc_lock) { - struct gs_can *dev = container_of(cc, struct gs_can, cc); + struct gs_usb *parent = container_of(cc, struct gs_usb, cc); u32 timestamp = 0; int err;
- lockdep_assert_held(&dev->tc_lock); + lockdep_assert_held(&parent->tc_lock);
/* drop lock for synchronous USB transfer */ - spin_unlock_bh(&dev->tc_lock); - err = gs_usb_get_timestamp(dev, ×tamp); - spin_lock_bh(&dev->tc_lock); + spin_unlock_bh(&parent->tc_lock); + err = gs_usb_get_timestamp(parent, ×tamp); + spin_lock_bh(&parent->tc_lock); if (err) - netdev_err(dev->netdev, - "Error %d while reading timestamp. HW timestamps may be inaccurate.", - err); + dev_err(&parent->udev->dev, + "Error %d while reading timestamp. HW timestamps may be inaccurate.", + err);
return timestamp; } @@ -431,14 +432,14 @@ static u64 gs_usb_timestamp_read(const s static void gs_usb_timestamp_work(struct work_struct *work) { struct delayed_work *delayed_work = to_delayed_work(work); - struct gs_can *dev; + struct gs_usb *parent;
- dev = container_of(delayed_work, struct gs_can, timestamp); - spin_lock_bh(&dev->tc_lock); - timecounter_read(&dev->tc); - spin_unlock_bh(&dev->tc_lock); + parent = container_of(delayed_work, struct gs_usb, timestamp); + spin_lock_bh(&parent->tc_lock); + timecounter_read(&parent->tc); + spin_unlock_bh(&parent->tc_lock);
- schedule_delayed_work(&dev->timestamp, + schedule_delayed_work(&parent->timestamp, GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ); }
@@ -446,37 +447,38 @@ static void gs_usb_skb_set_timestamp(str struct sk_buff *skb, u32 timestamp) { struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); + struct gs_usb *parent = dev->parent; u64 ns;
- spin_lock_bh(&dev->tc_lock); - ns = timecounter_cyc2time(&dev->tc, timestamp); - spin_unlock_bh(&dev->tc_lock); + spin_lock_bh(&parent->tc_lock); + ns = timecounter_cyc2time(&parent->tc, timestamp); + spin_unlock_bh(&parent->tc_lock);
hwtstamps->hwtstamp = ns_to_ktime(ns); }
-static void gs_usb_timestamp_init(struct gs_can *dev) +static void gs_usb_timestamp_init(struct gs_usb *parent) { - struct cyclecounter *cc = &dev->cc; + struct cyclecounter *cc = &parent->cc;
cc->read = gs_usb_timestamp_read; cc->mask = CYCLECOUNTER_MASK(32); cc->shift = 32 - bits_per(NSEC_PER_SEC / GS_USB_TIMESTAMP_TIMER_HZ); cc->mult = clocksource_hz2mult(GS_USB_TIMESTAMP_TIMER_HZ, cc->shift);
- spin_lock_init(&dev->tc_lock); - spin_lock_bh(&dev->tc_lock); - timecounter_init(&dev->tc, &dev->cc, ktime_get_real_ns()); - spin_unlock_bh(&dev->tc_lock); + spin_lock_init(&parent->tc_lock); + spin_lock_bh(&parent->tc_lock); + timecounter_init(&parent->tc, &parent->cc, ktime_get_real_ns()); + spin_unlock_bh(&parent->tc_lock);
- INIT_DELAYED_WORK(&dev->timestamp, gs_usb_timestamp_work); - schedule_delayed_work(&dev->timestamp, + INIT_DELAYED_WORK(&parent->timestamp, gs_usb_timestamp_work); + schedule_delayed_work(&parent->timestamp, GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ); }
-static void gs_usb_timestamp_stop(struct gs_can *dev) +static void gs_usb_timestamp_stop(struct gs_usb *parent) { - cancel_delayed_work_sync(&dev->timestamp); + cancel_delayed_work_sync(&parent->timestamp); }
static void gs_update_state(struct gs_can *dev, struct can_frame *cf) @@ -560,6 +562,9 @@ static void gs_usb_receive_bulk_callback if (!netif_device_present(netdev)) return;
+ if (!netif_running(netdev)) + goto resubmit_urb; + if (hf->echo_id == -1) { /* normal rx */ if (hf->flags & GS_CAN_FLAG_FD) { skb = alloc_canfd_skb(dev->netdev, &cfd); @@ -856,6 +861,9 @@ static int gs_can_open(struct net_device }
if (!parent->active_channels) { + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_init(parent); + for (i = 0; i < GS_MAX_RX_URBS; i++) { u8 *buf;
@@ -926,13 +934,9 @@ static int gs_can_open(struct net_device flags |= GS_CAN_MODE_FD;
/* if hardware supports timestamps, enable it */ - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) { + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) flags |= GS_CAN_MODE_HW_TIMESTAMP;
- /* start polling timestamp */ - gs_usb_timestamp_init(dev); - } - /* finally start device */ dev->can.state = CAN_STATE_ERROR_ACTIVE; dm.flags = cpu_to_le32(flags); @@ -942,8 +946,6 @@ static int gs_can_open(struct net_device GFP_KERNEL); if (rc) { netdev_err(netdev, "Couldn't start device (err=%d)\n", rc); - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) - gs_usb_timestamp_stop(dev); dev->can.state = CAN_STATE_STOPPED;
goto out_usb_kill_anchored_urbs; @@ -960,9 +962,13 @@ out_usb_unanchor_urb: out_usb_free_urb: usb_free_urb(urb); out_usb_kill_anchored_urbs: - if (!parent->active_channels) + if (!parent->active_channels) { usb_kill_anchored_urbs(&dev->tx_submitted);
+ if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_stop(parent); + } + close_candev(netdev);
return rc; @@ -1011,14 +1017,13 @@ static int gs_can_close(struct net_devic
netif_stop_queue(netdev);
- /* stop polling timestamp */ - if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) - gs_usb_timestamp_stop(dev); - /* Stop polling */ parent->active_channels--; if (!parent->active_channels) { usb_kill_anchored_urbs(&parent->rx_submitted); + + if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) + gs_usb_timestamp_stop(parent); }
/* Stop sending URBs */
From: Heiner Kallweit hkallweit1@gmail.com
commit e31a9fedc7d8d80722b19628e66fcb5a36981780 upstream.
This reverts commit e1ed3e4d91112027b90c7ee61479141b3f948e6a.
Turned out the change causes a performance regression.
Link: https://lore.kernel.org/netdev/20230713124914.GA12924@green245/T/ Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/055c6bc2-74fa-8c67-9897-3f658abb5ae7@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4514,10 +4514,6 @@ static irqreturn_t rtl8169_interrupt(int }
if (napi_schedule_prep(&tp->napi)) { - rtl_unlock_config_regs(tp); - rtl_hw_aspm_clkreq_enable(tp, false); - rtl_lock_config_regs(tp); - rtl_irq_disable(tp); __napi_schedule(&tp->napi); } @@ -4577,14 +4573,9 @@ static int rtl8169_poll(struct napi_stru
work_done = rtl_rx(dev, tp, budget);
- if (work_done < budget && napi_complete_done(napi, work_done)) { + if (work_done < budget && napi_complete_done(napi, work_done)) rtl_irq_enable(tp);
- rtl_unlock_config_regs(tp); - rtl_hw_aspm_clkreq_enable(tp, true); - rtl_lock_config_regs(tp); - } - return work_done; }
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 031c99e71fedcce93b6785d38b7d287bf59e3952 upstream.
When looking at the TC selftest reports, I noticed one test was failing because /proc/net/nf_conntrack was not available.
not ok 373 3992 - Add ct action triggering DNAT tuple conflict Could not match regex pattern. Verify command output: cat: /proc/net/nf_conntrack: No such file or directory
It is only available if NF_CONNTRACK_PROCFS kconfig is set. So the issue can be fixed simply by adding it to the list of required kconfig.
Fixes: e46905641316 ("tc-testing: add test for ct DNAT tuple collision") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/netdev/0e061d4a-9a23-9f58-3b35-d8919de332d7@tessares... [1] Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Tested-by: Zhengchao Shao shaozhengchao@huawei.com Link: https://lore.kernel.org/r/20230713-tc-selftests-lkft-v1-3-1eb4fd3a96e7@tessa... Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/tc-testing/config | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/tc-testing/config +++ b/tools/testing/selftests/tc-testing/config @@ -5,6 +5,7 @@ CONFIG_NF_CONNTRACK=m CONFIG_NF_CONNTRACK_MARK=y CONFIG_NF_CONNTRACK_ZONES=y CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_CONNTRACK_PROCFS=y CONFIG_NF_FLOW_TABLE=m CONFIG_NF_NAT=m CONFIG_NETFILTER_XT_TARGET_LOG=m
From: Dan Carpenter dan.carpenter@linaro.org
commit ea33cb6fc2788f9fe248d49e1c0b2553a58436ef upstream.
There are several issues in this code. The check at the start of the loop:
if (user_len >= user_msg->len) {
This check does not ensure that we have enough space for the trans_hdr (8 bytes). Instead the check needs to be:
if (user_len > user_msg->len - sizeof(*trans_hdr)) {
That subtraction is done as an unsigned long we want to avoid negatives. Add a lower bound to the start of the function.
if (user_msg->len < sizeof(*trans_hdr))
There is a second integer underflow which can happen if trans_hdr->len is zero inside the encode_passthrough() function.
memcpy(out_trans->data, in_trans->data, in_trans->hdr.len - sizeof(in_trans->hdr));
Instead of adding a check to encode_passthrough() it's better to check in this central place. Add that check:
if (trans_hdr->len < sizeof(trans_hdr)
The final concern is that the "user_len + trans_hdr->len" might have an integer overflow bug. Use size_add() to prevent that.
- if (user_len + trans_hdr->len > user_msg->len) { + if (size_add(user_len, trans_hdr->len) > user_msg->len) {
Fixes: 129776ac2e38 ("accel/qaic: Add control path") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Pranjal Ramajor Asha Kanojiya quic_pkanojiy@quicinc.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Cc: stable@vger.kernel.org # 6.4.x Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Link: https://patchwork.freedesktop.org/patch/msgid/9a0cb0c1-a974-4f10-bc8d-944379... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/qaic/qaic_control.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -14,6 +14,7 @@ #include <linux/mm.h> #include <linux/moduleparam.h> #include <linux/mutex.h> +#include <linux/overflow.h> #include <linux/pci.h> #include <linux/scatterlist.h> #include <linux/types.h> @@ -751,7 +752,8 @@ static int encode_message(struct qaic_de int ret; int i;
- if (!user_msg->count) { + if (!user_msg->count || + user_msg->len < sizeof(*trans_hdr)) { ret = -EINVAL; goto out; } @@ -768,12 +770,13 @@ static int encode_message(struct qaic_de }
for (i = 0; i < user_msg->count; ++i) { - if (user_len >= user_msg->len) { + if (user_len > user_msg->len - sizeof(*trans_hdr)) { ret = -EINVAL; break; } trans_hdr = (struct qaic_manage_trans_hdr *)(user_msg->data + user_len); - if (user_len + trans_hdr->len > user_msg->len) { + if (trans_hdr->len < sizeof(trans_hdr) || + size_add(user_len, trans_hdr->len) > user_msg->len) { ret = -EINVAL; break; }
From: Dan Carpenter dan.carpenter@linaro.org
commit 51b56382ed2a2b03347372272362b3baa623ed1e upstream.
Copy the bounds checking from encode_message() to decode_message().
This patch addresses the following concerns. Ensure that there is enough space for at least one header so that we don't have a negative size later.
if (msg_hdr_len < sizeof(*trans_hdr))
Ensure that we have enough space to read the next header from the msg->data.
if (msg_len > msg_hdr_len - sizeof(*trans_hdr)) return -EINVAL;
Check that the trans_hdr->len is not below the minimum size:
if (hdr_len < sizeof(*trans_hdr))
This minimum check ensures that we don't corrupt memory in decode_passthrough() when we do.
memcpy(out_trans->data, in_trans->data, len - sizeof(in_trans->hdr));
And finally, use size_add() to prevent an integer overflow:
if (size_add(msg_len, hdr_len) > msg_hdr_len)
Fixes: 129776ac2e38 ("accel/qaic: Add control path") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Pranjal Ramajor Asha Kanojiya quic_pkanojiy@quicinc.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Cc: stable@vger.kernel.org # 6.4.x Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Link: https://patchwork.freedesktop.org/patch/msgid/ZK0Q5nbLyDO7kJa+@moroto Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/qaic/qaic_control.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -959,15 +959,23 @@ static int decode_message(struct qaic_de int ret; int i;
- if (msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH) + if (msg_hdr_len < sizeof(*trans_hdr) || + msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH) return -EINVAL;
user_msg->len = 0; user_msg->count = le32_to_cpu(msg->hdr.count);
for (i = 0; i < user_msg->count; ++i) { + u32 hdr_len; + + if (msg_len > msg_hdr_len - sizeof(*trans_hdr)) + return -EINVAL; + trans_hdr = (struct wire_trans_hdr *)(msg->data + msg_len); - if (msg_len + le32_to_cpu(trans_hdr->len) > msg_hdr_len) + hdr_len = le32_to_cpu(trans_hdr->len); + if (hdr_len < sizeof(*trans_hdr) || + size_add(msg_len, hdr_len) > msg_hdr_len) return -EINVAL;
switch (le32_to_cpu(trans_hdr->type)) {
From: Dan Carpenter dan.carpenter@linaro.org
commit 47d87f71d00b7091b43a56f608f7151b33e5772e upstream.
The encode_dma() function has integer overflow checks. The encode_passthrough(), encode_activate() and encode_status() functions did not. I added integer overflow checking everywhere. I also updated the integer overflow checking in encode_dma() to use size_add() so everything is consistent.
Fixes: 129776ac2e38 ("accel/qaic: Add control path") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Pranjal Ramajor Asha Kanojiya quic_pkanojiy@quicinc.com Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Cc: stable@vger.kernel.org # 6.4.x [jhugo: tweak if in encode_dma() to match existing style] Signed-off-by: Jeffrey Hugo quic_jhugo@quicinc.com Link: https://patchwork.freedesktop.org/patch/msgid/ZK0Q7IsPkj6WSCcL@moroto Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/qaic/qaic_control.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
--- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -367,7 +367,7 @@ static int encode_passthrough(struct qai if (in_trans->hdr.len % 8 != 0) return -EINVAL;
- if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_EXT_MSG_LENGTH) + if (size_add(msg_hdr_len, in_trans->hdr.len) > QAIC_MANAGE_EXT_MSG_LENGTH) return -ENOSPC;
trans_wrapper = add_wrapper(wrappers, @@ -561,11 +561,8 @@ static int encode_dma(struct qaic_device msg = &wrapper->msg; msg_hdr_len = le32_to_cpu(msg->hdr.len);
- if (msg_hdr_len > (UINT_MAX - QAIC_MANAGE_EXT_MSG_LENGTH)) - return -EINVAL; - /* There should be enough space to hold at least one ASP entry. */ - if (msg_hdr_len + sizeof(*out_trans) + sizeof(struct wire_addr_size_pair) > + if (size_add(msg_hdr_len, sizeof(*out_trans) + sizeof(struct wire_addr_size_pair)) > QAIC_MANAGE_EXT_MSG_LENGTH) return -ENOMEM;
@@ -638,7 +635,7 @@ static int encode_activate(struct qaic_d msg = &wrapper->msg; msg_hdr_len = le32_to_cpu(msg->hdr.len);
- if (msg_hdr_len + sizeof(*out_trans) > QAIC_MANAGE_MAX_MSG_LENGTH) + if (size_add(msg_hdr_len, sizeof(*out_trans)) > QAIC_MANAGE_MAX_MSG_LENGTH) return -ENOSPC;
if (!in_trans->queue_size) @@ -722,7 +719,7 @@ static int encode_status(struct qaic_dev msg = &wrapper->msg; msg_hdr_len = le32_to_cpu(msg->hdr.len);
- if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_MAX_MSG_LENGTH) + if (size_add(msg_hdr_len, in_trans->hdr.len) > QAIC_MANAGE_MAX_MSG_LENGTH) return -ENOSPC;
trans_wrapper = add_wrapper(wrappers, sizeof(*trans_wrapper));
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 05abb3be91d8788328231ee02973ab3d47f5e3d2 upstream.
Currently dma_resv_get_fences() will leak the previously allocated array if the fence iteration got restarted and the krealloc_array() fails.
Free the old array by hand, and make sure we still clear the returned *fences so the caller won't end up accessing freed memory. Some (but not all) of the callers of dma_resv_get_fences() seem to still trawl through the array even when dma_resv_get_fences() failed. And let's zero out *num_fences as well for good measure.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Fixes: d3c80698c9f5 ("dma-buf: use new iterator in dma_resv_get_fences v3") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230713194745.1751-1-ville.sy... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/dma-resv.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -571,6 +571,7 @@ int dma_resv_get_fences(struct dma_resv dma_resv_for_each_fence_unlocked(&cursor, fence) {
if (dma_resv_iter_is_restarted(&cursor)) { + struct dma_fence **new_fences; unsigned int count;
while (*num_fences) @@ -579,13 +580,17 @@ int dma_resv_get_fences(struct dma_resv count = cursor.num_fences + 1;
/* Eventually re-allocate the array */ - *fences = krealloc_array(*fences, count, - sizeof(void *), - GFP_KERNEL); - if (count && !*fences) { + new_fences = krealloc_array(*fences, count, + sizeof(void *), + GFP_KERNEL); + if (count && !new_fences) { + kfree(*fences); + *fences = NULL; + *num_fences = 0; dma_resv_iter_end(&cursor); return -ENOMEM; } + *fences = new_fences; }
(*fences)[(*num_fences)++] = dma_fence_get(fence);
From: Guchun Chen guchun.chen@amd.com
commit b42ae87a7b3878afaf4c3852ca66c025a5b996e0 upstream.
In below thousands of screen rotation loop tests with virtual display enabled, a CPU hard lockup issue may happen, leading system to unresponsive and crash.
do { xrandr --output Virtual --rotate inverted xrandr --output Virtual --rotate right xrandr --output Virtual --rotate left xrandr --output Virtual --rotate normal } while (1);
NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
? hrtimer_run_softirq+0x140/0x140 ? store_vblank+0xe0/0xe0 [drm] hrtimer_cancel+0x15/0x30 amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu] drm_vblank_disable_and_save+0x185/0x1f0 [drm] drm_crtc_vblank_off+0x159/0x4c0 [drm] ? record_print_text.cold+0x11/0x11 ? wait_for_completion_timeout+0x232/0x280 ? drm_crtc_wait_one_vblank+0x40/0x40 [drm] ? bit_wait_io_timeout+0xe0/0xe0 ? wait_for_completion_interruptible+0x1d7/0x320 ? mutex_unlock+0x81/0xd0 amdgpu_vkms_crtc_atomic_disable
It's caused by a stuck in lock dependency in such scenario on different CPUs.
CPU1 CPU2 drm_crtc_vblank_off hrtimer_interrupt grab event_lock (irq disabled) __hrtimer_run_queues grab vbl_lock/vblank_time_block amdgpu_vkms_vblank_simulate amdgpu_vkms_disable_vblank drm_handle_vblank hrtimer_cancel grab dev->event_lock
So CPU1 stucks in hrtimer_cancel as timer callback is running endless on current clock base, as that timer queue on CPU2 has no chance to finish it because of failing to hold the lock. So NMI watchdog will throw the errors after its threshold, and all later CPUs are impacted/blocked.
So use hrtimer_try_to_cancel to fix this, as disable_vblank callback does not need to wait the handler to finish. And also it's not necessary to check the return value of hrtimer_try_to_cancel, because even if it's -1 which means current timer callback is running, it will be reprogrammed in hrtimer_start with calling enable_vblank to make it works.
v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes tag as well
v3: drop warn printing (Christian)
v4: drop superfluous check of blank->enabled in timer function, as it's guaranteed in drm_handle_vblank (Christian)
Fixes: 84ec374bd580 ("drm/amdgpu: create amdgpu_vkms (v4)") Cc: stable@vger.kernel.org Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -55,8 +55,9 @@ static enum hrtimer_restart amdgpu_vkms_ DRM_WARN("%s: vblank timer overrun\n", __func__);
ret = drm_crtc_handle_vblank(crtc); + /* Don't queue timer again when vblank is disabled. */ if (!ret) - DRM_ERROR("amdgpu_vkms failure on handling vblank"); + return HRTIMER_NORESTART;
return HRTIMER_RESTART; } @@ -81,7 +82,7 @@ static void amdgpu_vkms_disable_vblank(s { struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
- hrtimer_cancel(&amdgpu_crtc->vblank_timer); + hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer); }
static bool amdgpu_vkms_get_vblank_timestamp(struct drm_crtc *crtc,
From: Alex Deucher alexander.deucher@amd.com
commit a4eb11824170d742531998f4ebd1c6a18b63db47 upstream.
Use average gfxclock for consistency with other dGPUs.
Reviewed-by: Kenneth Feng kenneth.feng@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -1927,12 +1927,16 @@ static int sienna_cichlid_read_sensor(st *size = 4; break; case AMDGPU_PP_SENSOR_GFX_MCLK: - ret = sienna_cichlid_get_current_clk_freq_by_table(smu, SMU_UCLK, (uint32_t *)data); + ret = sienna_cichlid_get_smu_metrics_data(smu, + METRICS_CURR_UCLK, + (uint32_t *)data); *(uint32_t *)data *= 100; *size = 4; break; case AMDGPU_PP_SENSOR_GFX_SCLK: - ret = sienna_cichlid_get_current_clk_freq_by_table(smu, SMU_GFXCLK, (uint32_t *)data); + ret = sienna_cichlid_get_smu_metrics_data(smu, + METRICS_AVERAGE_GFXCLK, + (uint32_t *)data); *(uint32_t *)data *= 100; *size = 4; break;
From: Alex Deucher alexander.deucher@amd.com
commit 068c8bb10f37bb84824625dbbda053a3a3e0d6e1 upstream.
Use current uclk to be consistent with other dGPUs.
Reviewed-by: Kenneth Feng kenneth.feng@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c @@ -940,7 +940,7 @@ static int smu_v13_0_7_read_sensor(struc break; case AMDGPU_PP_SENSOR_GFX_MCLK: ret = smu_v13_0_7_get_smu_metrics_data(smu, - METRICS_AVERAGE_UCLK, + METRICS_CURR_UCLK, (uint32_t *)data); *(uint32_t *)data *= 100; *size = 4;
From: Ben Skeggs bskeggs@redhat.com
commit 2b5d1c29f6c4cb19369ef92881465e5ede75f4ef upstream.
Fixes crash on boards with ANX9805 TMDS/DP encoders.
Cc: stable@vger.kernel.org # 6.4+ Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Karol Herbst kherbst@redhat.com Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-2-skeggsb@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c | 29 +++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c @@ -81,20 +81,29 @@ nvkm_uconn_uevent(struct nvkm_object *ob return -ENOSYS;
list_for_each_entry(outp, &conn->disp->outps, head) { - if (outp->info.connector == conn->index && outp->dp.aux) { - if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_I2C_PLUG; - if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_I2C_UNPLUG; - if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ ) bits |= NVKM_I2C_IRQ; - - return nvkm_uevent_add(uevent, &device->i2c->event, outp->dp.aux->id, bits, - nvkm_uconn_uevent_aux); - } + if (outp->info.connector == conn->index) + break; + } + + if (&outp->head == &conn->disp->outps) + return -EINVAL; + + if (outp->dp.aux && !outp->info.location) { + if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_I2C_PLUG; + if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_I2C_UNPLUG; + if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ ) bits |= NVKM_I2C_IRQ; + + return nvkm_uevent_add(uevent, &device->i2c->event, outp->dp.aux->id, bits, + nvkm_uconn_uevent_aux); }
if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_GPIO_HI; if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_GPIO_LO; - if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ) - return -EINVAL; + if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ) { + /* TODO: support DP IRQ on ANX9805 and remove this hack. */ + if (!outp->info.location) + return -EINVAL; + }
return nvkm_uevent_add(uevent, &device->gpio->event, conn->info.hpd, bits, nvkm_uconn_uevent_gpio);
From: Ben Skeggs bskeggs@redhat.com
commit ea293f823a8805735d9e00124df81a8f448ed1ae upstream.
Fixes OOPS on boards with ANX9805 DP encoders.
Cc: stable@vger.kernel.org # 6.4+ Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Karol Herbst kherbst@redhat.com Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-3-skeggsb@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -1873,6 +1873,8 @@ nv50_pior_destroy(struct drm_encoder *en nvif_outp_dtor(&nv_encoder->outp);
drm_encoder_cleanup(encoder); + + mutex_destroy(&nv_encoder->dp.hpd_irq_lock); kfree(encoder); }
@@ -1917,6 +1919,8 @@ nv50_pior_create(struct drm_connector *c nv_encoder->i2c = ddc; nv_encoder->aux = aux;
+ mutex_init(&nv_encoder->dp.hpd_irq_lock); + encoder = to_drm_encoder(nv_encoder); encoder->possible_crtcs = dcbe->heads; encoder->possible_clones = 0;
From: Ben Skeggs bskeggs@redhat.com
commit 752a281032b2d6f4564be827e082bde6f7d2fd4f upstream.
This was completely bogus before, using maximum DCB device index rather than maximum AUX ID to size the buffer that stores event refcounts.
*Pretty* unlikely to have been an actual problem on most configurations, that is, unless you've got one of the rare boards that have off-chip DP.
There, it'll likely crash.
Cc: stable@vger.kernel.org # 6.4+ Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Karol Herbst kherbst@redhat.com Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230719044051.6975-1-skeggsb@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h | 4 ++-- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 11 +++++++++-- 2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h index 40a1065ae626..ef441dfdea09 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h @@ -16,7 +16,7 @@ struct nvkm_i2c_bus { const struct nvkm_i2c_bus_func *func; struct nvkm_i2c_pad *pad; #define NVKM_I2C_BUS_CCB(n) /* 'n' is ccb index */ (n) -#define NVKM_I2C_BUS_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x100) +#define NVKM_I2C_BUS_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x10) #define NVKM_I2C_BUS_PRI /* ccb primary comm. port */ -1 #define NVKM_I2C_BUS_SEC /* ccb secondary comm. port */ -2 int id; @@ -38,7 +38,7 @@ struct nvkm_i2c_aux { const struct nvkm_i2c_aux_func *func; struct nvkm_i2c_pad *pad; #define NVKM_I2C_AUX_CCB(n) /* 'n' is ccb index */ (n) -#define NVKM_I2C_AUX_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x100) +#define NVKM_I2C_AUX_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x10) int id;
struct mutex mutex; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c index 976539de4220..731b2f68d3db 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c @@ -260,10 +260,11 @@ nvkm_i2c_new_(const struct nvkm_i2c_func *func, struct nvkm_device *device, { struct nvkm_bios *bios = device->bios; struct nvkm_i2c *i2c; + struct nvkm_i2c_aux *aux; struct dcb_i2c_entry ccbE; struct dcb_output dcbE; u8 ver, hdr; - int ret, i; + int ret, i, ids;
if (!(i2c = *pi2c = kzalloc(sizeof(*i2c), GFP_KERNEL))) return -ENOMEM; @@ -406,5 +407,11 @@ nvkm_i2c_new_(const struct nvkm_i2c_func *func, struct nvkm_device *device, } }
- return nvkm_event_init(&nvkm_i2c_intr_func, &i2c->subdev, 4, i, &i2c->event); + ids = 0; + list_for_each_entry(aux, &i2c->aux, head) + ids = max(ids, aux->id + 1); + if (!ids) + return 0; + + return nvkm_event_init(&nvkm_i2c_intr_func, &i2c->subdev, 4, ids, &i2c->event); }
From: Jocelyn Falempe jfalempe@redhat.com
commit c2a88e8bdf5f6239948d75283d0ae7e0c7945b03 upstream.
dmt_mode is allocated and never freed in this function. It was found with the ast driver, but most drivers using generic fbdev setup are probably affected.
This fixes the following kmemleak report: backtrace: [<00000000b391296d>] drm_mode_duplicate+0x45/0x220 [drm] [<00000000e45bb5b3>] drm_client_target_cloned.constprop.0+0x27b/0x480 [drm] [<00000000ed2d3a37>] drm_client_modeset_probe+0x6bd/0xf50 [drm] [<0000000010e5cc9d>] __drm_fb_helper_initial_config_and_unlock+0xb4/0x2c0 [drm_kms_helper] [<00000000909f82ca>] drm_fbdev_client_hotplug+0x2bc/0x4d0 [drm_kms_helper] [<00000000063a69aa>] drm_client_register+0x169/0x240 [drm] [<00000000a8c61525>] ast_pci_probe+0x142/0x190 [ast] [<00000000987f19bb>] local_pci_probe+0xdc/0x180 [<000000004fca231b>] work_for_cpu_fn+0x4e/0xa0 [<0000000000b85301>] process_one_work+0x8b7/0x1540 [<000000003375b17c>] worker_thread+0x70a/0xed0 [<00000000b0d43cd9>] kthread+0x29f/0x340 [<000000008d770833>] ret_from_fork+0x1f/0x30 unreferenced object 0xff11000333089a00 (size 128):
cc: stable@vger.kernel.org Fixes: 1d42bbc8f7f9 ("drm/fbdev: fix cloning on fbcon") Reported-by: Zhang Yi yizhan@redhat.com Signed-off-by: Jocelyn Falempe jfalempe@redhat.com Reviewed-by: Javier Martinez Canillas javierm@redhat.com Reviewed-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20230711092203.68157-2-jfalemp... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_client_modeset.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/gpu/drm/drm_client_modeset.c +++ b/drivers/gpu/drm/drm_client_modeset.c @@ -311,6 +311,9 @@ static bool drm_client_target_cloned(str can_clone = true; dmt_mode = drm_mode_find_dmt(dev, 1024, 768, 60, false);
+ if (!dmt_mode) + goto fail; + for (i = 0; i < connector_count; i++) { if (!enabled[i]) continue; @@ -326,11 +329,13 @@ static bool drm_client_target_cloned(str if (!modes[i]) can_clone = false; } + kfree(dmt_mode);
if (can_clone) { DRM_DEBUG_KMS("can clone using 1024x768\n"); return true; } +fail: DRM_INFO("kms: can't enable cloning when we probably wanted to.\n"); return false; }
From: Jocelyn Falempe jfalempe@redhat.com
commit 2329cc7a101af1a844fbf706c0724c0baea38365 upstream.
When a new mode is set to modeset->mode, the previous mode should be freed. This fixes the following kmemleak report:
drm_mode_duplicate+0x45/0x220 [drm] drm_client_modeset_probe+0x944/0xf50 [drm] __drm_fb_helper_initial_config_and_unlock+0xb4/0x2c0 [drm_kms_helper] drm_fbdev_client_hotplug+0x2bc/0x4d0 [drm_kms_helper] drm_client_register+0x169/0x240 [drm] ast_pci_probe+0x142/0x190 [ast] local_pci_probe+0xdc/0x180 work_for_cpu_fn+0x4e/0xa0 process_one_work+0x8b7/0x1540 worker_thread+0x70a/0xed0 kthread+0x29f/0x340 ret_from_fork+0x1f/0x30
cc: stable@vger.kernel.org Reported-by: Zhang Yi yizhan@redhat.com Signed-off-by: Jocelyn Falempe jfalempe@redhat.com Reviewed-by: Javier Martinez Canillas javierm@redhat.com Reviewed-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20230711092203.68157-3-jfalemp... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_client_modeset.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/drm_client_modeset.c +++ b/drivers/gpu/drm/drm_client_modeset.c @@ -867,6 +867,7 @@ int drm_client_modeset_probe(struct drm_ break; }
+ kfree(modeset->mode); modeset->mode = drm_mode_duplicate(dev, mode); drm_connector_get(connector); modeset->connectors[modeset->num_connectors++] = connector;
From: Simon Ser contact@emersion.fr
commit 1ca67aba8d11c2849d395013e1fdce02918d5657 upstream.
Up until now, amdgpu was silently degrading to vsync when user-space requested an async flip but the hardware didn't support it.
The hardware doesn't support immediate flips when the update changes the FB pitch, the DCC state, the rotation, enables or disables CRTCs or planes, etc. This is reflected in the dm_crtc_state.update_type field: UPDATE_TYPE_FAST means that immediate flip is supported.
Silently degrading async flips to vsync is not the expected behavior from a uAPI point-of-view. Xorg expects async flips to fail if unsupported, to be able to fall back to a blit. i915 already behaves this way.
This patch aligns amdgpu with uAPI expectations and returns a failure when an async flip is not possible.
Signed-off-by: Simon Ser contact@emersion.fr Reviewed-by: André Almeida andrealmeid@igalia.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: André Almeida andrealmeid@igalia.com Signed-off-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 ++++++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 12 ++++++++++++ 2 files changed, 20 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8055,7 +8055,15 @@ static void amdgpu_dm_commit_planes(stru * Only allow immediate flips for fast updates that don't * change memory domain, FB pitch, DCC state, rotation or * mirroring. + * + * dm_crtc_helper_atomic_check() only accepts async flips with + * fast updates. */ + if (crtc->state->async_flip && + acrtc_state->update_type != UPDATE_TYPE_FAST) + drm_warn_once(state->dev, + "[PLANE:%d:%s] async flip with non-fast update\n", + plane->base.id, plane->name); bundle->flip_addrs[planes_count].flip_immediate = crtc->state->async_flip && acrtc_state->update_type == UPDATE_TYPE_FAST && --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c @@ -398,6 +398,18 @@ static int dm_crtc_helper_atomic_check(s return -EINVAL; }
+ /* + * Only allow async flips for fast updates that don't change the FB + * pitch, the DCC state, rotation, etc. + */ + if (crtc_state->async_flip && + dm_crtc_state->update_type != UPDATE_TYPE_FAST) { + drm_dbg_atomic(crtc->dev, + "[CRTC:%d:%s] async flips are only supported for fast updates\n", + crtc->base.id, crtc->name); + return -EINVAL; + } + /* In some use cases, like reset, no stream is attached */ if (!dm_crtc_state->stream) return 0;
From: Zhikai Zhai zhikai.zhai@amd.com
commit a460beefe77d780ac48f19d39333852a7f93ffc1 upstream.
[WHY] All of pipes will be used when the MPC split enable on the dcn which just has 2 pipes. Then MPO enter will trigger the minimal transition which need programe dcn from 2 pipes MPC split to 2 pipes MPO. This action will cause lag if happen frequently.
[HOW] Disable the MPC split for the platform which dcn resource is limited
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Alvin Lee alvin.lee2@amd.com Acked-by: Alan Liu haoping.liu@amd.com Signed-off-by: Zhikai Zhai zhikai.zhai@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c @@ -65,7 +65,7 @@ static const struct dc_debug_options deb .timing_trace = false, .clock_trace = true, .disable_pplib_clock_request = true, - .pipe_split_policy = MPC_SPLIT_DYNAMIC, + .pipe_split_policy = MPC_SPLIT_AVOID, .force_single_disp_pipe_split = false, .disable_dcc = DCC_ENABLE, .vsr_support = true,
From: Taimur Hassan syed.hassan@amd.com
commit 5a25cefc0920088bb9afafeb80ad3dcd84fe278b upstream.
[Why & How] If there is no TG allocation we can dereference a NULL pointer when checking if the TG is enabled.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Alan Liu haoping.liu@amd.com Signed-off-by: Taimur Hassan syed.hassan@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -3309,7 +3309,8 @@ void dcn10_wait_for_mpcc_disconnect( if (pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst]) { struct hubp *hubp = get_hubp_by_inst(res_pool, mpcc_inst);
- if (pipe_ctx->stream_res.tg->funcs->is_tg_enabled(pipe_ctx->stream_res.tg)) + if (pipe_ctx->stream_res.tg && + pipe_ctx->stream_res.tg->funcs->is_tg_enabled(pipe_ctx->stream_res.tg)) res_pool->mpc->funcs->wait_for_idle(res_pool->mpc, mpcc_inst); pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst] = false; hubp->funcs->set_blank(hubp, true);
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
commit 2387ccf43e3c6cb5dbd757c5ef410cca9f14b971 upstream.
[Why & How] Port of a change that went into DCN314 to keep the PHY enabled when we have a connected and active DP display.
The PHY can hang if PHY refclk is disabled inadvertently.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Josip Pavic josip.pavic@amd.com Acked-by: Alan Liu haoping.liu@amd.com Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c @@ -87,6 +87,11 @@ static int dcn31_get_active_display_cnt_ stream->signal == SIGNAL_TYPE_DVI_SINGLE_LINK || stream->signal == SIGNAL_TYPE_DVI_DUAL_LINK) tmds_present = true; + + /* Checking stream / link detection ensuring that PHY is active*/ + if (dc_is_dp_signal(stream->signal) && !stream->dpms_off) + display_count++; + }
for (i = 0; i < dc->link_count; i++) {
From: Matus Gajdos matuszpd@gmail.com
commit 269f399dc19f0e5c51711c3ba3bd06e0ef6ef403 upstream.
Otherwise bit clock remains running writing invalid data to the DAC.
Signed-off-by: Matus Gajdos matuszpd@gmail.com Acked-by: Shengjiu Wang shengjiu.wang@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230712124934.32232-1-matuszpd@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/fsl/fsl_sai.c | 2 +- sound/soc/fsl/fsl_sai.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -719,7 +719,7 @@ static void fsl_sai_config_disable(struc u32 xcsr, count = 100;
regmap_update_bits(sai->regmap, FSL_SAI_xCSR(tx, ofs), - FSL_SAI_CSR_TERE, 0); + FSL_SAI_CSR_TERE | FSL_SAI_CSR_BCE, 0);
/* TERE will remain set till the end of current frame */ do { --- a/sound/soc/fsl/fsl_sai.h +++ b/sound/soc/fsl/fsl_sai.h @@ -91,6 +91,7 @@ /* SAI Transmit/Receive Control Register */ #define FSL_SAI_CSR_TERE BIT(31) #define FSL_SAI_CSR_SE BIT(30) +#define FSL_SAI_CSR_BCE BIT(28) #define FSL_SAI_CSR_FR BIT(25) #define FSL_SAI_CSR_SR BIT(24) #define FSL_SAI_CSR_xF_SHIFT 16
From: Fabio Estevam festevam@denx.de
commit 86867aca7330e4fbcfa2a117e20b48bbb6c758a9 upstream.
This reverts commit ff87d619ac180444db297f043962a5c325ded47b.
Andreas reports that on an i.MX8MP-based system where MCLK needs to be used as an input, the MCLK pin is actually an output, despite not having the 'fsl,sai-mclk-direction-output' property present in the devicetree.
This is caused by commit ff87d619ac18 ("ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode") that sets FSL_SAI_MCTL_MCLK_EN unconditionally for imx8mm/8mn/8mp/93, causing the MCLK to always be configured as output.
FSL_SAI_MCTL_MCLK_EN corresponds to the MOE (MCLK Output Enable) bit of register MCR and the drivers sets it when the 'fsl,sai-mclk-direction-output' devicetree property is present.
Revert the commit to allow SAI to use MCLK as input as well.
Cc: stable@vger.kernel.org Fixes: ff87d619ac18 ("ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode") Reported-by: Andreas Henriksson andreas@fatal.se Signed-off-by: Fabio Estevam festevam@denx.de Acked-by: Shengjiu Wang shengjiu.wang@gmail.com Link: https://lore.kernel.org/r/20230706221827.1938990-1-festevam@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/fsl/fsl_sai.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -507,12 +507,6 @@ static int fsl_sai_set_bclk(struct snd_s savediv / 2 - 1); }
- if (sai->soc_data->max_register >= FSL_SAI_MCTL) { - /* SAI is in master mode at this point, so enable MCLK */ - regmap_update_bits(sai->regmap, FSL_SAI_MCTL, - FSL_SAI_MCTL_MCLK_EN, FSL_SAI_MCTL_MCLK_EN); - } - return 0; }
From: Sheetal sheetal@nvidia.com
commit 6dfe70be0b0dec0f9297811501bec26c05fd96ad upstream.
Byte mask for channel-1 of stream-1 is not getting enabled and this causes failures during ADX use cases. This happens because the byte map value 0 matches the byte map array and put() callback returns without enabling the corresponding bits in the byte mask.
ADX supports 4 output streams and each stream can have a maximum of 16 channels. Each byte in the input frame is uniquely mapped to a byte in one of these 4 outputs. This mapping is done with the help of byte map array via user space control setting. The byte map array size in the driver is 16 and each array element is of size 4 bytes. This corresponds to 64 byte map values.
Each byte in the byte map array can have any value between 0 to 255 to enable the corresponding bits in the byte mask. The value 256 is used as a way to disable the byte map. However the byte map array element cannot store this value. The put() callback disables the byte mask for 256 value and byte map value is reset to 0 for this case. This causes problems during subsequent runs since put() callback, for value of 0, just returns without enabling the byte mask. In short, the problem is coming because 0 and 256 control values are stored as 0 in the byte map array.
Right now fix the put() callback by actually looking at the byte mask array state to identify if any change is needed and update the fields accordingly. The get() callback needs an update as well to return the correct control value that user has set before. Note that when user set 256, the value is stored as 0 and byte mask is disabled. So byte mask state is used to either return 256 or the value from byte map array.
Given above, this looks bit complicated and all this happens because the byte map array is tightly packed and cannot actually store the 256 value. Right now the priority is to fix the existing failure and a TODO item is put to improve this logic.
Fixes: 3c97881b8c8a ("ASoC: tegra: Fix kcontrol put callback in ADX") Cc: stable@vger.kernel.org Signed-off-by: Sheetal sheetal@nvidia.com Reviewed-by: Mohan Kumar D mkumard@nvidia.com Reviewed-by: Sameer Pujar spujar@nvidia.com Link: https://lore.kernel.org/r/1688015537-31682-3-git-send-email-spujar@nvidia.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/tegra/tegra210_adx.c | 34 ++++++++++++++++++++++------------ 1 file changed, 22 insertions(+), 12 deletions(-)
--- a/sound/soc/tegra/tegra210_adx.c +++ b/sound/soc/tegra/tegra210_adx.c @@ -2,7 +2,7 @@ // // tegra210_adx.c - Tegra210 ADX driver // -// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved. +// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
#include <linux/clk.h> #include <linux/device.h> @@ -175,10 +175,20 @@ static int tegra210_adx_get_byte_map(str mc = (struct soc_mixer_control *)kcontrol->private_value; enabled = adx->byte_mask[mc->reg / 32] & (1 << (mc->reg % 32));
+ /* + * TODO: Simplify this logic to just return from bytes_map[] + * + * Presently below is required since bytes_map[] is + * tightly packed and cannot store the control value of 256. + * Byte mask state is used to know if 256 needs to be returned. + * Note that for control value of 256, the put() call stores 0 + * in the bytes_map[] and disables the corresponding bit in + * byte_mask[]. + */ if (enabled) ucontrol->value.integer.value[0] = bytes_map[mc->reg]; else - ucontrol->value.integer.value[0] = 0; + ucontrol->value.integer.value[0] = 256;
return 0; } @@ -192,19 +202,19 @@ static int tegra210_adx_put_byte_map(str int value = ucontrol->value.integer.value[0]; struct soc_mixer_control *mc = (struct soc_mixer_control *)kcontrol->private_value; + unsigned int mask_val = adx->byte_mask[mc->reg / 32];
- if (value == bytes_map[mc->reg]) + if (value >= 0 && value <= 255) + mask_val |= (1 << (mc->reg % 32)); + else + mask_val &= ~(1 << (mc->reg % 32)); + + if (mask_val == adx->byte_mask[mc->reg / 32]) return 0;
- if (value >= 0 && value <= 255) { - /* update byte map and enable slot */ - bytes_map[mc->reg] = value; - adx->byte_mask[mc->reg / 32] |= (1 << (mc->reg % 32)); - } else { - /* reset byte map and disable slot */ - bytes_map[mc->reg] = 0; - adx->byte_mask[mc->reg / 32] &= ~(1 << (mc->reg % 32)); - } + /* Update byte map and slot */ + bytes_map[mc->reg] = value % 256; + adx->byte_mask[mc->reg / 32] = mask_val;
return 1; }
From: Sameer Pujar spujar@nvidia.com
commit 70a6404ff610aa4889d98977da131c37f9ff9d1f upstream.
Following prints are observed while testing audio on Jetson AGX Orin which has onboard RT5640 audio codec:
BUG: sleeping function called from invalid context at kernel/workqueue.c:3027 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/0 preempt_count: 10001, expected: 0 RCU nest depth: 0, expected: 0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:159 __handle_irq_event_percpu+0x1e0/0x270 ---[ end trace ad1c64905aac14a6 ]-
The IRQ handler rt5640_irq() runs in interrupt context and can sleep during cancel_delayed_work_sync().
Fix this by running IRQ handler, rt5640_irq(), in thread context. Hence replace request_irq() calls with devm_request_threaded_irq().
Fixes: 051dade34695 ("ASoC: rt5640: Fix the wrong state of JD1 and JD2") Cc: stable@vger.kernel.org Cc: Oder Chiou oder_chiou@realtek.com Signed-off-by: Sameer Pujar spujar@nvidia.com Link: https://lore.kernel.org/r/1688015537-31682-4-git-send-email-spujar@nvidia.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/rt5640.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/sound/soc/codecs/rt5640.c +++ b/sound/soc/codecs/rt5640.c @@ -2567,9 +2567,10 @@ static void rt5640_enable_jack_detect(st if (jack_data && jack_data->use_platform_clock) rt5640->use_platform_clock = jack_data->use_platform_clock;
- ret = request_irq(rt5640->irq, rt5640_irq, - IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT, - "rt5640", rt5640); + ret = devm_request_threaded_irq(component->dev, rt5640->irq, + NULL, rt5640_irq, + IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT, + "rt5640", rt5640); if (ret) { dev_warn(component->dev, "Failed to reguest IRQ %d: %d\n", rt5640->irq, ret); rt5640_disable_jack_detect(component); @@ -2622,8 +2623,9 @@ static void rt5640_enable_hda_jack_detec
rt5640->jack = jack;
- ret = request_irq(rt5640->irq, rt5640_irq, - IRQF_TRIGGER_RISING | IRQF_ONESHOT, "rt5640", rt5640); + ret = devm_request_threaded_irq(component->dev, rt5640->irq, + NULL, rt5640_irq, IRQF_TRIGGER_RISING | IRQF_ONESHOT, + "rt5640", rt5640); if (ret) { dev_warn(component->dev, "Failed to reguest IRQ %d: %d\n", rt5640->irq, ret); rt5640->irq = -ENXIO;
From: Thomas Petazzoni thomas.petazzoni@bootlin.com
commit e51df4f81b02bcdd828a04de7c1eb6a92988b61e upstream.
In commit 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table pointer"), 9 years ago, some random guy fixed the cs42l51 after it was split into a core part and an I2C part to properly match based on a Device Tree compatible string.
However, the fix in this commit is wrong: the MODULE_DEVICE_TABLE(of, ....) is in the core part of the driver, not the I2C part. Therefore, automatic module loading based on module.alias, based on matching with the DT compatible string, loads the core part of the driver, but not the I2C part. And threfore, the i2c_driver is not registered, and the codec is not known to the system, nor matched with a DT node with the corresponding compatible string.
In order to fix that, we move the MODULE_DEVICE_TABLE(of, ...) into the I2C part of the driver. The cs42l51_of_match[] array is also moved as well, as it is not possible to have this definition in one file, and the MODULE_DEVICE_TABLE(of, ...) invocation in another file, due to how MODULE_DEVICE_TABLE works.
Thanks to this commit, the I2C part of the driver now properly autoloads, and thanks to its dependency on the core part, the core part gets autoloaded as well, resulting in a functional sound card without having to manually load kernel modules.
Fixes: 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table pointer") Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Link: https://lore.kernel.org/r/20230713112112.778576-1-thomas.petazzoni@bootlin.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/cs42l51-i2c.c | 6 ++++++ sound/soc/codecs/cs42l51.c | 7 ------- sound/soc/codecs/cs42l51.h | 1 - 3 files changed, 6 insertions(+), 8 deletions(-)
--- a/sound/soc/codecs/cs42l51-i2c.c +++ b/sound/soc/codecs/cs42l51-i2c.c @@ -19,6 +19,12 @@ static struct i2c_device_id cs42l51_i2c_ }; MODULE_DEVICE_TABLE(i2c, cs42l51_i2c_id);
+const struct of_device_id cs42l51_of_match[] = { + { .compatible = "cirrus,cs42l51", }, + { } +}; +MODULE_DEVICE_TABLE(of, cs42l51_of_match); + static int cs42l51_i2c_probe(struct i2c_client *i2c) { struct regmap_config config; --- a/sound/soc/codecs/cs42l51.c +++ b/sound/soc/codecs/cs42l51.c @@ -826,13 +826,6 @@ int __maybe_unused cs42l51_resume(struct } EXPORT_SYMBOL_GPL(cs42l51_resume);
-const struct of_device_id cs42l51_of_match[] = { - { .compatible = "cirrus,cs42l51", }, - { } -}; -MODULE_DEVICE_TABLE(of, cs42l51_of_match); -EXPORT_SYMBOL_GPL(cs42l51_of_match); - MODULE_AUTHOR("Arnaud Patard arnaud.patard@rtp-net.org"); MODULE_DESCRIPTION("Cirrus Logic CS42L51 ALSA SoC Codec Driver"); MODULE_LICENSE("GPL"); --- a/sound/soc/codecs/cs42l51.h +++ b/sound/soc/codecs/cs42l51.h @@ -16,7 +16,6 @@ int cs42l51_probe(struct device *dev, st void cs42l51_remove(struct device *dev); int __maybe_unused cs42l51_suspend(struct device *dev); int __maybe_unused cs42l51_resume(struct device *dev); -extern const struct of_device_id cs42l51_of_match[];
#define CS42L51_CHIP_ID 0x1B #define CS42L51_CHIP_REV_A 0x00
From: Johan Hovold johan+linaro@kernel.org
commit ed0dd9205bf69593edb495cb4b086dbae96a3f05 upstream.
Allocation of the clash control structure may fail so add the missing error handling to avoid dereferencing an error pointer.
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705123018.30903-4-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd938x.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -3090,6 +3090,10 @@ static int wcd938x_soc_codec_probe(struc WCD938X_ID_MASK);
wcd938x->clsh_info = wcd_clsh_ctrl_alloc(component, WCD938X); + if (IS_ERR(wcd938x->clsh_info)) { + pm_runtime_put(dev); + return PTR_ERR(wcd938x->clsh_info); + }
wcd938x_io_init(wcd938x); /* Set all interrupts as edge triggered */
From: Nathan Chancellor nathan@kernel.org
commit d9ba2975e98a4bec0a9f8d4be4c1de8883fccb71 upstream.
After commit 6085f9e6dc19 ("ASoC: cs35l45: IRQ support"), without any other configuration that selects CONFIG_REGMAP_IRQ, modpost errors out with:
ERROR: modpost: "regmap_irq_get_virq" [sound/soc/codecs/snd-soc-cs35l45.ko] undefined! ERROR: modpost: "devm_regmap_add_irq_chip" [sound/soc/codecs/snd-soc-cs35l45.ko] undefined!
Add the Kconfig selection to ensure these functions get built and included, which resolves the build failure.
Cc: stable@vger.kernel.org Fixes: 6085f9e6dc19 ("ASoC: cs35l45: IRQ support") Reported-by: Marcus Seyfarth m.seyfarth@gmail.com Closes: https://github.com/ClangBuiltLinux/linux/issues/1882 Signed-off-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/20230703-cs35l45-select-regmap_irq-v1-1-37d7e838b6... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/sound/soc/codecs/Kconfig +++ b/sound/soc/codecs/Kconfig @@ -701,6 +701,7 @@ config SND_SOC_CS35L41_I2C
config SND_SOC_CS35L45 tristate + select REGMAP_IRQ
config SND_SOC_CS35L45_SPI tristate "Cirrus Logic CS35L45 CODEC (SPI)"
From: Johan Hovold johan+linaro@kernel.org
commit a5475829adcc600bc69ee9ff7c9e3e43fb4f8d30 upstream.
The MBHC resources must be released on component probe failure and removal so can not be tied to the lifetime of the component device.
This is specifically needed to allow probe deferrals of the sound card which otherwise fails when reprobing the codec component:
snd-sc8280xp sound: ASoC: failed to instantiate card -517 genirq: Flags mismatch irq 299. 00002001 (mbhc sw intr) vs. 00002001 (mbhc sw intr) wcd938x_codec audio-codec: Failed to request mbhc interrupts -16 wcd938x_codec audio-codec: mbhc initialization failed wcd938x_codec audio-codec: ASoC: error at snd_soc_component_probe on audio-codec: -16 snd-sc8280xp sound: ASoC: failed to instantiate card -16
Fixes: 0e5c9e7ff899 ("ASoC: codecs: wcd: add multi button Headset detection support") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705123018.30903-7-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd-mbhc-v2.c | 57 +++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 16 deletions(-)
--- a/sound/soc/codecs/wcd-mbhc-v2.c +++ b/sound/soc/codecs/wcd-mbhc-v2.c @@ -1454,7 +1454,7 @@ struct wcd_mbhc *wcd_mbhc_init(struct sn return ERR_PTR(-EINVAL); }
- mbhc = devm_kzalloc(dev, sizeof(*mbhc), GFP_KERNEL); + mbhc = kzalloc(sizeof(*mbhc), GFP_KERNEL); if (!mbhc) return ERR_PTR(-ENOMEM);
@@ -1474,61 +1474,76 @@ struct wcd_mbhc *wcd_mbhc_init(struct sn
INIT_WORK(&mbhc->correct_plug_swch, wcd_correct_swch_plug);
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_sw_intr, NULL, + ret = request_threaded_irq(mbhc->intr_ids->mbhc_sw_intr, NULL, wcd_mbhc_mech_plug_detect_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "mbhc sw intr", mbhc); if (ret) - goto err; + goto err_free_mbhc;
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_btn_press_intr, NULL, + ret = request_threaded_irq(mbhc->intr_ids->mbhc_btn_press_intr, NULL, wcd_mbhc_btn_press_handler, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "Button Press detect", mbhc); if (ret) - goto err; + goto err_free_sw_intr;
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_btn_release_intr, NULL, + ret = request_threaded_irq(mbhc->intr_ids->mbhc_btn_release_intr, NULL, wcd_mbhc_btn_release_handler, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "Button Release detect", mbhc); if (ret) - goto err; + goto err_free_btn_press_intr;
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_hs_ins_intr, NULL, + ret = request_threaded_irq(mbhc->intr_ids->mbhc_hs_ins_intr, NULL, wcd_mbhc_adc_hs_ins_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "Elect Insert", mbhc); if (ret) - goto err; + goto err_free_btn_release_intr;
disable_irq_nosync(mbhc->intr_ids->mbhc_hs_ins_intr);
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_hs_rem_intr, NULL, + ret = request_threaded_irq(mbhc->intr_ids->mbhc_hs_rem_intr, NULL, wcd_mbhc_adc_hs_rem_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "Elect Remove", mbhc); if (ret) - goto err; + goto err_free_hs_ins_intr;
disable_irq_nosync(mbhc->intr_ids->mbhc_hs_rem_intr);
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->hph_left_ocp, NULL, + ret = request_threaded_irq(mbhc->intr_ids->hph_left_ocp, NULL, wcd_mbhc_hphl_ocp_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "HPH_L OCP detect", mbhc); if (ret) - goto err; + goto err_free_hs_rem_intr;
- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->hph_right_ocp, NULL, + ret = request_threaded_irq(mbhc->intr_ids->hph_right_ocp, NULL, wcd_mbhc_hphr_ocp_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "HPH_R OCP detect", mbhc); if (ret) - goto err; + goto err_free_hph_left_ocp;
return mbhc; -err: + +err_free_hph_left_ocp: + free_irq(mbhc->intr_ids->hph_left_ocp, mbhc); +err_free_hs_rem_intr: + free_irq(mbhc->intr_ids->mbhc_hs_rem_intr, mbhc); +err_free_hs_ins_intr: + free_irq(mbhc->intr_ids->mbhc_hs_ins_intr, mbhc); +err_free_btn_release_intr: + free_irq(mbhc->intr_ids->mbhc_btn_release_intr, mbhc); +err_free_btn_press_intr: + free_irq(mbhc->intr_ids->mbhc_btn_press_intr, mbhc); +err_free_sw_intr: + free_irq(mbhc->intr_ids->mbhc_sw_intr, mbhc); +err_free_mbhc: + kfree(mbhc); + dev_err(dev, "Failed to request mbhc interrupts %d\n", ret);
return ERR_PTR(ret); @@ -1537,9 +1552,19 @@ EXPORT_SYMBOL(wcd_mbhc_init);
void wcd_mbhc_deinit(struct wcd_mbhc *mbhc) { + free_irq(mbhc->intr_ids->hph_right_ocp, mbhc); + free_irq(mbhc->intr_ids->hph_left_ocp, mbhc); + free_irq(mbhc->intr_ids->mbhc_hs_rem_intr, mbhc); + free_irq(mbhc->intr_ids->mbhc_hs_ins_intr, mbhc); + free_irq(mbhc->intr_ids->mbhc_btn_release_intr, mbhc); + free_irq(mbhc->intr_ids->mbhc_btn_press_intr, mbhc); + free_irq(mbhc->intr_ids->mbhc_sw_intr, mbhc); + mutex_lock(&mbhc->lock); wcd_cancel_hs_detect_plug(mbhc, &mbhc->correct_plug_swch); mutex_unlock(&mbhc->lock); + + kfree(mbhc); } EXPORT_SYMBOL(wcd_mbhc_deinit);
From: Johan Hovold johan+linaro@kernel.org
commit 46ec420573cefa1fc98025e7e6841bdafd6f1e20 upstream.
Propagate errors when failing to load the topology component so that probe deferrals can be handled.
Fixes: 36ad9bf1d93d ("ASoC: qdsp6: audioreach: add topology support") Cc: stable@vger.kernel.org # 5.17 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705123018.30903-3-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/qcom/qdsp6/topology.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/soc/qcom/qdsp6/topology.c +++ b/sound/soc/qcom/qdsp6/topology.c @@ -1277,8 +1277,8 @@ int audioreach_tplg_init(struct snd_soc_
ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw); if (ret < 0) { - dev_err(dev, "tplg component load failed%d\n", ret); - ret = -EINVAL; + if (ret != -EPROBE_DEFER) + dev_err(dev, "tplg component load failed: %d\n", ret); }
release_firmware(fw);
From: Sheetal sheetal@nvidia.com
commit 49bd7b08149417a30aa7d92c8c85b3518de44a76 upstream.
Byte mask for channel-1 of stream-1 is not getting enabled and this causes failures during AMX use cases. This happens because the byte map value 0 matches the byte map array and put() callback returns without enabling the corresponding bits in the byte mask.
AMX supports 4 input streams and each stream can take a maximum of 16 channels. Each byte in the output frame is uniquely mapped to a byte in one of these 4 inputs. This mapping is done with the help of byte map array via user space control setting. The byte map array size in the driver is 16 and each array element is of size 4 bytes. This corresponds to 64 byte map values.
Each byte in the byte map array can have any value between 0 to 255 to enable the corresponding bits in the byte mask. The value 256 is used as a way to disable the byte map. However the byte map array element cannot store this value. The put() callback disables the byte mask for 256 value and byte map value is reset to 0 for this case. This causes problems during subsequent runs since put() callback, for value of 0, just returns without enabling the byte mask. In short, the problem is coming because 0 and 256 control values are stored as 0 in the byte map array.
Right now fix the put() callback by actually looking at the byte mask array state to identify if any change is needed and update the fields accordingly. The get() callback needs an update as well to return the correct control value that user has set before. Note that when user sets 256, the value is stored as 0 and byte mask is disabled. So byte mask state is used to either return 256 or the value from byte map array.
Given above, this looks bit complicated and all this happens because the byte map array is tightly packed and cannot actually store the 256 value. Right now the priority is to fix the existing failure and a TODO item is put to improve this logic.
Fixes: 8db78ace1ba8 ("ASoC: tegra: Fix kcontrol put callback in AMX") Cc: stable@vger.kernel.org Signed-off-by: Sheetal sheetal@nvidia.com Reviewed-by: Mohan Kumar D mkumard@nvidia.com Reviewed-by: Sameer Pujar spujar@nvidia.com Link: https://lore.kernel.org/r/1688015537-31682-2-git-send-email-spujar@nvidia.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/tegra/tegra210_amx.c | 40 ++++++++++++++++++++++------------------ 1 file changed, 22 insertions(+), 18 deletions(-)
--- a/sound/soc/tegra/tegra210_amx.c +++ b/sound/soc/tegra/tegra210_amx.c @@ -2,7 +2,7 @@ // // tegra210_amx.c - Tegra210 AMX driver // -// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved. +// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
#include <linux/clk.h> #include <linux/device.h> @@ -203,10 +203,20 @@ static int tegra210_amx_get_byte_map(str else enabled = amx->byte_mask[0] & (1 << reg);
+ /* + * TODO: Simplify this logic to just return from bytes_map[] + * + * Presently below is required since bytes_map[] is + * tightly packed and cannot store the control value of 256. + * Byte mask state is used to know if 256 needs to be returned. + * Note that for control value of 256, the put() call stores 0 + * in the bytes_map[] and disables the corresponding bit in + * byte_mask[]. + */ if (enabled) ucontrol->value.integer.value[0] = bytes_map[reg]; else - ucontrol->value.integer.value[0] = 0; + ucontrol->value.integer.value[0] = 256;
return 0; } @@ -221,25 +231,19 @@ static int tegra210_amx_put_byte_map(str unsigned char *bytes_map = (unsigned char *)&amx->map; int reg = mc->reg; int value = ucontrol->value.integer.value[0]; + unsigned int mask_val = amx->byte_mask[reg / 32];
- if (value == bytes_map[reg]) + if (value >= 0 && value <= 255) + mask_val |= (1 << (reg % 32)); + else + mask_val &= ~(1 << (reg % 32)); + + if (mask_val == amx->byte_mask[reg / 32]) return 0;
- if (value >= 0 && value <= 255) { - /* Update byte map and enable slot */ - bytes_map[reg] = value; - if (reg > 31) - amx->byte_mask[1] |= (1 << (reg - 32)); - else - amx->byte_mask[0] |= (1 << reg); - } else { - /* Reset byte map and disable slot */ - bytes_map[reg] = 0; - if (reg > 31) - amx->byte_mask[1] &= ~(1 << (reg - 32)); - else - amx->byte_mask[0] &= ~(1 << reg); - } + /* Update byte map and slot */ + bytes_map[reg] = value % 256; + amx->byte_mask[reg / 32] = mask_val;
return 1; }
From: Johan Hovold johan+linaro@kernel.org
commit a3406f87775fee986876e03f93a84385f54d5999 upstream.
Make sure to release allocated resources on component probe failure and on remove.
This is specifically needed to allow probe deferrals of the sound card which otherwise fails when reprobing the codec component:
snd-sc8280xp sound: ASoC: failed to instantiate card -517 genirq: Flags mismatch irq 289. 00002001 (HPHR PDM WD INT) vs. 00002001 (HPHR PDM WD INT) wcd938x_codec audio-codec: Failed to request HPHR WD interrupt (-16) genirq: Flags mismatch irq 290. 00002001 (HPHL PDM WD INT) vs. 00002001 (HPHL PDM WD INT) wcd938x_codec audio-codec: Failed to request HPHL WD interrupt (-16) genirq: Flags mismatch irq 291. 00002001 (AUX PDM WD INT) vs. 00002001 (AUX PDM WD INT) wcd938x_codec audio-codec: Failed to request Aux WD interrupt (-16) genirq: Flags mismatch irq 292. 00002001 (mbhc sw intr) vs. 00002001 (mbhc sw intr) wcd938x_codec audio-codec: Failed to request mbhc interrupts -16
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705123018.30903-5-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd938x.c | 55 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 48 insertions(+), 7 deletions(-)
--- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -2633,6 +2633,14 @@ static int wcd938x_mbhc_init(struct snd_
return 0; } + +static void wcd938x_mbhc_deinit(struct snd_soc_component *component) +{ + struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component); + + wcd_mbhc_deinit(wcd938x->wcd_mbhc); +} + /* END MBHC */
static const struct snd_kcontrol_new wcd938x_snd_controls[] = { @@ -3113,20 +3121,26 @@ static int wcd938x_soc_codec_probe(struc ret = request_threaded_irq(wcd938x->hphr_pdm_wd_int, NULL, wcd938x_wd_handle_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "HPHR PDM WD INT", wcd938x); - if (ret) + if (ret) { dev_err(dev, "Failed to request HPHR WD interrupt (%d)\n", ret); + goto err_free_clsh_ctrl; + }
ret = request_threaded_irq(wcd938x->hphl_pdm_wd_int, NULL, wcd938x_wd_handle_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "HPHL PDM WD INT", wcd938x); - if (ret) + if (ret) { dev_err(dev, "Failed to request HPHL WD interrupt (%d)\n", ret); + goto err_free_hphr_pdm_wd_int; + }
ret = request_threaded_irq(wcd938x->aux_pdm_wd_int, NULL, wcd938x_wd_handle_irq, IRQF_ONESHOT | IRQF_TRIGGER_RISING, "AUX PDM WD INT", wcd938x); - if (ret) + if (ret) { dev_err(dev, "Failed to request Aux WD interrupt (%d)\n", ret); + goto err_free_hphl_pdm_wd_int; + }
/* Disable watchdog interrupt for HPH and AUX */ disable_irq_nosync(wcd938x->hphr_pdm_wd_int); @@ -3141,7 +3155,7 @@ static int wcd938x_soc_codec_probe(struc dev_err(component->dev, "%s: Failed to add snd ctrls for variant: %d\n", __func__, wcd938x->variant); - goto err; + goto err_free_aux_pdm_wd_int; } break; case WCD9385: @@ -3151,7 +3165,7 @@ static int wcd938x_soc_codec_probe(struc dev_err(component->dev, "%s: Failed to add snd ctrls for variant: %d\n", __func__, wcd938x->variant); - goto err; + goto err_free_aux_pdm_wd_int; } break; default: @@ -3159,12 +3173,38 @@ static int wcd938x_soc_codec_probe(struc }
ret = wcd938x_mbhc_init(component); - if (ret) + if (ret) { dev_err(component->dev, "mbhc initialization failed\n"); -err: + goto err_free_aux_pdm_wd_int; + } + + return 0; + +err_free_aux_pdm_wd_int: + free_irq(wcd938x->aux_pdm_wd_int, wcd938x); +err_free_hphl_pdm_wd_int: + free_irq(wcd938x->hphl_pdm_wd_int, wcd938x); +err_free_hphr_pdm_wd_int: + free_irq(wcd938x->hphr_pdm_wd_int, wcd938x); +err_free_clsh_ctrl: + wcd_clsh_ctrl_free(wcd938x->clsh_info); + return ret; }
+static void wcd938x_soc_codec_remove(struct snd_soc_component *component) +{ + struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component); + + wcd938x_mbhc_deinit(component); + + free_irq(wcd938x->aux_pdm_wd_int, wcd938x); + free_irq(wcd938x->hphl_pdm_wd_int, wcd938x); + free_irq(wcd938x->hphr_pdm_wd_int, wcd938x); + + wcd_clsh_ctrl_free(wcd938x->clsh_info); +} + static int wcd938x_codec_set_jack(struct snd_soc_component *comp, struct snd_soc_jack *jack, void *data) { @@ -3181,6 +3221,7 @@ static int wcd938x_codec_set_jack(struct static const struct snd_soc_component_driver soc_codec_dev_wcd938x = { .name = "wcd938x_codec", .probe = wcd938x_soc_codec_probe, + .remove = wcd938x_soc_codec_remove, .controls = wcd938x_snd_controls, .num_controls = ARRAY_SIZE(wcd938x_snd_controls), .dapm_widgets = wcd938x_dapm_widgets,
From: Johan Hovold johan+linaro@kernel.org
commit 7dfae2631bfbdebecd35fe7b472ab3cc95c9ed66 upstream.
MBHC initialisation can fail so add the missing error handling to avoid dereferencing an error pointer when later configuring the jack:
Unable to handle kernel paging request at virtual address fffffffffffffff8
pc : wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc] lr : wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x]
Call trace: wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc] wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x] snd_soc_component_set_jack+0x28/0x8c [snd_soc_core] qcom_snd_wcd_jack_setup+0x7c/0x19c [snd_soc_qcom_common] sc8280xp_snd_init+0x20/0x2c [snd_soc_sc8280xp] snd_soc_link_init+0x28/0x90 [snd_soc_core] snd_soc_bind_card+0x628/0xbfc [snd_soc_core] snd_soc_register_card+0xec/0x104 [snd_soc_core] devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core] sc8280xp_platform_probe+0xf0/0x108 [snd_soc_sc8280xp]
Fixes: bcee7ed09b8e ("ASoC: codecs: wcd938x: add Multi Button Headset Control support") Cc: stable@vger.kernel.org # 5.15 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Link: https://lore.kernel.org/r/20230703124701.11734-1-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd938x.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -2625,6 +2625,8 @@ static int wcd938x_mbhc_init(struct snd_ WCD938X_IRQ_HPHR_OCP_INT);
wcd938x->wcd_mbhc = wcd_mbhc_init(component, &mbhc_cb, intr_ids, wcd_mbhc_fields, true); + if (IS_ERR(wcd938x->wcd_mbhc)) + return PTR_ERR(wcd938x->wcd_mbhc);
snd_soc_add_component_controls(component, impedance_detect_controls, ARRAY_SIZE(impedance_detect_controls));
From: Johan Hovold johan+linaro@kernel.org
commit 798590cc7d3c2b5f3a7548d96dd4d8a081c1bc39 upstream.
Make sure to release allocated MBHC resources also on component remove.
This is specifically needed to allow probe deferrals of the sound card which otherwise fails when reprobing the codec component.
Fixes: 9fb9b1690f0b ("ASoC: codecs: wcd934x: add mbhc support") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705123018.30903-6-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd934x.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/sound/soc/codecs/wcd934x.c +++ b/sound/soc/codecs/wcd934x.c @@ -3044,6 +3044,17 @@ static int wcd934x_mbhc_init(struct snd_
return 0; } + +static void wcd934x_mbhc_deinit(struct snd_soc_component *component) +{ + struct wcd934x_codec *wcd = snd_soc_component_get_drvdata(component); + + if (!wcd->mbhc) + return; + + wcd_mbhc_deinit(wcd->mbhc); +} + static int wcd934x_comp_probe(struct snd_soc_component *component) { struct wcd934x_codec *wcd = dev_get_drvdata(component->dev); @@ -3077,6 +3088,7 @@ static void wcd934x_comp_remove(struct s { struct wcd934x_codec *wcd = dev_get_drvdata(comp->dev);
+ wcd934x_mbhc_deinit(comp); wcd_clsh_ctrl_free(wcd->clsh_ctrl); }
From: Johan Hovold johan+linaro@kernel.org
commit 85a61b1ce461a3f62f1019e5e6423c393c542bff upstream.
Make sure to resume the codec and soundwire device before trying to read the codec variant and configure the device during component probe.
This specifically avoids interpreting (a masked and shifted) -EBUSY errno as the variant:
wcd938x_codec audio-codec: ASoC: error at soc_component_read_no_lock on audio-codec for register: [0x000034b0] -16
when the soundwire device happens to be suspended, which in turn prevents some headphone controls from being registered.
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Reported-by: Steev Klimaszewski steev@kali.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Link: https://lore.kernel.org/r/20230630120318.6571-1-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd938x.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -3095,6 +3095,10 @@ static int wcd938x_soc_codec_probe(struc
snd_soc_component_init_regmap(component, wcd938x->regmap);
+ ret = pm_runtime_resume_and_get(dev); + if (ret < 0) + return ret; + wcd938x->variant = snd_soc_component_read_field(component, WCD938X_DIGITAL_EFUSE_REG_0, WCD938X_ID_MASK); @@ -3112,6 +3116,8 @@ static int wcd938x_soc_codec_probe(struc (WCD938X_DIGITAL_INTR_LEVEL_0 + i), 0); }
+ pm_runtime_put(dev); + wcd938x->hphr_pdm_wd_int = regmap_irq_get_virq(wcd938x->irq_chip, WCD938X_IRQ_HPHR_PDM_WD_INT); wcd938x->hphl_pdm_wd_int = regmap_irq_get_virq(wcd938x->irq_chip,
From: Johan Hovold johan+linaro@kernel.org
commit 6f49256897083848ce9a59651f6b53fc80462397 upstream.
Make sure that the soundwire device used for register accesses has been enumerated and initialised before trying to read the codec variant during component probe.
This specifically avoids interpreting (a masked and shifted) -EBUSY errno as the variant:
wcd938x_codec audio-codec: ASoC: error at soc_component_read_no_lock on audio-codec for register: [0x000034b0] -16
in case the soundwire device has not yet been initialised, which in turn prevents some headphone controls from being registered.
Fixes: 8d78602aa87a ("ASoC: codecs: wcd938x: add basic driver") Cc: stable@vger.kernel.org # 5.14 Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Reported-by: Steev Klimaszewski steev@kali.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Tested-by: Steev Klimaszewski steev@kali.org Link: https://lore.kernel.org/r/20230701094723.29379-1-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wcd938x.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -3090,9 +3090,18 @@ static int wcd938x_irq_init(struct wcd93 static int wcd938x_soc_codec_probe(struct snd_soc_component *component) { struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component); + struct sdw_slave *tx_sdw_dev = wcd938x->tx_sdw_dev; struct device *dev = component->dev; + unsigned long time_left; int ret, i;
+ time_left = wait_for_completion_timeout(&tx_sdw_dev->initialization_complete, + msecs_to_jiffies(2000)); + if (!time_left) { + dev_err(dev, "soundwire device init timeout\n"); + return -ETIMEDOUT; + } + snd_soc_component_init_regmap(component, wcd938x->regmap);
ret = pm_runtime_resume_and_get(dev);
From: Marc Zyngier maz@kernel.org
commit fe769e6c1f80f542d6f4e7f7c8c6bf20c1307f99 upstream.
It recently appeared that, when running VHE, there is a notable difference between using CNTKCTL_EL1 and CNTHCTL_EL2, despite what the architecture documents:
- When accessed from EL2, bits [19:18] and [16:10] of CNTKCTL_EL1 have the same assignment as CNTHCTL_EL2 - When accessed from EL1, bits [19:18] and [16:10] are RES0
It is all OK, until you factor in NV, where the EL2 guest runs at EL1. In this configuration, CNTKCTL_EL11 doesn't trap, nor ends up in the VNCR page. This means that any write from the guest affecting CNTHCTL_EL2 using CNTKCTL_EL1 ends up losing some state. Not good.
The fix it obvious: don't use CNTKCTL_EL1 if you want to change bits that are not part of the EL1 definition of CNTKCTL_EL1, and use CNTHCTL_EL2 instead. This doesn't change anything for a bare-metal OS, and fixes it when running under NV. The NV hypervisor will itself have to work harder to merge the two accessors.
Note that there is a pending update to the architecture to address this issue by making the affected bits UNKNOWN when CNTKCTL_EL1 is used from EL2 with VHE enabled.
Fixes: c605ee245097 ("KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2") Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org # v6.4 Reviewed-by: Eric Auger eric.auger@redhat.com Link: https://lore.kernel.org/r/20230627140557.544885-1-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/arch_timer.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -827,8 +827,8 @@ static void timer_set_traps(struct kvm_v assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr); assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr);
- /* This only happens on VHE, so use the CNTKCTL_EL1 accessor */ - sysreg_clear_set(cntkctl_el1, clr, set); + /* This only happens on VHE, so use the CNTHCTL_EL2 accessor. */ + sysreg_clear_set(cnthctl_el2, clr, set); }
void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu) @@ -1559,7 +1559,7 @@ no_vgic: void kvm_timer_init_vhe(void) { if (cpus_have_final_cap(ARM64_HAS_ECV_CNTPOFF)) - sysreg_clear_set(cntkctl_el1, 0, CNTHCTL_ECV); + sysreg_clear_set(cnthctl_el2, 0, CNTHCTL_ECV); }
int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
From: Oliver Upton oliver.upton@linux.dev
commit df6556adf27b7372cfcd97e1c0afb0d516c8279f upstream.
Userspace is allowed to select any PAGE_SIZE aligned hva to back guest memory. This is even the case with hugepages, although it is a rather suboptimal configuration as PTE level mappings are used at stage-2.
The arm64 page aging handlers have an assumption that the specified range is exactly one page/block of memory, which in the aforementioned case is not necessarily true. All together this leads to the WARN() in kvm_age_gfn() firing.
However, the WARN is only part of the issue as the table walkers visit at most a single leaf PTE. For hugepage-backed memory in a memslot that isn't hugepage-aligned, page aging entirely misses accesses to the hugepage beyond the first page in the memslot.
Add a new walker dedicated to handling page aging MMU notifiers capable of walking a range of PTEs. Convert kvm(_test)_age_gfn() over to the new walker and drop the WARN that caught the issue in the first place. The implementation of this walker was inspired by the test_clear_young() implementation by Yu Zhao [*], but repurposed to address a bug in the existing aging implementation.
Cc: stable@vger.kernel.org # v5.15 Fixes: 056aad67f836 ("kvm: arm/arm64: Rework gpa callback handlers") Link: https://lore.kernel.org/kvmarm/20230526234435.662652-6-yuzhao@google.com/ Co-developed-by: Yu Zhao yuzhao@google.com Signed-off-by: Yu Zhao yuzhao@google.com Reported-by: Reiji Watanabe reijiw@google.com Reviewed-by: Marc Zyngier maz@kernel.org Reviewed-by: Shaoqin Huang shahuang@redhat.com Link: https://lore.kernel.org/r/20230627235405.4069823-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/kvm_pgtable.h | 26 ++++++------------- arch/arm64/kvm/hyp/pgtable.c | 47 ++++++++++++++++++++++++++++------- arch/arm64/kvm/mmu.c | 18 +++++-------- 3 files changed, 55 insertions(+), 36 deletions(-)
--- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -556,22 +556,26 @@ int kvm_pgtable_stage2_wrprotect(struct kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr);
/** - * kvm_pgtable_stage2_mkold() - Clear the access flag in a page-table entry. + * kvm_pgtable_stage2_test_clear_young() - Test and optionally clear the access + * flag in a page-table entry. * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*(). * @addr: Intermediate physical address to identify the page-table entry. + * @size: Size of the address range to visit. + * @mkold: True if the access flag should be cleared. * * The offset of @addr within a page is ignored. * - * If there is a valid, leaf page-table entry used to translate @addr, then - * clear the access flag in that entry. + * Tests and conditionally clears the access flag for every valid, leaf + * page-table entry used to translate the range [@addr, @addr + @size). * * Note that it is the caller's responsibility to invalidate the TLB after * calling this function to ensure that the updated permissions are visible * to the CPUs. * - * Return: The old page-table entry prior to clearing the flag, 0 on failure. + * Return: True if any of the visited PTEs had the access flag set. */ -kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr); +bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr, + u64 size, bool mkold);
/** * kvm_pgtable_stage2_relax_perms() - Relax the permissions enforced by a @@ -594,18 +598,6 @@ int kvm_pgtable_stage2_relax_perms(struc enum kvm_pgtable_prot prot);
/** - * kvm_pgtable_stage2_is_young() - Test whether a page-table entry has the - * access flag set. - * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*(). - * @addr: Intermediate physical address to identify the page-table entry. - * - * The offset of @addr within a page is ignored. - * - * Return: True if the page-table entry has the access flag set, false otherwise. - */ -bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr); - -/** * kvm_pgtable_stage2_flush_range() - Clean and invalidate data cache to Point * of Coherency for guest stage-2 address * range. --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1173,25 +1173,54 @@ kvm_pte_t kvm_pgtable_stage2_mkyoung(str return pte; }
-kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr) +struct stage2_age_data { + bool mkold; + bool young; +}; + +static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx, + enum kvm_pgtable_walk_flags visit) { - kvm_pte_t pte = 0; - stage2_update_leaf_attrs(pgt, addr, 1, 0, KVM_PTE_LEAF_ATTR_LO_S2_AF, - &pte, NULL, 0); + kvm_pte_t new = ctx->old & ~KVM_PTE_LEAF_ATTR_LO_S2_AF; + struct stage2_age_data *data = ctx->arg; + + if (!kvm_pte_valid(ctx->old) || new == ctx->old) + return 0; + + data->young = true; + + /* + * stage2_age_walker() is always called while holding the MMU lock for + * write, so this will always succeed. Nonetheless, this deliberately + * follows the race detection pattern of the other stage-2 walkers in + * case the locking mechanics of the MMU notifiers is ever changed. + */ + if (data->mkold && !stage2_try_set_pte(ctx, new)) + return -EAGAIN; + /* * "But where's the TLBI?!", you scream. * "Over in the core code", I sigh. * * See the '->clear_flush_young()' callback on the KVM mmu notifier. */ - return pte; + return 0; }
-bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr) +bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr, + u64 size, bool mkold) { - kvm_pte_t pte = 0; - stage2_update_leaf_attrs(pgt, addr, 1, 0, 0, &pte, NULL, 0); - return pte & KVM_PTE_LEAF_ATTR_LO_S2_AF; + struct stage2_age_data data = { + .mkold = mkold, + }; + struct kvm_pgtable_walker walker = { + .cb = stage2_age_walker, + .arg = &data, + .flags = KVM_PGTABLE_WALK_LEAF, + }; + + WARN_ON(kvm_pgtable_walk(pgt, addr, size, &walker)); + return data.young; }
int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1639,27 +1639,25 @@ bool kvm_set_spte_gfn(struct kvm *kvm, s bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { u64 size = (range->end - range->start) << PAGE_SHIFT; - kvm_pte_t kpte; - pte_t pte;
if (!kvm->arch.mmu.pgt) return false;
- WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE); - - kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt, - range->start << PAGE_SHIFT); - pte = __pte(kpte); - return pte_valid(pte) && pte_young(pte); + return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT, + size, true); }
bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { + u64 size = (range->end - range->start) << PAGE_SHIFT; + if (!kvm->arch.mmu.pgt) return false;
- return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt, - range->start << PAGE_SHIFT); + return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT, + size, false); }
phys_addr_t kvm_mmu_get_httbr(void)
From: Marc Zyngier maz@kernel.org
commit 970dee09b230895fe2230d2b32ad05a2826818c6 upstream.
Since 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock"), hotplugging back a CPU whilst a guest is running results in a number of ugly splats as most of this code expects to run with preemption disabled, which isn't the case anymore.
While the context is preemptable, it isn't migratable, which should be enough. But we have plenty of preemptible() checks all over the place, and our per-CPU accessors also disable preemption.
Since this affects released versions, let's do the easy fix first, disabling preemption in kvm_arch_hardware_enable(). We can always revisit this with a more invasive fix in the future.
Fixes: 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") Reported-by: Kristina Martsenko kristina.martsenko@arm.com Tested-by: Kristina Martsenko kristina.martsenko@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/aeab7562-2d39-e78e-93b1-4711f8cc3fa5@arm.com Cc: stable@vger.kernel.org # v6.3, v6.4 Link: https://lore.kernel.org/r/20230703163548.1498943-1-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/arm.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1793,8 +1793,17 @@ static void _kvm_arch_hardware_enable(vo
int kvm_arch_hardware_enable(void) { - int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); + int was_enabled;
+ /* + * Most calls to this function are made with migration + * disabled, but not with preemption disabled. The former is + * enough to ensure correctness, but most of the helpers + * expect the later and will throw a tantrum otherwise. + */ + preempt_disable(); + + was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); _kvm_arch_hardware_enable(NULL);
if (!was_enabled) { @@ -1802,6 +1811,8 @@ int kvm_arch_hardware_enable(void) kvm_timer_cpu_up(); }
+ preempt_enable(); + return 0; }
From: Marc Zyngier maz@kernel.org
commit b321c31c9b7b309dcde5e8854b741c8e6a9a05f0 upstream.
Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when running a preemptible kernel, as it is possible that a vCPU is blocked without requesting a doorbell interrupt.
The issue is that any preemption that occurs between vgic_v4_put() and schedule() on the block path will mark the vPE as nonresident and *not* request a doorbell irq. This occurs because when the vcpu thread is resumed on its way to block, vcpu_load() will make the vPE resident again. Once the vcpu actually blocks, we don't request a doorbell anymore, and the vcpu won't be woken up on interrupt delivery.
Fix it by tracking that we're entering WFI, and key the doorbell request on that flag. This allows us not to make the vPE resident when going through a preempt/schedule cycle, meaning we don't lose any state.
Cc: stable@vger.kernel.org Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put") Reported-by: Xiang Chen chenxiang66@hisilicon.com Suggested-by: Zenghui Yu yuzenghui@huawei.com Tested-by: Xiang Chen chenxiang66@hisilicon.com Co-developed-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Marc Zyngier maz@kernel.org Acked-by: Zenghui Yu yuzenghui@huawei.com Link: https://lore.kernel.org/r/20230713070657.3873244-1-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/kvm_host.h | 2 ++ arch/arm64/kvm/arm.c | 6 ++++-- arch/arm64/kvm/vgic/vgic-v3.c | 2 +- arch/arm64/kvm/vgic/vgic-v4.c | 7 +++++-- include/kvm/arm_vgic.h | 2 +- 5 files changed, 13 insertions(+), 6 deletions(-)
--- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -701,6 +701,8 @@ struct kvm_vcpu_arch { #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5)) /* PMUSERENR for the guest EL0 is on physical CPU */ #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6)) +/* WFI instruction trapped */ +#define IN_WFI __vcpu_single_flag(sflags, BIT(7))
/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -704,13 +704,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu) */ preempt_disable(); kvm_vgic_vmcr_sync(vcpu); - vgic_v4_put(vcpu, true); + vcpu_set_flag(vcpu, IN_WFI); + vgic_v4_put(vcpu); preempt_enable();
kvm_vcpu_halt(vcpu); vcpu_clear_flag(vcpu, IN_WFIT);
preempt_disable(); + vcpu_clear_flag(vcpu, IN_WFI); vgic_v4_load(vcpu); preempt_enable(); } @@ -778,7 +780,7 @@ static int check_vcpu_requests(struct kv if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) { /* The distributor enable bits were changed */ preempt_disable(); - vgic_v4_put(vcpu, false); + vgic_v4_put(vcpu); vgic_v4_load(vcpu); preempt_enable(); } --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -749,7 +749,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
- WARN_ON(vgic_v4_put(vcpu, false)); + WARN_ON(vgic_v4_put(vcpu));
vgic_v3_vmcr_sync(vcpu);
--- a/arch/arm64/kvm/vgic/vgic-v4.c +++ b/arch/arm64/kvm/vgic/vgic-v4.c @@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm) its_vm->vpes = NULL; }
-int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db) +int vgic_v4_put(struct kvm_vcpu *vcpu) { struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident) return 0;
- return its_make_vpe_non_resident(vpe, need_db); + return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI)); }
int vgic_v4_load(struct kvm_vcpu *vcpu) @@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu) if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident) return 0;
+ if (vcpu_get_flag(vcpu, IN_WFI)) + return 0; + /* * Before making the VPE resident, make sure the redistributor * corresponding to our current CPU expects us here. See the --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -431,7 +431,7 @@ int kvm_vgic_v4_unset_forwarding(struct
int vgic_v4_load(struct kvm_vcpu *vcpu); void vgic_v4_commit(struct kvm_vcpu *vcpu); -int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db); +int vgic_v4_put(struct kvm_vcpu *vcpu);
/* CPU HP callbacks */ void kvm_vgic_cpu_up(void);
From: Eric Whitney enwlinux@gmail.com
commit 6909cf5c4101214f4305a62d582a5b93c7e1eb9a upstream.
When run on a file system where the inline_data feature has been enabled, xfstests generic/269, generic/270, and generic/476 cause ext4 to emit error messages indicating that inline directory entries are corrupted. This occurs because the inline offset used to locate inline directory entries in the inode body is not updated when an xattr in that shared region is deleted and the region is shifted in memory to recover the space it occupied. If the deleted xattr precedes the system.data attribute, which points to the inline directory entries, that attribute will be moved further up in the region. The inline offset continues to point to whatever is located in system.data's former location, with unfortunate effects when used to access directory entries or (presumably) inline data in the inode body.
Cc: stable@kernel.org Signed-off-by: Eric Whitney enwlinux@gmail.com Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1782,6 +1782,20 @@ static int ext4_xattr_set_entry(struct e memmove(here, (void *)here + size, (void *)last - (void *)here + sizeof(__u32)); memset(last, 0, size); + + /* + * Update i_inline_off - moved ibody region might contain + * system.data attribute. Handling a failure here won't + * cause other complications for setting an xattr. + */ + if (!is_block && ext4_has_inline_data(inode)) { + ret = ext4_find_inline_data_nolock(inode); + if (ret) { + ext4_warning_inode(inode, + "unable to update i_inline_off"); + goto out; + } + } } else if (s->not_found) { /* Insert new name. */ size_t size = EXT4_XATTR_LEN(name_len);
[ Upstream commit f828b681d0cd566f86351c0b913e6cb6ed8c7b9c ]
The type of size is unsigned, if size is 0x40000000, there will be an integer overflow, size will be zero after size *= sizeof(uint32_t), will cause uninitialized memory to be referenced later
Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: hackyzh002 hackyzh002@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/radeon/radeon_cs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -270,7 +270,8 @@ int radeon_cs_parser_init(struct radeon_ { struct drm_radeon_cs *cs = data; uint64_t *chunk_array_ptr; - unsigned size, i; + u64 size; + unsigned i; u32 ring = RADEON_CS_RING_GFX; s32 priority = 0;
[ Upstream commit 8cabf83c7aa54530e699be56249fb44f9505c4f3 ]
There is no apparent reason for the massive code duplication.
Signed-off-by: Oswald Buddenhagen oswald.buddenhagen@gmx.de Link: https://lore.kernel.org/r/20230510173917.3073107-3-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/emu10k1/emufx.c | 112 +++------------------------------------------- 1 file changed, 9 insertions(+), 103 deletions(-)
--- a/sound/pci/emu10k1/emufx.c +++ b/sound/pci/emu10k1/emufx.c @@ -1559,14 +1559,8 @@ A_OP(icode, &ptr, iMAC0, A_GPR(var), A_G gpr += 2;
/* Master volume (will be renamed later) */ - A_OP(icode, &ptr, iMAC0, A_GPR(playback+0+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+0+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+1+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+1+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+2+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+2+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+3+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+3+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+4+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+4+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+5+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+5+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+6+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+6+SND_EMU10K1_PLAYBACK_CHANNELS)); - A_OP(icode, &ptr, iMAC0, A_GPR(playback+7+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+7+SND_EMU10K1_PLAYBACK_CHANNELS)); + for (z = 0; z < 8; z++) + A_OP(icode, &ptr, iMAC0, A_GPR(playback+z+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+z+SND_EMU10K1_PLAYBACK_CHANNELS)); snd_emu10k1_init_mono_control(&controls[nctl++], "Wave Master Playback Volume", gpr, 0); gpr += 2;
@@ -1653,102 +1647,14 @@ A_OP(icode, &ptr, iMAC0, A_GPR(var), A_G dev_dbg(emu->card->dev, "emufx.c: gpr=0x%x, tmp=0x%x\n", gpr, tmp); */ - /* For the EMU1010: How to get 32bit values from the DSP. High 16bits into L, low 16bits into R. */ - /* A_P16VIN(0) is delayed by one sample, - * so all other A_P16VIN channels will need to also be delayed - */ - /* Left ADC in. 1 of 2 */ snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_P16VIN(0x0), A_FXBUS2(0) ); - /* Right ADC in 1 of 2 */ - gpr_map[gpr++] = 0x00000000; - /* Delaying by one sample: instead of copying the input - * value A_P16VIN to output A_FXBUS2 as in the first channel, - * we use an auxiliary register, delaying the value by one - * sample - */ - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(2) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x1), A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(4) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x2), A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(6) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x3), A_C_00000000, A_C_00000000); - /* For 96kHz mode */ - /* Left ADC in. 2 of 2 */ - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0x8) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x4), A_C_00000000, A_C_00000000); - /* Right ADC in 2 of 2 */ - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xa) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x5), A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xc) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x6), A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xe) ); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x7), A_C_00000000, A_C_00000000); - /* Pavel Hofman - we still have voices, A_FXBUS2s, and - * A_P16VINs available - - * let's add 8 more capture channels - total of 16 - */ - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x10)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x8), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x12)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x9), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x14)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xa), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x16)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xb), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x18)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xc), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x1a)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xd), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x1c)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xe), - A_C_00000000, A_C_00000000); - gpr_map[gpr++] = 0x00000000; - snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp, - bit_shifter16, - A_GPR(gpr - 1), - A_FXBUS2(0x1e)); - A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xf), - A_C_00000000, A_C_00000000); + /* A_P16VIN(0) is delayed by one sample, so all other A_P16VIN channels + * will need to also be delayed; we use an auxiliary register for that. */ + for (z = 1; z < 0x10; z++) { + snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr), A_FXBUS2(z * 2) ); + A_OP(icode, &ptr, iACC3, A_GPR(gpr), A_P16VIN(z), A_C_00000000, A_C_00000000); + gpr_map[gpr++] = 0x00000000; + } }
#if 0
[ Upstream commit 6a4e3363792e30177cc3965697e34ddcea8b900b ]
When add_dquot_ref() fails (usually due to IO error or ENOMEM), we want to disable quotas we are trying to enable. However dquot_disable() call was passed just the flags we are enabling so in case flags == DQUOT_USAGE_ENABLED dquot_disable() call will just fail with EINVAL instead of properly disabling quotas. Fix the problem by always passing DQUOT_LIMITS_ENABLED | DQUOT_USAGE_ENABLED to dquot_disable() in this case.
Reported-and-tested-by: Ye Bin yebin10@huawei.com Reported-by: syzbot+e633c79ceaecbf479854@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20230605140731.2427629-2-yebin10@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/quota/dquot.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -2420,7 +2420,8 @@ int dquot_load_quota_sb(struct super_blo
error = add_dquot_ref(sb, type); if (error) - dquot_disable(sb, type, flags); + dquot_disable(sb, type, + DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED);
return error; out_fmt:
[ Upstream commit d6a95db3c7ad160bc16b89e36449705309b52bcb ]
There's issue as follows when do fault injection: WARNING: CPU: 1 PID: 14870 at include/linux/quotaops.h:51 dquot_disable+0x13b7/0x18c0 Modules linked in: CPU: 1 PID: 14870 Comm: fsconfig Not tainted 6.3.0-next-20230505-00006-g5107a9c821af-dirty #541 RIP: 0010:dquot_disable+0x13b7/0x18c0 RSP: 0018:ffffc9000acc79e0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88825e41b980 RDX: 0000000000000000 RSI: ffff88825e41b980 RDI: 0000000000000002 RBP: ffff888179f68000 R08: ffffffff82087ca7 R09: 0000000000000000 R10: 0000000000000001 R11: ffffed102f3ed026 R12: ffff888179f68130 R13: ffff888179f68110 R14: dffffc0000000000 R15: ffff888179f68118 FS: 00007f450a073740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe96f2efd8 CR3: 000000025c8ad000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> dquot_load_quota_sb+0xd53/0x1060 dquot_resume+0x172/0x230 ext4_reconfigure+0x1dc6/0x27b0 reconfigure_super+0x515/0xa90 __x64_sys_fsconfig+0xb19/0xd20 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Above issue may happens as follows: ProcessA ProcessB ProcessC sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_suspend -> suspend all type quota
sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_resume ret = dquot_load_quota_sb add_dquot_ref do_open -> open file O_RDWR vfs_open do_dentry_open get_write_access atomic_inc_unless_negative(&inode->i_writecount) ext4_file_open dquot_file_open dquot_initialize __dquot_initialize dqget atomic_inc(&dquot->dq_count);
__dquot_initialize __dquot_initialize dqget if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) ext4_acquire_dquot -> Return error DQ_ACTIVE_B flag isn't set dquot_disable invalidate_dquots if (atomic_read(&dquot->dq_count)) dqgrab WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) -> Trigger warning
In the above scenario, 'dquot->dq_flags' has no DQ_ACTIVE_B is normal when dqgrab(). To solve above issue just replace the dqgrab() use in invalidate_dquots() with atomic_inc(&dquot->dq_count).
Signed-off-by: Ye Bin yebin10@huawei.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20230605140731.2427629-3-yebin10@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/quota/dquot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -555,7 +555,7 @@ restart: continue; /* Wait for dquot users */ if (atomic_read(&dquot->dq_count)) { - dqgrab(dquot); + atomic_inc(&dquot->dq_count); spin_unlock(&dq_list_lock); /* * Once dqput() wakes us up, we know it's time to free
[ Upstream commit 0db117359e47750d8bd310d19f13e1c4ef7fc26a ]
HP Elite Presenter Mouse HID Record Descriptor shows two mouses (Repord ID 0x1 and 0x2), one keypad (Report ID 0x5), two Consumer Controls (Report IDs 0x6 and 0x3). Previous to this commit it registers one mouse, one keypad and one Consumer Control, and it was usable only as a digitl laser pointer (one of the two mouses). This patch defines the 464a USB device ID and enables the HID_QUIRK_MULTI_INPUT quirk for it, allowing to use the device both as a mouse and a digital laser pointer.
Signed-off-by: Marco Morandini marco.morandini@polimi.it Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + 2 files changed, 2 insertions(+)
--- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -620,6 +620,7 @@ #define USB_DEVICE_ID_UGCI_FIGHTING 0x0030
#define USB_VENDOR_ID_HP 0x03f0 +#define USB_PRODUCT_ID_HP_ELITE_PRESENTER_MOUSE_464A 0x464a #define USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0A4A 0x0a4a #define USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0B4A 0x0b4a #define USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE 0x134a --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -96,6 +96,7 @@ static const struct hid_device_id hid_qu { HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT, USB_DEVICE_ID_HOLTEK_ALT_KEYBOARD_A096), HID_QUIRK_NO_INIT_REPORTS }, { HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT, USB_DEVICE_ID_HOLTEK_ALT_KEYBOARD_A293), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0A4A), HID_QUIRK_ALWAYS_POLL }, + { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_ELITE_PRESENTER_MOUSE_464A), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0B4A), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE_094A), HID_QUIRK_ALWAYS_POLL },
[ Upstream commit f723edb8a532cd26e1ff0a2b271d73762d48f762 ]
Porting overlayfs to the new amount api I started experiencing random crashes that couldn't be explained easily. So after much debugging and reasoning it became clear that struct ovl_entry requires the point to struct vfsmount to be the first member and of type struct vfsmount.
During the port I added a new member at the beginning of struct ovl_entry which broke all over the place in the form of random crashes and cache corruptions. While there's a comment in ovl_free_fs() to the effect of "Hack! Reuse ofs->layers as a vfsmount array before freeing it" there's no such comment on struct ovl_entry which makes this easy to trip over.
Add a comment and two static asserts for both the offset and the type of pointer in struct ovl_entry.
Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/ovl_entry.h | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/overlayfs/ovl_entry.h +++ b/fs/overlayfs/ovl_entry.h @@ -32,6 +32,7 @@ struct ovl_sb { };
struct ovl_layer { + /* ovl_free_fs() relies on @mnt being the first member! */ struct vfsmount *mnt; /* Trap in ovl inode cache */ struct inode *trap; @@ -42,6 +43,14 @@ struct ovl_layer { int fsid; };
+/* + * ovl_free_fs() relies on @mnt being the first member when unmounting + * the private mounts created for each layer. Let's check both the + * offset and type. + */ +static_assert(offsetof(struct ovl_layer, mnt) == 0); +static_assert(__same_type(typeof_member(struct ovl_layer, mnt), struct vfsmount *)); + struct ovl_path { const struct ovl_layer *layer; struct dentry *dentry;
[ Upstream commit 028f6055c912588e6f72722d89c30b401bbcf013 ]
For filenames that begin with . and are between 2 and 5 characters long, UDF charset conversion code would read uninitialized memory in the output buffer. The only practical impact is that the name may be prepended a "unification hash" when it is not actually needed but still it is good to fix this.
Reported-by: syzbot+cd311b1e43cc25f90d18@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000e2638a05fe9dc8f9@google.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/unicode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/udf/unicode.c b/fs/udf/unicode.c index 622569007b530..2142cbd1dde24 100644 --- a/fs/udf/unicode.c +++ b/fs/udf/unicode.c @@ -247,7 +247,7 @@ static int udf_name_from_CS0(struct super_block *sb, }
if (translate) { - if (str_o_len <= 2 && str_o[0] == '.' && + if (str_o_len > 0 && str_o_len <= 2 && str_o[0] == '.' && (str_o_len == 1 || str_o[1] == '.')) needsCRC = 1; if (needsCRC) {
[ Upstream commit 724418b84e6248cd27599607b7e5fac365b8e3f5 ]
This requires a patched ACPI table or a firmware from ASUS to work because the system does not come with the _DSD field for the CSC3551.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217550 Signed-off-by: Matthew Anderson ruinairas1992@gmail.com Tested-by: Philip Mueller philm@manjaro.org Link: https://lore.kernel.org/r/20230621161714.9442-1-ruinairas1992@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 46 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7136,6 +7136,10 @@ enum { ALC294_FIXUP_ASUS_DUAL_SPK, ALC285_FIXUP_THINKPAD_X1_GEN7, ALC285_FIXUP_THINKPAD_HEADSET_JACK, + ALC294_FIXUP_ASUS_ALLY, + ALC294_FIXUP_ASUS_ALLY_PINS, + ALC294_FIXUP_ASUS_ALLY_VERBS, + ALC294_FIXUP_ASUS_ALLY_SPEAKER, ALC294_FIXUP_ASUS_HPE, ALC294_FIXUP_ASUS_COEF_1B, ALC294_FIXUP_ASUS_GX502_HP, @@ -8449,6 +8453,47 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC294_FIXUP_SPK2_TO_DAC1 }, + [ALC294_FIXUP_ASUS_ALLY] = { + .type = HDA_FIXUP_FUNC, + .v.func = cs35l41_fixup_i2c_two, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_ALLY_PINS + }, + [ALC294_FIXUP_ASUS_ALLY_PINS] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x19, 0x03a11050 }, + { 0x1a, 0x03a11c30 }, + { 0x21, 0x03211420 }, + { } + }, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_ALLY_VERBS + }, + [ALC294_FIXUP_ASUS_ALLY_VERBS] = { + .type = HDA_FIXUP_VERBS, + .v.verbs = (const struct hda_verb[]) { + { 0x20, AC_VERB_SET_COEF_INDEX, 0x45 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x5089 }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x46 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0004 }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x47 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0xa47a }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x49 }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x0049}, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x4a }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x201b }, + { 0x20, AC_VERB_SET_COEF_INDEX, 0x6b }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x4278}, + { } + }, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_ALLY_SPEAKER + }, + [ALC294_FIXUP_ASUS_ALLY_SPEAKER] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc285_fixup_speaker2_to_dac1, + }, [ALC285_FIXUP_THINKPAD_X1_GEN7] = { .type = HDA_FIXUP_FUNC, .v.func = alc285_fixup_thinkpad_x1_gen7, @@ -9557,6 +9602,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1043, 0x16e3, "ASUS UX50", ALC269_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x1043, 0x1740, "ASUS UX430UA", ALC295_FIXUP_ASUS_DACS), SND_PCI_QUIRK(0x1043, 0x17d1, "ASUS UX431FL", ALC294_FIXUP_ASUS_DUAL_SPK), + SND_PCI_QUIRK(0x1043, 0x17f3, "ROG Ally RC71L_RC71L", ALC294_FIXUP_ASUS_ALLY), SND_PCI_QUIRK(0x1043, 0x1881, "ASUS Zephyrus S/M", ALC294_FIXUP_ASUS_GX502_PINS), SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC), SND_PCI_QUIRK(0x1043, 0x18f1, "Asus FX505DT", ALC256_FIXUP_ASUS_HEADSET_MIC),
[ Upstream commit 4e302336d5ca1767a06beee7596a72d3bdc8d983 ]
Syzkaller reported the following issue:
UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dmap.c:1965:6 index -84 is out of range for type 's8[341]' (aka 'signed char[341]') CPU: 1 PID: 4995 Comm: syz-executor146 Not tainted 6.4.0-rc6-syzkaller-00037-gb6dad5178cea #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348 dbAllocDmapLev+0x3e5/0x430 fs/jfs/jfs_dmap.c:1965 dbAllocCtl+0x113/0x920 fs/jfs/jfs_dmap.c:1809 dbAllocAG+0x28f/0x10b0 fs/jfs/jfs_dmap.c:1350 dbAlloc+0x658/0xca0 fs/jfs/jfs_dmap.c:874 dtSplitUp fs/jfs/jfs_dtree.c:974 [inline] dtInsert+0xda7/0x6b00 fs/jfs/jfs_dtree.c:863 jfs_create+0x7b6/0xbb0 fs/jfs/namei.c:137 lookup_open fs/namei.c:3492 [inline] open_last_lookups fs/namei.c:3560 [inline] path_openat+0x13df/0x3170 fs/namei.c:3788 do_filp_open+0x234/0x490 fs/namei.c:3818 do_sys_openat2+0x13f/0x500 fs/open.c:1356 do_sys_open fs/open.c:1372 [inline] __do_sys_openat fs/open.c:1388 [inline] __se_sys_openat fs/open.c:1383 [inline] __x64_sys_openat+0x247/0x290 fs/open.c:1383 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f1f4e33f7e9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc21129578 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1f4e33f7e9 RDX: 000000000000275a RSI: 0000000020000040 RDI: 00000000ffffff9c RBP: 00007f1f4e2ff080 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1f4e2ff110 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK>
The bug occurs when the dbAllocDmapLev()function attempts to access dp->tree.stree[leafidx + LEAFIND] while the leafidx value is negative.
To rectify this, the patch introduces a safeguard within the dbAllocDmapLev() function. A check has been added to verify if leafidx is negative. If it is, the function immediately returns an I/O error, preventing any further execution that could potentially cause harm.
Tested via syzbot.
Reported-by: syzbot+853a6f4dfa3cf37d3aea@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=ae2f5a27a07ae44b0f17 Signed-off-by: Yogesh yogi.kernel@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -1959,6 +1959,9 @@ dbAllocDmapLev(struct bmap * bmp, if (dbFindLeaf((dmtree_t *) & dp->tree, l2nb, &leafidx)) return -ENOSPC;
+ if (leafidx < 0) + return -EIO; + /* determine the block number within the file system corresponding * to the leaf at which free space was found. */
[ Upstream commit 7b191b9b55df2a844bd32d1d380f47a7df1c2896 ]
Zero-length arrays are deprecated, and we are replacing them with flexible array members instead. So, replace zero-length array with flexible-array member in struct memmap.
Address the following warning found after building (with GCC-13) mips64 with decstation_64_defconfig: In function 'rex_setup_memory_region', inlined from 'prom_meminit' at arch/mips/dec/prom/memory.c:91:3: arch/mips/dec/prom/memory.c:72:31: error: array subscript i is outside array bounds of 'unsigned char[0]' [-Werror=array-bounds=] 72 | if (bm->bitmap[i] == 0xff) | ~~~~~~~~~~^~~ In file included from arch/mips/dec/prom/memory.c:16: ./arch/mips/include/asm/dec/prom.h: In function 'prom_meminit': ./arch/mips/include/asm/dec/prom.h:73:23: note: while referencing 'bitmap' 73 | unsigned char bitmap[0];
This helps with the ongoing efforts to globally enable -Warray-bounds.
This results in no differences in binary output.
Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/323 Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/dec/prom.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/dec/prom.h b/arch/mips/include/asm/dec/prom.h index 1e1247add1cf8..908e96e3a3117 100644 --- a/arch/mips/include/asm/dec/prom.h +++ b/arch/mips/include/asm/dec/prom.h @@ -70,7 +70,7 @@ static inline bool prom_is_rex(u32 magic) */ typedef struct { int pagesize; - unsigned char bitmap[0]; + unsigned char bitmap[]; } memmap;
[ Upstream commit 47cfdc338d674d38f4b2f22b7612cc6a2763ba27 ]
Syzkaller reported an issue where txBegin may be called on a superblock in a read-only mounted filesystem which leads to NULL pointer deref. This could be solved by checking if the filesystem is read-only before calling txBegin, and returning with appropiate error code.
Reported-By: syzbot+f1faa20eec55e0c8644c@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=be7e52c50c5182cc09a09ea6fc456446b2039de...
Signed-off-by: Immad Mir mirimmad17@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/namei.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/jfs/namei.c +++ b/fs/jfs/namei.c @@ -799,6 +799,11 @@ static int jfs_link(struct dentry *old_d if (rc) goto out;
+ if (isReadOnly(ip)) { + jfs_error(ip->i_sb, "read-only filesystem\n"); + return -EROFS; + } + tid = txBegin(ip->i_sb, 0);
mutex_lock_nested(&JFS_IP(dir)->commit_mutex, COMMIT_MUTEX_PARENT);
[ Upstream commit 95e2b352c03b0a86c5717ba1d24ea20969abcacc ]
This patch adds a check for read-only mounted filesystem in txBegin before starting a transaction potentially saving from NULL pointer deref.
Signed-off-by: Immad Mir mirimmad17@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_txnmgr.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c index c8ce7f1bc5942..6f6a5b9203d3f 100644 --- a/fs/jfs/jfs_txnmgr.c +++ b/fs/jfs/jfs_txnmgr.c @@ -354,6 +354,11 @@ tid_t txBegin(struct super_block *sb, int flag) jfs_info("txBegin: flag = 0x%x", flag); log = JFS_SBI(sb)->log;
+ if (!log) { + jfs_error(sb, "read-only filesystem\n"); + return 0; + } + TXN_LOCK();
INCREMENT(TxStat.txBegin);
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 873f50ece41aad5c4f788a340960c53774b5526e ]
Currently, if reshape is interrupted, echo "reshape" to sync_action will restart reshape from scratch, for example:
echo frozen > sync_action echo reshape > sync_action
This will corrupt data before reshape_position if the array is growing, fix the problem by continue reshape from reshape_position.
Reported-by: Peter Neuwirth reddunur@online.de Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@onli... Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20230512015610.821290-3-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 350094f1cb09f..18384251399ab 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4807,11 +4807,21 @@ action_store(struct mddev *mddev, const char *page, size_t len) return -EINVAL; err = mddev_lock(mddev); if (!err) { - if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { err = -EBUSY; - else { + } else if (mddev->reshape_position == MaxSector || + mddev->pers->check_reshape == NULL || + mddev->pers->check_reshape(mddev)) { clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); err = mddev->pers->start_reshape(mddev); + } else { + /* + * If reshape is still in progress, and + * md_check_recovery() can continue to reshape, + * don't restart reshape because data can be + * corrupted for raid456. + */ + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); } mddev_unlock(mddev); }
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ]
Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup.
Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks:
watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: <TASK> call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0
Fix the problem by adding cond_resched() to raid10 like what raid1 did.
Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad.
Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 9d23963496194..ee75b058438f3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -920,6 +920,7 @@ static void flush_pending_writes(struct r10conf *conf)
raid1_submit_write(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1132,6 +1133,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
raid1_submit_write(bio); bio = next; + cond_resched(); } kfree(plug); }
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit db59133e927916d8a25ee1fd8264f2808040909d ]
sg_ioctl() support to enable blktrace, which will create debugfs entries "/sys/kernel/debug/block/sgx/", however, there is no guarantee that user will remove these entries through ioctl, and deleting sg device doesn't cleanup these blktrace entries.
This problem can be fixed by cleanup blktrace while releasing request_queue, however, it's not a good idea to do this special handling in common layer just for sg device.
Fix this problem by shutdown bltkrace in sg_device_destroy(), where the device is deleted and all the users close the device, also grab a scsi_device reference from sg_add_device() to prevent scsi_device to be freed before sg_device_destroy();
Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Link: https://lore.kernel.org/r/20230610022003.2557284-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/sg.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 037f8c98a6d36..0adfbd77437f3 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1496,6 +1496,10 @@ sg_add_device(struct device *cl_dev) int error; unsigned long iflags;
+ error = scsi_device_get(scsidp); + if (error) + return error; + error = -ENOMEM; cdev = cdev_alloc(); if (!cdev) { @@ -1553,6 +1557,7 @@ sg_add_device(struct device *cl_dev) out: if (cdev) cdev_del(cdev); + scsi_device_put(scsidp); return error; }
@@ -1560,6 +1565,7 @@ static void sg_device_destroy(struct kref *kref) { struct sg_device *sdp = container_of(kref, struct sg_device, d_ref); + struct request_queue *q = sdp->device->request_queue; unsigned long flags;
/* CAUTION! Note that the device can still be found via idr_find() @@ -1567,6 +1573,9 @@ sg_device_destroy(struct kref *kref) * any other cleanup. */
+ blk_trace_remove(q); + scsi_device_put(sdp->device); + write_lock_irqsave(&sg_index_lock, flags); idr_remove(&sg_index_idr, sdp->index); write_unlock_irqrestore(&sg_index_lock, flags);
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 245165658e1c9f95c0fecfe02b9b1ebd30a1198a ]
After grabbing q->sysfs_lock, q->elevator may become NULL because of elevator switch.
Fix the NULL dereference on q->elevator by checking it with lock.
Reported-by: Guangwu Zhang guazhang@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20230616132354.415109-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index b9f4546139894..73ed8ccb09ce8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4617,9 +4617,6 @@ static bool blk_mq_elv_switch_none(struct list_head *head, { struct blk_mq_qe_pair *qe;
- if (!q->elevator) - return true; - qe = kmalloc(sizeof(*qe), GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY); if (!qe) return false; @@ -4627,6 +4624,12 @@ static bool blk_mq_elv_switch_none(struct list_head *head, /* q->elevator needs protection from ->sysfs_lock */ mutex_lock(&q->sysfs_lock);
+ /* the check has to be done with holding sysfs_lock */ + if (!q->elevator) { + kfree(qe); + goto unlock; + } + INIT_LIST_HEAD(&qe->node); qe->q = q; qe->type = q->elevator->type; @@ -4634,6 +4637,7 @@ static bool blk_mq_elv_switch_none(struct list_head *head, __elevator_get(qe->type); list_add(&qe->node, head); elevator_disable(q); +unlock: mutex_unlock(&q->sysfs_lock);
return true;
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ]
posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation.
This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point.
But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem:
CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0;
So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true:
if (sig->posix_timer_id == start) break;
While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness.
Rewrite it so that all id operations are under the hash lock.
Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Frederic Weisbecker frederic@kernel.org Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sched/signal.h | 2 +- kernel/time/posix-timers.c | 31 ++++++++++++++++++------------- 2 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h index 20099268fa257..669e8cff40c74 100644 --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -135,7 +135,7 @@ struct signal_struct { #ifdef CONFIG_POSIX_TIMERS
/* POSIX.1b Interval Timers */ - int posix_timer_id; + unsigned int next_posix_timer_id; struct list_head posix_timers;
/* ITIMER_REAL timer for the process */ diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c index ed3c4a9543982..2d6cf93ca370a 100644 --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -140,25 +140,30 @@ static struct k_itimer *posix_timer_by_id(timer_t id) static int posix_timer_add(struct k_itimer *timer) { struct signal_struct *sig = current->signal; - int first_free_id = sig->posix_timer_id; struct hlist_head *head; - int ret = -ENOENT; + unsigned int cnt, id;
- do { + /* + * FIXME: Replace this by a per signal struct xarray once there is + * a plan to handle the resulting CRIU regression gracefully. + */ + for (cnt = 0; cnt <= INT_MAX; cnt++) { spin_lock(&hash_lock); - head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)]; - if (!__posix_timers_find(head, sig, sig->posix_timer_id)) { + id = sig->next_posix_timer_id; + + /* Write the next ID back. Clamp it to the positive space */ + sig->next_posix_timer_id = (id + 1) & INT_MAX; + + head = &posix_timers_hashtable[hash(sig, id)]; + if (!__posix_timers_find(head, sig, id)) { hlist_add_head_rcu(&timer->t_hash, head); - ret = sig->posix_timer_id; + spin_unlock(&hash_lock); + return id; } - if (++sig->posix_timer_id < 0) - sig->posix_timer_id = 0; - if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT)) - /* Loop over all possible ids completed */ - ret = -EAGAIN; spin_unlock(&hash_lock); - } while (ret == -ENOENT); - return ret; + } + /* POSIX return code when no timer ID could be allocated */ + return -EAGAIN; }
static inline void unlock_timer(struct k_itimer *timr, unsigned long flags)
From: David Sterba dsterba@suse.com
[ Upstream commit efcfcbc6a36195c42d98e0ee697baba36da94dc8 ]
The implementation of XXHASH is now CPU only but still fast enough to be considered for the synchronous checksumming, like non-generic crc32c.
A userspace benchmark comparing it to various implementations (patched hash-speedtest from btrfs-progs):
Block size: 4096 Iterations: 1000000 Implementation: builtin Units: CPU cycles
NULL-NOP: cycles: 73384294, cycles/i 73 NULL-MEMCPY: cycles: 228033868, cycles/i 228, 61664.320 MiB/s CRC32C-ref: cycles: 24758559416, cycles/i 24758, 567.950 MiB/s CRC32C-NI: cycles: 1194350470, cycles/i 1194, 11773.433 MiB/s CRC32C-ADLERSW: cycles: 6150186216, cycles/i 6150, 2286.372 MiB/s CRC32C-ADLERHW: cycles: 626979180, cycles/i 626, 22427.453 MiB/s CRC32C-PCL: cycles: 466746732, cycles/i 466, 30126.699 MiB/s XXHASH: cycles: 860656400, cycles/i 860, 16338.188 MiB/s
Comparing purely software implementation (ref), current outdated accelerated using crc32q instruction (NI), optimized implementations by M. Adler (https://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in...) and the best one that was taken from kernel using the PCLMULQDQ instruction (PCL).
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index fc59eb4024438..795b30913c542 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2265,6 +2265,9 @@ static int btrfs_init_csum_hash(struct btrfs_fs_info *fs_info, u16 csum_type) if (!strstr(crypto_shash_driver_name(csum_shash), "generic")) set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags); break; + case BTRFS_CSUM_TYPE_XXHASH: + set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags); + break; default: break; }
From: Christoph Hellwig hch@lst.de
[ Upstream commit 3e92499e3b004baffb479d61e191b41b604ece9a ]
__extent_writepage currenly sets PageError whenever any error happens, and the also checks for PageError to decide if to call error handling. This leads to very unclear responsibility for cleaning up on errors. In the VM and generic writeback helpers the basic idea is that once I/O is fired off all error handling responsibility is delegated to the end I/O handler. But if that end I/O handler sets the PageError bit, and the submitter checks it, the bit could in some cases leak into the submission context for fast enough I/O.
Fix this by simply not checking PageError and just using the local ret variable to check for submission errors. This also fundamentally solves the long problem documented in a comment in __extent_writepage by never leaking the error bit into the submission context.
Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 33 +-------------------------------- 1 file changed, 1 insertion(+), 32 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index e3ae55d8bae14..a37a6587efaf0 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1592,38 +1592,7 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl set_page_writeback(page); end_page_writeback(page); } - /* - * Here we used to have a check for PageError() and then set @ret and - * call end_extent_writepage(). - * - * But in fact setting @ret here will cause different error paths - * between subpage and regular sectorsize. - * - * For regular page size, we never submit current page, but only add - * current page to current bio. - * The bio submission can only happen in next page. - * Thus if we hit the PageError() branch, @ret is already set to - * non-zero value and will not get updated for regular sectorsize. - * - * But for subpage case, it's possible we submit part of current page, - * thus can get PageError() set by submitted bio of the same page, - * while our @ret is still 0. - * - * So here we unify the behavior and don't set @ret. - * Error can still be properly passed to higher layer as page will - * be set error, here we just don't handle the IO failure. - * - * NOTE: This is just a hotfix for subpage. - * The root fix will be properly ending ordered extent when we hit - * an error during writeback. - * - * But that needs a bigger refactoring, as we not only need to grab the - * submitted OE, but also need to know exactly at which bytenr we hit - * the error. - * Currently the full page based __extent_writepage_io() is not - * capable of that. - */ - if (PageError(page)) + if (ret) end_extent_writepage(page, ret, page_start, page_end); unlock_page(page); ASSERT(ret <= 0);
From: Filipe Manana fdmanana@suse.com
[ Upstream commit eced687e224eb3cc5a501cf53ad9291337c8dbc5 ]
At update_ref_for_cow() we are calling btrfs_handle_fs_error() if we find that the extent buffer has an unexpected ref count of zero, however we can simply use btrfs_abort_transaction(), which achieves the same purposes: to turn the fs to error state, abort the current transaction and turn the fs to RO mode as well. Besides that, btrfs_abort_transaction() also prints a stack trace which makes it more useful.
Also, as this is a very unexpected situation, indicating a serious corruption/inconsistency, tag the if branch as 'unlikely', set the error code to -EUCLEAN instead of -EROFS, and log an explicit message.
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 4912d624ca3d3..886e661a218fc 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -417,9 +417,13 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, &refs, &flags); if (ret) return ret; - if (refs == 0) { - ret = -EROFS; - btrfs_handle_fs_error(fs_info, ret, NULL); + if (unlikely(refs == 0)) { + btrfs_crit(fs_info, + "found 0 references for tree block at bytenr %llu level %d root %llu", + buf->start, btrfs_header_level(buf), + btrfs_root_id(root)); + ret = -EUCLEAN; + btrfs_abort_transaction(trans, ret); return ret; } } else {
From: Sandeep Dhavale dhavale@google.com
[ Upstream commit 12d0a24afd9ea58e581ea64d64e066f2027b28d9 ]
Current check for atomic context is not sufficient as z_erofs_decompressqueue_endio can be called under rcu lock from blk_mq_flush_plug_list(). See the stacktrace [1]
In such case we should hand off the decompression work for async processing rather than trying to do sync decompression in current context. Patch fixes the detection by checking for rcu_read_lock_any_held() and while at it use more appropriate !in_task() check than in_atomic().
Background: Historically erofs would always schedule a kworker for decompression which would incur the scheduling cost regardless of the context. But z_erofs_decompressqueue_endio() may not always be in atomic context and we could actually benefit from doing the decompression in z_erofs_decompressqueue_endio() if we are in thread context, for example when running with dm-verity. This optimization was later added in patch [2] which has shown improvement in performance benchmarks.
============================================== [1] Problem stacktrace [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291 [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi [name:core&]preempt_count: 0, expected: 0 [name:core&]RCU nest depth: 1, expected: 0 CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1 Hardware name: MT6897 (DT) Call trace: dump_backtrace+0x108/0x15c show_stack+0x20/0x30 dump_stack_lvl+0x6c/0x8c dump_stack+0x20/0x48 __might_resched+0x1fc/0x308 __might_sleep+0x50/0x88 mutex_lock+0x2c/0x110 z_erofs_decompress_queue+0x11c/0xc10 z_erofs_decompress_kickoff+0x110/0x1a4 z_erofs_decompressqueue_endio+0x154/0x180 bio_endio+0x1b0/0x1d8 __dm_io_complete+0x22c/0x280 clone_endio+0xe4/0x280 bio_endio+0x1b0/0x1d8 blk_update_request+0x138/0x3a4 blk_mq_plug_issue_direct+0xd4/0x19c blk_mq_flush_plug_list+0x2b0/0x354 __blk_flush_plug+0x110/0x160 blk_finish_plug+0x30/0x4c read_pages+0x2fc/0x370 page_cache_ra_unbounded+0xa4/0x23c page_cache_ra_order+0x290/0x320 do_sync_mmap_readahead+0x108/0x2c0 filemap_fault+0x19c/0x52c __do_fault+0xc4/0x114 handle_mm_fault+0x5b4/0x1168 do_page_fault+0x338/0x4b4 do_translation_fault+0x40/0x60 do_mem_abort+0x60/0xc8 el0_da+0x4c/0xe0 el0t_64_sync_handler+0xd4/0xfc el0t_64_sync+0x1a0/0x1a4
[2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/
Reported-by: Will Shiu Will.Shiu@mediatek.com Suggested-by: Gao Xiang xiang@kernel.org Signed-off-by: Sandeep Dhavale dhavale@google.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Alexandre Mergnat amergnat@baylibre.com Link: https://lore.kernel.org/r/20230621220848.3379029-1-dhavale@google.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 997ca4b32e87f..4a1c238600c52 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1411,7 +1411,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, if (atomic_add_return(bios, &io->pending_bios)) return; /* Use (kthread_)work and sync decompression for atomic contexts only */ - if (in_atomic() || irqs_disabled()) { + if (!in_task() || irqs_disabled() || rcu_read_lock_any_held()) { #ifdef CONFIG_EROFS_FS_PCPU_KTHREAD struct kthread_worker *worker;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 69d6b37695c1f2320cfa330e1e1636d50dd5040a ]
The Nextbook Ares 8A is a x86 ACPI tablet which ships with Android x86 as factory OS. Its DSDT contains a bunch of I2C devices which are not actually there (the Android x86 kernel fork ignores I2C devices described in the DSDT).
On this specific model this just not cause resource conflicts, one of the probe() calls for the non existing i2c_clients actually ends up toggling a GPIO or executing a _PS3 after a failed probe which turns the tablet off.
Add a ACPI_QUIRK_SKIP_I2C_CLIENTS for the Nextbook Ares 8 to the acpi_quirk_skip_dmi_ids table to avoid the bogus i2c_clients and to fix the tablet turning off during boot because of this.
Also add the "10EC5651" HID for the RealTek ALC5651 codec used in this tablet to the list of HIDs for which not to skipi2c_client instantiation, since the Intel SST sound driver relies on the codec being instantiated through ACPI.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/utils.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c index 9c2d6f35f88a0..4cfee2da06756 100644 --- a/drivers/acpi/x86/utils.c +++ b/drivers/acpi/x86/utils.c @@ -365,7 +365,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY), }, { - /* Nextbook Ares 8 */ + /* Nextbook Ares 8 (BYT version)*/ .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), DMI_MATCH(DMI_PRODUCT_NAME, "M890BAP"), @@ -374,6 +374,16 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, + { + /* Nextbook Ares 8A (CHT version)*/ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"), + DMI_MATCH(DMI_BIOS_VERSION, "M882"), + }, + .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY), + }, { /* Whitelabel (sold as various brands) TM800A550L */ .matches = { @@ -392,6 +402,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { #if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS) static const struct acpi_device_id i2c_acpi_known_good_ids[] = { { "10EC5640", 0 }, /* RealTek ALC5640 audio codec */ + { "10EC5651", 0 }, /* RealTek ALC5651 audio codec */ { "INT33F4", 0 }, /* X-Powers AXP288 PMIC */ { "INT33FD", 0 }, /* Intel Crystal Cove PMIC */ { "INT34D3", 0 }, /* Intel Whiskey Cove PMIC */
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 4fd5556608bfa9c2bf276fc115ef04288331aded ]
The LID0 device on the Nextbook Ares 8A tablet always reports lid closed causing userspace to suspend the device as soon as booting is complete.
Add a DMI quirk to disable the broken lid functionality.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/button.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c index 475e1eddfa3b4..ef77c14c72a92 100644 --- a/drivers/acpi/button.c +++ b/drivers/acpi/button.c @@ -77,6 +77,15 @@ static const struct dmi_system_id dmi_lid_quirks[] = { }, .driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_DISABLED, }, + { + /* Nextbook Ares 8A tablet, _LID device always reports lid closed */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"), + DMI_MATCH(DMI_BIOS_VERSION, "M882"), + }, + .driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_DISABLED, + }, { /* * Lenovo Yoga 9 14ITL5, initial notification of the LID device
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit f91280f35895d6dcb53f504968fafd1da0b00397 ]
The Lenovo Yoga Book yb1-x90f/l 2-in-1 which ships with Android as Factory OS has (another) bug in its DSDT where the UART resource for the BTH0 ACPI device contains "\_SB.PCIO.URT1" as path to the UART.
Note that is with a letter 'O' instead of the number '0' which is wrong.
This causes Linux to instantiate a standard /dev/ttyS? device for the UART instead of a /sys/bus/serial device, which in turn causes bluetooth to not work.
Similar DSDT bugs have been encountered before and to work around those the acpi_quirk_skip_serdev_enumeration() helper exists.
Previous devices had the broken resource pointing to the first UART, while the BT HCI was on the second UART, which ACPI_QUIRK_UART1_TTY_UART2_SKIP deals with. Add a new ACPI_QUIRK_UART1_SKIP quirk for skipping enumeration of UART1 instead for the Yoga Book case and add this quirk to the existing DMI quirk table entry for the yb1-x90f/l .
This leaves the UART1 controller unbound allowing the x86-android-tablets module to manually instantiate a serdev for it fixing bluetooth.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/utils.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c index 4cfee2da06756..c2b925f8cd4e4 100644 --- a/drivers/acpi/x86/utils.c +++ b/drivers/acpi/x86/utils.c @@ -259,10 +259,11 @@ bool force_storage_d3(void) * drivers/platform/x86/x86-android-tablets.c kernel module. */ #define ACPI_QUIRK_SKIP_I2C_CLIENTS BIT(0) -#define ACPI_QUIRK_UART1_TTY_UART2_SKIP BIT(1) -#define ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY BIT(2) -#define ACPI_QUIRK_USE_ACPI_AC_AND_BATTERY BIT(3) -#define ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS BIT(4) +#define ACPI_QUIRK_UART1_SKIP BIT(1) +#define ACPI_QUIRK_UART1_TTY_UART2_SKIP BIT(2) +#define ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY BIT(3) +#define ACPI_QUIRK_USE_ACPI_AC_AND_BATTERY BIT(4) +#define ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS BIT(5)
static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { /* @@ -319,6 +320,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "YETI-11"), }, .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | + ACPI_QUIRK_UART1_SKIP | ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, @@ -449,6 +451,9 @@ int acpi_quirk_skip_serdev_enumeration(struct device *controller_parent, bool *s if (dmi_id) quirks = (unsigned long)dmi_id->driver_data;
+ if ((quirks & ACPI_QUIRK_UART1_SKIP) && uid == 1) + *skip = true; + if (quirks & ACPI_QUIRK_UART1_TTY_UART2_SKIP) { if (uid == 1) return -ENODEV; /* Create tty cdev instead of serdev */
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 48436f2e9834b46b47b038b605c8142a1c07bc85 ]
Linux defaults to picking the non-working ACPI video backlight interface on the Apple iMac11,3 .
Add a DMI quirk to pick the working native radeon_bl0 interface instead.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index bcc25d457581d..61586caebb01b 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -470,6 +470,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "82BK"), }, }, + { + .callback = video_detect_force_native, + /* Apple iMac11,3 */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "iMac11,3"), + }, + }, { /* https://bugzilla.redhat.com/show_bug.cgi?id=1217249 */ .callback = video_detect_force_native,
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit bd5d93df86a7ddf98a2a37e9c3751e3cb334a66c ]
Linux defaults to picking the non-working ACPI video backlight interface on the Lenovo ThinkPad X131e (3371 AMD version).
Add a DMI quirk to pick the working native radeon_bl0 interface instead.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index 61586caebb01b..b87783c5872dd 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -470,6 +470,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "82BK"), }, }, + { + .callback = video_detect_force_native, + /* Lenovo ThinkPad X131e (3371 AMD version) */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_NAME, "3371"), + }, + }, { .callback = video_detect_force_native, /* Apple iMac11,3 */
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit a9c4a912b7dc7ff922d4b9261160c001558f9755 ]
commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") attempted to overhaul the override logic so it didn't apply on X86 AMD Zen systems. This was intentional so that systems would prefer DSDT values instead of default MADT value for IRQ 1 on Ryzen 6000 systems which typically uses ActiveLow for IRQ1.
This turned out to be a bad assumption because several vendors add Interrupt Source Override but don't fix the DSDT. A pile of quirks was collecting that proved this wasn't sustaintable.
Furthermore some vendors have used ActiveHigh for IRQ1. To solve this problem revert the following commits: * commit 17bb7046e7ce ("ACPI: resource: Do IRQ override on all TongFang GMxRGxx") * commit f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7") * commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad") * commit 7592b79ba4a9 ("ACPI: resource: do IRQ override on XMG Core 15") * commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms")
Reported-by: evilsnoo@proton.me Link: https://bugzilla.kernel.org/show_bug.cgi?id=217394 Reported-by: ruinairas1992@gmail.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=217406 Reported-by: nmschulte@gmail.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=217336 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Tested-by: Werner Sembach wse@tuxedocomputers.com Tested-by: Chuanhong Guo gch981213@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/resource.c | 60 ----------------------------------------- 1 file changed, 60 deletions(-)
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c index 0800a9d775580..1dd8d5aebf678 100644 --- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -470,52 +470,6 @@ static const struct dmi_system_id asus_laptop[] = { { } };
-static const struct dmi_system_id lenovo_laptop[] = { - { - .ident = "LENOVO IdeaPad Flex 5 14ALC7", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), - DMI_MATCH(DMI_PRODUCT_NAME, "82R9"), - }, - }, - { - .ident = "LENOVO IdeaPad Flex 5 16ALC7", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), - DMI_MATCH(DMI_PRODUCT_NAME, "82RA"), - }, - }, - { } -}; - -static const struct dmi_system_id tongfang_gm_rg[] = { - { - .ident = "TongFang GMxRGxx/XMG CORE 15 (M22)/TUXEDO Stellaris 15 Gen4 AMD", - .matches = { - DMI_MATCH(DMI_BOARD_NAME, "GMxRGxx"), - }, - }, - { } -}; - -static const struct dmi_system_id maingear_laptop[] = { - { - .ident = "MAINGEAR Vector Pro 2 15", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"), - DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-15A3070T"), - } - }, - { - .ident = "MAINGEAR Vector Pro 2 17", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"), - DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-17A3070T"), - }, - }, - { } -}; - static const struct dmi_system_id lg_laptop[] = { { .ident = "LG Electronics 17U70P", @@ -539,10 +493,6 @@ struct irq_override_cmp { static const struct irq_override_cmp override_table[] = { { medion_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false }, { asus_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false }, - { lenovo_laptop, 6, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, true }, - { lenovo_laptop, 10, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, true }, - { tongfang_gm_rg, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true }, - { maingear_laptop, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true }, { lg_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false }, };
@@ -562,16 +512,6 @@ static bool acpi_dev_irq_override(u32 gsi, u8 triggering, u8 polarity, return entry->override; }
-#ifdef CONFIG_X86 - /* - * IRQ override isn't needed on modern AMD Zen systems and - * this override breaks active low IRQs on AMD Ryzen 6000 and - * newer systems. Skip it. - */ - if (boot_cpu_has(X86_FEATURE_ZEN)) - return false; -#endif - return true; }
Hi,
On 2023-07-25 12:44, Greg Kroah-Hartman wrote:
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit a9c4a912b7dc7ff922d4b9261160c001558f9755 ]
commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") attempted to overhaul the override logic so it didn't apply on X86 AMD Zen systems. This was intentional so that systems would prefer DSDT values instead of default MADT value for IRQ 1 on Ryzen 6000 systems which typically uses ActiveLow for IRQ1.
This turned out to be a bad assumption because several vendors add Interrupt Source Override but don't fix the DSDT. A pile of quirks was collecting that proved this wasn't sustaintable.
Furthermore some vendors have used ActiveHigh for IRQ1. To solve this problem revert the following commits:
- commit 17bb7046e7ce ("ACPI: resource: Do IRQ override on all TongFang
GMxRGxx")
- commit f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7")
- commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad")
- commit 7592b79ba4a9 ("ACPI: resource: do IRQ override on XMG Core 15")
- commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen
platforms")
Unfortunately this breaks the keyboard on Lenovo Yoga 7 14ARB7: https://lore.kernel.org/all/596b9c4a-fb83-a8ab-3a44-6052d83fa546@augustwiker... https://github.com/tomsom/yoga-linux/issues/47
Regards, August Wikerfors
On Thu, Jul 27, 2023 at 01:06:25AM +0200, August Wikerfors wrote:
Hi,
On 2023-07-25 12:44, Greg Kroah-Hartman wrote:
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit a9c4a912b7dc7ff922d4b9261160c001558f9755 ]
commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") attempted to overhaul the override logic so it didn't apply on X86 AMD Zen systems. This was intentional so that systems would prefer DSDT values instead of default MADT value for IRQ 1 on Ryzen 6000 systems which typically uses ActiveLow for IRQ1.
This turned out to be a bad assumption because several vendors add Interrupt Source Override but don't fix the DSDT. A pile of quirks was collecting that proved this wasn't sustaintable.
Furthermore some vendors have used ActiveHigh for IRQ1. To solve this problem revert the following commits:
- commit 17bb7046e7ce ("ACPI: resource: Do IRQ override on all TongFang
GMxRGxx")
- commit f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7")
- commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad")
- commit 7592b79ba4a9 ("ACPI: resource: do IRQ override on XMG Core 15")
- commit 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen
platforms")
Unfortunately this breaks the keyboard on Lenovo Yoga 7 14ARB7: https://lore.kernel.org/all/596b9c4a-fb83-a8ab-3a44-6052d83fa546@augustwiker... https://github.com/tomsom/yoga-linux/issues/47
Help to fix it in Linus's tree and then we will be glad to take the fix into the stable trees as well.
thanks,
greg k-h
From: Youngmin Nam youngmin.nam@samsung.com
[ Upstream commit f6794950f0e5ba37e3bbedda4d6ab0aad7395dd3 ]
filter_irq_stacks() is supposed to cut entries which are related irq entries from its call stack. And in_irqentry_text() which is called by filter_irq_stacks() uses __irqentry_text_start/end symbol to find irq entries in callstack.
But it doesn't work correctly as without "CONFIG_FUNCTION_GRAPH_TRACER", arm64 kernel doesn't include gic_handle_irq which is entry point of arm64 irq between __irqentry_text_start and __irqentry_text_end as we discussed in below link. https://lore.kernel.org/all/CACT4Y+aReMGLYua2rCLHgFpS9io5cZC04Q8GLs-uNmrn1ez...
This problem can makes unintentional deep call stack entries especially in KASAN enabled situation as below.
[ 2479.383395]I[0:launcher-loader: 1719] Stack depot reached limit capacity [ 2479.383538]I[0:launcher-loader: 1719] WARNING: CPU: 0 PID: 1719 at lib/stackdepot.c:129 __stack_depot_save+0x464/0x46c [ 2479.385693]I[0:launcher-loader: 1719] pstate: 624000c5 (nZCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--) [ 2479.385724]I[0:launcher-loader: 1719] pc : __stack_depot_save+0x464/0x46c [ 2479.385751]I[0:launcher-loader: 1719] lr : __stack_depot_save+0x460/0x46c [ 2479.385774]I[0:launcher-loader: 1719] sp : ffffffc0080073c0 [ 2479.385793]I[0:launcher-loader: 1719] x29: ffffffc0080073e0 x28: ffffffd00b78a000 x27: 0000000000000000 [ 2479.385839]I[0:launcher-loader: 1719] x26: 000000000004d1dd x25: ffffff891474f000 x24: 00000000ca64d1dd [ 2479.385882]I[0:launcher-loader: 1719] x23: 0000000000000200 x22: 0000000000000220 x21: 0000000000000040 [ 2479.385925]I[0:launcher-loader: 1719] x20: ffffffc008007440 x19: 0000000000000000 x18: 0000000000000000 [ 2479.385969]I[0:launcher-loader: 1719] x17: 2065726568207475 x16: 000000000000005e x15: 2d2d2d2d2d2d2d20 [ 2479.386013]I[0:launcher-loader: 1719] x14: 5d39313731203a72 x13: 00000000002f6b30 x12: 00000000002f6af8 [ 2479.386057]I[0:launcher-loader: 1719] x11: 00000000ffffffff x10: ffffffb90aacf000 x9 : e8a74a6c16008800 [ 2479.386101]I[0:launcher-loader: 1719] x8 : e8a74a6c16008800 x7 : 00000000002f6b30 x6 : 00000000002f6af8 [ 2479.386145]I[0:launcher-loader: 1719] x5 : ffffffc0080070c8 x4 : ffffffd00b192380 x3 : ffffffd0092b313c [ 2479.386189]I[0:launcher-loader: 1719] x2 : 0000000000000001 x1 : 0000000000000004 x0 : 0000000000000022 [ 2479.386231]I[0:launcher-loader: 1719] Call trace: [ 2479.386248]I[0:launcher-loader: 1719] __stack_depot_save+0x464/0x46c [ 2479.386273]I[0:launcher-loader: 1719] kasan_save_stack+0x58/0x70 [ 2479.386303]I[0:launcher-loader: 1719] save_stack_info+0x34/0x138 [ 2479.386331]I[0:launcher-loader: 1719] kasan_save_free_info+0x18/0x24 [ 2479.386358]I[0:launcher-loader: 1719] ____kasan_slab_free+0x16c/0x170 [ 2479.386385]I[0:launcher-loader: 1719] __kasan_slab_free+0x10/0x20 [ 2479.386410]I[0:launcher-loader: 1719] kmem_cache_free+0x238/0x53c [ 2479.386435]I[0:launcher-loader: 1719] mempool_free_slab+0x1c/0x28 [ 2479.386460]I[0:launcher-loader: 1719] mempool_free+0x7c/0x1a0 [ 2479.386484]I[0:launcher-loader: 1719] bvec_free+0x34/0x80 [ 2479.386514]I[0:launcher-loader: 1719] bio_free+0x60/0x98 [ 2479.386540]I[0:launcher-loader: 1719] bio_put+0x50/0x21c [ 2479.386567]I[0:launcher-loader: 1719] f2fs_write_end_io+0x4ac/0x4d0 [ 2479.386594]I[0:launcher-loader: 1719] bio_endio+0x2dc/0x300 [ 2479.386622]I[0:launcher-loader: 1719] __dm_io_complete+0x324/0x37c [ 2479.386650]I[0:launcher-loader: 1719] dm_io_dec_pending+0x60/0xa4 [ 2479.386676]I[0:launcher-loader: 1719] clone_endio+0xf8/0x2f0 [ 2479.386700]I[0:launcher-loader: 1719] bio_endio+0x2dc/0x300 [ 2479.386727]I[0:launcher-loader: 1719] blk_update_request+0x258/0x63c [ 2479.386754]I[0:launcher-loader: 1719] scsi_end_request+0x50/0x304 [ 2479.386782]I[0:launcher-loader: 1719] scsi_io_completion+0x88/0x160 [ 2479.386808]I[0:launcher-loader: 1719] scsi_finish_command+0x17c/0x194 [ 2479.386833]I[0:launcher-loader: 1719] scsi_complete+0xcc/0x158 [ 2479.386859]I[0:launcher-loader: 1719] blk_mq_complete_request+0x4c/0x5c [ 2479.386885]I[0:launcher-loader: 1719] scsi_done_internal+0xf4/0x1e0 [ 2479.386910]I[0:launcher-loader: 1719] scsi_done+0x14/0x20 [ 2479.386935]I[0:launcher-loader: 1719] ufshcd_compl_one_cqe+0x578/0x71c [ 2479.386963]I[0:launcher-loader: 1719] ufshcd_mcq_poll_cqe_nolock+0xc8/0x150 [ 2479.386991]I[0:launcher-loader: 1719] ufshcd_intr+0x868/0xc0c [ 2479.387017]I[0:launcher-loader: 1719] __handle_irq_event_percpu+0xd0/0x348 [ 2479.387044]I[0:launcher-loader: 1719] handle_irq_event_percpu+0x24/0x74 [ 2479.387068]I[0:launcher-loader: 1719] handle_irq_event+0x74/0xe0 [ 2479.387091]I[0:launcher-loader: 1719] handle_fasteoi_irq+0x174/0x240 [ 2479.387118]I[0:launcher-loader: 1719] handle_irq_desc+0x7c/0x2c0 [ 2479.387147]I[0:launcher-loader: 1719] generic_handle_domain_irq+0x1c/0x28 [ 2479.387174]I[0:launcher-loader: 1719] gic_handle_irq+0x64/0x158 [ 2479.387204]I[0:launcher-loader: 1719] call_on_irq_stack+0x2c/0x54 [ 2479.387231]I[0:launcher-loader: 1719] do_interrupt_handler+0x70/0xa0 [ 2479.387258]I[0:launcher-loader: 1719] el1_interrupt+0x34/0x68 [ 2479.387283]I[0:launcher-loader: 1719] el1h_64_irq_handler+0x18/0x24 [ 2479.387308]I[0:launcher-loader: 1719] el1h_64_irq+0x68/0x6c [ 2479.387332]I[0:launcher-loader: 1719] blk_attempt_bio_merge+0x8/0x170 [ 2479.387356]I[0:launcher-loader: 1719] blk_mq_attempt_bio_merge+0x78/0x98 [ 2479.387383]I[0:launcher-loader: 1719] blk_mq_submit_bio+0x324/0xa40 [ 2479.387409]I[0:launcher-loader: 1719] __submit_bio+0x104/0x138 [ 2479.387436]I[0:launcher-loader: 1719] submit_bio_noacct_nocheck+0x1d0/0x4a0 [ 2479.387462]I[0:launcher-loader: 1719] submit_bio_noacct+0x618/0x804 [ 2479.387487]I[0:launcher-loader: 1719] submit_bio+0x164/0x180 [ 2479.387511]I[0:launcher-loader: 1719] f2fs_submit_read_bio+0xe4/0x1c4 [ 2479.387537]I[0:launcher-loader: 1719] f2fs_mpage_readpages+0x888/0xa4c [ 2479.387563]I[0:launcher-loader: 1719] f2fs_readahead+0xd4/0x19c [ 2479.387587]I[0:launcher-loader: 1719] read_pages+0xb0/0x4ac [ 2479.387614]I[0:launcher-loader: 1719] page_cache_ra_unbounded+0x238/0x288 [ 2479.387642]I[0:launcher-loader: 1719] do_page_cache_ra+0x60/0x6c [ 2479.387669]I[0:launcher-loader: 1719] page_cache_ra_order+0x318/0x364 [ 2479.387695]I[0:launcher-loader: 1719] ondemand_readahead+0x30c/0x3d8 [ 2479.387722]I[0:launcher-loader: 1719] page_cache_sync_ra+0xb4/0xc8 [ 2479.387749]I[0:launcher-loader: 1719] filemap_read+0x268/0xd24 [ 2479.387777]I[0:launcher-loader: 1719] f2fs_file_read_iter+0x1a0/0x62c [ 2479.387806]I[0:launcher-loader: 1719] vfs_read+0x258/0x34c [ 2479.387831]I[0:launcher-loader: 1719] ksys_pread64+0x8c/0xd0 [ 2479.387857]I[0:launcher-loader: 1719] __arm64_sys_pread64+0x48/0x54 [ 2479.387881]I[0:launcher-loader: 1719] invoke_syscall+0x58/0x158 [ 2479.387909]I[0:launcher-loader: 1719] el0_svc_common+0xf0/0x134 [ 2479.387935]I[0:launcher-loader: 1719] do_el0_svc+0x44/0x114 [ 2479.387961]I[0:launcher-loader: 1719] el0_svc+0x2c/0x80 [ 2479.387985]I[0:launcher-loader: 1719] el0t_64_sync_handler+0x48/0x114 [ 2479.388010]I[0:launcher-loader: 1719] el0t_64_sync+0x190/0x194 [ 2479.388038]I[0:launcher-loader: 1719] Kernel panic - not syncing: kernel: panic_on_warn set ...
So let's set __exception_irq_entry with __irq_entry as a default. Applying this patch, we can see gic_hande_irq is included in Systemp.map as below.
* Before ffffffc008010000 T __do_softirq ffffffc008010000 T __irqentry_text_end ffffffc008010000 T __irqentry_text_start ffffffc008010000 T __softirqentry_text_start ffffffc008010000 T _stext ffffffc00801066c T __softirqentry_text_end ffffffc008010670 T __entry_text_start
* After ffffffc008010000 T __irqentry_text_start ffffffc008010000 T _stext ffffffc008010000 t gic_handle_irq ffffffc00801013c t gic_handle_irq ffffffc008010294 T __irqentry_text_end ffffffc008010298 T __do_softirq ffffffc008010298 T __softirqentry_text_start ffffffc008010904 T __softirqentry_text_end ffffffc008010908 T __entry_text_start
Signed-off-by: Youngmin Nam youngmin.nam@samsung.com Signed-off-by: SEO HOYOUNG hy50.seo@samsung.com Reviewed-by: Mark Rutland mark.rutland@arm.com Link: https://lore.kernel.org/r/20230424010436.779733-1-youngmin.nam@samsung.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/exception.h | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h index e73af709cb7ad..88d8dfeed0db6 100644 --- a/arch/arm64/include/asm/exception.h +++ b/arch/arm64/include/asm/exception.h @@ -8,16 +8,11 @@ #define __ASM_EXCEPTION_H
#include <asm/esr.h> -#include <asm/kprobes.h> #include <asm/ptrace.h>
#include <linux/interrupt.h>
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER #define __exception_irq_entry __irq_entry -#else -#define __exception_irq_entry __kprobes -#endif
static inline unsigned long disr_to_esr(u64 disr) {
From: Mark Rutland mark.rutland@arm.com
[ Upstream commit ab9b4008092c86dc12497af155a0901cc1156999 ]
Both create_mapping_noalloc() and update_mapping_prot() sanity-check their 'virt' parameter, but the check itself doesn't make much sense. The condition used today appears to be a historical accident.
The sanity-check condition:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) { [ ... warning here ... ] return; }
... can only be true for the KASAN shadow region or the module region, and there's no reason to exclude these specifically for creating and updateing mappings.
When arm64 support was first upstreamed in commit:
c1cc1552616d0f35 ("arm64: MMU initialisation")
... the condition was:
if (virt < VMALLOC_START) { [ ... warning here ... ] return; }
At the time, VMALLOC_START was the lowest kernel address, and this was checking whether 'virt' would be translated via TTBR1.
Subsequently in commit:
14c127c957c1c607 ("arm64: mm: Flip kernel VA space")
... the condition was changed to:
if ((virt >= VA_START) && (virt < VMALLOC_START)) { [ ... warning here ... ] return; }
This appear to have been a thinko. The commit moved the linear map to the bottom of the kernel address space, with VMALLOC_START being at the halfway point. The old condition would warn for changes to the linear map below this, and at the time VA_START was the end of the linear map.
Subsequently we cleaned up the naming of VA_START in commit:
77ad4ce69321abbe ("arm64: memory: rename VA_START to PAGE_END")
... keeping the erroneous condition as:
if ((virt >= PAGE_END) && (virt < VMALLOC_START)) { [ ... warning here ... ] return; }
Correct the condition to check against the start of the TTBR1 address space, which is currently PAGE_OFFSET. This simplifies the logic, and more clearly matches the "outside kernel range" message in the warning.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Russell King linux@armlinux.org.uk Cc: Steve Capper steve.capper@arm.com Cc: Will Deacon will@kernel.org Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Link: https://lore.kernel.org/r/20230615102628.1052103-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index af6bc8403ee46..72b3c21820b96 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -451,7 +451,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift) void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot) { - if ((virt >= PAGE_END) && (virt < VMALLOC_START)) { + if (virt < PAGE_OFFSET) { pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n", &phys, virt); return; @@ -478,7 +478,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, static void update_mapping_prot(phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot) { - if ((virt >= PAGE_END) && (virt < VMALLOC_START)) { + if (virt < PAGE_OFFSET) { pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n", &phys, virt); return;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 23d28cc0444be3f694eb986cd653b6888b78431d ]
The Dell Studio 1569 predates Windows 8, so it defaults to using acpi_video# for backlight control, but this is non functional on this model.
Add a DMI quirk to use the native intel_backlight interface which does work properly.
Reported-by: raycekarneal raycekarneal@gmail.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index b87783c5872dd..e7d04ab864a16 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -528,6 +528,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Precision 7510"), }, }, + { + .callback = video_detect_force_native, + /* Dell Studio 1569 */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Studio 1569"), + }, + }, { .callback = video_detect_force_native, /* Acer Aspire 3830TG */
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit 5fc8cbe4cf0fd34ded8045c385790c3bf04f6785 ]
pr_info() is called with rtp->cbs_gbl_lock spin lock locked. Because pr_info() calls printk() that might sleep, this will result in BUG like below:
[ 0.206455] cblist_init_generic: Setting adjustable number of callback queues. [ 0.206463] [ 0.206464] ============================= [ 0.206464] [ BUG: Invalid wait context ] [ 0.206465] 5.19.0-00428-g9de1f9c8ca51 #5 Not tainted [ 0.206466] ----------------------------- [ 0.206466] swapper/0/1 is trying to lock: [ 0.206467] ffffffffa0167a58 (&port_lock_key){....}-{3:3}, at: serial8250_console_write+0x327/0x4a0 [ 0.206473] other info that might help us debug this: [ 0.206473] context-{5:5} [ 0.206474] 3 locks held by swapper/0/1: [ 0.206474] #0: ffffffff9eb597e0 (rcu_tasks.cbs_gbl_lock){....}-{2:2}, at: cblist_init_generic.constprop.0+0x14/0x1f0 [ 0.206478] #1: ffffffff9eb579c0 (console_lock){+.+.}-{0:0}, at: _printk+0x63/0x7e [ 0.206482] #2: ffffffff9ea77780 (console_owner){....}-{0:0}, at: console_emit_next_record.constprop.0+0x111/0x330 [ 0.206485] stack backtrace: [ 0.206486] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-00428-g9de1f9c8ca51 #5 [ 0.206488] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 [ 0.206489] Call Trace: [ 0.206490] <TASK> [ 0.206491] dump_stack_lvl+0x6a/0x9f [ 0.206493] __lock_acquire.cold+0x2d7/0x2fe [ 0.206496] ? stack_trace_save+0x46/0x70 [ 0.206497] lock_acquire+0xd1/0x2f0 [ 0.206499] ? serial8250_console_write+0x327/0x4a0 [ 0.206500] ? __lock_acquire+0x5c7/0x2720 [ 0.206502] _raw_spin_lock_irqsave+0x3d/0x90 [ 0.206504] ? serial8250_console_write+0x327/0x4a0 [ 0.206506] serial8250_console_write+0x327/0x4a0 [ 0.206508] console_emit_next_record.constprop.0+0x180/0x330 [ 0.206511] console_unlock+0xf7/0x1f0 [ 0.206512] vprintk_emit+0xf7/0x330 [ 0.206514] _printk+0x63/0x7e [ 0.206516] cblist_init_generic.constprop.0.cold+0x24/0x32 [ 0.206518] rcu_init_tasks_generic+0x5/0xd9 [ 0.206522] kernel_init_freeable+0x15b/0x2a2 [ 0.206523] ? rest_init+0x160/0x160 [ 0.206526] kernel_init+0x11/0x120 [ 0.206527] ret_from_fork+0x1f/0x30 [ 0.206530] </TASK> [ 0.207018] cblist_init_generic: Setting shift to 1 and lim to 1.
This patch moves pr_info() so that it is called without rtp->cbs_gbl_lock locked.
Signed-off-by: Shigeru Yoshida syoshida@redhat.com Tested-by: "Zhang, Qiang1" qiang1.zhang@intel.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/tasks.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 8f08c087142b0..9b9ce09f8f358 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -241,7 +241,6 @@ static void cblist_init_generic(struct rcu_tasks *rtp) if (rcu_task_enqueue_lim < 0) { rcu_task_enqueue_lim = 1; rcu_task_cb_adjust = true; - pr_info("%s: Setting adjustable number of callback queues.\n", __func__); } else if (rcu_task_enqueue_lim == 0) { rcu_task_enqueue_lim = 1; } @@ -272,6 +271,10 @@ static void cblist_init_generic(struct rcu_tasks *rtp) raw_spin_unlock_rcu_node(rtpcp); // irqs remain disabled. } raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags); + + if (rcu_task_cb_adjust) + pr_info("%s: Setting adjustable number of callback queues.\n", __func__); + pr_info("%s: Setting shift to %d and lim to %d.\n", __func__, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim)); }
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit 9146eb25495ea8bfb5010192e61e3ed5805ce9ef ]
The per-CPU rcu_data structure's ->cpu_no_qs.b.exp field is updated only on the instance corresponding to the current CPU, but can be read more widely. Unmarked accesses are OK from the corresponding CPU, but only if interrupts are disabled, given that interrupt handlers can and do modify this field.
Unfortunately, although the load from rcu_preempt_deferred_qs() is always carried out from the corresponding CPU, interrupts are not necessarily disabled. This commit therefore upgrades this load to READ_ONCE.
Similarly, the diagnostic access from synchronize_rcu_expedited_wait() might run with interrupts disabled and from some other CPU. This commit therefore marks this load with data_race().
Finally, the C-language access in rcu_preempt_ctxt_queue() is OK as is because interrupts are disabled and this load is always from the corresponding CPU. This commit adds a comment giving the rationale for this access being safe.
This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely.
Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 3b7abb58157df..8239b39d945bd 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -643,7 +643,7 @@ static void synchronize_rcu_expedited_wait(void) "O."[!!cpu_online(cpu)], "o."[!!(rdp->grpmask & rnp->expmaskinit)], "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], - "D."[!!(rdp->cpu_no_qs.b.exp)]); + "D."[!!data_race(rdp->cpu_no_qs.b.exp)]); } } pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7b0fe741a0886..41021080ad258 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -257,6 +257,8 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp) * GP should not be able to end until we report, so there should be * no need to check for a subsequent expedited GP. (Though we are * still in a quiescent state in any case.) + * + * Interrupts are disabled, so ->cpu_no_qs.b.exp cannot change. */ if (blkd_state & RCU_EXP_BLKD && rdp->cpu_no_qs.b.exp) rcu_report_exp_rdp(rdp); @@ -941,7 +943,7 @@ notrace void rcu_preempt_deferred_qs(struct task_struct *t) { struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
- if (rdp->cpu_no_qs.b.exp) + if (READ_ONCE(rdp->cpu_no_qs.b.exp)) rcu_report_exp_rdp(rdp); }
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit 88fc7eb54ecc6db8b773341ce39ad201066fa7da ]
The all-zero pattern is one of the more probable out-of-bound writes so add a special case to not accidentally accept it.
Also it enables the reliable detection of stack protector initialization during testing.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Signed-off-by: Willy Tarreau w@1wt.eu Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/include/nolibc/stackprotector.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/include/nolibc/stackprotector.h b/tools/include/nolibc/stackprotector.h index d119cbbbc256f..9890e86c26172 100644 --- a/tools/include/nolibc/stackprotector.h +++ b/tools/include/nolibc/stackprotector.h @@ -45,8 +45,9 @@ __attribute__((weak,no_stack_protector,section(".text.nolibc_stack_chk"))) void __stack_chk_init(void) { my_syscall3(__NR_getrandom, &__stack_chk_guard, sizeof(__stack_chk_guard), 0); - /* a bit more randomness in case getrandom() fails */ - __stack_chk_guard ^= (uintptr_t) &__stack_chk_guard; + /* a bit more randomness in case getrandom() fails, ensure the guard is never 0 */ + if (__stack_chk_guard != (uintptr_t) &__stack_chk_guard) + __stack_chk_guard ^= (uintptr_t) &__stack_chk_guard; } #endif // defined(NOLIBC_STACKPROTECTOR)
From: Yicong Yang yangyicong@hisilicon.com
[ Upstream commit 0dd37d6dd33a9c23351e6115ae8cdac7863bc7de ]
We've run into the case that the balancer tries to balance a migration disabled task and trigger the warning in set_task_cpu() like below:
------------[ cut here ]------------ WARNING: CPU: 7 PID: 0 at kernel/sched/core.c:3115 set_task_cpu+0x188/0x240 Modules linked in: hclgevf xt_CHECKSUM ipt_REJECT nf_reject_ipv4 <...snip> CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G O 6.1.0-rc4+ #1 Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V5.B221.01 12/09/2021 pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : set_task_cpu+0x188/0x240 lr : load_balance+0x5d0/0xc60 sp : ffff80000803bc70 x29: ffff80000803bc70 x28: ffff004089e190e8 x27: ffff004089e19040 x26: ffff007effcabc38 x25: 0000000000000000 x24: 0000000000000001 x23: ffff80000803be84 x22: 000000000000000c x21: ffffb093e79e2a78 x20: 000000000000000c x19: ffff004089e19040 x18: 0000000000000000 x17: 0000000000001fad x16: 0000000000000030 x15: 0000000000000000 x14: 0000000000000003 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000001 x10: 0000000000000400 x9 : ffffb093e4cee530 x8 : 00000000fffffffe x7 : 0000000000ce168a x6 : 000000000000013e x5 : 00000000ffffffe1 x4 : 0000000000000001 x3 : 0000000000000b2a x2 : 0000000000000b2a x1 : ffffb093e6d6c510 x0 : 0000000000000001 Call trace: set_task_cpu+0x188/0x240 load_balance+0x5d0/0xc60 rebalance_domains+0x26c/0x380 _nohz_idle_balance.isra.0+0x1e0/0x370 run_rebalance_domains+0x6c/0x80 __do_softirq+0x128/0x3d8 ____do_softirq+0x18/0x24 call_on_irq_stack+0x2c/0x38 do_softirq_own_stack+0x24/0x3c __irq_exit_rcu+0xcc/0xf4 irq_exit_rcu+0x18/0x24 el1_interrupt+0x4c/0xe4 el1h_64_irq_handler+0x18/0x2c el1h_64_irq+0x74/0x78 arch_cpu_idle+0x18/0x4c default_idle_call+0x58/0x194 do_idle+0x244/0x2b0 cpu_startup_entry+0x30/0x3c secondary_start_kernel+0x14c/0x190 __secondary_switched+0xb0/0xb4 ---[ end trace 0000000000000000 ]---
Further investigation shows that the warning is superfluous, the migration disabled task is just going to be migrated to its current running CPU. This is because that on load balance if the dst_cpu is not allowed by the task, we'll re-select a new_dst_cpu as a candidate. If no task can be balanced to dst_cpu we'll try to balance the task to the new_dst_cpu instead. In this case when the migration disabled task is not on CPU it only allows to run on its current CPU, load balance will select its current CPU as new_dst_cpu and later triggers the warning above.
The new_dst_cpu is chosen from the env->dst_grpmask. Currently it contains CPUs in sched_group_span() and if we have overlapped groups it's possible to run into this case. This patch makes env->dst_grpmask of group_balance_mask() which exclude any CPUs from the busiest group and solve the issue. For balancing in a domain with no overlapped groups the behaviour keeps same as before.
Suggested-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Yicong Yang yangyicong@hisilicon.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20230530082507.10444-1-yangyicong@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4da5f35417626..e427056b440bb 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10762,7 +10762,7 @@ static int load_balance(int this_cpu, struct rq *this_rq, .sd = sd, .dst_cpu = this_cpu, .dst_rq = this_rq, - .dst_grpmask = sched_group_span(sd->groups), + .dst_grpmask = group_balance_mask(sd->groups), .idle = idle, .loop_break = SCHED_NR_MIGRATE_BREAK, .cpus = cpus,
From: Maxime Bizon mbizon@freebox.fr
[ Upstream commit e2ceb1de2f83aafd8003f0b72dfd4b7441e97d14 ]
Because of what seems to be a typo, a 6Ghz-only phy for which the BDF does not allow the 7115Mhz channel will fail to register:
WARNING: CPU: 2 PID: 106 at net/wireless/core.c:907 wiphy_register+0x914/0x954 Modules linked in: ath11k_pci sbsa_gwdt CPU: 2 PID: 106 Comm: kworker/u8:5 Not tainted 6.3.0-rc7-next-20230418-00549-g1e096a17625a-dirty #9 Hardware name: Freebox V7R Board (DT) Workqueue: ath11k_qmi_driver_event ath11k_qmi_driver_event_work pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : wiphy_register+0x914/0x954 lr : ieee80211_register_hw+0x67c/0xc10 sp : ffffff800b123aa0 x29: ffffff800b123aa0 x28: 0000000000000000 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000006 x24: ffffffc008d51418 x23: ffffffc008cb0838 x22: ffffff80176c2460 x21: 0000000000000168 x20: ffffff80176c0000 x19: ffffff80176c03e0 x18: 0000000000000014 x17: 00000000cbef338c x16: 00000000d2a26f21 x15: 00000000ad6bb85f x14: 0000000000000020 x13: 0000000000000020 x12: 00000000ffffffbd x11: 0000000000000208 x10: 00000000fffffdf7 x9 : ffffffc009394718 x8 : ffffff80176c0528 x7 : 000000007fffffff x6 : 0000000000000006 x5 : 0000000000000005 x4 : ffffff800b304284 x3 : ffffff800b304284 x2 : ffffff800b304d98 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: wiphy_register+0x914/0x954 ieee80211_register_hw+0x67c/0xc10 ath11k_mac_register+0x7c4/0xe10 ath11k_core_qmi_firmware_ready+0x1f4/0x570 ath11k_qmi_driver_event_work+0x198/0x590 process_one_work+0x1b8/0x328 worker_thread+0x6c/0x414 kthread+0x100/0x104 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- ath11k_pci 0002:01:00.0: ieee80211 registration failed: -22 ath11k_pci 0002:01:00.0: failed register the radio with mac80211: -22 ath11k_pci 0002:01:00.0: failed to create pdev core: -22
Signed-off-by: Maxime Bizon mbizon@freebox.fr Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230421145445.2612280-1-mbizon@freebox.fr Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath11k/mac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 1c93f1afccc57..05920ad413c55 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -8892,7 +8892,7 @@ static int ath11k_mac_setup_channels_rates(struct ath11k *ar, }
if (supported_bands & WMI_HOST_WLAN_5G_CAP) { - if (reg_cap->high_5ghz_chan >= ATH11K_MAX_6G_FREQ) { + if (reg_cap->high_5ghz_chan >= ATH11K_MIN_6G_FREQ) { channels = kmemdup(ath11k_6ghz_channels, sizeof(ath11k_6ghz_channels), GFP_KERNEL); if (!channels) {
From: Kui-Feng Lee thinker.li@gmail.com
[ Upstream commit fedf99200ab086c42a572fca1d7266b06cdc3e3f ]
Only print the warning message if you are writing to "/proc/sys/kernel/unprivileged_bpf_disabled".
The kernel may print an annoying warning when you read "/proc/sys/kernel/unprivileged_bpf_disabled" saying
WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!
However, this message is only meaningful when the feature is disabled or enabled.
Signed-off-by: Kui-Feng Lee kuifeng@meta.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/bpf/20230502181418.308479-1-kuifeng@meta.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/syscall.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f1c8733f76b83..5524fcf6fb2a4 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -5394,7 +5394,8 @@ static int bpf_unpriv_handler(struct ctl_table *table, int write, *(int *)table->data = unpriv_enable; }
- unpriv_ebpf_notify(unpriv_enable); + if (write) + unpriv_ebpf_notify(unpriv_enable);
return ret; }
From: Martin KaFai Lau martin.lau@kernel.org
[ Upstream commit ee9fd0ac3017c4313be91a220a9ac4c99dde7ad4 ]
KCSAN reported a data-race when accessing node->ref. Although node->ref does not have to be accurate, take this chance to use a more common READ_ONCE() and WRITE_ONCE() pattern instead of data_race().
There is an existing bpf_lru_node_is_ref() and bpf_lru_node_set_ref(). This patch also adds bpf_lru_node_clear_ref() to do the WRITE_ONCE(node->ref, 0) also.
================================================================== BUG: KCSAN: data-race in __bpf_lru_list_rotate / __htab_lru_percpu_map_update_elem
write to 0xffff888137038deb of 1 bytes by task 11240 on cpu 1: __bpf_lru_node_move kernel/bpf/bpf_lru_list.c:113 [inline] __bpf_lru_list_rotate_active kernel/bpf/bpf_lru_list.c:149 [inline] __bpf_lru_list_rotate+0x1bf/0x750 kernel/bpf/bpf_lru_list.c:240 bpf_lru_list_pop_free_to_local kernel/bpf/bpf_lru_list.c:329 [inline] bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:447 [inline] bpf_lru_pop_free+0x638/0xe20 kernel/bpf/bpf_lru_list.c:499 prealloc_lru_pop kernel/bpf/hashtab.c:290 [inline] __htab_lru_percpu_map_update_elem+0xe7/0x820 kernel/bpf/hashtab.c:1316 bpf_percpu_hash_update+0x5e/0x90 kernel/bpf/hashtab.c:2313 bpf_map_update_value+0x2a9/0x370 kernel/bpf/syscall.c:200 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1687 bpf_map_do_batch+0x2d9/0x3d0 kernel/bpf/syscall.c:4534 __sys_bpf+0x338/0x810 __do_sys_bpf kernel/bpf/syscall.c:5096 [inline] __se_sys_bpf kernel/bpf/syscall.c:5094 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5094 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff888137038deb of 1 bytes by task 11241 on cpu 0: bpf_lru_node_set_ref kernel/bpf/bpf_lru_list.h:70 [inline] __htab_lru_percpu_map_update_elem+0x2f1/0x820 kernel/bpf/hashtab.c:1332 bpf_percpu_hash_update+0x5e/0x90 kernel/bpf/hashtab.c:2313 bpf_map_update_value+0x2a9/0x370 kernel/bpf/syscall.c:200 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1687 bpf_map_do_batch+0x2d9/0x3d0 kernel/bpf/syscall.c:4534 __sys_bpf+0x338/0x810 __do_sys_bpf kernel/bpf/syscall.c:5096 [inline] __se_sys_bpf kernel/bpf/syscall.c:5094 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5094 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x01 -> 0x00
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 11241 Comm: syz-executor.3 Not tainted 6.3.0-rc7-syzkaller-00136-g6a66fdd29ea1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023 ==================================================================
Reported-by: syzbot+ebe648a84e8784763f82@syzkaller.appspotmail.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Acked-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/r/20230511043748.1384166-1-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/bpf_lru_list.c | 21 +++++++++++++-------- kernel/bpf/bpf_lru_list.h | 7 ++----- 2 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c index d99e89f113c43..3dabdd137d102 100644 --- a/kernel/bpf/bpf_lru_list.c +++ b/kernel/bpf/bpf_lru_list.c @@ -41,7 +41,12 @@ static struct list_head *local_pending_list(struct bpf_lru_locallist *loc_l) /* bpf_lru_node helpers */ static bool bpf_lru_node_is_ref(const struct bpf_lru_node *node) { - return node->ref; + return READ_ONCE(node->ref); +} + +static void bpf_lru_node_clear_ref(struct bpf_lru_node *node) +{ + WRITE_ONCE(node->ref, 0); }
static void bpf_lru_list_count_inc(struct bpf_lru_list *l, @@ -89,7 +94,7 @@ static void __bpf_lru_node_move_in(struct bpf_lru_list *l,
bpf_lru_list_count_inc(l, tgt_type); node->type = tgt_type; - node->ref = 0; + bpf_lru_node_clear_ref(node); list_move(&node->list, &l->lists[tgt_type]); }
@@ -110,7 +115,7 @@ static void __bpf_lru_node_move(struct bpf_lru_list *l, bpf_lru_list_count_inc(l, tgt_type); node->type = tgt_type; } - node->ref = 0; + bpf_lru_node_clear_ref(node);
/* If the moving node is the next_inactive_rotation candidate, * move the next_inactive_rotation pointer also. @@ -353,7 +358,7 @@ static void __local_list_add_pending(struct bpf_lru *lru, *(u32 *)((void *)node + lru->hash_offset) = hash; node->cpu = cpu; node->type = BPF_LRU_LOCAL_LIST_T_PENDING; - node->ref = 0; + bpf_lru_node_clear_ref(node); list_add(&node->list, local_pending_list(loc_l)); }
@@ -419,7 +424,7 @@ static struct bpf_lru_node *bpf_percpu_lru_pop_free(struct bpf_lru *lru, if (!list_empty(free_list)) { node = list_first_entry(free_list, struct bpf_lru_node, list); *(u32 *)((void *)node + lru->hash_offset) = hash; - node->ref = 0; + bpf_lru_node_clear_ref(node); __bpf_lru_node_move(l, node, BPF_LRU_LIST_T_INACTIVE); }
@@ -522,7 +527,7 @@ static void bpf_common_lru_push_free(struct bpf_lru *lru, }
node->type = BPF_LRU_LOCAL_LIST_T_FREE; - node->ref = 0; + bpf_lru_node_clear_ref(node); list_move(&node->list, local_free_list(loc_l));
raw_spin_unlock_irqrestore(&loc_l->lock, flags); @@ -568,7 +573,7 @@ static void bpf_common_lru_populate(struct bpf_lru *lru, void *buf,
node = (struct bpf_lru_node *)(buf + node_offset); node->type = BPF_LRU_LIST_T_FREE; - node->ref = 0; + bpf_lru_node_clear_ref(node); list_add(&node->list, &l->lists[BPF_LRU_LIST_T_FREE]); buf += elem_size; } @@ -594,7 +599,7 @@ static void bpf_percpu_lru_populate(struct bpf_lru *lru, void *buf, node = (struct bpf_lru_node *)(buf + node_offset); node->cpu = cpu; node->type = BPF_LRU_LIST_T_FREE; - node->ref = 0; + bpf_lru_node_clear_ref(node); list_add(&node->list, &l->lists[BPF_LRU_LIST_T_FREE]); i++; buf += elem_size; diff --git a/kernel/bpf/bpf_lru_list.h b/kernel/bpf/bpf_lru_list.h index 4ea227c9c1ade..8f3c8b2b4490e 100644 --- a/kernel/bpf/bpf_lru_list.h +++ b/kernel/bpf/bpf_lru_list.h @@ -64,11 +64,8 @@ struct bpf_lru {
static inline void bpf_lru_node_set_ref(struct bpf_lru_node *node) { - /* ref is an approximation on access frequency. It does not - * have to be very accurate. Hence, no protection is used. - */ - if (!node->ref) - node->ref = 1; + if (!READ_ONCE(node->ref)) + WRITE_ONCE(node->ref, 1); }
int bpf_lru_init(struct bpf_lru *lru, bool percpu, u32 hash_offset,
From: Brad Larson blarson@amd.com
[ Upstream commit f5c2f9f9584353bc816d76a65c97dd03dc61678c ]
The AMD Pensando Elba SoC has the Cadence QSPI controller integrated.
The quirk CQSPI_NEEDS_APB_AHB_HAZARD_WAR is added and if enabled a dummy readback from the controller is performed to ensure synchronization.
Signed-off-by: Brad Larson <blarson@amd.com Link: https://lore.kernel.org/r/20230515181606.65953-8-blarson@amd.com Signed-off-by: Mark Brown <broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-cadence-quadspi.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 32449bef4415a..abf10f92415dc 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -40,6 +40,7 @@ #define CQSPI_SUPPORT_EXTERNAL_DMA BIT(2) #define CQSPI_NO_SUPPORT_WR_COMPLETION BIT(3) #define CQSPI_SLOW_SRAM BIT(4) +#define CQSPI_NEEDS_APB_AHB_HAZARD_WAR BIT(5)
/* Capabilities */ #define CQSPI_SUPPORTS_OCTAL BIT(0) @@ -90,6 +91,7 @@ struct cqspi_st { u32 pd_dev_id; bool wr_completion; bool slow_sram; + bool apb_ahb_hazard; };
struct cqspi_driver_platdata { @@ -1027,6 +1029,13 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata, if (cqspi->wr_delay) ndelay(cqspi->wr_delay);
+ /* + * If a hazard exists between the APB and AHB interfaces, perform a + * dummy readback from the controller to ensure synchronization. + */ + if (cqspi->apb_ahb_hazard) + readl(reg_base + CQSPI_REG_INDIRECTWR); + while (remaining > 0) { size_t write_words, mod_bytes;
@@ -1754,6 +1763,8 @@ static int cqspi_probe(struct platform_device *pdev) cqspi->wr_completion = false; if (ddata->quirks & CQSPI_SLOW_SRAM) cqspi->slow_sram = true; + if (ddata->quirks & CQSPI_NEEDS_APB_AHB_HAZARD_WAR) + cqspi->apb_ahb_hazard = true;
if (of_device_is_compatible(pdev->dev.of_node, "xlnx,versal-ospi-1.0")) { @@ -1888,6 +1899,10 @@ static const struct cqspi_driver_platdata jh7110_qspi = { .quirks = CQSPI_DISABLE_DAC_MODE, };
+static const struct cqspi_driver_platdata pensando_cdns_qspi = { + .quirks = CQSPI_NEEDS_APB_AHB_HAZARD_WAR | CQSPI_DISABLE_DAC_MODE, +}; + static const struct of_device_id cqspi_dt_ids[] = { { .compatible = "cdns,qspi-nor", @@ -1917,6 +1932,10 @@ static const struct of_device_id cqspi_dt_ids[] = { .compatible = "starfive,jh7110-qspi", .data = &jh7110_qspi, }, + { + .compatible = "amd,pensando-elba-qspi", + .data = &pensando_cdns_qspi, + }, { /* end of table */ } };
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit cff36398bd4c7d322d424433db437f3c3391c491 ]
It's trivial for user to trigger "verifier log line truncated" warning, as verifier has a fixed-sized buffer of 1024 bytes (as of now), and there are at least two pieces of user-provided information that can be output through this buffer, and both can be arbitrarily sized by user: - BTF names; - BTF.ext source code lines strings.
Verifier log buffer should be properly sized for typical verifier state output. But it's sort-of expected that this buffer won't be long enough in some circumstances. So let's drop the check. In any case code will work correctly, at worst truncating a part of a single line output.
Reported-by: syzbot+8b2a08dfbd25fd933d75@syzkaller.appspotmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20230516180409.3549088-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/log.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 046ddff37a76d..850494423530e 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -62,9 +62,6 @@ void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
- WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1, - "verifier log line truncated - local buffer too short\n"); - if (log->level == BPF_LOG_KERNEL) { bool newline = n > 0 && log->kbuf[n - 1] == '\n';
From: Aditi Ghag aditi.ghag@isovalent.com
[ Upstream commit 9378096e8a656fb5c4099b26b1370c56f056eab9 ]
This is a preparatory commit to replace `lock_sock_fast` with `lock_sock`,and facilitate BPF programs executed from the TCP sockets iterator to be able to destroy TCP sockets using the bpf_sock_destroy kfunc (implemented in follow-up commits).
Previously, BPF TCP iterator was acquiring the sock lock with BH disabled. This led to scenarios where the sockets hash table bucket lock can be acquired with BH enabled in some path versus disabled in other. In such situation, kernel issued a warning since it thinks that in the BH enabled path the same bucket lock *might* be acquired again in the softirq context (BH disabled), which will lead to a potential dead lock. Since bpf_sock_destroy also happens in a process context, the potential deadlock warning is likely a false alarm.
Here is a snippet of annotated stack trace that motivated this change:
```
Possible interrupt unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&h->lhash2[i].lock); local_bh_disable(); lock(&h->lhash2[i].lock); kernel imagined possible scenario: local_bh_disable(); /* Possible softirq */ lock(&h->lhash2[i].lock); *** Potential Deadlock ***
process context:
lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock with BH enabled __inet_hash+0x4b/0x210 inet_csk_listen_start+0xe6/0x100 inet_listen+0x95/0x1d0 __sys_listen+0x69/0xb0 __x64_sys_listen+0x14/0x20 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc
bpf_sock_destroy run from iterator:
lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock with BH disabled inet_unhash+0x9a/0x110 tcp_set_state+0x6a/0x210 tcp_abort+0x10d/0x200 bpf_prog_6793c5ca50c43c0d_iter_tcp6_server+0xa4/0xa9 bpf_iter_run_prog+0x1ff/0x340 ------> lock_sock_fast that acquires sock lock with BH disabled bpf_iter_tcp_seq_show+0xca/0x190 bpf_seq_read+0x177/0x450
```
Also, Yonghong reported a deadlock for non-listening TCP sockets that this change resolves. Previously, `lock_sock_fast` held the sock spin lock with BH which was again being acquired in `tcp_abort`:
``` watchdog: BUG: soft lockup - CPU#0 stuck for 86s! [test_progs:2331] RIP: 0010:queued_spin_lock_slowpath+0xd8/0x500 Call Trace: <TASK> _raw_spin_lock+0x84/0x90 tcp_abort+0x13c/0x1f0 bpf_prog_88539c5453a9dd47_iter_tcp6_client+0x82/0x89 bpf_iter_run_prog+0x1aa/0x2c0 ? preempt_count_sub+0x1c/0xd0 ? from_kuid_munged+0x1c8/0x210 bpf_iter_tcp_seq_show+0x14e/0x1b0 bpf_seq_read+0x36c/0x6a0
bpf_iter_tcp_seq_show lock_sock_fast __lock_sock_fast spin_lock_bh(&sk->sk_lock.slock); /* * Fast path return with bottom halves disabled and * sock::sk_lock.slock held.* */
... tcp_abort local_bh_disable(); spin_lock(&((sk)->sk_lock.slock)); // from bh_lock_sock(sk)
```
With the switch to `lock_sock`, it calls `spin_unlock_bh` before returning:
``` lock_sock lock_sock_nested spin_lock_bh(&sk->sk_lock.slock); : spin_unlock_bh(&sk->sk_lock.slock); ```
Acked-by: Yonghong Song yhs@meta.com Acked-by: Stanislav Fomichev sdf@google.com Signed-off-by: Aditi Ghag aditi.ghag@isovalent.com Link: https://lore.kernel.org/r/20230519225157.760788-2-aditi.ghag@isovalent.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_ipv4.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 06d2573685ca9..434e5f0c8b99d 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2963,7 +2963,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) struct bpf_iter_meta meta; struct bpf_prog *prog; struct sock *sk = v; - bool slow; uid_t uid; int ret;
@@ -2971,7 +2970,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) return 0;
if (sk_fullsock(sk)) - slow = lock_sock_fast(sk); + lock_sock(sk);
if (unlikely(sk_unhashed(sk))) { ret = SEQ_SKIP; @@ -2995,7 +2994,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v)
unlock: if (sk_fullsock(sk)) - unlock_sock_fast(sk, slow); + release_sock(sk); return ret;
}
From: Martin Blumenstingl martin.blumenstingl@googlemail.com
[ Upstream commit e967229ead0e6c5047a1cfd5a0db58ceb930800b ]
rtw_sdio_rx_isr() is responsible for receiving data from the wifi chip and is called from the SDIO interrupt handler when the interrupt status register (HISR) has the RX_REQUEST bit set. After the first batch of data has been processed by the driver the wifi chip may have more data ready to be read, which is managed by a loop in rtw_sdio_rx_isr().
It turns out that there are cases where the RX buffer length (from the REG_SDIO_RX0_REQ_LEN register) does not match the data we receive. The following two cases were observed with a RTL8723DS card: - RX length is smaller than the total packet length including overhead and actual data bytes (whose length is part of the buffer we read from the wifi chip and is stored in rtw_rx_pkt_stat.pkt_len). This can result in errors like: skbuff: skb_over_panic: text:ffff8000011924ac len:3341 put:3341 (one case observed was: RX buffer length = 1536 bytes but rtw_rx_pkt_stat.pkt_len = 1546 bytes, this is not valid as it means we need to read beyond the end of the buffer) - RX length looks valid but rtw_rx_pkt_stat.pkt_len is zero
Check if the RX_REQUEST is set in the HISR register for each iteration inside rtw_sdio_rx_isr(). This mimics what the RTL8723DS vendor driver does and makes the driver only read more data if the RX_REQUEST bit is set (which seems to be a way for the card's hardware or firmware to tell the host that data is ready to be processed).
For RTW_WCPU_11AC chips this check is not needed. The RTL8822BS vendor driver for example states that this check is unnecessary (but still uses it) and the RTL8822CS drops this check entirely.
Reviewed-by: Ping-Ke Shih pkshih@realtek.com Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230522202425.1827005-2-martin.blumenstingl@googl... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtw88/sdio.c | 24 ++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c index 06fce7c3addaa..2c1fb2dabd40a 100644 --- a/drivers/net/wireless/realtek/rtw88/sdio.c +++ b/drivers/net/wireless/realtek/rtw88/sdio.c @@ -998,9 +998,9 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)
static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev) { - u32 rx_len, total_rx_bytes = 0; + u32 rx_len, hisr, total_rx_bytes = 0;
- while (total_rx_bytes < SZ_64K) { + do { if (rtw_chip_wcpu_11n(rtwdev)) rx_len = rtw_read16(rtwdev, REG_SDIO_RX0_REQ_LEN); else @@ -1012,7 +1012,25 @@ static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev) rtw_sdio_rxfifo_recv(rtwdev, rx_len);
total_rx_bytes += rx_len; - } + + if (rtw_chip_wcpu_11n(rtwdev)) { + /* Stop if no more RX requests are pending, even if + * rx_len could be greater than zero in the next + * iteration. This is needed because the RX buffer may + * already contain data while either HW or FW are not + * done filling that buffer yet. Still reading the + * buffer can result in packets where + * rtw_rx_pkt_stat.pkt_len is zero or points beyond the + * end of the buffer. + */ + hisr = rtw_read32(rtwdev, REG_SDIO_HISR); + } else { + /* RTW_WCPU_11AC chips have improved hardware or + * firmware and can use rx_len unconditionally. + */ + hisr = REG_SDIO_HISR_RX_REQUEST; + } + } while (total_rx_bytes < SZ_64K && hisr & REG_SDIO_HISR_RX_REQUEST); }
static void rtw_sdio_handle_interrupt(struct sdio_func *sdio_func)
From: Yonghong Song yhs@fb.com
[ Upstream commit e6c2f594ed961273479505b42040782820190305 ]
syzbot reported a warning in [1] with the following stacktrace: WARNING: CPU: 0 PID: 5005 at kernel/bpf/btf.c:1988 btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... RIP: 0010:btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... Call Trace: <TASK> map_check_btf kernel/bpf/syscall.c:1024 [inline] map_create+0x1157/0x1860 kernel/bpf/syscall.c:1198 __sys_bpf+0x127f/0x5420 kernel/bpf/syscall.c:5040 __do_sys_bpf kernel/bpf/syscall.c:5162 [inline] __se_sys_bpf kernel/bpf/syscall.c:5160 [inline] __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5160 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
With the following btf [1] DECL_TAG 'a' type_id=4 component_idx=-1 [2] PTR '(anon)' type_id=0 [3] TYPE_TAG 'a' type_id=2 [4] VAR 'a' type_id=3, linkage=static and when the bpf_attr.btf_key_type_id = 1 (DECL_TAG), the following WARN_ON_ONCE in btf_type_id_size() is triggered: if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) && !btf_type_is_var(size_type))) return NULL;
Note that 'return NULL' is the correct behavior as we don't want a DECL_TAG type to be used as a btf_{key,value}_type_id even for the case like 'DECL_TAG -> STRUCT'. So there is no correctness issue here, we just want to silence warning.
To silence the warning, I added DECL_TAG as one of kinds in btf_type_nosize() which will cause btf_type_id_size() returning NULL earlier without the warning.
[1] https://lore.kernel.org/bpf/000000000000e0df8d05fc75ba86@google.com/
Reported-by: syzbot+958967f249155967d42a@syzkaller.appspotmail.com Signed-off-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/r/20230530205029.264910-1-yhs@fb.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/btf.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 25ca17a8e1964..8b4e92439d1d6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -485,25 +485,26 @@ static bool btf_type_is_fwd(const struct btf_type *t) return BTF_INFO_KIND(t->info) == BTF_KIND_FWD; }
-static bool btf_type_nosize(const struct btf_type *t) +static bool btf_type_is_datasec(const struct btf_type *t) { - return btf_type_is_void(t) || btf_type_is_fwd(t) || - btf_type_is_func(t) || btf_type_is_func_proto(t); + return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
-static bool btf_type_nosize_or_null(const struct btf_type *t) +static bool btf_type_is_decl_tag(const struct btf_type *t) { - return !t || btf_type_nosize(t); + return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG; }
-static bool btf_type_is_datasec(const struct btf_type *t) +static bool btf_type_nosize(const struct btf_type *t) { - return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; + return btf_type_is_void(t) || btf_type_is_fwd(t) || + btf_type_is_func(t) || btf_type_is_func_proto(t) || + btf_type_is_decl_tag(t); }
-static bool btf_type_is_decl_tag(const struct btf_type *t) +static bool btf_type_nosize_or_null(const struct btf_type *t) { - return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG; + return !t || btf_type_nosize(t); }
static bool btf_type_is_decl_tag_target(const struct btf_type *t)
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 6f4b98147b8dfcabacb19b5c6abd087af66d0049 ]
Devlink health is involved in error recovery. Machines in bad state tend to be fairly unreliable, and occasionally get stuck in error loops. Even with a reasonable grace period devlink health may get a thousand reports in an hour.
In case of reporting on an unregistered devlink instance the subsequent reports don't add much value. Switch to WARN_ON_ONCE() to avoid flooding dmesg and fleet monitoring dashboards.
Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20230531015523.48961-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/devlink/health.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/devlink/health.c b/net/devlink/health.c index 0839706d5741a..194340a8bb863 100644 --- a/net/devlink/health.c +++ b/net/devlink/health.c @@ -480,7 +480,7 @@ static void devlink_recover_notify(struct devlink_health_reporter *reporter, int err;
WARN_ON(cmd != DEVLINK_CMD_HEALTH_REPORTER_RECOVER); - WARN_ON(!xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED)); + ASSERT_DEVLINK_REGISTERED(devlink);
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg)
From: Wen Gong quic_wgong@quicinc.com
[ Upstream commit 88ca89202f8e8afb5225eb5244d79cd67c15d744 ]
Sometimes board-2.bin does not have the regdb data which matched the parameters such as vendor, device, subsystem-vendor, subsystem-device and etc. Add default regdb data with 'bus=%s' into board-2.bin for WCN6855, then ath11k use 'bus=pci' to search regdb data in board-2.bin for WCN6855.
kernel: [ 122.515808] ath11k_pci 0000:03:00.0: boot using board name 'bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=262' kernel: [ 122.517240] ath11k_pci 0000:03:00.0: boot firmware request ath11k/WCN6855/hw2.0/board-2.bin size 6179564 kernel: [ 122.517280] ath11k_pci 0000:03:00.0: failed to fetch regdb data for bus=pci,vendor=17cb,device=1103,subsystem-vendor=17cb,subsystem-device=3374,qmi-chip-id=2,qmi-board-id=262 from ath11k/WCN6855/hw2.0/board-2.bin kernel: [ 122.517464] ath11k_pci 0000:03:00.0: boot using board name 'bus=pci' kernel: [ 122.518901] ath11k_pci 0000:03:00.0: boot firmware request ath11k/WCN6855/hw2.0/board-2.bin size 6179564 kernel: [ 122.518915] ath11k_pci 0000:03:00.0: board name kernel: [ 122.518917] ath11k_pci 0000:03:00.0: 00000000: 62 75 73 3d 70 63 69 bus=pci kernel: [ 122.518918] ath11k_pci 0000:03:00.0: boot found match regdb data for name 'bus=pci' kernel: [ 122.518920] ath11k_pci 0000:03:00.0: boot found regdb data for 'bus=pci' kernel: [ 122.518921] ath11k_pci 0000:03:00.0: fetched regdb
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3
Signed-off-by: Wen Gong quic_wgong@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230517133959.8224-1-quic_wgong@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath11k/core.c | 53 +++++++++++++++++++------- 1 file changed, 40 insertions(+), 13 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c index 9de23c11e18bb..8ab1a62351b98 100644 --- a/drivers/net/wireless/ath/ath11k/core.c +++ b/drivers/net/wireless/ath/ath11k/core.c @@ -962,7 +962,8 @@ int ath11k_core_check_dt(struct ath11k_base *ab) }
static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name, - size_t name_len, bool with_variant) + size_t name_len, bool with_variant, + bool bus_type_mode) { /* strlen(',variant=') + strlen(ab->qmi.target.bdf_ext) */ char variant[9 + ATH11K_QMI_BDF_EXT_STR_LENGTH] = { 0 }; @@ -973,15 +974,20 @@ static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name,
switch (ab->id.bdf_search) { case ATH11K_BDF_SEARCH_BUS_AND_BOARD: - scnprintf(name, name_len, - "bus=%s,vendor=%04x,device=%04x,subsystem-vendor=%04x,subsystem-device=%04x,qmi-chip-id=%d,qmi-board-id=%d%s", - ath11k_bus_str(ab->hif.bus), - ab->id.vendor, ab->id.device, - ab->id.subsystem_vendor, - ab->id.subsystem_device, - ab->qmi.target.chip_id, - ab->qmi.target.board_id, - variant); + if (bus_type_mode) + scnprintf(name, name_len, + "bus=%s", + ath11k_bus_str(ab->hif.bus)); + else + scnprintf(name, name_len, + "bus=%s,vendor=%04x,device=%04x,subsystem-vendor=%04x,subsystem-device=%04x,qmi-chip-id=%d,qmi-board-id=%d%s", + ath11k_bus_str(ab->hif.bus), + ab->id.vendor, ab->id.device, + ab->id.subsystem_vendor, + ab->id.subsystem_device, + ab->qmi.target.chip_id, + ab->qmi.target.board_id, + variant); break; default: scnprintf(name, name_len, @@ -1000,13 +1006,19 @@ static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name, static int ath11k_core_create_board_name(struct ath11k_base *ab, char *name, size_t name_len) { - return __ath11k_core_create_board_name(ab, name, name_len, true); + return __ath11k_core_create_board_name(ab, name, name_len, true, false); }
static int ath11k_core_create_fallback_board_name(struct ath11k_base *ab, char *name, size_t name_len) { - return __ath11k_core_create_board_name(ab, name, name_len, false); + return __ath11k_core_create_board_name(ab, name, name_len, false, false); +} + +static int ath11k_core_create_bus_type_board_name(struct ath11k_base *ab, char *name, + size_t name_len) +{ + return __ath11k_core_create_board_name(ab, name, name_len, false, true); }
const struct firmware *ath11k_core_firmware_request(struct ath11k_base *ab, @@ -1310,7 +1322,7 @@ int ath11k_core_fetch_bdf(struct ath11k_base *ab, struct ath11k_board_data *bd)
int ath11k_core_fetch_regdb(struct ath11k_base *ab, struct ath11k_board_data *bd) { - char boardname[BOARD_NAME_SIZE]; + char boardname[BOARD_NAME_SIZE], default_boardname[BOARD_NAME_SIZE]; int ret;
ret = ath11k_core_create_board_name(ab, boardname, BOARD_NAME_SIZE); @@ -1327,6 +1339,21 @@ int ath11k_core_fetch_regdb(struct ath11k_base *ab, struct ath11k_board_data *bd if (!ret) goto exit;
+ ret = ath11k_core_create_bus_type_board_name(ab, default_boardname, + BOARD_NAME_SIZE); + if (ret) { + ath11k_dbg(ab, ATH11K_DBG_BOOT, + "failed to create default board name for regdb: %d", ret); + goto exit; + } + + ret = ath11k_core_fetch_board_data_api_n(ab, bd, default_boardname, + ATH11K_BD_IE_REGDB, + ATH11K_BD_IE_REGDB_NAME, + ATH11K_BD_IE_REGDB_DATA); + if (!ret) + goto exit; + ret = ath11k_core_fetch_board_data_api_1(ab, bd, ATH11K_REGDB_FILE_NAME); if (ret) ath11k_dbg(ab, ATH11K_DBG_BOOT, "failed to fetch %s from %s\n",
From: Ilan Peer ilan.peer@intel.com
[ Upstream commit 0cc80943ef518a1c51a1111e9346d1daf11dd545 ]
In a call to mac80211_hwsim_select_tx_link() the sta pointer might be NULL, thus need to check that it is not NULL before accessing it.
Signed-off-by: Ilan Peer ilan.peer@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230604120651.f4d889fc98c4.Iae85f527ed245a37637a8... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/virtual/mac80211_hwsim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/virtual/mac80211_hwsim.c b/drivers/net/wireless/virtual/mac80211_hwsim.c index 89c7a1420381d..ed5af63025979 100644 --- a/drivers/net/wireless/virtual/mac80211_hwsim.c +++ b/drivers/net/wireless/virtual/mac80211_hwsim.c @@ -4,7 +4,7 @@ * Copyright (c) 2008, Jouni Malinen j@w1.fi * Copyright (c) 2011, Javier Lopez jlopex@gmail.com * Copyright (c) 2016 - 2017 Intel Deutschland GmbH - * Copyright (C) 2018 - 2022 Intel Corporation + * Copyright (C) 2018 - 2023 Intel Corporation */
/* @@ -1864,7 +1864,7 @@ mac80211_hwsim_select_tx_link(struct mac80211_hwsim_data *data,
WARN_ON(is_multicast_ether_addr(hdr->addr1));
- if (WARN_ON_ONCE(!sta->valid_links)) + if (WARN_ON_ONCE(!sta || !sta->valid_links)) return &vif->bss_conf;
for (i = 0; i < ARRAY_SIZE(vif->link_conf); i++) {
From: Abe Kohandel abe.kohandel@intel.com
[ Upstream commit 0760d5d0e9f0c0e2200a0323a61d1995bb745dee ]
The Intel Mount Evans SoC's Integrated Management Complex uses the SPI controller for access to a NOR SPI FLASH. However, the SoC doesn't provide a mechanism to override the native chip select signal.
This driver doesn't use DMA for memory operations when a chip select override is not provided due to the native chip select timing behavior. As a result no DMA configuration is done for the controller and this configuration is not tested.
The controller also has an errata where a full TX FIFO can result in data corruption. The suggested workaround is to never completely fill the FIFO. The TX FIFO has a size of 32 so the fifo_len is set to 31.
Signed-off-by: Abe Kohandel abe.kohandel@intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230606145402.474866-2-abe.kohandel@intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-dw-mmio.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c index 15f5e9cb54ad4..5a38cb09a650d 100644 --- a/drivers/spi/spi-dw-mmio.c +++ b/drivers/spi/spi-dw-mmio.c @@ -236,6 +236,31 @@ static int dw_spi_intel_init(struct platform_device *pdev, return 0; }
+/* + * The Intel Mount Evans SoC's Integrated Management Complex uses the + * SPI controller for access to a NOR SPI FLASH. However, the SoC doesn't + * provide a mechanism to override the native chip select signal. + * + * This driver doesn't use DMA for memory operations when a chip select + * override is not provided due to the native chip select timing behavior. + * As a result no DMA configuration is done for the controller and this + * configuration is not tested. + */ +static int dw_spi_mountevans_imc_init(struct platform_device *pdev, + struct dw_spi_mmio *dwsmmio) +{ + /* + * The Intel Mount Evans SoC's Integrated Management Complex DW + * apb_ssi_v4.02a controller has an errata where a full TX FIFO can + * result in data corruption. The suggested workaround is to never + * completely fill the FIFO. The TX FIFO has a size of 32 so the + * fifo_len is set to 31. + */ + dwsmmio->dws.fifo_len = 31; + + return 0; +} + static int dw_spi_canaan_k210_init(struct platform_device *pdev, struct dw_spi_mmio *dwsmmio) { @@ -405,6 +430,10 @@ static const struct of_device_id dw_spi_mmio_of_match[] = { { .compatible = "snps,dwc-ssi-1.01a", .data = dw_spi_hssi_init}, { .compatible = "intel,keembay-ssi", .data = dw_spi_intel_init}, { .compatible = "intel,thunderbay-ssi", .data = dw_spi_intel_init}, + { + .compatible = "intel,mountevans-imc-ssi", + .data = dw_spi_mountevans_imc_init, + }, { .compatible = "microchip,sparx5-spi", dw_spi_mscc_sparx5_init}, { .compatible = "canaan,k210-spi", dw_spi_canaan_k210_init}, { .compatible = "amd,pensando-elba-spi", .data = dw_spi_elba_init},
From: Balamurugan S quic_bselvara@quicinc.com
[ Upstream commit 054b5580a36e435692c203c19abdcb9f7734320e ]
Currently 'ar' reference is not added in skb_cb. Though this is generally not used during transmit completion callbacks, on interface removal the remaining idr cleanup callback uses the ar pointer from skb_cb from management txmgmt_idr. Hence fill them during transmit call for proper usage to avoid NULL pointer dereference.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Balamurugan S quic_bselvara@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230518071046.14337-1-quic_bselvara@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/mac.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c index ee792822b4113..58acfe8fdf8c0 100644 --- a/drivers/net/wireless/ath/ath12k/mac.c +++ b/drivers/net/wireless/ath/ath12k/mac.c @@ -4425,6 +4425,7 @@ static int ath12k_mac_mgmt_tx_wmi(struct ath12k *ar, struct ath12k_vif *arvif, int buf_id; int ret;
+ ATH12K_SKB_CB(skb)->ar = ar; spin_lock_bh(&ar->txmgmt_idr_lock); buf_id = idr_alloc(&ar->txmgmt_idr, skb, 0, ATH12K_TX_MGMT_NUM_PENDING_MAX, GFP_ATOMIC);
From: P Praneesh quic_ppranees@quicinc.com
[ Upstream commit 6aafa1c2d3e3fea2ebe84c018003f2a91722e607 ]
Memory allocated for firmware pdev, vdev and beacon statistics are not released during rmmod.
Fix it by calling ath11k_fw_stats_free() function before hardware unregister.
While at it, avoid calling ath11k_fw_stats_free() while processing the firmware stats received in the WMI event because the local list is getting spliced and reinitialised and hence there are no elements in the list after splicing.
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
Signed-off-by: P Praneesh quic_ppranees@quicinc.com Signed-off-by: Aditya Kumar Singh quic_adisi@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230606091128.14202-1-quic_adisi@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath11k/mac.c | 1 + drivers/net/wireless/ath/ath11k/wmi.c | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 05920ad413c55..01ff197b017f7 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -9468,6 +9468,7 @@ void ath11k_mac_destroy(struct ath11k_base *ab) if (!ar) continue;
+ ath11k_fw_stats_free(&ar->fw_stats); ieee80211_free_hw(ar->hw); pdev->ar = NULL; } diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c index d0b59bc2905a9..42d9b29623a47 100644 --- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -8103,6 +8103,11 @@ static void ath11k_update_stats_event(struct ath11k_base *ab, struct sk_buff *sk rcu_read_unlock(); spin_unlock_bh(&ar->data_lock);
+ /* Since the stats's pdev, vdev and beacon list are spliced and reinitialised + * at this point, no need to free the individual list. + */ + return; + free: ath11k_fw_stats_free(&stats); }
From: Gregory Greenman gregory.greenman@intel.com
[ Upstream commit 637452360ecde9ac972d19416e9606529576b302 ]
Account for IWL_SEC_WEP_KEY_OFFSET when needed while verifying key_len size in iwl_mvm_sec_key_add().
Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230613155501.f193b7493a93.I6948ba625b9318924b96a... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c b/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c index 8853821b37168..1e659bd07392a 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* - * Copyright (C) 2022 Intel Corporation + * Copyright (C) 2022 - 2023 Intel Corporation */ #include <linux/kernel.h> #include <net/mac80211.h> @@ -179,9 +179,14 @@ int iwl_mvm_sec_key_add(struct iwl_mvm *mvm, .u.add.key_flags = cpu_to_le32(key_flags), .u.add.tx_seq = cpu_to_le64(atomic64_read(&keyconf->tx_pn)), }; + int max_key_len = sizeof(cmd.u.add.key); int ret;
- if (WARN_ON(keyconf->keylen > sizeof(cmd.u.add.key))) + if (keyconf->cipher == WLAN_CIPHER_SUITE_WEP40 || + keyconf->cipher == WLAN_CIPHER_SUITE_WEP104) + max_key_len -= IWL_SEC_WEP_KEY_OFFSET; + + if (WARN_ON(keyconf->keylen > max_key_len)) return -EINVAL;
if (WARN_ON(!sta_mask))
From: Jisheng Zhang jszhang@kernel.org
[ Upstream commit 18da174d865a87d47d2f33f5b0a322efcf067728 ]
Implement 64 bit per cpu stats to fix the overflow of netdev->stats on 32 bit platforms. To simplify the code, we use net core pcpu_sw_netstats infrastructure. One small drawback is some memory overhead because litex uses just one queue, but we allocate the counters per cpu.
Signed-off-by: Jisheng Zhang jszhang@kernel.org Reviewed-by: Simon Horman simon.horman@corigine.com Acked-by: Gabriel Somlo gsomlo@gmail.com Link: https://lore.kernel.org/r/20230614162035.300-1-jszhang@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/litex/litex_liteeth.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/litex/litex_liteeth.c b/drivers/net/ethernet/litex/litex_liteeth.c index 35f24e0f09349..ffa96059079c6 100644 --- a/drivers/net/ethernet/litex/litex_liteeth.c +++ b/drivers/net/ethernet/litex/litex_liteeth.c @@ -78,8 +78,7 @@ static int liteeth_rx(struct net_device *netdev) memcpy_fromio(data, priv->rx_base + rx_slot * priv->slot_size, len); skb->protocol = eth_type_trans(skb, netdev);
- netdev->stats.rx_packets++; - netdev->stats.rx_bytes += len; + dev_sw_netstats_rx_add(netdev, len);
return netif_rx(skb);
@@ -185,8 +184,7 @@ static netdev_tx_t liteeth_start_xmit(struct sk_buff *skb, litex_write16(priv->base + LITEETH_READER_LENGTH, skb->len); litex_write8(priv->base + LITEETH_READER_START, 1);
- netdev->stats.tx_bytes += skb->len; - netdev->stats.tx_packets++; + dev_sw_netstats_tx_add(netdev, 1, skb->len);
priv->tx_slot = (priv->tx_slot + 1) % priv->num_tx_slots; dev_kfree_skb_any(skb); @@ -194,9 +192,17 @@ static netdev_tx_t liteeth_start_xmit(struct sk_buff *skb, return NETDEV_TX_OK; }
+static void +liteeth_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats) +{ + netdev_stats_to_stats64(stats, &netdev->stats); + dev_fetch_sw_netstats(stats, netdev->tstats); +} + static const struct net_device_ops liteeth_netdev_ops = { .ndo_open = liteeth_open, .ndo_stop = liteeth_stop, + .ndo_get_stats64 = liteeth_get_stats64, .ndo_start_xmit = liteeth_start_xmit, };
@@ -242,6 +248,11 @@ static int liteeth_probe(struct platform_device *pdev) priv->netdev = netdev; priv->dev = &pdev->dev;
+ netdev->tstats = devm_netdev_alloc_pcpu_stats(&pdev->dev, + struct pcpu_sw_netstats); + if (!netdev->tstats) + return -ENOMEM; + irq = platform_get_irq(pdev, 0); if (irq < 0) return irq;
From: Petr Oros poros@redhat.com
[ Upstream commit a52305a81d6bb74b90b400dfa56455d37872fe4b ]
devlink_port_type_warn is scheduled for port devlink and warning when the port type is not set. But from this warning it is not easy found out which device (driver) has no devlink port set.
[ 3709.975552] Type was not set for devlink port. [ 3709.975579] WARNING: CPU: 1 PID: 13092 at net/devlink/leftover.c:6775 devlink_port_type_warn+0x11/0x20 [ 3709.993967] Modules linked in: openvswitch nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink bluetooth rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs vhost_net vhost vhost_iotlb tap tun bridge stp llc qrtr intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal mlx5_ib intel_powerclamp coretemp dell_wmi ledtrig_audio sparse_keymap ipmi_ssif kvm_intel ib_uverbs rfkill ib_core video kvm iTCO_wdt acpi_ipmi intel_vsec irqbypass ipmi_si iTCO_vendor_support dcdbas ipmi_devintf mei_me ipmi_msghandler rapl mei intel_cstate isst_if_mmio isst_if_mbox_pci dell_smbios intel_uncore isst_if_common i2c_i801 dell_wmi_descriptor wmi_bmof i2c_smbus intel_pch_thermal pcspkr acpi_power_meter xfs libcrc32c sd_mod sg nvme_tcp mgag200 i2c_algo_bit nvme_fabrics drm_shmem_helper drm_kms_helper nvme syscopyarea ahci sysfillrect sysimgblt nvme_core fb_sys_fops crct10dif_pclmul libahci mlx5_core sfc crc32_pclmul nvme_common drm [ 3709.994030] crc32c_intel mtd t10_pi mlxfw libata tg3 mdio megaraid_sas psample ghash_clmulni_intel pci_hyperv_intf wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse [ 3710.108431] CPU: 1 PID: 13092 Comm: kworker/1:1 Kdump: loaded Not tainted 5.14.0-319.el9.x86_64 #1 [ 3710.108435] Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.8.2 09/14/2022 [ 3710.108437] Workqueue: events devlink_port_type_warn [ 3710.108440] RIP: 0010:devlink_port_type_warn+0x11/0x20 [ 3710.108443] Code: 84 76 fe ff ff 48 c7 03 20 0e 1a ad 31 c0 e9 96 fd ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 48 c7 c7 18 24 4e ad e8 ef 71 62 ff <0f> 0b c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 87 [ 3710.108445] RSP: 0018:ff3b6d2e8b3c7e90 EFLAGS: 00010282 [ 3710.108447] RAX: 0000000000000000 RBX: ff366d6580127080 RCX: 0000000000000027 [ 3710.108448] RDX: 0000000000000027 RSI: 00000000ffff86de RDI: ff366d753f41f8c8 [ 3710.108449] RBP: ff366d658ff5a0c0 R08: ff366d753f41f8c0 R09: ff3b6d2e8b3c7e18 [ 3710.108450] R10: 0000000000000001 R11: 0000000000000023 R12: ff366d753f430600 [ 3710.108451] R13: ff366d753f436900 R14: 0000000000000000 R15: ff366d753f436905 [ 3710.108452] FS: 0000000000000000(0000) GS:ff366d753f400000(0000) knlGS:0000000000000000 [ 3710.108453] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3710.108454] CR2: 00007f1c57bc74e0 CR3: 000000111d26a001 CR4: 0000000000773ee0 [ 3710.108456] PKRU: 55555554 [ 3710.108457] Call Trace: [ 3710.108458] <TASK> [ 3710.108459] process_one_work+0x1e2/0x3b0 [ 3710.108466] ? rescuer_thread+0x390/0x390 [ 3710.108468] worker_thread+0x50/0x3a0 [ 3710.108471] ? rescuer_thread+0x390/0x390 [ 3710.108473] kthread+0xdd/0x100 [ 3710.108477] ? kthread_complete_and_exit+0x20/0x20 [ 3710.108479] ret_from_fork+0x1f/0x30 [ 3710.108485] </TASK> [ 3710.108486] ---[ end trace 1b4b23cd0c65d6a0 ]---
After patch: [ 402.473064] ice 0000:41:00.0: Type was not set for devlink port. [ 402.473064] ice 0000:41:00.1: Type was not set for devlink port.
Signed-off-by: Petr Oros poros@redhat.com Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Reviewed-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/r/20230615095447.8259-1-poros@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/devlink/leftover.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c index cd02549680767..790e61b2a9404 100644 --- a/net/devlink/leftover.c +++ b/net/devlink/leftover.c @@ -6772,7 +6772,10 @@ void devlink_notify_unregister(struct devlink *devlink)
static void devlink_port_type_warn(struct work_struct *work) { - WARN(true, "Type was not set for devlink port."); + struct devlink_port *port = container_of(to_delayed_work(work), + struct devlink_port, + type_warn_dw); + dev_warn(port->devlink->dev, "Type was not set for devlink port."); }
static bool devlink_port_type_should_warn(struct devlink_port *devlink_port)
From: Mukesh Sisodiya mukesh.sisodiya@intel.com
[ Upstream commit 7dd50fd5478056929a012c6bf8b3c6f87c7e9e87 ]
While vif pointers are protected by the corresponding "*active" fields, static checkers can get confused sometimes. Add an explicit check.
Signed-off-by: Mukesh Sisodiya mukesh.sisodiya@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230614154951.78749ae91fb5.Id3c05d13eeee6638f0930... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/power.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/power.c b/drivers/net/wireless/intel/iwlwifi/mvm/power.c index ac1dae52556f8..19839cc44eb3d 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/power.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/power.c @@ -647,30 +647,32 @@ static void iwl_mvm_power_set_pm(struct iwl_mvm *mvm, return;
/* enable PM on bss if bss stand alone */ - if (vifs->bss_active && !vifs->p2p_active && !vifs->ap_active) { + if (bss_mvmvif && vifs->bss_active && !vifs->p2p_active && + !vifs->ap_active) { bss_mvmvif->pm_enabled = true; return; }
/* enable PM on p2p if p2p stand alone */ - if (vifs->p2p_active && !vifs->bss_active && !vifs->ap_active) { + if (p2p_mvmvif && vifs->p2p_active && !vifs->bss_active && + !vifs->ap_active) { p2p_mvmvif->pm_enabled = true; return; }
- if (vifs->bss_active && vifs->p2p_active) + if (p2p_mvmvif && bss_mvmvif && vifs->bss_active && vifs->p2p_active) client_same_channel = iwl_mvm_have_links_same_channel(bss_mvmvif, p2p_mvmvif);
- if (vifs->bss_active && vifs->ap_active) + if (bss_mvmvif && ap_mvmvif && vifs->bss_active && vifs->ap_active) ap_same_channel = iwl_mvm_have_links_same_channel(bss_mvmvif, ap_mvmvif);
/* clients are not stand alone: enable PM if DCM */ if (!(client_same_channel || ap_same_channel)) { - if (vifs->bss_active) + if (bss_mvmvif && vifs->bss_active) bss_mvmvif->pm_enabled = true; - if (vifs->p2p_active) + if (p2p_mvmvif && vifs->p2p_active) p2p_mvmvif->pm_enabled = true; return; }
From: Gustavo A. R. Silva gustavoars@kernel.org
[ Upstream commit 71e7552c90db2a2767f5c17c7ec72296b0d92061 ]
-Wstringop-overflow is legitimately warning us about extra_size pontentially being zero at some point, hence potenially ending up _allocating_ zero bytes of memory for extra pointer and then trying to access such object in a call to copy_from_user().
Fix this by adding a sanity check to ensure we never end up trying to allocate zero bytes of data for extra pointer, before continue executing the rest of the code in the function.
Address the following -Wstringop-overflow warning seen when built m68k architecture with allyesconfig configuration: from net/wireless/wext-core.c:11: In function '_copy_from_user', inlined from 'copy_from_user' at include/linux/uaccess.h:183:7, inlined from 'ioctl_standard_iw_point' at net/wireless/wext-core.c:825:7: arch/m68k/include/asm/string.h:48:25: warning: '__builtin_memset' writing 1 or more bytes into a region of size 0 overflows the destination [-Wstringop-overflow=] 48 | #define memset(d, c, n) __builtin_memset(d, c, n) | ^~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/uaccess.h:153:17: note: in expansion of macro 'memset' 153 | memset(to + (n - res), 0, res); | ^~~~~~ In function 'kmalloc', inlined from 'kzalloc' at include/linux/slab.h:694:9, inlined from 'ioctl_standard_iw_point' at net/wireless/wext-core.c:819:10: include/linux/slab.h:577:16: note: at offset 1 into destination object of size 0 allocated by '__kmalloc' 577 | return __kmalloc(size, flags); | ^~~~~~~~~~~~~~~~~~~~~~
This help with the ongoing efforts to globally enable -Wstringop-overflow.
Link: https://github.com/KSPP/linux/issues/315 Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/ZItSlzvIpjdjNfd8@work Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/wext-core.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c index a125fd1fa1342..a161c64d1765e 100644 --- a/net/wireless/wext-core.c +++ b/net/wireless/wext-core.c @@ -815,6 +815,12 @@ static int ioctl_standard_iw_point(struct iw_point *iwp, unsigned int cmd, } }
+ /* Sanity-check to ensure we never end up _allocating_ zero + * bytes of data for extra. + */ + if (extra_size <= 0) + return -EFAULT; + /* kzalloc() ensures NULL-termination for essid_compat. */ extra = kzalloc(extra_size, GFP_KERNEL); if (!extra)
From: Mukesh Sisodiya mukesh.sisodiya@intel.com
[ Upstream commit 35bd6f1d043d089fcb60450e1287cc65f0095787 ]
Add support for the PCI Id 51F1 without IMR support.
Signed-off-by: Mukesh Sisodiya mukesh.sisodiya@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230620125813.9800e652e789.Ic06a085832ac3f988c8ef... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c index 79115eb1c2852..e9fe6cea891aa 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c @@ -495,6 +495,7 @@ static const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x7AF0, PCI_ANY_ID, iwl_so_trans_cfg)}, {IWL_PCI_DEVICE(0x51F0, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)}, {IWL_PCI_DEVICE(0x51F1, PCI_ANY_ID, iwl_so_long_latency_imr_trans_cfg)}, + {IWL_PCI_DEVICE(0x51F1, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)}, {IWL_PCI_DEVICE(0x54F0, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)}, {IWL_PCI_DEVICE(0x7F70, PCI_ANY_ID, iwl_so_trans_cfg)},
@@ -544,6 +545,7 @@ static const struct iwl_dev_info iwl_dev_info_table[] = { IWL_DEV_INFO(0x51F0, 0x1551, iwl9560_2ac_cfg_soc, iwl9560_killer_1550i_160_name), IWL_DEV_INFO(0x51F0, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name), IWL_DEV_INFO(0x51F0, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name), + IWL_DEV_INFO(0x51F1, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name), IWL_DEV_INFO(0x54F0, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name), IWL_DEV_INFO(0x54F0, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name), IWL_DEV_INFO(0x7A70, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name),
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 1a528ab1da324d078ec60283c34c17848580df24 ]
Roee reported various hard-to-debug crashes with pings in EHT aggregation scenarios. Enabling KASAN showed that we access the BAID allocation out of bounds, and looking at the code a bit shows that since the reorder buffer entry (struct iwl_mvm_reorder_buf_entry) is 128 bytes if debug such as lockdep is enabled, then staring from an agg size 512 we overflow the size calculation, and allocate a much smaller structure than we should, causing slab corruption once we initialize this.
Fix this by simply using u32 instead of u16.
Reported-by: Roee Goldfiner roee.h.goldfiner@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230620125813.f428c856030d.I2c2bb808e945adb71bc15... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c index b85e363544f8b..7f9a809dd081c 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c @@ -2884,7 +2884,7 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct ieee80211_sta *sta, }
if (iwl_mvm_has_new_rx_api(mvm) && start) { - u16 reorder_buf_size = buf_size * sizeof(baid_data->entries[0]); + u32 reorder_buf_size = buf_size * sizeof(baid_data->entries[0]);
/* sparse doesn't like the __align() so don't check */ #ifndef __CHECKER__
From: Yi Kuo yi@yikuo.dev
[ Upstream commit f4daceae4087bbb3e9a56044b44601d520d009d2 ]
Intel Killer AX1675i/s with device id 51f1 would show "No config found for PCI dev 51f1/1672" in dmesg and refuse to work. Add the new device id 51F1 for 1675i/s to fix the issue.
Signed-off-by: Yi Kuo yi@yikuo.dev Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230621130444.ee224675380b.I921c905e21e8d041ad808... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c index e9fe6cea891aa..e086664a4eaca 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c @@ -684,6 +684,8 @@ static const struct iwl_dev_info iwl_dev_info_table[] = { IWL_DEV_INFO(0x2726, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name), IWL_DEV_INFO(0x51F0, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name), IWL_DEV_INFO(0x51F0, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name), + IWL_DEV_INFO(0x51F1, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name), + IWL_DEV_INFO(0x51F1, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name), IWL_DEV_INFO(0x54F0, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name), IWL_DEV_INFO(0x54F0, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name), IWL_DEV_INFO(0x7A70, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name),
From: Ying Hsu yinghsu@chromium.org
[ Upstream commit 004d25060c78fc31f66da0fa439c544dda1ac9d5 ]
In a setup where a Thunderbolt hub connects to Ethernet and a display through USB Type-C, users may experience a hung task timeout when they remove the cable between the PC and the Thunderbolt hub. This is because the igb_down function is called multiple times when the Thunderbolt hub is unplugged. For example, the igb_io_error_detected triggers the first call, and the igb_remove triggers the second call. The second call to igb_down will block at napi_synchronize. Here's the call trace: __schedule+0x3b0/0xddb ? __mod_timer+0x164/0x5d3 schedule+0x44/0xa8 schedule_timeout+0xb2/0x2a4 ? run_local_timers+0x4e/0x4e msleep+0x31/0x38 igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4] __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4] igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4] __dev_close_many+0x95/0xec dev_close_many+0x6e/0x103 unregister_netdevice_many+0x105/0x5b1 unregister_netdevice_queue+0xc2/0x10d unregister_netdev+0x1c/0x23 igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4] pci_device_remove+0x3f/0x9c device_release_driver_internal+0xfe/0x1b4 pci_stop_bus_device+0x5b/0x7f pci_stop_bus_device+0x30/0x7f pci_stop_bus_device+0x30/0x7f pci_stop_and_remove_bus_device+0x12/0x19 pciehp_unconfigure_device+0x76/0xe9 pciehp_disable_slot+0x6e/0x131 pciehp_handle_presence_or_link_change+0x7a/0x3f7 pciehp_ist+0xbe/0x194 irq_thread_fn+0x22/0x4d ? irq_thread+0x1fd/0x1fd irq_thread+0x17b/0x1fd ? irq_forced_thread_fn+0x5f/0x5f kthread+0x142/0x153 ? __irq_get_irqchip_state+0x46/0x46 ? kthread_associate_blkcg+0x71/0x71 ret_from_fork+0x1f/0x30
In this case, igb_io_error_detected detaches the network interface and requests a PCIE slot reset, however, the PCIE reset callback is not being invoked and thus the Ethernet connection breaks down. As the PCIE error in this case is a non-fatal one, requesting a slot reset can be avoided. This patch fixes the task hung issue and preserves Ethernet connection by ignoring non-fatal PCIE errors.
Signed-off-by: Ying Hsu yinghsu@chromium.org Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230620174732.4145155-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index bb3db387d49cf..ba5e1d1320f67 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -9585,6 +9585,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev, struct net_device *netdev = pci_get_drvdata(pdev); struct igb_adapter *adapter = netdev_priv(netdev);
+ if (state == pci_channel_io_normal) { + dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n"); + return PCI_ERS_RESULT_CAN_RECOVER; + } + netif_device_detach(netdev);
if (state == pci_channel_io_perm_failure)
From: Hao Chen chenhao418@huawei.com
[ Upstream commit 1cf3d5567f273a8746d1bade00633a93204f80f0 ]
Now, strncpy() in hns3_dbg_fill_content() use src-length as copy-length, it may result in dest-buf overflow.
This patch is to fix intel compile warning for csky-linux-gcc (GCC) 12.1.0 compiler.
The warning reports as below:
hclge_debugfs.c:92:25: warning: 'strncpy' specified bound depends on the length of the source argument [-Wstringop-truncation]
strncpy(pos, items[i].name, strlen(items[i].name));
hclge_debugfs.c:90:25: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation]
strncpy(pos, result[i], strlen(result[i]));
strncpy() use src-length as copy-length, it may result in dest-buf overflow.
So,this patch add some values check to avoid this issue.
Signed-off-by: Hao Chen chenhao418@huawei.com Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/lkml/202207170606.7WtHs9yS-lkp@intel.com/T/ Signed-off-by: Hao Lan lanhao@huawei.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3_debugfs.c | 31 ++++++++++++++----- .../hisilicon/hns3/hns3pf/hclge_debugfs.c | 29 ++++++++++++++--- 2 files changed, 48 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c index d385ffc218766..32bb14303473b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c @@ -438,19 +438,36 @@ static void hns3_dbg_fill_content(char *content, u16 len, const struct hns3_dbg_item *items, const char **result, u16 size) { +#define HNS3_DBG_LINE_END_LEN 2 char *pos = content; + u16 item_len; u16 i;
+ if (!len) { + return; + } else if (len <= HNS3_DBG_LINE_END_LEN) { + *pos++ = '\0'; + return; + } + memset(content, ' ', len); - for (i = 0; i < size; i++) { - if (result) - strncpy(pos, result[i], strlen(result[i])); - else - strncpy(pos, items[i].name, strlen(items[i].name)); + len -= HNS3_DBG_LINE_END_LEN;
- pos += strlen(items[i].name) + items[i].interval; + for (i = 0; i < size; i++) { + item_len = strlen(items[i].name) + items[i].interval; + if (len < item_len) + break; + + if (result) { + if (item_len < strlen(result[i])) + break; + strscpy(pos, result[i], strlen(result[i])); + } else { + strscpy(pos, items[i].name, strlen(items[i].name)); + } + pos += item_len; + len -= item_len; } - *pos++ = '\n'; *pos++ = '\0'; } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c index a0b46e7d863eb..233c132dc513e 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c @@ -88,16 +88,35 @@ static void hclge_dbg_fill_content(char *content, u16 len, const struct hclge_dbg_item *items, const char **result, u16 size) { +#define HCLGE_DBG_LINE_END_LEN 2 char *pos = content; + u16 item_len; u16 i;
+ if (!len) { + return; + } else if (len <= HCLGE_DBG_LINE_END_LEN) { + *pos++ = '\0'; + return; + } + memset(content, ' ', len); + len -= HCLGE_DBG_LINE_END_LEN; + for (i = 0; i < size; i++) { - if (result) - strncpy(pos, result[i], strlen(result[i])); - else - strncpy(pos, items[i].name, strlen(items[i].name)); - pos += strlen(items[i].name) + items[i].interval; + item_len = strlen(items[i].name) + items[i].interval; + if (len < item_len) + break; + + if (result) { + if (item_len < strlen(result[i])) + break; + strscpy(pos, result[i], strlen(result[i])); + } else { + strscpy(pos, items[i].name, strlen(items[i].name)); + } + pos += item_len; + len -= item_len; } *pos++ = '\n'; *pos++ = '\0';
From: Vijendar Mukunda Vijendar.Mukunda@amd.com
[ Upstream commit 85aeab362201cf52c34cd429e4f6c75a0b42f9a3 ]
For invalid dai id, instead of returning -EINVAL return bytes count as zero in acp_get_byte_count() function.
Fixes: 623621a9f9e1 ("ASoC: amd: Add common framework to support I2S on ACP SOC")
Signed-off-by: Vijendar Mukunda Vijendar.Mukunda@amd.com Link: https://lore.kernel.org/r/20230626105356.2580125-6-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/amd/acp/amd.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/sound/soc/amd/acp/amd.h b/sound/soc/amd/acp/amd.h index 5f2119f422715..12a176a50fd6e 100644 --- a/sound/soc/amd/acp/amd.h +++ b/sound/soc/amd/acp/amd.h @@ -173,7 +173,7 @@ int snd_amd_acp_find_config(struct pci_dev *pci);
static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int direction) { - u64 byte_count, low = 0, high = 0; + u64 byte_count = 0, low = 0, high = 0;
if (direction == SNDRV_PCM_STREAM_PLAYBACK) { switch (dai_id) { @@ -191,7 +191,7 @@ static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int break; default: dev_err(adata->dev, "Invalid dai id %x\n", dai_id); - return -EINVAL; + goto POINTER_RETURN_BYTES; } } else { switch (dai_id) { @@ -213,12 +213,13 @@ static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int break; default: dev_err(adata->dev, "Invalid dai id %x\n", dai_id); - return -EINVAL; + goto POINTER_RETURN_BYTES; } } /* Get 64 bit value from two 32 bit registers */ byte_count = (high << 32) | low;
+POINTER_RETURN_BYTES: return byte_count; }
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit e5ce198bd5c6923b6a51e1493b1401f84c24b26d ]
Demote the MBHC impedance measurement printk, which is not an error message, from error to debug level.
While at it, fix the capitalisation of "ohm" and add the missing space before the opening parenthesis.
Fixes: bcee7ed09b8e ("ASoC: codecs: wcd938x: add Multi Button Headset Control support") Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230630142717.5314-2-johan+linaro@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wcd938x.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c index 0ff8f784b5eca..8bb6a5ff7b0f6 100644 --- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -2165,8 +2165,8 @@ static inline void wcd938x_mbhc_get_result_params(struct wcd938x_priv *wcd938x, else if (x1 < minCode_param[noff]) *zdet = WCD938X_ZDET_FLOATING_IMPEDANCE;
- pr_err("%s: d1=%d, c1=%d, x1=0x%x, z_val=%d(milliOhm)\n", - __func__, d1, c1, x1, *zdet); + pr_debug("%s: d1=%d, c1=%d, x1=0x%x, z_val=%d (milliohm)\n", + __func__, d1, c1, x1, *zdet); ramp_down: i = 0; while (x1) {
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit c03226ba15fe3c42d13907ec7d8536396602557b ]
dB range for HPHL and HPHR gains are from +6dB to -30dB in steps of 1.5dB with register values range from 0 to 24.
Current code maps these dB ranges incorrectly, fix them to allow proper volume setting.
Fixes: e8ba1e05bdc0 ("ASoC: codecs: wcd938x: add basic controls") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705125723.40464-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wcd938x.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c index 8bb6a5ff7b0f6..4a0b990f56e12 100644 --- a/sound/soc/codecs/wcd938x.c +++ b/sound/soc/codecs/wcd938x.c @@ -210,7 +210,7 @@ struct wcd938x_priv { };
static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(ear_pa_gain, 600, -1800); -static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(line_gain, 600, -3000); +static const DECLARE_TLV_DB_SCALE(line_gain, -3000, 150, -3000); static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(analog_gain, 0, 3000);
struct wcd938x_mbhc_zdet_param { @@ -2662,8 +2662,8 @@ static const struct snd_kcontrol_new wcd938x_snd_controls[] = { wcd938x_get_swr_port, wcd938x_set_swr_port), SOC_SINGLE_EXT("DSD_R Switch", WCD938X_DSD_R, 0, 1, 0, wcd938x_get_swr_port, wcd938x_set_swr_port), - SOC_SINGLE_TLV("HPHL Volume", WCD938X_HPH_L_EN, 0, 0x18, 0, line_gain), - SOC_SINGLE_TLV("HPHR Volume", WCD938X_HPH_R_EN, 0, 0x18, 0, line_gain), + SOC_SINGLE_TLV("HPHL Volume", WCD938X_HPH_L_EN, 0, 0x18, 1, line_gain), + SOC_SINGLE_TLV("HPHR Volume", WCD938X_HPH_R_EN, 0, 0x18, 1, line_gain), WCD938X_EAR_PA_GAIN_TLV("EAR_PA Volume", WCD938X_ANA_EAR_COMPANDER_CTL, 2, 0x10, 0, ear_pa_gain), SOC_SINGLE_EXT("ADC1 Switch", WCD938X_ADC1, 1, 1, 0,
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit c1be62923d4d86e7c06b1224626e27eb8d9ab32e ]
Closing GPR port before graph close can result in un handled notifications from DSP, this results in spam of errors from GPR driver as there is no one to handle these notification at that point in time.
Fix this by closing GPR port after graph close is finished.
Fixes: 5477518b8a0e ("ASoC: qdsp6: audioreach: add q6apm support") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230705131842.41584-1-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/qcom/qdsp6/q6apm.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/sound/soc/qcom/qdsp6/q6apm.c b/sound/soc/qcom/qdsp6/q6apm.c index a7a3f973eb6d5..cdebf209c8a55 100644 --- a/sound/soc/qcom/qdsp6/q6apm.c +++ b/sound/soc/qcom/qdsp6/q6apm.c @@ -446,6 +446,8 @@ static int graph_callback(struct gpr_resp_pkt *data, void *priv, int op)
switch (hdr->opcode) { case DATA_CMD_RSP_WR_SH_MEM_EP_DATA_BUFFER_DONE_V2: + if (!graph->ar_graph) + break; client_event = APM_CLIENT_EVENT_DATA_WRITE_DONE; mutex_lock(&graph->lock); token = hdr->token & APM_WRITE_TOKEN_MASK; @@ -479,6 +481,8 @@ static int graph_callback(struct gpr_resp_pkt *data, void *priv, int op) wake_up(&graph->cmd_wait); break; case DATA_CMD_RSP_RD_SH_MEM_EP_DATA_BUFFER_V2: + if (!graph->ar_graph) + break; client_event = APM_CLIENT_EVENT_DATA_READ_DONE; mutex_lock(&graph->lock); rd_done = data->payload; @@ -581,8 +585,9 @@ int q6apm_graph_close(struct q6apm_graph *graph) { struct audioreach_graph *ar_graph = graph->ar_graph;
- gpr_free_port(graph->port); + graph->ar_graph = NULL; kref_put(&ar_graph->refcount, q6apm_put_audioreach_graph); + gpr_free_port(graph->port); kfree(graph);
return 0;
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 719a937b7003933de1298ffa4b881dd6a234e244 ]
Extend commit 50f9a76ef127 ("iov_iter: Mark copy_compat_iovec_from_user() noinline") to also cover copy_iovec_from_user(). Different compiler versions cause the same problem on different functions.
lib/iov_iter.o: warning: objtool: .altinstr_replacement+0x1f: redundant UACCESS disable lib/iov_iter.o: warning: objtool: iovec_from_user+0x84: call to copy_iovec_from_user.part.0() with UACCESS enabled lib/iov_iter.o: warning: objtool: __import_iovec+0x143: call to copy_iovec_from_user.part.0() with UACCESS enabled
Fixes: 50f9a76ef127 ("iov_iter: Mark copy_compat_iovec_from_user() noinline") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lkml.kernel.org/r/20230616124354.GD4253@hirez.programming.kicks-ass.... Signed-off-by: Sasha Levin sashal@kernel.org --- lib/iov_iter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 960223ed91991..061cc3ed58f5b 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1795,7 +1795,7 @@ static __noclone int copy_compat_iovec_from_user(struct iovec *iov, return ret; }
-static int copy_iovec_from_user(struct iovec *iov, +static __noclone int copy_iovec_from_user(struct iovec *iov, const struct iovec __user *uiov, unsigned long nr_segs) { int ret = -EFAULT;
From: Miaohe Lin linmiaohe@huawei.com
[ Upstream commit ae2ad293d6be143ad223f5f947cca07bcbe42595 ]
When checking whether a recently used CPU can be a potential idle candidate, recent_used_cpu should be used to test p->cpus_ptr as p->recent_used_cpu is not equal to recent_used_cpu and candidate decision is made based on recent_used_cpu here.
Fixes: 89aafd67f28c ("sched/fair: Use prev instead of new target as recent_used_cpu") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Phil Auld pauld@redhat.com Acked-by: Mel Gorman mgorman@suse.de Link: https://lore.kernel.org/r/20230620080747.359122-1-linmiaohe@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e427056b440bb..dacb56d7e9147 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7174,7 +7174,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) recent_used_cpu != target && cpus_share_cache(recent_used_cpu, target) && (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) && - cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) && + cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) && asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) { return recent_used_cpu; }
From: Suren Baghdasaryan surenb@google.com
[ Upstream commit aff037078ecaecf34a7c2afab1341815f90fba5e ]
Destroying psi trigger in cgroup_file_release causes UAF issues when a cgroup is removed from under a polling process. This is happening because cgroup removal causes a call to cgroup_file_release while the actual file is still alive. Destroying the trigger at this point would also destroy its waitqueue head and if there is still a polling process on that file accessing the waitqueue, it will step on the freed pointer:
do_select vfs_poll do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy wake_up_pollfree(&t->event_wait) // vfs_poll is unblocked synchronize_rcu kfree(t) poll_freewait -> UAF access to the trigger's waitqueue head
Patch [1] fixed this issue for epoll() case using wake_up_pollfree(), however the same issue exists for synchronous poll() case. The root cause of this issue is that the lifecycles of the psi trigger's waitqueue and of the file associated with the trigger are different. Fix this by using kernfs_generic_poll function when polling on cgroup-specific psi triggers. It internally uses kernfs_open_node->poll waitqueue head with its lifecycle tied to the file's lifecycle. This also renders the fix in [1] obsolete, so revert it.
[1] commit c2dbe32d5db5 ("sched/psi: Fix use-after-free in ep_remove_wait_queue()")
Fixes: 0e94682b73bf ("psi: introduce psi monitor") Closes: https://lore.kernel.org/all/20230613062306.101831-1-lujialin4@huawei.com/ Reported-by: Lu Jialin lujialin4@huawei.com Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20230630005612.1014540-1-surenb@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/psi.h | 5 +++-- include/linux/psi_types.h | 3 +++ kernel/cgroup/cgroup.c | 2 +- kernel/sched/psi.c | 29 +++++++++++++++++++++-------- 4 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/include/linux/psi.h b/include/linux/psi.h index ab26200c28033..e0745873e3f26 100644 --- a/include/linux/psi.h +++ b/include/linux/psi.h @@ -23,8 +23,9 @@ void psi_memstall_enter(unsigned long *flags); void psi_memstall_leave(unsigned long *flags);
int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res); -struct psi_trigger *psi_trigger_create(struct psi_group *group, - char *buf, enum psi_res res, struct file *file); +struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf, + enum psi_res res, struct file *file, + struct kernfs_open_file *of); void psi_trigger_destroy(struct psi_trigger *t);
__poll_t psi_trigger_poll(void **trigger_ptr, struct file *file, diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h index 040c089581c6c..f1fd3a8044e0e 100644 --- a/include/linux/psi_types.h +++ b/include/linux/psi_types.h @@ -137,6 +137,9 @@ struct psi_trigger { /* Wait queue for polling */ wait_queue_head_t event_wait;
+ /* Kernfs file for cgroup triggers */ + struct kernfs_open_file *of; + /* Pending event flag */ int event;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 4d42f0cbc11ea..3299ec69ce0d1 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3785,7 +3785,7 @@ static ssize_t pressure_write(struct kernfs_open_file *of, char *buf, }
psi = cgroup_psi(cgrp); - new = psi_trigger_create(psi, buf, res, of->file); + new = psi_trigger_create(psi, buf, res, of->file, of); if (IS_ERR(new)) { cgroup_put(cgrp); return PTR_ERR(new); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index e072f6b31bf30..80d8c10e93638 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -494,8 +494,12 @@ static u64 update_triggers(struct psi_group *group, u64 now, bool *update_total, continue;
/* Generate an event */ - if (cmpxchg(&t->event, 0, 1) == 0) - wake_up_interruptible(&t->event_wait); + if (cmpxchg(&t->event, 0, 1) == 0) { + if (t->of) + kernfs_notify(t->of->kn); + else + wake_up_interruptible(&t->event_wait); + } t->last_event_time = now; /* Reset threshold breach flag once event got generated */ t->pending_event = false; @@ -1272,8 +1276,9 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res) return 0; }
-struct psi_trigger *psi_trigger_create(struct psi_group *group, - char *buf, enum psi_res res, struct file *file) +struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf, + enum psi_res res, struct file *file, + struct kernfs_open_file *of) { struct psi_trigger *t; enum psi_states state; @@ -1333,7 +1338,9 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
t->event = 0; t->last_event_time = 0; - init_waitqueue_head(&t->event_wait); + t->of = of; + if (!of) + init_waitqueue_head(&t->event_wait); t->pending_event = false; t->aggregator = privileged ? PSI_POLL : PSI_AVGS;
@@ -1390,7 +1397,10 @@ void psi_trigger_destroy(struct psi_trigger *t) * being accessed later. Can happen if cgroup is deleted from under a * polling process. */ - wake_up_pollfree(&t->event_wait); + if (t->of) + kernfs_notify(t->of->kn); + else + wake_up_interruptible(&t->event_wait);
if (t->aggregator == PSI_AVGS) { mutex_lock(&group->avgs_lock); @@ -1462,7 +1472,10 @@ __poll_t psi_trigger_poll(void **trigger_ptr, if (!t) return DEFAULT_POLLMASK | EPOLLERR | EPOLLPRI;
- poll_wait(file, &t->event_wait, wait); + if (t->of) + kernfs_generic_poll(t->of, wait); + else + poll_wait(file, &t->event_wait, wait);
if (cmpxchg(&t->event, 1, 0) == 1) ret |= EPOLLPRI; @@ -1532,7 +1545,7 @@ static ssize_t psi_write(struct file *file, const char __user *user_buf, return -EBUSY; }
- new = psi_trigger_create(&psi_system, buf, res, file); + new = psi_trigger_create(&psi_system, buf, res, file, NULL); if (IS_ERR(new)) { mutex_unlock(&seq->lock); return PTR_ERR(new);
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit f46a0b47cc0829acd050213194c5a77351e619b2 ]
The eMMC and SDHI pin control configuration nodes in DT have subnodes with the same names ("data" and "ctrl"). As the RZ/V2M pin control driver considers only the names of the subnodes, this leads to conflicts:
pinctrl-rzv2m b6250000.pinctrl: pin P8_2 already requested by 85000000.mmc; cannot claim for 85020000.mmc pinctrl-rzv2m b6250000.pinctrl: pin-130 (85020000.mmc) status -22 renesas_sdhi_internal_dmac 85020000.mmc: Error applying setting, reverse things back
Fix this by constructing unique names from the node names of both the pin control configuration node and its child node, where appropriate.
Reported by: Fabrizio Castro fabrizio.castro.jz@renesas.com
Fixes: 92a9b825257614af ("pinctrl: renesas: Add RZ/V2M pin and gpio controller driver") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Tested-by: Fabrizio Castro fabrizio.castro.jz@renesas.com Link: https://lore.kernel.org/r/607bd6ab4905b0b1b119a06ef953fa1184505777.168839671... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/renesas/pinctrl-rzv2m.c | 28 ++++++++++++++++++------- 1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/pinctrl/renesas/pinctrl-rzv2m.c b/drivers/pinctrl/renesas/pinctrl-rzv2m.c index e5472293bc7fb..35b23c1a5684d 100644 --- a/drivers/pinctrl/renesas/pinctrl-rzv2m.c +++ b/drivers/pinctrl/renesas/pinctrl-rzv2m.c @@ -209,6 +209,7 @@ static int rzv2m_map_add_config(struct pinctrl_map *map,
static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev, struct device_node *np, + struct device_node *parent, struct pinctrl_map **map, unsigned int *num_maps, unsigned int *index) @@ -226,6 +227,7 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev, struct property *prop; int ret, gsel, fsel; const char **pin_fn; + const char *name; const char *pin;
pinmux = of_find_property(np, "pinmux", NULL); @@ -309,8 +311,19 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev, psel_val[i] = MUX_FUNC(value); }
+ if (parent) { + name = devm_kasprintf(pctrl->dev, GFP_KERNEL, "%pOFn.%pOFn", + parent, np); + if (!name) { + ret = -ENOMEM; + goto done; + } + } else { + name = np->name; + } + /* Register a single pin group listing all the pins we read from DT */ - gsel = pinctrl_generic_add_group(pctldev, np->name, pins, num_pinmux, NULL); + gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL); if (gsel < 0) { ret = gsel; goto done; @@ -320,17 +333,16 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev, * Register a single group function where the 'data' is an array PSEL * register values read from DT. */ - pin_fn[0] = np->name; - fsel = pinmux_generic_add_function(pctldev, np->name, pin_fn, 1, - psel_val); + pin_fn[0] = name; + fsel = pinmux_generic_add_function(pctldev, name, pin_fn, 1, psel_val); if (fsel < 0) { ret = fsel; goto remove_group; }
maps[idx].type = PIN_MAP_TYPE_MUX_GROUP; - maps[idx].data.mux.group = np->name; - maps[idx].data.mux.function = np->name; + maps[idx].data.mux.group = name; + maps[idx].data.mux.function = name; idx++;
dev_dbg(pctrl->dev, "Parsed %pOF with %d pins\n", np, num_pinmux); @@ -377,7 +389,7 @@ static int rzv2m_dt_node_to_map(struct pinctrl_dev *pctldev, index = 0;
for_each_child_of_node(np, child) { - ret = rzv2m_dt_subnode_to_map(pctldev, child, map, + ret = rzv2m_dt_subnode_to_map(pctldev, child, np, map, num_maps, &index); if (ret < 0) { of_node_put(child); @@ -386,7 +398,7 @@ static int rzv2m_dt_node_to_map(struct pinctrl_dev *pctldev, }
if (*num_maps == 0) { - ret = rzv2m_dt_subnode_to_map(pctldev, np, map, + ret = rzv2m_dt_subnode_to_map(pctldev, np, NULL, map, num_maps, &index); if (ret < 0) goto done;
From: Biju Das biju.das.jz@bp.renesas.com
[ Upstream commit bfc374a145ae133613e05b9b89be561f169cb58d ]
Currently, sd1 and sd0 have unique subnode names 'sd1_mux' and 'sd0_mux'. If we change these to non-unique subnode names such as 'mux' this can lead to the below conflict as the RZ/G2L pin control driver considers only the names of the subnodes.
pinctrl-rzg2l 11030000.pinctrl: pin P47_0 already requested by 11c00000.mmc; cannot claim for 11c10000.mmc pinctrl-rzg2l 11030000.pinctrl: pin-376 (11c10000.mmc) status -22 pinctrl-rzg2l 11030000.pinctrl: could not request pin 376 (P47_0) from group mux on device pinctrl-rzg2l renesas_sdhi_internal_dmac 11c10000.mmc: Error applying setting, reverse things back
Fix this by constructing unique names from the node names of both the pin control configuration node and its child node, where appropriate.
Based on the work done by Geert for the RZ/V2M pinctrl driver.
Fixes: c4c4637eb57f ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver") Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20230704111858.215278-1-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/renesas/pinctrl-rzg2l.c | 28 ++++++++++++++++++------- 1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/pinctrl/renesas/pinctrl-rzg2l.c b/drivers/pinctrl/renesas/pinctrl-rzg2l.c index 9511d920565e9..b53d26167da52 100644 --- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c +++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c @@ -249,6 +249,7 @@ static int rzg2l_map_add_config(struct pinctrl_map *map,
static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev, struct device_node *np, + struct device_node *parent, struct pinctrl_map **map, unsigned int *num_maps, unsigned int *index) @@ -266,6 +267,7 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev, struct property *prop; int ret, gsel, fsel; const char **pin_fn; + const char *name; const char *pin;
pinmux = of_find_property(np, "pinmux", NULL); @@ -349,8 +351,19 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev, psel_val[i] = MUX_FUNC(value); }
+ if (parent) { + name = devm_kasprintf(pctrl->dev, GFP_KERNEL, "%pOFn.%pOFn", + parent, np); + if (!name) { + ret = -ENOMEM; + goto done; + } + } else { + name = np->name; + } + /* Register a single pin group listing all the pins we read from DT */ - gsel = pinctrl_generic_add_group(pctldev, np->name, pins, num_pinmux, NULL); + gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL); if (gsel < 0) { ret = gsel; goto done; @@ -360,17 +373,16 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev, * Register a single group function where the 'data' is an array PSEL * register values read from DT. */ - pin_fn[0] = np->name; - fsel = pinmux_generic_add_function(pctldev, np->name, pin_fn, 1, - psel_val); + pin_fn[0] = name; + fsel = pinmux_generic_add_function(pctldev, name, pin_fn, 1, psel_val); if (fsel < 0) { ret = fsel; goto remove_group; }
maps[idx].type = PIN_MAP_TYPE_MUX_GROUP; - maps[idx].data.mux.group = np->name; - maps[idx].data.mux.function = np->name; + maps[idx].data.mux.group = name; + maps[idx].data.mux.function = name; idx++;
dev_dbg(pctrl->dev, "Parsed %pOF with %d pins\n", np, num_pinmux); @@ -417,7 +429,7 @@ static int rzg2l_dt_node_to_map(struct pinctrl_dev *pctldev, index = 0;
for_each_child_of_node(np, child) { - ret = rzg2l_dt_subnode_to_map(pctldev, child, map, + ret = rzg2l_dt_subnode_to_map(pctldev, child, np, map, num_maps, &index); if (ret < 0) { of_node_put(child); @@ -426,7 +438,7 @@ static int rzg2l_dt_node_to_map(struct pinctrl_dev *pctldev, }
if (*num_maps == 0) { - ret = rzg2l_dt_subnode_to_map(pctldev, np, map, + ret = rzg2l_dt_subnode_to_map(pctldev, np, NULL, map, num_maps, &index); if (ret < 0) goto done;
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 5158814cbb37bbb38344b3ecddc24ba2ed0365f2 ]
The command word is defined as following:
/* Command */ #define SPI_CMD_COMMAND_SHIFT 0 #define SPI_CMD_DEVICE_ID_SHIFT 4 #define SPI_CMD_PREPEND_BYTE_CNT_SHIFT 8 #define SPI_CMD_ONE_BYTE_SHIFT 11 #define SPI_CMD_ONE_WIRE_SHIFT 12
If the prepend byte count field starts at bit 8, and the next defined bit is SPI_CMD_ONE_BYTE at bit 11, it can be at most 3 bits wide, and thus the max value is 7, not 15.
Fixes: b17de076062a ("spi/bcm63xx: work around inability to keep CS up") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Link: https://lore.kernel.org/r/20230629071453.62024-1-jonas.gorski@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-bcm63xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-bcm63xx.c b/drivers/spi/spi-bcm63xx.c index 9aecb77c3d892..07b5b71b23520 100644 --- a/drivers/spi/spi-bcm63xx.c +++ b/drivers/spi/spi-bcm63xx.c @@ -126,7 +126,7 @@ enum bcm63xx_regs_spi { SPI_MSG_DATA_SIZE, };
-#define BCM63XX_SPI_MAX_PREPEND 15 +#define BCM63XX_SPI_MAX_PREPEND 7
#define BCM63XX_SPI_MAX_CS 8 #define BCM63XX_SPI_BUS_NUM 0
From: Martin Kaiser martin@kaiser.cx
[ Upstream commit 4e47382fbca916d7db95cbf9e2d7ca2e9d1ca3fe ]
Warn about invalid var->left_margin or var->right_margin. Their values are read from the device tree.
We store var->left_margin-3 and var->right_margin-1 in register fields. These fields should be >= 0.
Fixes: 7e8549bcee00 ("imxfb: Fix margin settings") Signed-off-by: Martin Kaiser martin@kaiser.cx Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/imxfb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/video/fbdev/imxfb.c b/drivers/video/fbdev/imxfb.c index adf36690c342b..5fbcb78a9caee 100644 --- a/drivers/video/fbdev/imxfb.c +++ b/drivers/video/fbdev/imxfb.c @@ -613,10 +613,10 @@ static int imxfb_activate_var(struct fb_var_screeninfo *var, struct fb_info *inf if (var->hsync_len < 1 || var->hsync_len > 64) printk(KERN_ERR "%s: invalid hsync_len %d\n", info->fix.id, var->hsync_len); - if (var->left_margin > 255) + if (var->left_margin < 3 || var->left_margin > 255) printk(KERN_ERR "%s: invalid left_margin %d\n", info->fix.id, var->left_margin); - if (var->right_margin > 255) + if (var->right_margin < 1 || var->right_margin > 255) printk(KERN_ERR "%s: invalid right_margin %d\n", info->fix.id, var->right_margin); if (var->yres < 1 || var->yres > ymax_mask)
From: Yangtao Li frank.li@vivo.com
[ Upstream commit 45fcc058a75bf5d65cf4c32da44a252fbe873cd4 ]
Remove unnecessary release_mem_region from the error path to prevent mem region from being released twice, which could avoid resource leak or other unexpected issues.
Fixes: b083c22d5114 ("video: fbdev: imxfb: Convert request_mem_region + ioremap to devm_ioremap_resource") Signed-off-by: Yangtao Li frank.li@vivo.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/imxfb.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/video/fbdev/imxfb.c b/drivers/video/fbdev/imxfb.c index 5fbcb78a9caee..c8b1c73412d36 100644 --- a/drivers/video/fbdev/imxfb.c +++ b/drivers/video/fbdev/imxfb.c @@ -1043,7 +1043,6 @@ static int imxfb_probe(struct platform_device *pdev) failed_map: failed_ioremap: failed_getclock: - release_mem_region(res->start, resource_size(res)); failed_of_parse: kfree(info->pseudo_palette); failed_init:
From: James Clark james.clark@arm.com
[ Upstream commit 1feece2780ac2f8de45177fe53979726cee4b3d1 ]
-L only specifies the search path for libraries directly provided in the link line with -l. Because -lopencsd isn't specified, it's only linked because it's a dependency of -lopencsd_c_api. Dependencies like this are resolved using the default system search paths or -rpath-link=... rather than -L. This means that compilation only works if OpenCSD is installed to the system rather than provided with the CSLIBS (-L) option.
This could be fixed by adding -Wl,-rpath-link=$(CSLIBS) but that is less conventional than just adding -lopencsd to the link line so that it uses -L. -lopencsd seems to have been removed in commit ed17b1914978eddb ("perf tools: Drop requirement for libstdc++.so for libopencsd check") because it was thought that there was a chance compilation would work even if it didn't exist, but I think that only applies to libstdc++ so there is no harm to add it back. libopencsd.so and libopencsd_c_api.so would always exist together.
Testing =======
The following scenarios now all work:
* Cross build with OpenCSD installed * Cross build using CSLIBS=... * Native build with OpenCSD installed * Native build using CSLIBS=... * Static cross build with OpenCSD installed * Static cross build with CSLIBS=...
Committer testing:
⬢[acme@toolbox perf-tools]$ alias m alias m='make -k BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools -C tools/perf install-bin && git status && perf test python ; perf record -o /dev/null sleep 0.01 ; perf stat --null sleep 0.01' ⬢[acme@toolbox perf-tools]$ ldd ~/bin/perf | grep csd libopencsd_c_api.so.1 => /lib64/libopencsd_c_api.so.1 (0x00007fd49c44e000) libopencsd.so.1 => /lib64/libopencsd.so.1 (0x00007fd49bd56000) ⬢[acme@toolbox perf-tools]$ cat /etc/redhat-release Fedora release 36 (Thirty Six) ⬢[acme@toolbox perf-tools]$
Fixes: ed17b1914978eddb ("perf tools: Drop requirement for libstdc++.so for libopencsd check") Reported-by: Radhey Shyam Pandey radhey.shyam.pandey@amd.com Signed-off-by: James Clark james.clark@arm.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Tested-by: Radhey Shyam Pandey radhey.shyam.pandey@amd.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Uwe Kleine-König uwe@kleine-koenig.org Cc: coresight@lists.linaro.org Closes: https://lore.kernel.org/linux-arm-kernel/56905d7a-a91e-883a-b707-9d5f686ba5f... Link: https://lore.kernel.org/all/36cc4dc6-bf4b-1093-1c0a-876e368af183@kleine-koen... Link: https://lore.kernel.org/r/20230707154546.456720-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/Makefile.config | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index a794d9eca93d8..72f068682c9a2 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -155,9 +155,9 @@ FEATURE_CHECK_LDFLAGS-libcrypto = -lcrypto ifdef CSINCLUDES LIBOPENCSD_CFLAGS := -I$(CSINCLUDES) endif -OPENCSDLIBS := -lopencsd_c_api +OPENCSDLIBS := -lopencsd_c_api -lopencsd ifeq ($(findstring -static,${LDFLAGS}),-static) - OPENCSDLIBS += -lopencsd -lstdc++ + OPENCSDLIBS += -lstdc++ endif ifdef CSLIBS LIBOPENCSD_LDFLAGS := -L$(CSLIBS)
From: Christoph Hellwig hch@lst.de
[ Upstream commit 4e7de35eb7d1a1d4f2dda15f39fbedd4798a0b8d ]
The mirror_num_ret is allowed to be NULL, although it has to be set when smap is set. Unfortunately that is not a well enough specifiable invariant for static type checkers, so add a NULL check to make sure they are fine.
Fixes: 03793cbbc80f ("btrfs: add fast path for single device io in __btrfs_map_block") Reported-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Qu Wenruo wqu@suse.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5ec000813f047..436e15e3759da 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6399,7 +6399,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, (!need_full_stripe(op) || !dev_replace_is_ongoing || !dev_replace->tgtdev)) { set_io_stripe(smap, map, stripe_index, stripe_offset, stripe_nr); - *mirror_num_ret = mirror_num; + if (mirror_num_ret) + *mirror_num_ret = mirror_num; *bioc_ret = NULL; ret = 0; goto out;
From: Jaewon Kim jaewon02.kim@samsung.com
[ Upstream commit 9ec3c5517e22a12d2ff1b71e844f7913641460c6 ]
When SPI loopback transfer is performed, S3C64XX_SPI_MODE_SELF_LOOPBACK bit still remained. It works as loopback even if the next transfer is not spi loopback mode. If not SPI_LOOP, needs to clear S3C64XX_SPI_MODE_SELF_LOOPBACK bit.
Signed-off-by: Jaewon Kim jaewon02.kim@samsung.com Fixes: ffb7bcd3b27e ("spi: s3c64xx: support loopback mode") Reviewed-by: Chanho Park chanho61.park@samsung.com Link: https://lore.kernel.org/r/20230711082020.138165-1-jaewon02.kim@samsung.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-s3c64xx.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c index 7ac17f0d18a95..1a8b31e20baf2 100644 --- a/drivers/spi/spi-s3c64xx.c +++ b/drivers/spi/spi-s3c64xx.c @@ -668,6 +668,8 @@ static int s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd)
if ((sdd->cur_mode & SPI_LOOP) && sdd->port_conf->has_loopback) val |= S3C64XX_SPI_MODE_SELF_LOOPBACK; + else + val &= ~S3C64XX_SPI_MODE_SELF_LOOPBACK;
writel(val, regs + S3C64XX_SPI_MODE_CFG);
From: Yonghong Song yhs@fb.com
[ Upstream commit 8cc32a9bbf2934d90762d9de0187adcb5ad46a11 ]
Commit 6eb4bd92c1ce ("kallsyms: strip LTO suffixes from static functions") stripped all function/variable suffixes started with '.' regardless of whether those suffixes are generated at LTO mode or not. In fact, as far as I know, in LTO mode, when a static function/variable is promoted to the global scope, '.llvm.<...>' suffix is added.
The existing mechanism breaks live patch for a LTO kernel even if no <symbol>.llvm.<...> symbols are involved. For example, for the following kernel symbols: $ grep bpf_verifier_vlog /proc/kallsyms ffffffff81549f60 t bpf_verifier_vlog ffffffff8268b430 d bpf_verifier_vlog._entry ffffffff8282a958 d bpf_verifier_vlog._entry_ptr ffffffff82e12a1f d bpf_verifier_vlog.__already_done 'bpf_verifier_vlog' is a static function. '_entry', '_entry_ptr' and '__already_done' are static variables used inside 'bpf_verifier_vlog', so llvm promotes them to file-level static with prefix 'bpf_verifier_vlog.'. Note that the func-level to file-level static function promotion also happens without LTO.
Given a symbol name 'bpf_verifier_vlog', with LTO kernel, current mechanism will return 4 symbols to live patch subsystem which current live patching subsystem cannot handle it. With non-LTO kernel, only one symbol is returned.
In [1], we have a lengthy discussion, the suggestion is to separate two cases: (1). new symbols with suffix which are generated regardless of whether LTO is enabled or not, and (2). new symbols with suffix generated only when LTO is enabled.
The cleanup_symbol_name() should only remove suffixes for case (2). Case (1) should not be changed so it can work uniformly with or without LTO.
This patch removed LTO-only suffix '.llvm.<...>' so live patching and tracing should work the same way for non-LTO kernel. The cleanup_symbol_name() in scripts/kallsyms.c is also changed to have the same filtering pattern so both kernel and kallsyms tool have the same expectation on the order of symbols.
[1] https://lore.kernel.org/live-patching/20230615170048.2382735-1-song@kernel.o...
Fixes: 6eb4bd92c1ce ("kallsyms: strip LTO suffixes from static functions") Reported-by: Song Liu song@kernel.org Signed-off-by: Yonghong Song yhs@fb.com Reviewed-by: Zhen Lei thunder.leizhen@huawei.com Reviewed-by: Nick Desaulniers ndesaulniers@google.com Acked-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20230628181926.4102448-1-yhs@fb.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/kallsyms.c | 5 ++--- scripts/kallsyms.c | 6 +++--- 2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index 77747391f49b6..4874508bb950e 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -174,11 +174,10 @@ static bool cleanup_symbol_name(char *s) * LLVM appends various suffixes for local functions and variables that * must be promoted to global scope as part of LTO. This can break * hooking of static functions with kprobes. '.' is not a valid - * character in an identifier in C. Suffixes observed: + * character in an identifier in C. Suffixes only in LLVM LTO observed: * - foo.llvm.[0-9a-f]+ - * - foo.[0-9a-f]+ */ - res = strchr(s, '.'); + res = strstr(s, ".llvm."); if (res) { *res = '\0'; return true; diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index 0d2db41177b23..13af6d0ff845d 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -346,10 +346,10 @@ static void cleanup_symbol_name(char *s) * ASCII[_] = 5f * ASCII[a-z] = 61,7a * - * As above, replacing '.' with '\0' does not affect the main sorting, - * but it helps us with subsorting. + * As above, replacing the first '.' in ".llvm." with '\0' does not + * affect the main sorting, but it helps us with subsorting. */ - p = strchr(s, '.'); + p = strstr(s, ".llvm."); if (p) *p = '\0'; }
From: Paulo Alcantara pc@manguebit.com
[ Upstream commit bf99f6be2d20146942bce6f9e90a0ceef12cbc1e ]
Use new cifs_smb_ses_inc_refcount() helper to get an active reference of @ses and @ses->dfs_root_ses (if set). This will prevent @ses->dfs_root_ses of being put in the next call to cifs_put_smb_ses() and thus potentially causing an use-after-free bug.
Fixes: 8e3554150d6c ("cifs: fix sharing of DFS connections") Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/dfs.c | 26 ++++++++++---------------- fs/smb/client/smb2transport.c | 2 +- 2 files changed, 11 insertions(+), 17 deletions(-)
diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c index 26d14dd0482ef..cf83617236d8b 100644 --- a/fs/smb/client/dfs.c +++ b/fs/smb/client/dfs.c @@ -66,6 +66,12 @@ static int get_session(struct cifs_mount_ctx *mnt_ctx, const char *full_path) return rc; }
+/* + * Track individual DFS referral servers used by new DFS mount. + * + * On success, their lifetime will be shared by final tcon (dfs_ses_list). + * Otherwise, they will be put by dfs_put_root_smb_sessions() in cifs_mount(). + */ static int add_root_smb_session(struct cifs_mount_ctx *mnt_ctx) { struct smb3_fs_context *ctx = mnt_ctx->fs_ctx; @@ -80,11 +86,12 @@ static int add_root_smb_session(struct cifs_mount_ctx *mnt_ctx) INIT_LIST_HEAD(&root_ses->list);
spin_lock(&cifs_tcp_ses_lock); - ses->ses_count++; + cifs_smb_ses_inc_refcount(ses); spin_unlock(&cifs_tcp_ses_lock); root_ses->ses = ses; list_add_tail(&root_ses->list, &mnt_ctx->dfs_ses_list); } + /* Select new DFS referral server so that new referrals go through it */ ctx->dfs_root_ses = ses; return 0; } @@ -244,7 +251,6 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx) int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs) { struct smb3_fs_context *ctx = mnt_ctx->fs_ctx; - struct cifs_ses *ses; bool nodfs = ctx->nodfs; int rc;
@@ -278,20 +284,8 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs) }
*isdfs = true; - /* - * Prevent DFS root session of being put in the first call to - * cifs_mount_put_conns(). If another DFS root server was not found - * while chasing the referrals (@ctx->dfs_root_ses == @ses), then we - * can safely put extra refcount of @ses. - */ - ses = mnt_ctx->ses; - mnt_ctx->ses = NULL; - mnt_ctx->server = NULL; - rc = __dfs_mount_share(mnt_ctx); - if (ses == ctx->dfs_root_ses) - cifs_put_smb_ses(ses); - - return rc; + add_root_smb_session(mnt_ctx); + return __dfs_mount_share(mnt_ctx); }
/* Update dfs referral path of superblock */ diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 22954a9c7a6c7..355e8700530fc 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -159,7 +159,7 @@ smb2_find_smb_ses_unlocked(struct TCP_Server_Info *server, __u64 ses_id) spin_unlock(&ses->ses_lock); continue; } - ++ses->ses_count; + cifs_smb_ses_inc_refcount(ses); spin_unlock(&ses->ses_lock); return ses; }
From: Marc Zyngier maz@kernel.org
[ Upstream commit 55b87b74996383230586f4f9f801ae304c70e649 ]
The HFGxTR_EL2 fields do not always follow the naming described in the spec, nor do they match the name of the register they trap in the rest of the kernel.
It is a bit sad that they were written by hand despite the availability of a machine readable version...
Fixes: cc077e7facbe ("arm64/sysreg: Convert HFG[RW]TR_EL2 to automatic generation") Signed-off-by: Marc Zyngier maz@kernel.org Cc: Mark Brown broonie@kernel.org Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.org Cc: Mark Rutland mark.rutland@arm.com Reviewed-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20230703130416.1495307-1-maz@kernel.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/tools/sysreg | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg index c9a0d1fa32090..930c8cc0812fc 100644 --- a/arch/arm64/tools/sysreg +++ b/arch/arm64/tools/sysreg @@ -1890,7 +1890,7 @@ Field 0 SM EndSysreg
SysregFields HFGxTR_EL2 -Field 63 nAMIAIR2_EL1 +Field 63 nAMAIR2_EL1 Field 62 nMAIR2_EL1 Field 61 nS2POR_EL1 Field 60 nPOR_EL1 @@ -1905,9 +1905,9 @@ Field 52 nGCS_EL0 Res0 51 Field 50 nACCDATA_EL1 Field 49 ERXADDR_EL1 -Field 48 EXRPFGCDN_EL1 -Field 47 EXPFGCTL_EL1 -Field 46 EXPFGF_EL1 +Field 48 ERXPFGCDN_EL1 +Field 47 ERXPFGCTL_EL1 +Field 46 ERXPFGF_EL1 Field 45 ERXMISCn_EL1 Field 44 ERXSTATUS_EL1 Field 43 ERXCTLR_EL1 @@ -1922,8 +1922,8 @@ Field 35 TPIDR_EL0 Field 34 TPIDRRO_EL0 Field 33 TPIDR_EL1 Field 32 TCR_EL1 -Field 31 SCTXNUM_EL0 -Field 30 SCTXNUM_EL1 +Field 31 SCXTNUM_EL0 +Field 30 SCXTNUM_EL1 Field 29 SCTLR_EL1 Field 28 REVIDR_EL1 Field 27 PAR_EL1
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 95ce158b6c93b28842b54b42ad1cb221b9844062 ]
I get sporadic timeouts from the driver when using the MV88E6352. Reading the status again after the loop fixes the problem: the operation is successful but goes undetected.
Some added prints show things like this:
[ 58.356209] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1b reg 0b, mask 8000, val 0000, data c000 [ 58.367487] mv88e6085 mdio_mux-0.1:00: Timeout waiting for ATU op 4000, fid 0001 (...) [ 61.826293] mv88e6085 mdio_mux-0.1:00: Timeout while waiting for switch, addr 1c reg 18, mask 8000, val 0000, data 9860 [ 61.837560] mv88e6085 mdio_mux-0.1:00: Timeout waiting for PHY command 1860 to complete
The reason is probably not the commands: I think those are mostly fine with the 50+50ms timeout, but the problem appears when OpenWrt brings up several interfaces in parallel on a system with 7 populated ports: if one of them take more than 50 ms and waits one or more of the others can get stuck on the mutex for the switch and then this can easily multiply.
As we sleep and wait, the function loop needs a final check after exiting the loop if we were successful.
Suggested-by: Andrew Lunn andrew@lunn.ch Cc: Tobias Waldekranz tobias@waldekranz.com Fixes: 35da1dfd9484 ("net: dsa: mv88e6xxx: Improve performance of busy bit polling") Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230712223405.861899-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 08a46ffd53af9..642e93e8623eb 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -109,6 +109,13 @@ int mv88e6xxx_wait_mask(struct mv88e6xxx_chip *chip, int addr, int reg, usleep_range(1000, 2000); }
+ err = mv88e6xxx_read(chip, addr, reg, &data); + if (err) + return err; + + if ((data & mask) == val) + return 0; + dev_err(chip->dev, "Timeout while waiting for switch\n"); return -ETIMEDOUT; }
From: Tanmay Patil t-patil@ti.com
[ Upstream commit b685f1a58956fa36cc01123f253351b25bfacfda ]
CPSW ALE has 75 bit ALE entries which are stored within three 32 bit words. The cpsw_ale_get_field() and cpsw_ale_set_field() functions assume that the field will be strictly contained within one word. However, this is not guaranteed to be the case and it is possible for ALE field entries to span across up to two words at the most.
Fix the methods to handle getting/setting fields spanning up to two words.
Fixes: db82173f23c5 ("netdev: driver: ethernet: add cpsw address lookup engine support") Signed-off-by: Tanmay Patil t-patil@ti.com [s-vadapalli@ti.com: rephrased commit message and added Fixes tag] Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/cpsw_ale.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 0c5e783e574c4..64bf22cd860c9 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -106,23 +106,37 @@ struct cpsw_ale_dev_id {
static inline int cpsw_ale_get_field(u32 *ale_entry, u32 start, u32 bits) { - int idx; + int idx, idx2; + u32 hi_val = 0;
idx = start / 32; + idx2 = (start + bits - 1) / 32; + /* Check if bits to be fetched exceed a word */ + if (idx != idx2) { + idx2 = 2 - idx2; /* flip */ + hi_val = ale_entry[idx2] << ((idx2 * 32) - start); + } start -= idx * 32; idx = 2 - idx; /* flip */ - return (ale_entry[idx] >> start) & BITMASK(bits); + return (hi_val + (ale_entry[idx] >> start)) & BITMASK(bits); }
static inline void cpsw_ale_set_field(u32 *ale_entry, u32 start, u32 bits, u32 value) { - int idx; + int idx, idx2;
value &= BITMASK(bits); - idx = start / 32; + idx = start / 32; + idx2 = (start + bits - 1) / 32; + /* Check if bits to be set exceed a word */ + if (idx != idx2) { + idx2 = 2 - idx2; /* flip */ + ale_entry[idx2] &= ~(BITMASK(bits + start - (idx2 * 32))); + ale_entry[idx2] |= (value >> ((idx2 * 32) - start)); + } start -= idx * 32; - idx = 2 - idx; /* flip */ + idx = 2 - idx; /* flip */ ale_entry[idx] &= ~(BITMASK(bits) << start); ale_entry[idx] |= (value << start); }
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 56a16035bb6effb37177867cea94c13a8382f745 ]
When we create an L2 loop on a bridge in netns, we will see packets storm even if STP is enabled.
# unshare -n # ip link add br0 type bridge # ip link add veth0 type veth peer name veth1 # ip link set veth0 master br0 up # ip link set veth1 master br0 up # ip link set br0 type bridge stp_state 1 # ip link set br0 up # sleep 30 # ip -s link show br0 2: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether b6:61:98:1c:1c:b5 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 956553768 12861249 0 0 0 12861249 <-. Keep TX: bytes packets errors dropped carrier collsns | increasing 1027834 11951 0 0 0 0 <-' rapidly
This is because llc_rcv() drops all packets in non-root netns and BPDU is dropped.
Let's add extack warning when enabling STP in netns.
# unshare -n # ip link add br0 type bridge # ip link set br0 type bridge stp_state 1 Warning: bridge: STP does not work in non-root netns.
Note this commit will be reverted later when we namespacify the whole LLC infra.
Fixes: e730c15519d0 ("[NET]: Make packet reception network namespace safe") Suggested-by: Harry Coin hcoin@quietfountain.com Link: https://lore.kernel.org/netdev/0f531295-e289-022d-5add-5ceffa0df9bc@quietfou... Suggested-by: Ido Schimmel idosch@idosch.org Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Acked-by: Nikolay Aleksandrov razor@blackwall.org Reviewed-by: Ido Schimmel idosch@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_stp_if.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index 75204d36d7f90..b65962682771f 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -201,6 +201,9 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val, { ASSERT_RTNL();
+ if (!net_eq(dev_net(br->dev), &init_net)) + NL_SET_ERR_MSG_MOD(extack, "STP does not work in non-root netns"); + if (br_mrp_enabled(br)) { NL_SET_ERR_MSG_MOD(extack, "STP can't be enabled if MRP is already enabled");
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit 1d6d537dc55d1f42d16290f00157ac387985b95b ]
Move the call to of_get_ethdev_address to mtk_add_mac which is part of the probe function and can hence itself return -EPROBE_DEFER should of_get_ethdev_address return -EPROBE_DEFER. This allows us to entirely get rid of the mtk_init function.
The problem of of_get_ethdev_address returning -EPROBE_DEFER surfaced in situations in which the NVMEM provider holding the MAC address has not yet be loaded at the time mtk_eth_soc is initially probed. In this case probing of mtk_eth_soc should be deferred instead of falling back to use a random MAC address, so once the NVMEM provider becomes available probing can be repeated.
Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Signed-off-by: Daniel Golle daniel@makrotopia.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 ++++++++------------- 1 file changed, 11 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 834c644b67db5..2d15342c260ae 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -3846,23 +3846,6 @@ static int mtk_hw_deinit(struct mtk_eth *eth) return 0; }
-static int __init mtk_init(struct net_device *dev) -{ - struct mtk_mac *mac = netdev_priv(dev); - struct mtk_eth *eth = mac->hw; - int ret; - - ret = of_get_ethdev_address(mac->of_node, dev); - if (ret) { - /* If the mac address is invalid, use random mac address */ - eth_hw_addr_random(dev); - dev_err(eth->dev, "generated random MAC address %pM\n", - dev->dev_addr); - } - - return 0; -} - static void mtk_uninit(struct net_device *dev) { struct mtk_mac *mac = netdev_priv(dev); @@ -4278,7 +4261,6 @@ static const struct ethtool_ops mtk_ethtool_ops = { };
static const struct net_device_ops mtk_netdev_ops = { - .ndo_init = mtk_init, .ndo_uninit = mtk_uninit, .ndo_open = mtk_open, .ndo_stop = mtk_stop, @@ -4340,6 +4322,17 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np) mac->hw = eth; mac->of_node = np;
+ err = of_get_ethdev_address(mac->of_node, eth->netdev[id]); + if (err == -EPROBE_DEFER) + return err; + + if (err) { + /* If the mac address is invalid, use random mac address */ + eth_hw_addr_random(eth->netdev[id]); + dev_err(eth->dev, "generated random MAC address %pM\n", + eth->netdev[id]->dev_addr); + } + memset(mac->hwlro_ip, 0, sizeof(mac->hwlro_ip)); mac->hwlro_ip_cnt = 0;
From: Yan Zhai yan@cloudflare.com
[ Upstream commit 9840036786d90cea11a90d1f30b6dc003b34ee67 ]
Commit 1fd54773c267 ("udp: allow header check for dodgy GSO_UDP_L4 packets.") checks DODGY bit for UDP, but for packets that can be fed directly to the device after gso_segs reset, it actually falls through to fragmentation:
https://lore.kernel.org/all/CAJPywTKDdjtwkLVUW6LRA2FU912qcDmQOQGt2WaDo28KzYD...
This change restores the expected behavior of GSO_UDP_L4 packets.
Fixes: 1fd54773c267 ("udp: allow header check for dodgy GSO_UDP_L4 packets.") Suggested-by: Willem de Bruijn willemdebruijn.kernel@gmail.com Signed-off-by: Yan Zhai yan@cloudflare.com Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp_offload.c | 16 +++++++++++----- net/ipv6/udp_offload.c | 3 +-- 2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 1f01e15ca24fd..4a61832e7f69b 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -273,13 +273,20 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, __sum16 check; __be16 newlen;
- if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) - return __udp_gso_segment_list(gso_skb, features, is_ipv6); - mss = skb_shinfo(gso_skb)->gso_size; if (gso_skb->len <= sizeof(*uh) + mss) return ERR_PTR(-EINVAL);
+ if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) { + /* Packet is from an untrusted source, reset gso_segs. */ + skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh), + mss); + return NULL; + } + + if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) + return __udp_gso_segment_list(gso_skb, features, is_ipv6); + skb_pull(gso_skb, sizeof(*uh));
/* clear destructor to avoid skb_segment assigning it to tail */ @@ -387,8 +394,7 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb, if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto out;
- if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && - !skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) + if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) return __udp_gso_segment(skb, features, false);
mss = skb_shinfo(skb)->gso_size; diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c index c39c1e32f9804..e0e10f6bcdc18 100644 --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -42,8 +42,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb, if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto out;
- if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && - !skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) + if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) return __udp_gso_segment(skb, features, true);
mss = skb_shinfo(skb)->gso_size;
From: Dan Carpenter error27@gmail.com
[ Upstream commit c20ecf7bb6153149b81a9277eda23398957656f2 ]
The ida_alloc_range() function returns negative error codes on error. On success it returns values in the min to max range (inclusive). It never returns more then INT_MAX even if "max" is higher. It never returns values in the 0 to (min - 1) range.
The bug is that "min" is an unsigned int so negative error codes will be promoted to high positive values errors treated as success.
Fixes: 1a14bf0fc7ed ("iommu/sva: Use GFP_KERNEL for pasid allocation") Signed-off-by: Dan Carpenter error27@gmail.com Reviewed-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/6b32095d-7491-4ebb-a850-12e96209eaaf@kili.mountain Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/iommu-sva.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 3ebd4b6586b3e..05c0fb2acbc44 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -34,8 +34,9 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t ma }
ret = ida_alloc_range(&iommu_global_pasid_ida, min, max, GFP_KERNEL); - if (ret < min) + if (ret < 0) goto out; + mm->pasid = ret; ret = 0; out:
From: Shyam Prasad N nspmangalore@gmail.com
[ Upstream commit 69cba9d3c1284e0838ae408830a02c4a063104bc ]
When the number of responses with status of STATUS_IO_TIMEOUT exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect the connection. But we do not return the mid, or the credits returned for the mid, or reduce the number of in-flight requests.
This bug could result in the server->in_flight count to go bad, and also cause a leak in the mids.
This change moves the check to a few lines below where the response is decrypted, even of the response is read from the transform header. This way, the code for returning the mids can be reused.
Also, the cifs_reconnect was reconnecting just the transport connection before. In case of multi-channel, this may not be what we want to do after several timeouts. Changed that to reconnect the session and the tree too.
Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name MAX_STATUS_IO_TIMEOUT.
Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully") Signed-off-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/connect.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index d9f0b3b94f007..853209268f507 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -60,7 +60,7 @@ extern bool disable_legacy_dialects; #define TLINK_IDLE_EXPIRE (600 * HZ)
/* Drop the connection to not overload the server */ -#define NUM_STATUS_IO_TIMEOUT 5 +#define MAX_STATUS_IO_TIMEOUT 5
static int ip_connect(struct TCP_Server_Info *server); static int generic_ip_connect(struct TCP_Server_Info *server); @@ -1117,6 +1117,7 @@ cifs_demultiplex_thread(void *p) struct mid_q_entry *mids[MAX_COMPOUND]; char *bufs[MAX_COMPOUND]; unsigned int noreclaim_flag, num_io_timeout = 0; + bool pending_reconnect = false;
noreclaim_flag = memalloc_noreclaim_save(); cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current)); @@ -1156,6 +1157,8 @@ cifs_demultiplex_thread(void *p) cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length); if (!is_smb_response(server, buf[0])) continue; + + pending_reconnect = false; next_pdu: server->pdu_size = pdu_length;
@@ -1213,10 +1216,13 @@ cifs_demultiplex_thread(void *p) if (server->ops->is_status_io_timeout && server->ops->is_status_io_timeout(buf)) { num_io_timeout++; - if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) { - cifs_reconnect(server, false); + if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) { + cifs_server_dbg(VFS, + "Number of request timeouts exceeded %d. Reconnecting", + MAX_STATUS_IO_TIMEOUT); + + pending_reconnect = true; num_io_timeout = 0; - continue; } }
@@ -1263,6 +1269,11 @@ cifs_demultiplex_thread(void *p) buf = server->smallbuf; goto next_pdu; } + + /* do this reconnect at the very end after processing all MIDs */ + if (pending_reconnect) + cifs_reconnect(server, true); + } /* end while !EXITING */
/* buffer usually freed in free_mid - need to free it here on exit */
From: Petr Oros poros@redhat.com
[ Upstream commit 24a3298ac9e6bd8de838ab79f7868207170d556d ]
Since commit 6624e780a577fc ("ice: split ice_vsi_setup into smaller functions") ice_vsi_release does things twice. There is unregister netdev which is unregistered in ice_deinit_eth also.
It also unregisters the devlink_port twice which is also unregistered in ice_deinit_eth(). This double deregistration is hidden because devl_port_unregister ignores the return value of xa_erase.
[ 68.642167] Call Trace: [ 68.650385] ice_devlink_destroy_pf_port+0xe/0x20 [ice] [ 68.655656] ice_vsi_release+0x445/0x690 [ice] [ 68.660147] ice_deinit+0x99/0x280 [ice] [ 68.664117] ice_remove+0x1b6/0x5c0 [ice]
[ 171.103841] Call Trace: [ 171.109607] ice_devlink_destroy_pf_port+0xf/0x20 [ice] [ 171.114841] ice_remove+0x158/0x270 [ice] [ 171.118854] pci_device_remove+0x3b/0xc0 [ 171.122779] device_release_driver_internal+0xc7/0x170 [ 171.127912] driver_detach+0x54/0x8c [ 171.131491] bus_remove_driver+0x77/0xd1 [ 171.135406] pci_unregister_driver+0x2d/0xb0 [ 171.139670] ice_module_exit+0xc/0x55f [ice]
Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Signed-off-by: Petr Oros poros@redhat.com Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_lib.c | 27 ------------------------ 1 file changed, 27 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 11ae0e41f518a..284a1f0bfdb54 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -3272,39 +3272,12 @@ int ice_vsi_release(struct ice_vsi *vsi) return -ENODEV; pf = vsi->back;
- /* do not unregister while driver is in the reset recovery pending - * state. Since reset/rebuild happens through PF service task workqueue, - * it's not a good idea to unregister netdev that is associated to the - * PF that is running the work queue items currently. This is done to - * avoid check_flush_dependency() warning on this wq - */ - if (vsi->netdev && !ice_is_reset_in_progress(pf->state) && - (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state))) { - unregister_netdev(vsi->netdev); - clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state); - } - - if (vsi->type == ICE_VSI_PF) - ice_devlink_destroy_pf_port(pf); - if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) ice_rss_clean(vsi);
ice_vsi_close(vsi); ice_vsi_decfg(vsi);
- if (vsi->netdev) { - if (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state)) { - unregister_netdev(vsi->netdev); - clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state); - } - if (test_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state)) { - free_netdev(vsi->netdev); - vsi->netdev = NULL; - clear_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state); - } - } - /* retain SW VSI data structure since it is needed to unregister and * free VSI netdev when PF is not in reset recovery pending state,\ * for ex: during rmmod.
From: Michal Swiatkowski michal.swiatkowski@linux.intel.com
[ Upstream commit b3e7b3a6ee92ab927f750a6b19615ce88ece808f ]
Calling ethtool during reload can lead to call trace, because VSI isn't configured for some time, but netdev is alive.
To fix it add rtnl lock for VSI deconfig and config. Set ::num_q_vectors to 0 after freeing and add a check for ::tx/rx_rings in ring related ethtool ops.
Add proper unroll of filters in ice_start_eth().
Reproduction: $watch -n 0.1 -d 'ethtool -g enp24s0f0np0' $devlink dev reload pci/0000:18:00.0 action driver_reinit
Call trace before fix: [66303.926205] BUG: kernel NULL pointer dereference, address: 0000000000000000 [66303.926259] #PF: supervisor read access in kernel mode [66303.926286] #PF: error_code(0x0000) - not-present page [66303.926311] PGD 0 P4D 0 [66303.926332] Oops: 0000 [#1] PREEMPT SMP PTI [66303.926358] CPU: 4 PID: 933821 Comm: ethtool Kdump: loaded Tainted: G OE 6.4.0-rc5+ #1 [66303.926400] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018 [66303.926446] RIP: 0010:ice_get_ringparam+0x22/0x50 [ice] [66303.926649] Code: 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 c0 09 00 00 c7 46 04 e0 1f 00 00 c7 46 10 e0 1f 00 00 48 8b 50 20 <48> 8b 12 0f b7 52 3a 89 56 14 48 8b 40 28 48 8b 00 0f b7 40 58 48 [66303.926722] RSP: 0018:ffffad40472f39c8 EFLAGS: 00010246 [66303.926749] RAX: ffff98a8ada05828 RBX: ffff98a8c46dd060 RCX: ffffad40472f3b48 [66303.926781] RDX: 0000000000000000 RSI: ffff98a8c46dd068 RDI: ffff98a8b23c4000 [66303.926811] RBP: ffffad40472f3b48 R08: 00000000000337b0 R09: 0000000000000000 [66303.926843] R10: 0000000000000001 R11: 0000000000000100 R12: ffff98a8b23c4000 [66303.926874] R13: ffff98a8c46dd060 R14: 000000000000000f R15: ffffad40472f3a50 [66303.926906] FS: 00007f6397966740(0000) GS:ffff98b390900000(0000) knlGS:0000000000000000 [66303.926941] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [66303.926967] CR2: 0000000000000000 CR3: 000000011ac20002 CR4: 00000000007706e0 [66303.926999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [66303.927029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [66303.927060] PKRU: 55555554 [66303.927075] Call Trace: [66303.927094] <TASK> [66303.927111] ? __die+0x23/0x70 [66303.927140] ? page_fault_oops+0x171/0x4e0 [66303.927176] ? exc_page_fault+0x7f/0x180 [66303.927209] ? asm_exc_page_fault+0x26/0x30 [66303.927244] ? ice_get_ringparam+0x22/0x50 [ice] [66303.927433] rings_prepare_data+0x62/0x80 [66303.927469] ethnl_default_doit+0xe2/0x350 [66303.927501] genl_family_rcv_msg_doit.isra.0+0xe3/0x140 [66303.927538] genl_rcv_msg+0x1b1/0x2c0 [66303.927561] ? __pfx_ethnl_default_doit+0x10/0x10 [66303.927590] ? __pfx_genl_rcv_msg+0x10/0x10 [66303.927615] netlink_rcv_skb+0x58/0x110 [66303.927644] genl_rcv+0x28/0x40 [66303.927665] netlink_unicast+0x19e/0x290 [66303.927691] netlink_sendmsg+0x254/0x4d0 [66303.927717] sock_sendmsg+0x93/0xa0 [66303.927743] __sys_sendto+0x126/0x170 [66303.927780] __x64_sys_sendto+0x24/0x30 [66303.928593] do_syscall_64+0x5d/0x90 [66303.929370] ? __count_memcg_events+0x60/0xa0 [66303.930146] ? count_memcg_events.constprop.0+0x1a/0x30 [66303.930920] ? handle_mm_fault+0x9e/0x350 [66303.931688] ? do_user_addr_fault+0x258/0x740 [66303.932452] ? exc_page_fault+0x7f/0x180 [66303.933193] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_base.c | 2 ++ drivers/net/ethernet/intel/ice/ice_ethtool.c | 13 +++++++++++-- drivers/net/ethernet/intel/ice/ice_main.c | 10 ++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 1911d644dfa8d..619cb07a40691 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -758,6 +758,8 @@ void ice_vsi_free_q_vectors(struct ice_vsi *vsi)
ice_for_each_q_vector(vsi, v_idx) ice_free_q_vector(vsi, v_idx); + + vsi->num_q_vectors = 0; }
/** diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index f86e814354a31..ec4138e684bd2 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -2920,8 +2920,13 @@ ice_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
ring->rx_max_pending = ICE_MAX_NUM_DESC; ring->tx_max_pending = ICE_MAX_NUM_DESC; - ring->rx_pending = vsi->rx_rings[0]->count; - ring->tx_pending = vsi->tx_rings[0]->count; + if (vsi->tx_rings && vsi->rx_rings) { + ring->rx_pending = vsi->rx_rings[0]->count; + ring->tx_pending = vsi->tx_rings[0]->count; + } else { + ring->rx_pending = 0; + ring->tx_pending = 0; + }
/* Rx mini and jumbo rings are not supported */ ring->rx_mini_max_pending = 0; @@ -2955,6 +2960,10 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring, return -EINVAL; }
+ /* Return if there is no rings (device is reloading) */ + if (!vsi->tx_rings || !vsi->rx_rings) + return -EBUSY; + new_tx_cnt = ALIGN(ring->tx_pending, ICE_REQ_DESC_MULTIPLE); if (new_tx_cnt != ring->tx_pending) netdev_info(netdev, "Requested Tx descriptor count rounded up to %d\n", diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 1277e0a044ee4..fbe70458fda27 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4655,9 +4655,9 @@ static int ice_start_eth(struct ice_vsi *vsi) if (err) return err;
- rtnl_lock(); err = ice_vsi_open(vsi); - rtnl_unlock(); + if (err) + ice_fltr_remove_all(vsi);
return err; } @@ -5120,6 +5120,7 @@ int ice_load(struct ice_pf *pf) params = ice_vsi_to_params(vsi); params.flags = ICE_VSI_FLAG_INIT;
+ rtnl_lock(); err = ice_vsi_cfg(vsi, ¶ms); if (err) goto err_vsi_cfg; @@ -5127,6 +5128,7 @@ int ice_load(struct ice_pf *pf) err = ice_start_eth(ice_get_main_vsi(pf)); if (err) goto err_start_eth; + rtnl_unlock();
err = ice_init_rdma(pf); if (err) @@ -5141,9 +5143,11 @@ int ice_load(struct ice_pf *pf)
err_init_rdma: ice_vsi_close(ice_get_main_vsi(pf)); + rtnl_lock(); err_start_eth: ice_vsi_decfg(ice_get_main_vsi(pf)); err_vsi_cfg: + rtnl_unlock(); ice_deinit_dev(pf); return err; } @@ -5156,8 +5160,10 @@ void ice_unload(struct ice_pf *pf) { ice_deinit_features(pf); ice_deinit_rdma(pf); + rtnl_lock(); ice_stop_eth(ice_get_main_vsi(pf)); ice_vsi_decfg(ice_get_main_vsi(pf)); + rtnl_unlock(); ice_deinit_dev(pf); }
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 469e2f28c2cbee2430058c1c9bb6d1675d7195fb ]
This doesn't check how many bytes the simple_write_to_buffer() writes to the buffer. The only thing that we know is that the first byte is initialized and the last byte of the buffer is set to NUL. However the middle bytes could be uninitialized.
There is no need to use simple_write_to_buffer(). This code does not support partial writes but instead passes "pos = 0" as the starting offset regardless of what the user passed as "*ppos". Just use the copy_from_user() function and initialize the whole buffer.
Fixes: 671e0b90051e ("ASoC: SOF: Clone the trace code to ipc3-dtrace as fw_tracing implementation") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/74148292-ce4d-4e01-a1a7-921e6767da14@moroto.mounta... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc3-dtrace.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/sound/soc/sof/ipc3-dtrace.c b/sound/soc/sof/ipc3-dtrace.c index 1d3bca2d28dd6..35da85a45a9ae 100644 --- a/sound/soc/sof/ipc3-dtrace.c +++ b/sound/soc/sof/ipc3-dtrace.c @@ -186,7 +186,6 @@ static ssize_t dfsentry_trace_filter_write(struct file *file, const char __user struct snd_sof_dfsentry *dfse = file->private_data; struct sof_ipc_trace_filter_elem *elems = NULL; struct snd_sof_dev *sdev = dfse->sdev; - loff_t pos = 0; int num_elems; char *string; int ret; @@ -201,11 +200,11 @@ static ssize_t dfsentry_trace_filter_write(struct file *file, const char __user if (!string) return -ENOMEM;
- /* assert null termination */ - string[count] = 0; - ret = simple_write_to_buffer(string, count, &pos, from, count); - if (ret < 0) + if (copy_from_user(string, from, count)) { + ret = -EFAULT; goto error; + } + string[count] = '\0';
ret = trace_filter_parse(sdev, string, &num_elems, &elems); if (ret < 0)
From: Martin Fuzzey martin.fuzzey@flowbird.group
[ Upstream commit 98e2dd5f7a8be5cb2501a897e96910393a49f0ff ]
When some of the da9063 regulators do not have corresponding DT nodes a null pointer dereference occurs on boot because such regulators have no init_data causing the pointers calculated in da9063_check_xvp_constraints() to be invalid.
Do not dereference them in this case.
Fixes: b8717a80e6ee ("regulator: da9063: implement setter for voltage monitoring") Signed-off-by: Martin Fuzzey martin.fuzzey@flowbird.group Link: https://lore.kernel.org/r/20230616143736.2946173-1-martin.fuzzey@flowbird.gr... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/da9063-regulator.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/regulator/da9063-regulator.c b/drivers/regulator/da9063-regulator.c index c5dd77be558b6..dfd5ec9f75c90 100644 --- a/drivers/regulator/da9063-regulator.c +++ b/drivers/regulator/da9063-regulator.c @@ -778,6 +778,9 @@ static int da9063_check_xvp_constraints(struct regulator_config *config) const struct notification_limit *uv_l = &constr->under_voltage_limits; const struct notification_limit *ov_l = &constr->over_voltage_limits;
+ if (!config->init_data) /* No config in DT, pointers will be invalid */ + return 0; + /* make sure that only one severity is used to clarify if unchanged, enabled or disabled */ if ((!!uv_l->prot + !!uv_l->err + !!uv_l->warn) > 1) { dev_err(config->dev, "%s: at most one voltage monitoring severity allowed!\n",
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit b3d0e0489430735e2e7626aa37e6462cdd136e9d ]
In case an error occurred after mall_set_parms executed successfully, we must undo the tcf_bind_filter call it issues.
Fix that by calling tcf_unbind_filter in err_replace_hw_filter label.
Fixes: ec2507d2a306 ("net/sched: cls_matchall: Fix error path") Signed-off-by: Victor Nogueira victor@mojatatu.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_matchall.c | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-)
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c index fa3bbd187eb97..c4ed11df62548 100644 --- a/net/sched/cls_matchall.c +++ b/net/sched/cls_matchall.c @@ -159,26 +159,6 @@ static const struct nla_policy mall_policy[TCA_MATCHALL_MAX + 1] = { [TCA_MATCHALL_FLAGS] = { .type = NLA_U32 }, };
-static int mall_set_parms(struct net *net, struct tcf_proto *tp, - struct cls_mall_head *head, - unsigned long base, struct nlattr **tb, - struct nlattr *est, u32 flags, u32 fl_flags, - struct netlink_ext_ack *extack) -{ - int err; - - err = tcf_exts_validate_ex(net, tp, tb, est, &head->exts, flags, - fl_flags, extack); - if (err < 0) - return err; - - if (tb[TCA_MATCHALL_CLASSID]) { - head->res.classid = nla_get_u32(tb[TCA_MATCHALL_CLASSID]); - tcf_bind_filter(tp, &head->res, base); - } - return 0; -} - static int mall_change(struct net *net, struct sk_buff *in_skb, struct tcf_proto *tp, unsigned long base, u32 handle, struct nlattr **tca, @@ -187,6 +167,7 @@ static int mall_change(struct net *net, struct sk_buff *in_skb, { struct cls_mall_head *head = rtnl_dereference(tp->root); struct nlattr *tb[TCA_MATCHALL_MAX + 1]; + bool bound_to_filter = false; struct cls_mall_head *new; u32 userflags = 0; int err; @@ -226,11 +207,17 @@ static int mall_change(struct net *net, struct sk_buff *in_skb, goto err_alloc_percpu; }
- err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE], - flags, new->flags, extack); - if (err) + err = tcf_exts_validate_ex(net, tp, tb, tca[TCA_RATE], + &new->exts, flags, new->flags, extack); + if (err < 0) goto err_set_parms;
+ if (tb[TCA_MATCHALL_CLASSID]) { + new->res.classid = nla_get_u32(tb[TCA_MATCHALL_CLASSID]); + tcf_bind_filter(tp, &new->res, base); + bound_to_filter = true; + } + if (!tc_skip_hw(new->flags)) { err = mall_replace_hw_filter(tp, new, (unsigned long)new, extack); @@ -246,6 +233,8 @@ static int mall_change(struct net *net, struct sk_buff *in_skb, return 0;
err_replace_hw_filter: + if (bound_to_filter) + tcf_unbind_filter(tp, &new->res); err_set_parms: free_percpu(new->pf); err_alloc_percpu:
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit 9cb36faedeafb9720ac236aeae2ea57091d90a09 ]
When u32_replace_hw_knode fails, we need to undo the tcf_bind_filter operation done at u32_set_parms.
Fixes: d34e3e181395 ("net: cls_u32: Add support for skip-sw flag to tc u32 classifier.") Signed-off-by: Victor Nogueira victor@mojatatu.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_u32.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-)
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index d15d50de79802..ed358466d042a 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -712,8 +712,23 @@ static const struct nla_policy u32_policy[TCA_U32_MAX + 1] = { [TCA_U32_FLAGS] = { .type = NLA_U32 }, };
+static void u32_unbind_filter(struct tcf_proto *tp, struct tc_u_knode *n, + struct nlattr **tb) +{ + if (tb[TCA_U32_CLASSID]) + tcf_unbind_filter(tp, &n->res); +} + +static void u32_bind_filter(struct tcf_proto *tp, struct tc_u_knode *n, + unsigned long base, struct nlattr **tb) +{ + if (tb[TCA_U32_CLASSID]) { + n->res.classid = nla_get_u32(tb[TCA_U32_CLASSID]); + tcf_bind_filter(tp, &n->res, base); + } +} + static int u32_set_parms(struct net *net, struct tcf_proto *tp, - unsigned long base, struct tc_u_knode *n, struct nlattr **tb, struct nlattr *est, u32 flags, u32 fl_flags, struct netlink_ext_ack *extack) @@ -760,10 +775,6 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp, if (ht_old) ht_old->refcnt--; } - if (tb[TCA_U32_CLASSID]) { - n->res.classid = nla_get_u32(tb[TCA_U32_CLASSID]); - tcf_bind_filter(tp, &n->res, base); - }
if (ifindex >= 0) n->ifindex = ifindex; @@ -903,17 +914,20 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, if (!new) return -ENOMEM;
- err = u32_set_parms(net, tp, base, new, tb, - tca[TCA_RATE], flags, new->flags, - extack); + err = u32_set_parms(net, tp, new, tb, tca[TCA_RATE], + flags, new->flags, extack);
if (err) { __u32_destroy_key(new); return err; }
+ u32_bind_filter(tp, new, base, tb); + err = u32_replace_hw_knode(tp, new, flags, extack); if (err) { + u32_unbind_filter(tp, new, tb); + __u32_destroy_key(new); return err; } @@ -1074,15 +1088,18 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, } #endif
- err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE], + err = u32_set_parms(net, tp, n, tb, tca[TCA_RATE], flags, n->flags, extack); + + u32_bind_filter(tp, n, base, tb); + if (err == 0) { struct tc_u_knode __rcu **ins; struct tc_u_knode *pins;
err = u32_replace_hw_knode(tp, n, flags, extack); if (err) - goto errhw; + goto errunbind;
if (!tc_in_hw(n->flags)) n->flags |= TCA_CLS_FLAGS_NOT_IN_HW; @@ -1100,7 +1117,9 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, return 0; }
-errhw: +errunbind: + u32_unbind_filter(tp, n, tb); + #ifdef CONFIG_CLS_U32_MARK free_percpu(n->pcpu_success); #endif
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit e8d3d78c19be0264a5692bed477c303523aead31 ]
In the case of an update, when TCA_U32_LINK is set, u32_set_parms will decrement the refcount of the ht_down (struct tc_u_hnode) pointer present in the older u32 filter which we are replacing. However, if u32_replace_hw_knode errors out, the update command fails and that ht_down pointer continues decremented. To fix that, when u32_replace_hw_knode fails, check if ht_down's refcount was decremented and undo the decrement.
Fixes: d34e3e181395 ("net: cls_u32: Add support for skip-sw flag to tc u32 classifier.") Signed-off-by: Victor Nogueira victor@mojatatu.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_u32.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index ed358466d042a..5abf31e432caf 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -928,6 +928,13 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, if (err) { u32_unbind_filter(tp, new, tb);
+ if (tb[TCA_U32_LINK]) { + struct tc_u_hnode *ht_old; + + ht_old = rtnl_dereference(n->ht_down); + if (ht_old) + ht_old->refcnt++; + } __u32_destroy_key(new); return err; }
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit 26a22194927e8521e304ed75c2f38d8068d55fc7 ]
If cls_bpf_offload errors out, we must also undo tcf_bind_filter that was done before the error.
Fix that by calling tcf_unbind_filter in errout_parms.
Fixes: eadb41489fd2 ("net: cls_bpf: add support for marking filters as hardware-only") Signed-off-by: Victor Nogueira victor@mojatatu.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_bpf.c | 99 +++++++++++++++++++++------------------------ 1 file changed, 47 insertions(+), 52 deletions(-)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c index 466c26df853a0..382c7a71f81f2 100644 --- a/net/sched/cls_bpf.c +++ b/net/sched/cls_bpf.c @@ -406,56 +406,6 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog, return 0; }
-static int cls_bpf_set_parms(struct net *net, struct tcf_proto *tp, - struct cls_bpf_prog *prog, unsigned long base, - struct nlattr **tb, struct nlattr *est, u32 flags, - struct netlink_ext_ack *extack) -{ - bool is_bpf, is_ebpf, have_exts = false; - u32 gen_flags = 0; - int ret; - - is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS]; - is_ebpf = tb[TCA_BPF_FD]; - if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf)) - return -EINVAL; - - ret = tcf_exts_validate(net, tp, tb, est, &prog->exts, flags, - extack); - if (ret < 0) - return ret; - - if (tb[TCA_BPF_FLAGS]) { - u32 bpf_flags = nla_get_u32(tb[TCA_BPF_FLAGS]); - - if (bpf_flags & ~TCA_BPF_FLAG_ACT_DIRECT) - return -EINVAL; - - have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT; - } - if (tb[TCA_BPF_FLAGS_GEN]) { - gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]); - if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS || - !tc_flags_valid(gen_flags)) - return -EINVAL; - } - - prog->exts_integrated = have_exts; - prog->gen_flags = gen_flags; - - ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) : - cls_bpf_prog_from_efd(tb, prog, gen_flags, tp); - if (ret < 0) - return ret; - - if (tb[TCA_BPF_CLASSID]) { - prog->res.classid = nla_get_u32(tb[TCA_BPF_CLASSID]); - tcf_bind_filter(tp, &prog->res, base); - } - - return 0; -} - static int cls_bpf_change(struct net *net, struct sk_buff *in_skb, struct tcf_proto *tp, unsigned long base, u32 handle, struct nlattr **tca, @@ -463,9 +413,12 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb, struct netlink_ext_ack *extack) { struct cls_bpf_head *head = rtnl_dereference(tp->root); + bool is_bpf, is_ebpf, have_exts = false; struct cls_bpf_prog *oldprog = *arg; struct nlattr *tb[TCA_BPF_MAX + 1]; + bool bound_to_filter = false; struct cls_bpf_prog *prog; + u32 gen_flags = 0; int ret;
if (tca[TCA_OPTIONS] == NULL) @@ -504,11 +457,51 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb, goto errout; prog->handle = handle;
- ret = cls_bpf_set_parms(net, tp, prog, base, tb, tca[TCA_RATE], flags, - extack); + is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS]; + is_ebpf = tb[TCA_BPF_FD]; + if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf)) { + ret = -EINVAL; + goto errout_idr; + } + + ret = tcf_exts_validate(net, tp, tb, tca[TCA_RATE], &prog->exts, + flags, extack); + if (ret < 0) + goto errout_idr; + + if (tb[TCA_BPF_FLAGS]) { + u32 bpf_flags = nla_get_u32(tb[TCA_BPF_FLAGS]); + + if (bpf_flags & ~TCA_BPF_FLAG_ACT_DIRECT) { + ret = -EINVAL; + goto errout_idr; + } + + have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT; + } + if (tb[TCA_BPF_FLAGS_GEN]) { + gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]); + if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS || + !tc_flags_valid(gen_flags)) { + ret = -EINVAL; + goto errout_idr; + } + } + + prog->exts_integrated = have_exts; + prog->gen_flags = gen_flags; + + ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) : + cls_bpf_prog_from_efd(tb, prog, gen_flags, tp); if (ret < 0) goto errout_idr;
+ if (tb[TCA_BPF_CLASSID]) { + prog->res.classid = nla_get_u32(tb[TCA_BPF_CLASSID]); + tcf_bind_filter(tp, &prog->res, base); + bound_to_filter = true; + } + ret = cls_bpf_offload(tp, prog, oldprog, extack); if (ret) goto errout_parms; @@ -530,6 +523,8 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb, return 0;
errout_parms: + if (bound_to_filter) + tcf_unbind_filter(tp, &prog->res); cls_bpf_free_parms(prog); errout_idr: if (!oldprog)
From: Tristram Ha Tristram.Ha@microchip.com
[ Upstream commit 4bdf79d686b49ac49373b36466acfb93972c7d7c ]
The KSZ8795 driver code was modified to use on KSZ8863/73, which has different register definitions. Some of the new KSZ8795 register information are wrong compared to previous code.
KSZ8795 also behaves differently in that the STATIC_MAC_TABLE_USE_FID and STATIC_MAC_TABLE_FID bits are off by 1 when doing MAC table reading than writing. To compensate that a special code was added to shift the register value by 1 before applying those bits. This is wrong when the code is running on KSZ8863, so this special code is only executed when KSZ8795 is detected.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Tristram Ha Tristram.Ha@microchip.com Reviewed-by: Horatiu Vultur horatiu.vultur@microchip.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz8795.c | 8 +++++++- drivers/net/dsa/microchip/ksz_common.c | 8 ++++---- drivers/net/dsa/microchip/ksz_common.h | 7 +++++++ 3 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c index f56fca1b1a222..cc5b19a3d0df2 100644 --- a/drivers/net/dsa/microchip/ksz8795.c +++ b/drivers/net/dsa/microchip/ksz8795.c @@ -506,7 +506,13 @@ static int ksz8_r_sta_mac_table(struct ksz_device *dev, u16 addr, (data_hi & masks[STATIC_MAC_TABLE_FWD_PORTS]) >> shifts[STATIC_MAC_FWD_PORTS]; alu->is_override = (data_hi & masks[STATIC_MAC_TABLE_OVERRIDE]) ? 1 : 0; - data_hi >>= 1; + + /* KSZ8795 family switches have STATIC_MAC_TABLE_USE_FID and + * STATIC_MAC_TABLE_FID definitions off by 1 when doing read on the + * static MAC table compared to doing write. + */ + if (ksz_is_ksz87xx(dev)) + data_hi >>= 1; alu->is_static = true; alu->is_use_fid = (data_hi & masks[STATIC_MAC_TABLE_USE_FID]) ? 1 : 0; alu->fid = (data_hi & masks[STATIC_MAC_TABLE_FID]) >> diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index a4428be5f483c..a0ba2605bb620 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -331,13 +331,13 @@ static const u32 ksz8795_masks[] = { [STATIC_MAC_TABLE_VALID] = BIT(21), [STATIC_MAC_TABLE_USE_FID] = BIT(23), [STATIC_MAC_TABLE_FID] = GENMASK(30, 24), - [STATIC_MAC_TABLE_OVERRIDE] = BIT(26), - [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(24, 20), + [STATIC_MAC_TABLE_OVERRIDE] = BIT(22), + [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(20, 16), [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(6, 0), - [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(8), + [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7), [DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7), [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 29), - [DYNAMIC_MAC_TABLE_FID] = GENMASK(26, 20), + [DYNAMIC_MAC_TABLE_FID] = GENMASK(22, 16), [DYNAMIC_MAC_TABLE_SRC_PORT] = GENMASK(26, 24), [DYNAMIC_MAC_TABLE_TIMESTAMP] = GENMASK(28, 27), [P_MII_TX_FLOW_CTRL] = BIT(5), diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h index 8abecaf6089ef..33d9a2f6af27a 100644 --- a/drivers/net/dsa/microchip/ksz_common.h +++ b/drivers/net/dsa/microchip/ksz_common.h @@ -569,6 +569,13 @@ static inline void ksz_regmap_unlock(void *__mtx) mutex_unlock(mtx); }
+static inline bool ksz_is_ksz87xx(struct ksz_device *dev) +{ + return dev->chip_id == KSZ8795_CHIP_ID || + dev->chip_id == KSZ8794_CHIP_ID || + dev->chip_id == KSZ8765_CHIP_ID; +} + static inline bool ksz_is_ksz88x3(struct ksz_device *dev) { return dev->chip_id == KSZ8830_CHIP_ID;
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 162d626f3013215b82b6514ca14f20932c7ccce5 ]
Referenced commit missed that for chip versions 42 and 43 ASPM remained disabled in the respective rtl_hw_start_...() routines. This resulted in problems as described in the referenced bug ticket. Therefore re-instantiate the previous logic.
Fixes: 5fc3f6c90cca ("r8169: consolidate disabling ASPM before EPHY access") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217635 Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index ca0140963ff3a..b69122686407d 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2747,6 +2747,13 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable) return;
if (enable) { + /* On these chip versions ASPM can even harm + * bus communication of other PCI devices. + */ + if (tp->mac_version == RTL_GIGA_MAC_VER_42 || + tp->mac_version == RTL_GIGA_MAC_VER_43) + return; + rtl_mod_config5(tp, 0, ASPM_en); rtl_mod_config2(tp, 0, ClkReqEn);
From: Andrzej Hajda andrzej.hajda@intel.com
[ Upstream commit 785b3f667b4bf98804cad135005e964df0c750de ]
Arrays passed to reg_in_range_table should end with empty record.
The patch solves KASAN detected bug with signature: BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915] Read of size 4 at addr ffffffffa1555d90 by task perf/1518
CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1 Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023 Call Trace: <TASK> ... xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
Fixes: 0fa9349dda03 ("drm/i915/perf: complete programming whitelisting for XEHPSDV") Signed-off-by: Andrzej Hajda andrzej.hajda@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230711153410.1224997-1-andrz... (cherry picked from commit 2f42c5afb34b5696cf5fe79e744f99be9b218798) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_perf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 3035cba2c6a29..d7caae281fb92 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4442,6 +4442,7 @@ static const struct i915_range mtl_oam_b_counters[] = { static const struct i915_range xehp_oa_b_counters[] = { { .start = 0xdc48, .end = 0xdc48 }, /* OAA_ENABLE_REG */ { .start = 0xdd00, .end = 0xdd48 }, /* OAG_LCE0_0 - OAA_LENABLE_REG */ + {} };
static const struct i915_range gen7_oa_mux_regs[] = {
From: Ding Hui dinghui@sangfor.com.cn
[ Upstream commit 5f4fa1672d98fe99d2297b03add35346f1685d6b ]
We do netif_napi_add() for all allocated q_vectors[], but potentially do netif_napi_del() for part of them, then kfree q_vectors and leave invalid pointers at dev->napi_list.
Reproducer:
[root@host ~]# cat repro.sh #!/bin/bash
pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=()
function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) }
function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) }
function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() }
trap "on_exit; exit" EXIT
while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!)
wait
Result:
[ 4093.900222] ================================================================== [ 4093.900230] BUG: KASAN: use-after-free in free_netdev+0x308/0x390 [ 4093.900232] Read of size 8 at addr ffff88b4dc145640 by task repro.sh/6699 [ 4093.900233] [ 4093.900236] CPU: 10 PID: 6699 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 4093.900238] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 4093.900239] Call Trace: [ 4093.900244] dump_stack+0x71/0xab [ 4093.900249] print_address_description+0x6b/0x290 [ 4093.900251] ? free_netdev+0x308/0x390 [ 4093.900252] kasan_report+0x14a/0x2b0 [ 4093.900254] free_netdev+0x308/0x390 [ 4093.900261] iavf_remove+0x825/0xd20 [iavf] [ 4093.900265] pci_device_remove+0xa8/0x1f0 [ 4093.900268] device_release_driver_internal+0x1c6/0x460 [ 4093.900271] pci_stop_bus_device+0x101/0x150 [ 4093.900273] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900275] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900277] ? pci_iov_add_virtfn+0xe10/0xe10 [ 4093.900278] ? pci_get_subsys+0x90/0x90 [ 4093.900280] sriov_disable+0xed/0x3e0 [ 4093.900282] ? bus_find_device+0x12d/0x1a0 [ 4093.900290] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900298] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 4093.900299] ? pci_get_device+0x7c/0x90 [ 4093.900300] ? pci_get_subsys+0x90/0x90 [ 4093.900306] ? pci_vfs_assigned.part.7+0x144/0x210 [ 4093.900309] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900315] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900318] sriov_numvfs_store+0x214/0x290 [ 4093.900320] ? sriov_totalvfs_show+0x30/0x30 [ 4093.900321] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900323] ? __check_object_size+0x15a/0x350 [ 4093.900326] kernfs_fop_write+0x280/0x3f0 [ 4093.900329] vfs_write+0x145/0x440 [ 4093.900330] ksys_write+0xab/0x160 [ 4093.900332] ? __ia32_sys_read+0xb0/0xb0 [ 4093.900334] ? fput_many+0x1a/0x120 [ 4093.900335] ? filp_close+0xf0/0x130 [ 4093.900338] do_syscall_64+0xa0/0x370 [ 4093.900339] ? page_fault+0x8/0x30 [ 4093.900341] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900357] RIP: 0033:0x7f16ad4d22c0 [ 4093.900359] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 4093.900360] RSP: 002b:00007ffd6491b7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 4093.900362] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f16ad4d22c0 [ 4093.900363] RDX: 0000000000000002 RSI: 0000000001a41408 RDI: 0000000000000001 [ 4093.900364] RBP: 0000000001a41408 R08: 00007f16ad7a1780 R09: 00007f16ae1f2700 [ 4093.900364] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 4093.900365] R13: 0000000000000001 R14: 00007f16ad7a0620 R15: 0000000000000001 [ 4093.900367] [ 4093.900368] Allocated by task 820: [ 4093.900371] kasan_kmalloc+0xa6/0xd0 [ 4093.900373] __kmalloc+0xfb/0x200 [ 4093.900376] iavf_init_interrupt_scheme+0x63b/0x1320 [iavf] [ 4093.900380] iavf_watchdog_task+0x3d51/0x52c0 [iavf] [ 4093.900382] process_one_work+0x56a/0x11f0 [ 4093.900383] worker_thread+0x8f/0xf40 [ 4093.900384] kthread+0x2a0/0x390 [ 4093.900385] ret_from_fork+0x1f/0x40 [ 4093.900387] 0xffffffffffffffff [ 4093.900387] [ 4093.900388] Freed by task 6699: [ 4093.900390] __kasan_slab_free+0x137/0x190 [ 4093.900391] kfree+0x8b/0x1b0 [ 4093.900394] iavf_free_q_vectors+0x11d/0x1a0 [iavf] [ 4093.900397] iavf_remove+0x35a/0xd20 [iavf] [ 4093.900399] pci_device_remove+0xa8/0x1f0 [ 4093.900400] device_release_driver_internal+0x1c6/0x460 [ 4093.900401] pci_stop_bus_device+0x101/0x150 [ 4093.900402] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900403] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900404] sriov_disable+0xed/0x3e0 [ 4093.900409] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900415] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900416] sriov_numvfs_store+0x214/0x290 [ 4093.900417] kernfs_fop_write+0x280/0x3f0 [ 4093.900418] vfs_write+0x145/0x440 [ 4093.900419] ksys_write+0xab/0x160 [ 4093.900420] do_syscall_64+0xa0/0x370 [ 4093.900421] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900422] 0xffffffffffffffff [ 4093.900422] [ 4093.900424] The buggy address belongs to the object at ffff88b4dc144200 which belongs to the cache kmalloc-8k of size 8192 [ 4093.900425] The buggy address is located 5184 bytes inside of 8192-byte region [ffff88b4dc144200, ffff88b4dc146200) [ 4093.900425] The buggy address belongs to the page: [ 4093.900427] page:ffffea00d3705000 refcount:1 mapcount:0 mapping:ffff88bf04415c80 index:0x0 compound_mapcount: 0 [ 4093.900430] flags: 0x10000000008100(slab|head) [ 4093.900433] raw: 0010000000008100 dead000000000100 dead000000000200 ffff88bf04415c80 [ 4093.900434] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000 [ 4093.900434] page dumped because: kasan: bad access detected [ 4093.900435] [ 4093.900435] Memory state around the buggy address: [ 4093.900436] ffff88b4dc145500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900437] ffff88b4dc145580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] >ffff88b4dc145600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] ^ [ 4093.900439] ffff88b4dc145680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ffff88b4dc145700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ==================================================================
Although the patch #2 (of 2) can avoid the issue triggered by this repro.sh, there still are other potential risks that if num_active_queues is changed to less than allocated q_vectors[] by unexpected, the mismatched netif_napi_add/del() can also cause UAF.
Since we actually call netif_napi_add() for all allocated q_vectors unconditionally in iavf_alloc_q_vectors(), so we should fix it by letting netif_napi_del() match to netif_napi_add().
Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ding Hui dinghui@sangfor.com.cn Cc: Donglin Peng pengdonglin@sangfor.com.cn Cc: Huang Cun huangcun@sangfor.com.cn Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Madhu Chittim madhu.chittim@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 4a66873882d12..601de8e8f3654 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1840,19 +1840,16 @@ static int iavf_alloc_q_vectors(struct iavf_adapter *adapter) static void iavf_free_q_vectors(struct iavf_adapter *adapter) { int q_idx, num_q_vectors; - int napi_vectors;
if (!adapter->q_vectors) return;
num_q_vectors = adapter->num_msix_vectors - NONQ_VECS; - napi_vectors = adapter->num_active_queues;
for (q_idx = 0; q_idx < num_q_vectors; q_idx++) { struct iavf_q_vector *q_vector = &adapter->q_vectors[q_idx];
- if (q_idx < napi_vectors) - netif_napi_del(&q_vector->napi); + netif_napi_del(&q_vector->napi); } kfree(adapter->q_vectors); adapter->q_vectors = NULL;
From: Ding Hui dinghui@sangfor.com.cn
[ Upstream commit 7c4bced3caa749ce468b0c5de711c98476b23a52 ]
If we set channels greater during iavf_remove(), and waiting reset done would be timeout, then returned with error but changed num_active_queues directly, that will lead to OOB like the following logs. Because the num_active_queues is greater than tx/rx_rings[] allocated actually.
Reproducer:
[root@host ~]# cat repro.sh #!/bin/bash
pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=()
function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) }
function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) }
function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() }
trap "on_exit; exit" EXIT
while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!)
wait
Result:
[ 3506.152887] iavf 0000:41:02.0: Removing device [ 3510.400799] ================================================================== [ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536 [ 3510.400823] [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 3510.400835] Call Trace: [ 3510.400851] dump_stack+0x71/0xab [ 3510.400860] print_address_description+0x6b/0x290 [ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400868] kasan_report+0x14a/0x2b0 [ 3510.400873] iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400880] iavf_remove+0x2b6/0xc70 [iavf] [ 3510.400884] ? iavf_free_all_rx_resources+0x160/0x160 [iavf] [ 3510.400891] ? wait_woken+0x1d0/0x1d0 [ 3510.400895] ? notifier_call_chain+0xc1/0x130 [ 3510.400903] pci_device_remove+0xa8/0x1f0 [ 3510.400910] device_release_driver_internal+0x1c6/0x460 [ 3510.400916] pci_stop_bus_device+0x101/0x150 [ 3510.400919] pci_stop_and_remove_bus_device+0xe/0x20 [ 3510.400924] pci_iov_remove_virtfn+0x187/0x420 [ 3510.400927] ? pci_iov_add_virtfn+0xe10/0xe10 [ 3510.400929] ? pci_get_subsys+0x90/0x90 [ 3510.400932] sriov_disable+0xed/0x3e0 [ 3510.400936] ? bus_find_device+0x12d/0x1a0 [ 3510.400953] i40e_free_vfs+0x754/0x1210 [i40e] [ 3510.400966] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 3510.400968] ? pci_get_device+0x7c/0x90 [ 3510.400970] ? pci_get_subsys+0x90/0x90 [ 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210 [ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.400996] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 3510.401001] sriov_numvfs_store+0x214/0x290 [ 3510.401005] ? sriov_totalvfs_show+0x30/0x30 [ 3510.401007] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.401011] ? __check_object_size+0x15a/0x350 [ 3510.401018] kernfs_fop_write+0x280/0x3f0 [ 3510.401022] vfs_write+0x145/0x440 [ 3510.401025] ksys_write+0xab/0x160 [ 3510.401028] ? __ia32_sys_read+0xb0/0xb0 [ 3510.401031] ? fput_many+0x1a/0x120 [ 3510.401032] ? filp_close+0xf0/0x130 [ 3510.401038] do_syscall_64+0xa0/0x370 [ 3510.401041] ? page_fault+0x8/0x30 [ 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 3510.401073] RIP: 0033:0x7f3a9bb842c0 [ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0 [ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001 [ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700 [ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001 [ 3510.401090] [ 3510.401093] Allocated by task 76795: [ 3510.401098] kasan_kmalloc+0xa6/0xd0 [ 3510.401099] __kmalloc+0xfb/0x200 [ 3510.401104] iavf_init_interrupt_scheme+0x26f/0x1310 [iavf] [ 3510.401108] iavf_watchdog_task+0x1d58/0x4050 [iavf] [ 3510.401114] process_one_work+0x56a/0x11f0 [ 3510.401115] worker_thread+0x8f/0xf40 [ 3510.401117] kthread+0x2a0/0x390 [ 3510.401119] ret_from_fork+0x1f/0x40 [ 3510.401122] 0xffffffffffffffff [ 3510.401123]
In timeout handling, we should keep the original num_active_queues and reset num_req_queues to 0.
Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Ding Hui dinghui@sangfor.com.cn Cc: Donglin Peng pengdonglin@sangfor.com.cn Cc: Huang Cun huangcun@sangfor.com.cn Reviewed-by: Leon Romanovsky leonro@nvidia.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index 6f171d1d85b75..92443f8e9fbdf 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -1863,7 +1863,7 @@ static int iavf_set_channels(struct net_device *netdev, } if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) { adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; - adapter->num_active_queues = num_req; + adapter->num_req_queues = 0; return -EOPNOTSUPP; }
From: Ahmed Zaki ahmed.zaki@intel.com
[ Upstream commit a77ed5c5b768e9649be240a2d864e5cd9c6a2015 ]
If the system tries to close the netdev while iavf_reset_task() is running, __LINK_STATE_START will be cleared and netif_running() will return false in iavf_reinit_interrupt_scheme(). This will result in iavf_free_traffic_irqs() not being called and a leak as follows:
[7632.489326] remove_proc_entry: removing non-empty directory 'irq/999', leaking at least 'iavf-enp24s0f0v0-TxRx-0' [7632.490214] WARNING: CPU: 0 PID: 10 at fs/proc/generic.c:718 remove_proc_entry+0x19b/0x1b0
is shown when pci_disable_msix() is later called. Fix by using the internal adapter state. The traffic IRQs will always exist if state == __IAVF_RUNNING.
Fixes: 5b36e8d04b44 ("i40evf: Enable VF to request an alternate queue allocation") Signed-off-by: Ahmed Zaki ahmed.zaki@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 601de8e8f3654..b698f8917f049 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1941,15 +1941,16 @@ static void iavf_free_rss(struct iavf_adapter *adapter) /** * iavf_reinit_interrupt_scheme - Reallocate queues and vectors * @adapter: board private structure + * @running: true if adapter->state == __IAVF_RUNNING * * Returns 0 on success, negative on failure **/ -static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter) +static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter, bool running) { struct net_device *netdev = adapter->netdev; int err;
- if (netif_running(netdev)) + if (running) iavf_free_traffic_irqs(adapter); iavf_free_misc_irq(adapter); iavf_reset_interrupt_capability(adapter); @@ -3065,7 +3066,7 @@ static void iavf_reset_task(struct work_struct *work)
if ((adapter->flags & IAVF_FLAG_REINIT_MSIX_NEEDED) || (adapter->flags & IAVF_FLAG_REINIT_ITR_NEEDED)) { - err = iavf_reinit_interrupt_scheme(adapter); + err = iavf_reinit_interrupt_scheme(adapter, running); if (err) goto reset_err; }
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit a4aadf0f5905661cd25c366b96cc1c840f05b756 ]
Make all possible functions static.
Move iavf_force_wb() up to avoid forward declaration.
Suggested-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: c2ed2403f12c ("iavf: Wait for reset in callbacks which trigger it") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf.h | 10 ----- drivers/net/ethernet/intel/iavf/iavf_main.c | 14 +++---- drivers/net/ethernet/intel/iavf/iavf_txrx.c | 43 ++++++++++----------- drivers/net/ethernet/intel/iavf/iavf_txrx.h | 4 -- 4 files changed, 28 insertions(+), 43 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index 39d0fe76a38ff..f80f2735e6886 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -523,9 +523,6 @@ void iavf_schedule_request_stats(struct iavf_adapter *adapter); void iavf_reset(struct iavf_adapter *adapter); void iavf_set_ethtool_ops(struct net_device *netdev); void iavf_update_stats(struct iavf_adapter *adapter); -void iavf_reset_interrupt_capability(struct iavf_adapter *adapter); -int iavf_init_interrupt_scheme(struct iavf_adapter *adapter); -void iavf_irq_enable_queues(struct iavf_adapter *adapter); void iavf_free_all_tx_resources(struct iavf_adapter *adapter); void iavf_free_all_rx_resources(struct iavf_adapter *adapter);
@@ -579,17 +576,10 @@ void iavf_enable_vlan_stripping_v2(struct iavf_adapter *adapter, u16 tpid); void iavf_disable_vlan_stripping_v2(struct iavf_adapter *adapter, u16 tpid); void iavf_enable_vlan_insertion_v2(struct iavf_adapter *adapter, u16 tpid); void iavf_disable_vlan_insertion_v2(struct iavf_adapter *adapter, u16 tpid); -int iavf_replace_primary_mac(struct iavf_adapter *adapter, - const u8 *new_mac); -void -iavf_set_vlan_offload_features(struct iavf_adapter *adapter, - netdev_features_t prev_features, - netdev_features_t features); void iavf_add_fdir_filter(struct iavf_adapter *adapter); void iavf_del_fdir_filter(struct iavf_adapter *adapter); void iavf_add_adv_rss_cfg(struct iavf_adapter *adapter); void iavf_del_adv_rss_cfg(struct iavf_adapter *adapter); struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter, const u8 *macaddr); -int iavf_lock_timeout(struct mutex *lock, unsigned int msecs); #endif /* _IAVF_H_ */ diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index b698f8917f049..b24e54823e6ae 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -253,7 +253,7 @@ enum iavf_status iavf_free_virt_mem_d(struct iavf_hw *hw, * * Returns 0 on success, negative on failure **/ -int iavf_lock_timeout(struct mutex *lock, unsigned int msecs) +static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs) { unsigned int wait, delay = 10;
@@ -362,7 +362,7 @@ static void iavf_irq_disable(struct iavf_adapter *adapter) * iavf_irq_enable_queues - Enable interrupt for all queues * @adapter: board private structure **/ -void iavf_irq_enable_queues(struct iavf_adapter *adapter) +static void iavf_irq_enable_queues(struct iavf_adapter *adapter) { struct iavf_hw *hw = &adapter->hw; int i; @@ -1003,8 +1003,8 @@ struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter, * * Do not call this with mac_vlan_list_lock! **/ -int iavf_replace_primary_mac(struct iavf_adapter *adapter, - const u8 *new_mac) +static int iavf_replace_primary_mac(struct iavf_adapter *adapter, + const u8 *new_mac) { struct iavf_hw *hw = &adapter->hw; struct iavf_mac_filter *f; @@ -1860,7 +1860,7 @@ static void iavf_free_q_vectors(struct iavf_adapter *adapter) * @adapter: board private structure * **/ -void iavf_reset_interrupt_capability(struct iavf_adapter *adapter) +static void iavf_reset_interrupt_capability(struct iavf_adapter *adapter) { if (!adapter->msix_entries) return; @@ -1875,7 +1875,7 @@ void iavf_reset_interrupt_capability(struct iavf_adapter *adapter) * @adapter: board private structure to initialize * **/ -int iavf_init_interrupt_scheme(struct iavf_adapter *adapter) +static int iavf_init_interrupt_scheme(struct iavf_adapter *adapter) { int err;
@@ -2174,7 +2174,7 @@ static int iavf_process_aq_command(struct iavf_adapter *adapter) * the watchdog if any changes are requested to expedite the request via * virtchnl. **/ -void +static void iavf_set_vlan_offload_features(struct iavf_adapter *adapter, netdev_features_t prev_features, netdev_features_t features) diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c index e989feda133c1..8c5f6096b0022 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c @@ -54,7 +54,7 @@ static void iavf_unmap_and_free_tx_resource(struct iavf_ring *ring, * iavf_clean_tx_ring - Free any empty Tx buffers * @tx_ring: ring to be cleaned **/ -void iavf_clean_tx_ring(struct iavf_ring *tx_ring) +static void iavf_clean_tx_ring(struct iavf_ring *tx_ring) { unsigned long bi_size; u16 i; @@ -110,7 +110,7 @@ void iavf_free_tx_resources(struct iavf_ring *tx_ring) * Since there is no access to the ring head register * in XL710, we need to use our local copies **/ -u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw) +static u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw) { u32 head, tail;
@@ -127,6 +127,24 @@ u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw) return 0; }
+/** + * iavf_force_wb - Issue SW Interrupt so HW does a wb + * @vsi: the VSI we care about + * @q_vector: the vector on which to force writeback + **/ +static void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector) +{ + u32 val = IAVF_VFINT_DYN_CTLN1_INTENA_MASK | + IAVF_VFINT_DYN_CTLN1_ITR_INDX_MASK | /* set noitr */ + IAVF_VFINT_DYN_CTLN1_SWINT_TRIG_MASK | + IAVF_VFINT_DYN_CTLN1_SW_ITR_INDX_ENA_MASK + /* allow 00 to be written to the index */; + + wr32(&vsi->back->hw, + IAVF_VFINT_DYN_CTLN1(q_vector->reg_idx), + val); +} + /** * iavf_detect_recover_hung - Function to detect and recover hung_queues * @vsi: pointer to vsi struct with tx queues @@ -352,25 +370,6 @@ static void iavf_enable_wb_on_itr(struct iavf_vsi *vsi, q_vector->arm_wb_state = true; }
-/** - * iavf_force_wb - Issue SW Interrupt so HW does a wb - * @vsi: the VSI we care about - * @q_vector: the vector on which to force writeback - * - **/ -void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector) -{ - u32 val = IAVF_VFINT_DYN_CTLN1_INTENA_MASK | - IAVF_VFINT_DYN_CTLN1_ITR_INDX_MASK | /* set noitr */ - IAVF_VFINT_DYN_CTLN1_SWINT_TRIG_MASK | - IAVF_VFINT_DYN_CTLN1_SW_ITR_INDX_ENA_MASK - /* allow 00 to be written to the index */; - - wr32(&vsi->back->hw, - IAVF_VFINT_DYN_CTLN1(q_vector->reg_idx), - val); -} - static inline bool iavf_container_is_rx(struct iavf_q_vector *q_vector, struct iavf_ring_container *rc) { @@ -687,7 +686,7 @@ int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring) * iavf_clean_rx_ring - Free Rx buffers * @rx_ring: ring to be cleaned **/ -void iavf_clean_rx_ring(struct iavf_ring *rx_ring) +static void iavf_clean_rx_ring(struct iavf_ring *rx_ring) { unsigned long bi_size; u16 i; diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.h b/drivers/net/ethernet/intel/iavf/iavf_txrx.h index 2624bf6d009e3..7e6ee32d19b69 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.h +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.h @@ -442,15 +442,11 @@ static inline unsigned int iavf_rx_pg_order(struct iavf_ring *ring)
bool iavf_alloc_rx_buffers(struct iavf_ring *rxr, u16 cleaned_count); netdev_tx_t iavf_xmit_frame(struct sk_buff *skb, struct net_device *netdev); -void iavf_clean_tx_ring(struct iavf_ring *tx_ring); -void iavf_clean_rx_ring(struct iavf_ring *rx_ring); int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring); int iavf_setup_rx_descriptors(struct iavf_ring *rx_ring); void iavf_free_tx_resources(struct iavf_ring *tx_ring); void iavf_free_rx_resources(struct iavf_ring *rx_ring); int iavf_napi_poll(struct napi_struct *napi, int budget); -void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector); -u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw); void iavf_detect_recover_hung(struct iavf_vsi *vsi); int __iavf_maybe_stop_tx(struct iavf_ring *tx_ring, int size); bool __iavf_chk_linearize(struct sk_buff *skb);
From: Marcin Szycik marcin.szycik@linux.intel.com
[ Upstream commit c2ed2403f12c74a74a0091ed5d830e72c58406e8 ]
There was a fail when trying to add the interface to bonding right after changing the MTU on the interface. It was caused by bonding interface unable to open the interface due to interface being in __RESETTING state because of MTU change.
Add new reset_waitqueue to indicate that reset has finished.
Add waiting for reset to finish in callbacks which trigger hw reset: iavf_set_priv_flags(), iavf_change_mtu() and iavf_set_ringparam(). We use a 5000ms timeout period because on Hyper-V based systems, this operation takes around 3000-4000ms. In normal circumstances, it doesn't take more than 500ms to complete.
Add a function iavf_wait_for_reset() to reuse waiting for reset code and use it also in iavf_set_channels(), which already waits for reset. We don't use error handling in iavf_set_channels() as this could cause the device to be in incorrect state if the reset was scheduled but hit timeout or the waitng function was interrupted by a signal.
Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Marcin Szycik marcin.szycik@linux.intel.com Co-developed-by: Dawid Wesierski dawidx.wesierski@intel.com Signed-off-by: Dawid Wesierski dawidx.wesierski@intel.com Signed-off-by: Sylwester Dziedziuch sylwesterx.dziedziuch@intel.com Signed-off-by: Kamil Maziarz kamil.maziarz@intel.com Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf.h | 2 + .../net/ethernet/intel/iavf/iavf_ethtool.c | 31 ++++++----- drivers/net/ethernet/intel/iavf/iavf_main.c | 51 ++++++++++++++++++- .../net/ethernet/intel/iavf/iavf_virtchnl.c | 1 + 4 files changed, 68 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index f80f2735e6886..a5cab19eb6a8b 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -257,6 +257,7 @@ struct iavf_adapter { struct work_struct adminq_task; struct delayed_work client_task; wait_queue_head_t down_waitqueue; + wait_queue_head_t reset_waitqueue; wait_queue_head_t vc_waitqueue; struct iavf_q_vector *q_vectors; struct list_head vlan_filter_list; @@ -582,4 +583,5 @@ void iavf_add_adv_rss_cfg(struct iavf_adapter *adapter); void iavf_del_adv_rss_cfg(struct iavf_adapter *adapter); struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter, const u8 *macaddr); +int iavf_wait_for_reset(struct iavf_adapter *adapter); #endif /* _IAVF_H_ */ diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index 92443f8e9fbdf..b7141c2a941d1 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -484,6 +484,7 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) { struct iavf_adapter *adapter = netdev_priv(netdev); u32 orig_flags, new_flags, changed_flags; + int ret = 0; u32 i;
orig_flags = READ_ONCE(adapter->flags); @@ -533,10 +534,13 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing private flags timeout or interrupted waiting for reset"); } }
- return 0; + return ret; }
/** @@ -627,6 +631,7 @@ static int iavf_set_ringparam(struct net_device *netdev, { struct iavf_adapter *adapter = netdev_priv(netdev); u32 new_rx_count, new_tx_count; + int ret = 0;
if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending)) return -EINVAL; @@ -673,9 +678,12 @@ static int iavf_set_ringparam(struct net_device *netdev, if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing ring parameters timeout or interrupted waiting for reset"); }
- return 0; + return ret; }
/** @@ -1830,7 +1838,7 @@ static int iavf_set_channels(struct net_device *netdev, { struct iavf_adapter *adapter = netdev_priv(netdev); u32 num_req = ch->combined_count; - int i; + int ret = 0;
if ((adapter->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_ADQ) && adapter->num_tc) { @@ -1854,20 +1862,11 @@ static int iavf_set_channels(struct net_device *netdev, adapter->flags |= IAVF_FLAG_REINIT_ITR_NEEDED; iavf_schedule_reset(adapter);
- /* wait for the reset is done */ - for (i = 0; i < IAVF_RESET_WAIT_COMPLETE_COUNT; i++) { - msleep(IAVF_RESET_WAIT_MS); - if (adapter->flags & IAVF_FLAG_RESET_PENDING) - continue; - break; - } - if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) { - adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; - adapter->num_req_queues = 0; - return -EOPNOTSUPP; - } + ret = iavf_wait_for_reset(adapter); + if (ret) + netdev_warn(netdev, "Changing channel count timeout or interrupted waiting for reset");
- return 0; + return ret; }
/** diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index b24e54823e6ae..8cb9b74b3ebea 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -166,6 +166,45 @@ static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev) return netdev_priv(pci_get_drvdata(pdev)); }
+/** + * iavf_is_reset_in_progress - Check if a reset is in progress + * @adapter: board private structure + */ +static bool iavf_is_reset_in_progress(struct iavf_adapter *adapter) +{ + if (adapter->state == __IAVF_RESETTING || + adapter->flags & (IAVF_FLAG_RESET_PENDING | + IAVF_FLAG_RESET_NEEDED)) + return true; + + return false; +} + +/** + * iavf_wait_for_reset - Wait for reset to finish. + * @adapter: board private structure + * + * Returns 0 if reset finished successfully, negative on timeout or interrupt. + */ +int iavf_wait_for_reset(struct iavf_adapter *adapter) +{ + int ret = wait_event_interruptible_timeout(adapter->reset_waitqueue, + !iavf_is_reset_in_progress(adapter), + msecs_to_jiffies(5000)); + + /* If ret < 0 then it means wait was interrupted. + * If ret == 0 then it means we got a timeout while waiting + * for reset to finish. + * If ret > 0 it means reset has finished. + */ + if (ret > 0) + return 0; + else if (ret < 0) + return -EINTR; + else + return -EBUSY; +} + /** * iavf_allocate_dma_mem_d - OS specific memory alloc for shared code * @hw: pointer to the HW structure @@ -3161,6 +3200,7 @@ static void iavf_reset_task(struct work_struct *work)
adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
+ wake_up(&adapter->reset_waitqueue); mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock);
@@ -4325,6 +4365,7 @@ static int iavf_close(struct net_device *netdev) static int iavf_change_mtu(struct net_device *netdev, int new_mtu) { struct iavf_adapter *adapter = netdev_priv(netdev); + int ret = 0;
netdev_dbg(netdev, "changing MTU from %d to %d\n", netdev->mtu, new_mtu); @@ -4337,9 +4378,14 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu) if (netif_running(netdev)) { adapter->flags |= IAVF_FLAG_RESET_NEEDED; queue_work(adapter->wq, &adapter->reset_task); + ret = iavf_wait_for_reset(adapter); + if (ret < 0) + netdev_warn(netdev, "MTU change interrupted waiting for reset"); + else if (ret) + netdev_warn(netdev, "MTU change timed out waiting for reset"); }
- return 0; + return ret; }
#define NETIF_VLAN_OFFLOAD_FEATURES (NETIF_F_HW_VLAN_CTAG_RX | \ @@ -4940,6 +4986,9 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /* Setup the wait queue for indicating transition to down status */ init_waitqueue_head(&adapter->down_waitqueue);
+ /* Setup the wait queue for indicating transition to running state */ + init_waitqueue_head(&adapter->reset_waitqueue); + /* Setup the wait queue for indicating virtchannel events */ init_waitqueue_head(&adapter->vc_waitqueue);
diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 7c0578b5457b9..1bab896aaf40c 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -2285,6 +2285,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter, case VIRTCHNL_OP_ENABLE_QUEUES: /* enable transmits */ iavf_irq_enable(adapter, true); + wake_up(&adapter->reset_waitqueue); adapter->flags &= ~IAVF_FLAG_QUEUES_DISABLED; break; case VIRTCHNL_OP_DISABLE_QUEUES:
From: Ahmed Zaki ahmed.zaki@intel.com
[ Upstream commit d1639a17319ba78a018280cd2df6577a7e5d9fab ]
A driver's lock (crit_lock) is used to serialize all the driver's tasks. Lockdep, however, shows a circular dependency between rtnl and crit_lock. This happens when an ndo that already holds the rtnl requests the driver to reset, since the reset task (in some paths) tries to grab rtnl to either change real number of queues of update netdev features.
[566.241851] ====================================================== [566.241893] WARNING: possible circular locking dependency detected [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G OE [566.241984] ------------------------------------------------------ [566.242025] repro.sh/2604 is trying to acquire lock: [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf] [566.242167] but task is already holding lock: [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf] [566.242300] which lock already depends on the new lock.
[566.242353] the existing dependency chain (in reverse order) is: [566.242401] -> #1 (rtnl_mutex){+.+.}-{3:3}: [566.242451] __mutex_lock+0xc1/0xbb0 [566.242489] iavf_init_interrupt_scheme+0x179/0x440 [iavf] [566.242560] iavf_watchdog_task+0x80b/0x1400 [iavf] [566.242627] process_one_work+0x2b3/0x560 [566.242663] worker_thread+0x4f/0x3a0 [566.242696] kthread+0xf2/0x120 [566.242730] ret_from_fork+0x29/0x50 [566.242763] -> #0 (&adapter->crit_lock){+.+.}-{3:3}: [566.242815] __lock_acquire+0x15ff/0x22b0 [566.242869] lock_acquire+0xd2/0x2c0 [566.242901] __mutex_lock+0xc1/0xbb0 [566.242934] iavf_close+0x3c/0x240 [iavf] [566.242997] __dev_close_many+0xac/0x120 [566.243036] dev_close_many+0x8b/0x140 [566.243071] unregister_netdevice_many_notify+0x165/0x7c0 [566.243116] unregister_netdevice_queue+0xd3/0x110 [566.243157] iavf_remove+0x6c1/0x730 [iavf] [566.243217] pci_device_remove+0x33/0xa0 [566.243257] device_release_driver_internal+0x1bc/0x240 [566.243299] pci_stop_bus_device+0x6c/0x90 [566.243338] pci_stop_and_remove_bus_device+0xe/0x20 [566.243380] pci_iov_remove_virtfn+0xd1/0x130 [566.243417] sriov_disable+0x34/0xe0 [566.243448] ice_free_vfs+0x2da/0x330 [ice] [566.244383] ice_sriov_configure+0x88/0xad0 [ice] [566.245353] sriov_numvfs_store+0xde/0x1d0 [566.246156] kernfs_fop_write_iter+0x15e/0x210 [566.246921] vfs_write+0x288/0x530 [566.247671] ksys_write+0x74/0xf0 [566.248408] do_syscall_64+0x58/0x80 [566.249145] entry_SYSCALL_64_after_hwframe+0x72/0xdc [566.249886] other info that might help us debug this:
[566.252014] Possible unsafe locking scenario:
[566.253432] CPU0 CPU1 [566.254118] ---- ---- [566.254800] lock(rtnl_mutex); [566.255514] lock(&adapter->crit_lock); [566.256233] lock(rtnl_mutex); [566.256897] lock(&adapter->crit_lock); [566.257388] *** DEADLOCK ***
The deadlock can be triggered by a script that is continuously resetting the VF adapter while doing other operations requiring RTNL, e.g:
while :; do ip link set $VF up ethtool --set-channels $VF combined 2 ip link set $VF down ip link set $VF up ethtool --set-channels $VF combined 4 ip link set $VF down done
Any operation that triggers a reset can substitute "ethtool --set-channles"
As a fix, add a new task "finish_config" that do all the work which needs rtnl lock. With the exception of iavf_remove(), all work that require rtnl should be called from this task.
As for iavf_remove(), at the point where we need to call unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent finish_config work cannot restart after that since the task is guarded by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config().
Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Ahmed Zaki ahmed.zaki@intel.com Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf.h | 2 + drivers/net/ethernet/intel/iavf/iavf_main.c | 114 +++++++++++++----- .../net/ethernet/intel/iavf/iavf_virtchnl.c | 1 + 3 files changed, 85 insertions(+), 32 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index a5cab19eb6a8b..bf5e3c8e97e04 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -255,6 +255,7 @@ struct iavf_adapter { struct workqueue_struct *wq; struct work_struct reset_task; struct work_struct adminq_task; + struct work_struct finish_config; struct delayed_work client_task; wait_queue_head_t down_waitqueue; wait_queue_head_t reset_waitqueue; @@ -521,6 +522,7 @@ int iavf_process_config(struct iavf_adapter *adapter); int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter); void iavf_schedule_reset(struct iavf_adapter *adapter); void iavf_schedule_request_stats(struct iavf_adapter *adapter); +void iavf_schedule_finish_config(struct iavf_adapter *adapter); void iavf_reset(struct iavf_adapter *adapter); void iavf_set_ethtool_ops(struct net_device *netdev); void iavf_update_stats(struct iavf_adapter *adapter); diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 8cb9b74b3ebea..161750c1598f8 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -1702,10 +1702,10 @@ static int iavf_set_interrupt_capability(struct iavf_adapter *adapter) adapter->msix_entries[vector].entry = vector;
err = iavf_acquire_msix_vectors(adapter, v_budget); + if (!err) + iavf_schedule_finish_config(adapter);
out: - netif_set_real_num_rx_queues(adapter->netdev, pairs); - netif_set_real_num_tx_queues(adapter->netdev, pairs); return err; }
@@ -1925,9 +1925,7 @@ static int iavf_init_interrupt_scheme(struct iavf_adapter *adapter) goto err_alloc_queues; }
- rtnl_lock(); err = iavf_set_interrupt_capability(adapter); - rtnl_unlock(); if (err) { dev_err(&adapter->pdev->dev, "Unable to setup interrupt capabilities\n"); @@ -2013,6 +2011,78 @@ static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter, bool runni return err; }
+/** + * iavf_finish_config - do all netdev work that needs RTNL + * @work: our work_struct + * + * Do work that needs both RTNL and crit_lock. + **/ +static void iavf_finish_config(struct work_struct *work) +{ + struct iavf_adapter *adapter; + int pairs, err; + + adapter = container_of(work, struct iavf_adapter, finish_config); + + /* Always take RTNL first to prevent circular lock dependency */ + rtnl_lock(); + mutex_lock(&adapter->crit_lock); + + if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) && + adapter->netdev_registered && + !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) { + netdev_update_features(adapter->netdev); + adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES; + } + + switch (adapter->state) { + case __IAVF_DOWN: + if (!adapter->netdev_registered) { + err = register_netdevice(adapter->netdev); + if (err) { + dev_err(&adapter->pdev->dev, "Unable to register netdev (%d)\n", + err); + + /* go back and try again.*/ + iavf_free_rss(adapter); + iavf_free_misc_irq(adapter); + iavf_reset_interrupt_capability(adapter); + iavf_change_state(adapter, + __IAVF_INIT_CONFIG_ADAPTER); + goto out; + } + adapter->netdev_registered = true; + } + + /* Set the real number of queues when reset occurs while + * state == __IAVF_DOWN + */ + fallthrough; + case __IAVF_RUNNING: + pairs = adapter->num_active_queues; + netif_set_real_num_rx_queues(adapter->netdev, pairs); + netif_set_real_num_tx_queues(adapter->netdev, pairs); + break; + + default: + break; + } + +out: + mutex_unlock(&adapter->crit_lock); + rtnl_unlock(); +} + +/** + * iavf_schedule_finish_config - Set the flags and schedule a reset event + * @adapter: board private structure + **/ +void iavf_schedule_finish_config(struct iavf_adapter *adapter) +{ + if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) + queue_work(adapter->wq, &adapter->finish_config); +} + /** * iavf_process_aq_command - process aq_required flags * and sends aq command @@ -2650,22 +2720,8 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
netif_carrier_off(netdev); adapter->link_up = false; - - /* set the semaphore to prevent any callbacks after device registration - * up to time when state of driver will be set to __IAVF_DOWN - */ - rtnl_lock(); - if (!adapter->netdev_registered) { - err = register_netdevice(netdev); - if (err) { - rtnl_unlock(); - goto err_register; - } - } - - adapter->netdev_registered = true; - netif_tx_stop_all_queues(netdev); + if (CLIENT_ALLOWED(adapter)) { err = iavf_lan_add_device(adapter); if (err) @@ -2678,7 +2734,6 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
iavf_change_state(adapter, __IAVF_DOWN); set_bit(__IAVF_VSI_DOWN, adapter->vsi.state); - rtnl_unlock();
iavf_misc_irq_enable(adapter); wake_up(&adapter->down_waitqueue); @@ -2698,10 +2753,11 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter) /* request initial VLAN offload settings */ iavf_set_vlan_offload_features(adapter, 0, netdev->features);
+ iavf_schedule_finish_config(adapter); return; + err_mem: iavf_free_rss(adapter); -err_register: iavf_free_misc_irq(adapter); err_sw_init: iavf_reset_interrupt_capability(adapter); @@ -2728,15 +2784,6 @@ static void iavf_watchdog_task(struct work_struct *work) goto restart_watchdog; }
- if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) && - adapter->netdev_registered && - !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) && - rtnl_trylock()) { - netdev_update_features(adapter->netdev); - rtnl_unlock(); - adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES; - } - if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) iavf_change_state(adapter, __IAVF_COMM_FAILED);
@@ -4978,6 +5025,7 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
INIT_WORK(&adapter->reset_task, iavf_reset_task); INIT_WORK(&adapter->adminq_task, iavf_adminq_task); + INIT_WORK(&adapter->finish_config, iavf_finish_config); INIT_DELAYED_WORK(&adapter->watchdog_task, iavf_watchdog_task); INIT_DELAYED_WORK(&adapter->client_task, iavf_client_task); queue_delayed_work(adapter->wq, &adapter->watchdog_task, @@ -5120,13 +5168,15 @@ static void iavf_remove(struct pci_dev *pdev) usleep_range(500, 1000); } cancel_delayed_work_sync(&adapter->watchdog_task); + cancel_work_sync(&adapter->finish_config);
+ rtnl_lock(); if (adapter->netdev_registered) { - rtnl_lock(); unregister_netdevice(netdev); adapter->netdev_registered = false; - rtnl_unlock(); } + rtnl_unlock(); + if (CLIENT_ALLOWED(adapter)) { err = iavf_lan_del_device(adapter); if (err) diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 1bab896aaf40c..073ac29ed84c7 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -2237,6 +2237,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
iavf_process_config(adapter); adapter->flags |= IAVF_FLAG_SETUP_NETDEV_FEATURES; + iavf_schedule_finish_config(adapter);
iavf_set_queue_vlan_tag_loc(adapter);
From: Ahmed Zaki ahmed.zaki@intel.com
[ Upstream commit c34743daca0eb1dc855831a5210f0800a850088e ]
The reset task is currently scheduled from the watchdog or adminq tasks. First, all direct calls to schedule the reset task are replaced with the iavf_schedule_reset(), which is modified to accept the flag showing the type of reset.
To prevent the reset task from starting once iavf_remove() starts, we need to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now easily added to iavf_schedule_reset().
Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task. It is redundant since all callers who set the flag immediately schedules the reset task.
Fixes: 3ccd54ef44eb ("iavf: Fix init state closure on remove") Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage") Signed-off-by: Ahmed Zaki ahmed.zaki@intel.com Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf.h | 2 +- .../net/ethernet/intel/iavf/iavf_ethtool.c | 8 ++--- drivers/net/ethernet/intel/iavf/iavf_main.c | 32 +++++++------------ .../net/ethernet/intel/iavf/iavf_virtchnl.c | 3 +- 4 files changed, 16 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index bf5e3c8e97e04..8cbdebc5b6989 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -520,7 +520,7 @@ int iavf_up(struct iavf_adapter *adapter); void iavf_down(struct iavf_adapter *adapter); int iavf_process_config(struct iavf_adapter *adapter); int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter); -void iavf_schedule_reset(struct iavf_adapter *adapter); +void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags); void iavf_schedule_request_stats(struct iavf_adapter *adapter); void iavf_schedule_finish_config(struct iavf_adapter *adapter); void iavf_reset(struct iavf_adapter *adapter); diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index b7141c2a941d1..2f47cfa7f06e2 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -532,8 +532,7 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags) /* issue a reset to force legacy-rx change to take effect */ if (changed_flags & IAVF_FLAG_LEGACY_RX) { if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret) netdev_warn(netdev, "Changing private flags timeout or interrupted waiting for reset"); @@ -676,8 +675,7 @@ static int iavf_set_ringparam(struct net_device *netdev, }
if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret) netdev_warn(netdev, "Changing ring parameters timeout or interrupted waiting for reset"); @@ -1860,7 +1858,7 @@ static int iavf_set_channels(struct net_device *netdev,
adapter->num_req_queues = num_req; adapter->flags |= IAVF_FLAG_REINIT_ITR_NEEDED; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
ret = iavf_wait_for_reset(adapter); if (ret) diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 161750c1598f8..ba96312feb505 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -309,12 +309,14 @@ static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs) /** * iavf_schedule_reset - Set the flags and schedule a reset event * @adapter: board private structure + * @flags: IAVF_FLAG_RESET_PENDING or IAVF_FLAG_RESET_NEEDED **/ -void iavf_schedule_reset(struct iavf_adapter *adapter) +void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags) { - if (!(adapter->flags & - (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; + if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) && + !(adapter->flags & + (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) { + adapter->flags |= flags; queue_work(adapter->wq, &adapter->reset_task); } } @@ -342,7 +344,7 @@ static void iavf_tx_timeout(struct net_device *netdev, unsigned int txqueue) struct iavf_adapter *adapter = netdev_priv(netdev);
adapter->tx_timeout_count++; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); }
/** @@ -2490,7 +2492,7 @@ int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter) adapter->vsi_res->num_queue_pairs); adapter->flags |= IAVF_FLAG_REINIT_MSIX_NEEDED; adapter->num_req_queues = adapter->vsi_res->num_queue_pairs; - iavf_schedule_reset(adapter); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
return -EAGAIN; } @@ -2787,14 +2789,6 @@ static void iavf_watchdog_task(struct work_struct *work) if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) iavf_change_state(adapter, __IAVF_COMM_FAILED);
- if (adapter->flags & IAVF_FLAG_RESET_NEEDED) { - adapter->aq_required = 0; - adapter->current_op = VIRTCHNL_OP_UNKNOWN; - mutex_unlock(&adapter->crit_lock); - queue_work(adapter->wq, &adapter->reset_task); - return; - } - switch (adapter->state) { case __IAVF_STARTUP: iavf_startup(adapter); @@ -2922,11 +2916,10 @@ static void iavf_watchdog_task(struct work_struct *work) /* check for hw reset */ reg_val = rd32(hw, IAVF_VF_ARQLEN1) & IAVF_VF_ARQLEN1_ARQENABLE_MASK; if (!reg_val) { - adapter->flags |= IAVF_FLAG_RESET_PENDING; adapter->aq_required = 0; adapter->current_op = VIRTCHNL_OP_UNKNOWN; dev_err(&adapter->pdev->dev, "Hardware reset detected\n"); - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING); mutex_unlock(&adapter->crit_lock); queue_delayed_work(adapter->wq, &adapter->watchdog_task, HZ * 2); @@ -3324,9 +3317,7 @@ static void iavf_adminq_task(struct work_struct *work) } while (pending); mutex_unlock(&adapter->crit_lock);
- if ((adapter->flags & - (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) || - adapter->state == __IAVF_RESETTING) + if (iavf_is_reset_in_progress(adapter)) goto freedom;
/* check for error indications */ @@ -4423,8 +4414,7 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu) }
if (netif_running(netdev)) { - adapter->flags |= IAVF_FLAG_RESET_NEEDED; - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED); ret = iavf_wait_for_reset(adapter); if (ret < 0) netdev_warn(netdev, "MTU change interrupted waiting for reset"); diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c index 073ac29ed84c7..be3c007ce90a9 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c +++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c @@ -1961,9 +1961,8 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter, case VIRTCHNL_EVENT_RESET_IMPENDING: dev_info(&adapter->pdev->dev, "Reset indication received from the PF\n"); if (!(adapter->flags & IAVF_FLAG_RESET_PENDING)) { - adapter->flags |= IAVF_FLAG_RESET_PENDING; dev_info(&adapter->pdev->dev, "Scheduling reset task\n"); - queue_work(adapter->wq, &adapter->reset_task); + iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING); } break; default:
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit 2a4152742025c5f21482e8cebc581702a0fa5b01 ]
No functional modification involved.
security/keys/trusted-keys/trusted_tpm2.c:203: warning: expecting prototype for tpm_buf_append_auth(). Prototype was for tpm2_buf_append_auth() instead.
Fixes: 2e19e10131a0 ("KEYS: trusted: Move TPM2 trusted keys code") Reported-by: Abaci Robot abaci@linux.alibaba.com Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5524 Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Reviewed-by: Paul Moore paul@paul-moore.com Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- security/keys/trusted-keys/trusted_tpm2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c index 2b2c8eb258d5b..bc700f85f80be 100644 --- a/security/keys/trusted-keys/trusted_tpm2.c +++ b/security/keys/trusted-keys/trusted_tpm2.c @@ -186,7 +186,7 @@ int tpm2_key_priv(void *context, size_t hdrlen, }
/** - * tpm_buf_append_auth() - append TPMS_AUTH_COMMAND to the buffer. + * tpm2_buf_append_auth() - append TPMS_AUTH_COMMAND to the buffer. * * @buf: an allocated tpm_buf instance * @session_handle: session handle
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 2033ab90380d46e0e9f0520fd6776a73d107fd95 ]
Cited commit converted the neighbour code to use the standard RCU variant instead of the RCU-bh variant, but the VRF code still uses rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup code in its IPv4 and IPv6 output paths, resulting in lockdep splats [1][2]. Can be reproduced using [3].
Fix by switching to rcu_read_lock() / rcu_read_unlock().
[1] ============================= WARNING: suspicious RCU usage 6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted ----------------------------- include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ping/183: #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030
stack backtrace: CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xc1/0xf0 lockdep_rcu_suspicious+0x211/0x3b0 vrf_output+0x1380/0x2030 ip_push_pending_frames+0x125/0x2a0 raw_sendmsg+0x200d/0x33c0 inet_sendmsg+0xa2/0xe0 __sys_sendto+0x2aa/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
[2] ============================= WARNING: suspicious RCU usage 6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted ----------------------------- include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ping6/182: #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50 #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310
stack backtrace: CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xc1/0xf0 lockdep_rcu_suspicious+0x211/0x3b0 vrf_output6+0xd32/0x1310 ip6_local_out+0xb4/0x1a0 ip6_send_skb+0xbc/0x340 ip6_push_pending_frames+0xe5/0x110 rawv6_sendmsg+0x2e6e/0x3e50 inet_sendmsg+0xa2/0xe0 __sys_sendto+0x2aa/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
[3] #!/bin/bash
ip link add name vrf-red up numtxqueues 2 type vrf table 10 ip link add name swp1 up master vrf-red type dummy ip address add 192.0.2.1/24 dev swp1 ip address add 2001:db8:1::1/64 dev swp1 ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1 ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1 ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null
Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh") Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ... Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: David Ahern dsahern@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230715153605.4068066-1-idosch@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vrf.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index bdb3a76a352e4..6043e63b42f97 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -664,7 +664,7 @@ static int vrf_finish_output6(struct net *net, struct sock *sk, skb->protocol = htons(ETH_P_IPV6); skb->dev = dev;
- rcu_read_lock_bh(); + rcu_read_lock(); nexthop = rt6_nexthop((struct rt6_info *)dst, &ipv6_hdr(skb)->daddr); neigh = __ipv6_neigh_lookup_noref(dst->dev, nexthop); if (unlikely(!neigh)) @@ -672,10 +672,10 @@ static int vrf_finish_output6(struct net *net, struct sock *sk, if (!IS_ERR(neigh)) { sock_confirm_neigh(skb, neigh); ret = neigh_output(neigh, skb, false); - rcu_read_unlock_bh(); + rcu_read_unlock(); return ret; } - rcu_read_unlock_bh(); + rcu_read_unlock();
IP6_INC_STATS(dev_net(dst->dev), ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES); @@ -889,7 +889,7 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s } }
- rcu_read_lock_bh(); + rcu_read_lock();
neigh = ip_neigh_for_gw(rt, skb, &is_v6gw); if (!IS_ERR(neigh)) { @@ -898,11 +898,11 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s sock_confirm_neigh(skb, neigh); /* if crossing protocols, can not use the cached header */ ret = neigh_output(neigh, skb, is_v6gw); - rcu_read_unlock_bh(); + rcu_read_unlock(); return ret; }
- rcu_read_unlock_bh(); + rcu_read_unlock(); vrf_tx_error(skb->dev, skb); return -EINVAL; }
From: Geetha sowjanya gakula@marvell.com
[ Upstream commit 8fcd7c7b3a38ab5e452f542fda8f7940e77e479a ]
Current driver enables backpressure for LBK interfaces. But these interfaces do not support this feature. Hence, this patch fixes the issue by skipping the backpressure configuration for these interfaces.
Fixes: 75f36270990c ("octeontx2-pf: Support to enable/disable pause frames via ethtool"). Signed-off-by: Geetha sowjanya gakula@marvell.com Signed-off-by: Sunil Goutham sgoutham@marvell.com Link: https://lore.kernel.org/r/20230716093741.28063-1-gakula@marvell.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c index 18284ad751572..384d26bee9b23 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c @@ -1452,8 +1452,9 @@ static int otx2_init_hw_resources(struct otx2_nic *pf) if (err) goto err_free_npa_lf;
- /* Enable backpressure */ - otx2_nix_config_bp(pf, true); + /* Enable backpressure for CGX mapped PF/VFs */ + if (!is_otx2_lbkvf(pf->pdev)) + otx2_nix_config_bp(pf, true);
/* Init Auras and pools used by NIX RQ, for free buffer ptrs */ err = otx2_rq_aura_pool_init(pf);
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit ba7b3e7d5f9014be65879ede8fd599cb222901c9 ]
The assignment to idx in check_max_stack_depth happens once we see a bpf_pseudo_call or bpf_pseudo_func. This is not an issue as the rest of the code performs a few checks and then pushes the frame to the frame stack, except the case of async callbacks. If the async callback case causes the loop iteration to be skipped, the idx assignment will be incorrect on the next iteration of the loop. The value stored in the frame stack (as the subprogno of the current subprog) will be incorrect.
This leads to incorrect checks and incorrect tail_call_reachable marking. Save the target subprog in a new variable and only assign to idx once we are done with the is_async_cb check which may skip pushing of frame to the frame stack and subsequent stack depth checks and tail call markings.
Fixes: 7ddc80a476c2 ("bpf: Teach stack depth check about async callbacks.") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/r/20230717161530.1238-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index aac31e33323bb..e95bfe45fd890 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5429,7 +5429,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { - int next_insn; + int next_insn, sidx;
if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i)) continue; @@ -5439,14 +5439,14 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
/* find the callee */ next_insn = i + insn[i].imm + 1; - idx = find_subprog(env, next_insn); - if (idx < 0) { + sidx = find_subprog(env, next_insn); + if (sidx < 0) { WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn); return -EFAULT; } - if (subprog[idx].is_async_cb) { - if (subprog[idx].has_tail_call) { + if (subprog[sidx].is_async_cb) { + if (subprog[sidx].has_tail_call) { verbose(env, "verifier bug. subprog has tail_call and async cb\n"); return -EFAULT; } @@ -5455,6 +5455,7 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) continue; } i = next_insn; + idx = sidx;
if (subprog[idx].has_tail_call) tail_call_reachable = true;
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit b5e9ad522c4ccd32d322877515cff8d47ed731b9 ]
While the check_max_stack_depth function explores call chains emanating from the main prog, which is typically enough to cover all possible call chains, it doesn't explore those rooted at async callbacks unless the async callback will have been directly called, since unlike non-async callbacks it skips their instruction exploration as they don't contribute to stack depth.
It could be the case that the async callback leads to a callchain which exceeds the stack depth, but this is never reachable while only exploring the entry point from main subprog. Hence, repeat the check for the main subprog *and* all async callbacks marked by the symbolic execution pass of the verifier, as execution of the program may begin at any of them.
Consider functions with following stack depths: main: 256 async: 256 foo: 256
main: rX = async bpf_timer_set_callback(...)
async: foo()
Here, async is not descended as it does not contribute to stack depth of main (since it is referenced using bpf_pseudo_func and not bpf_pseudo_call). However, when async is invoked asynchronously, it will end up breaching the MAX_BPF_STACK limit by calling foo.
Hence, in addition to main, we also need to explore call chains beginning at all async callback subprogs in a program.
Fixes: 7ddc80a476c2 ("bpf: Teach stack depth check about async callbacks.") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/r/20230717161530.1238-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e95bfe45fd890..4fbfe1d086467 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5381,16 +5381,17 @@ static int update_stack_depth(struct bpf_verifier_env *env, * Since recursion is prevented by check_cfg() this algorithm * only needs a local stack of MAX_CALL_FRAMES to remember callsites */ -static int check_max_stack_depth(struct bpf_verifier_env *env) +static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) { - int depth = 0, frame = 0, idx = 0, i = 0, subprog_end; struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; + int depth = 0, frame = 0, i, subprog_end; bool tail_call_reachable = false; int ret_insn[MAX_CALL_FRAMES]; int ret_prog[MAX_CALL_FRAMES]; int j;
+ i = subprog[idx].start; process_func: /* protect against potential stack overflow that might happen when * bpf2bpf calls get combined with tailcalls. Limit the caller's stack @@ -5491,6 +5492,22 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) goto continue_func; }
+static int check_max_stack_depth(struct bpf_verifier_env *env) +{ + struct bpf_subprog_info *si = env->subprog_info; + int ret; + + for (int i = 0; i < env->subprog_cnt; i++) { + if (!i || si[i].is_async_cb) { + ret = check_max_stack_depth_subprog(env, i); + if (ret < 0) + return ret; + } + continue; + } + return 0; +} + #ifndef CONFIG_BPF_JIT_ALWAYS_ON static int get_callee_stack_depth(struct bpf_verifier_env *env, const struct bpf_insn *insn, int idx)
From: Alexander Duyck alexanderduyck@fb.com
[ Upstream commit a3f25d614bc73b45e8f02adc6769876dfd16ca84 ]
When running an freplace attached bpf program on an arm64 system w were seeing the following issue: Unhandled 64-bit el1h sync exception on CPU47, ESR 0x0000000036000003 -- BTI
After a bit of work to track it down I determined that what appeared to be happening is that the 'bti c' at the start of the program was somehow being reached after a 'br' instruction. Further digging pointed me toward the fact that the function was attached via freplace. This in turn led me to build_plt which I believe is invoking the long jump which is triggering this error.
To resolve it we can replace the 'bti c' with 'bti jc' and add a comment explaining why this has to be modified as such.
Fixes: b2ad54e1533e ("bpf, arm64: Implement bpf_arch_text_poke() for arm64") Signed-off-by: Alexander Duyck alexanderduyck@fb.com Acked-by: Xu Kuohai xukuohai@huawei.com Link: https://lore.kernel.org/r/168926677665.316237.9953845318337455525.stgit@ahdu... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/net/bpf_jit_comp.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index b26da8efa616e..0ce5f13eabb1b 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -322,7 +322,13 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) * */
- emit_bti(A64_BTI_C, ctx); + /* bpf function may be invoked by 3 instruction types: + * 1. bl, attached via freplace to bpf prog via short jump + * 2. br, attached via freplace to bpf prog via long jump + * 3. blr, working as a function pointer, used by emit_call. + * So BTI_JC should used here to support both br and blr. + */ + emit_bti(A64_BTI_JC, ctx);
emit(A64_MOV(1, A64_R(9), A64_LR), ctx); emit(A64_NOP, ctx);
From: Kurt Kanzenbach kurt@linutronix.de
[ Upstream commit 95b681485563c64585de78662ee52d06b7fa47d9 ]
High XDP load triggers the netdev watchdog:
|NETDEV WATCHDOG: enp3s0 (igc): transmit queue 2 timed out
The reason is the Tx queue transmission start (txq->trans_start) is not updated in XDP code path. Therefore, add it for all XDP transmission functions.
Signed-off-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: 78adb4bcf99e ("igc: Prevent garbled TX queue with XDP ZEROCOPY") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 44aa4342cbbb5..ef4ea46442f21 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2417,6 +2417,8 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp) nq = txring_txq(ring);
__netif_tx_lock(nq, cpu); + /* Avoid transmit queue timeout since we share it with the slow path */ + txq_trans_cond_update(nq); res = igc_xdp_init_tx_descriptor(ring, xdpf); __netif_tx_unlock(nq); return res; @@ -2833,6 +2835,9 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
__netif_tx_lock(nq, cpu);
+ /* Avoid transmit queue timeout since we share it with the slow path */ + txq_trans_cond_update(nq); + budget = igc_desc_unused(ring);
while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { @@ -6385,6 +6390,9 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
__netif_tx_lock(nq, cpu);
+ /* Avoid transmit queue timeout since we share it with the slow path */ + txq_trans_cond_update(nq); + drops = 0; for (i = 0; i < num_frames; i++) { int err;
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit 78adb4bcf99effbb960c5f9091e2e062509d1030 ]
In normal operation, each populated queue item has next_to_watch pointing to the last TX desc of the packet, while each cleaned item has it set to 0. In particular, next_to_use that points to the next (necessarily clean) item to use has next_to_watch set to 0.
When the TX queue is used both by an application using AF_XDP with ZEROCOPY as well as a second non-XDP application generating high traffic, the queue pointers can get in an invalid state where next_to_use points to an item where next_to_watch is NOT set to 0.
However, the implementation assumes at several places that this is never the case, so if it does hold, bad things happen. In particular, within the loop inside of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. Finally, this prevents any further transmission via this queue and it never gets unblocked or signaled. Secondly, if the queue is in this garbled state, the inner loop of igc_clean_tx_ring() will never terminate, completely hogging a CPU core.
The reason is that igc_xdp_xmit_zc() reads next_to_use before acquiring the lock, and writing it back (potentially unmodified) later. If it got modified before locking, the outdated next_to_use is written pointing to an item that was already used elsewhere (and thus next_to_watch got written).
Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Kurt Kanzenbach kurt@linutronix.de Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230717175444.3217831-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index ef4ea46442f21..496a4eb687b00 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2826,9 +2826,8 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) struct netdev_queue *nq = txring_txq(ring); union igc_adv_tx_desc *tx_desc = NULL; int cpu = smp_processor_id(); - u16 ntu = ring->next_to_use; struct xdp_desc xdp_desc; - u16 budget; + u16 budget, ntu;
if (!netif_carrier_ok(ring->netdev)) return; @@ -2838,6 +2837,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) /* Avoid transmit queue timeout since we share it with the slow path */ txq_trans_cond_update(nq);
+ ntu = ring->next_to_use; budget = igc_desc_unused(ring);
while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
From: Antoine Tenart atenart@kernel.org
[ Upstream commit c0a8966e2bc7d31f77a7246947ebc09c1ff06066 ]
When using IPv4/TCP, skb->hash comes from sk->sk_txhash except in TIME_WAIT and SYN_RECV where it's not set in the reply skb from ip_send_unicast_reply. Those packets will have a mismatched hash with others from the same flow as their hashes will be 0. IPv6 does not have the same issue as the hash is set from the socket txhash in those cases.
This commits sets the hash in the reply skb from ip_send_unicast_reply, which makes the IPv4 code behaving like IPv6.
Signed-off-by: Antoine Tenart atenart@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: 5e5265522a9a ("tcp: annotate data-races around tcp_rsk(req)->txhash") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ip.h | 2 +- net/ipv4/ip_output.c | 4 +++- net/ipv4/tcp_ipv4.c | 14 +++++++++----- 3 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h index acec504c469a0..83a1a9bc3ceb1 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -282,7 +282,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, const struct ip_options *sopt, __be32 daddr, __be32 saddr, const struct ip_reply_arg *arg, - unsigned int len, u64 transmit_time); + unsigned int len, u64 transmit_time, u32 txhash);
#define IP_INC_STATS(net, field) SNMP_INC_STATS64((net)->mib.ip_statistics, field) #define __IP_INC_STATS(net, field) __SNMP_INC_STATS64((net)->mib.ip_statistics, field) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 61892268e8a6c..a1bead441026e 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1692,7 +1692,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, const struct ip_options *sopt, __be32 daddr, __be32 saddr, const struct ip_reply_arg *arg, - unsigned int len, u64 transmit_time) + unsigned int len, u64 transmit_time, u32 txhash) { struct ip_options_data replyopts; struct ipcm_cookie ipc; @@ -1755,6 +1755,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, arg->csum)); nskb->ip_summed = CHECKSUM_NONE; nskb->mono_delivery_time = !!transmit_time; + if (txhash) + skb_set_hash(nskb, txhash, PKT_HASH_TYPE_L4); ip_push_pending_frames(sk, &fl4); } out: diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 434e5f0c8b99d..a64069077e388 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -692,6 +692,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) u64 transmit_time = 0; struct sock *ctl_sk; struct net *net; + u32 txhash = 0;
/* Never send a reset in response to a reset. */ if (th->rst) @@ -829,6 +830,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) inet_twsk(sk)->tw_priority : sk->sk_priority; transmit_time = tcp_transmit_time(sk); xfrm_sk_clone_policy(ctl_sk, sk); + txhash = (sk->sk_state == TCP_TIME_WAIT) ? + inet_twsk(sk)->tw_txhash : sk->sk_txhash; } else { ctl_sk->sk_mark = 0; ctl_sk->sk_priority = 0; @@ -837,7 +840,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) skb, &TCP_SKB_CB(skb)->header.h4.opt, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &arg, arg.iov[0].iov_len, - transmit_time); + transmit_time, txhash);
xfrm_sk_free_policy(ctl_sk); sock_net_set(ctl_sk, &init_net); @@ -859,7 +862,7 @@ static void tcp_v4_send_ack(const struct sock *sk, struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 tsval, u32 tsecr, int oif, struct tcp_md5sig_key *key, - int reply_flags, u8 tos) + int reply_flags, u8 tos, u32 txhash) { const struct tcphdr *th = tcp_hdr(skb); struct { @@ -935,7 +938,7 @@ static void tcp_v4_send_ack(const struct sock *sk, skb, &TCP_SKB_CB(skb)->header.h4.opt, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, &arg, arg.iov[0].iov_len, - transmit_time); + transmit_time, txhash);
sock_net_set(ctl_sk, &init_net); __TCP_INC_STATS(net, TCP_MIB_OUTSEGS); @@ -955,7 +958,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb) tw->tw_bound_dev_if, tcp_twsk_md5_key(tcptw), tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0, - tw->tw_tos + tw->tw_tos, + tw->tw_txhash );
inet_twsk_put(tw); @@ -988,7 +992,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, 0, tcp_md5_do_lookup(sk, l3index, addr, AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, - ip_hdr(skb)->tos); + ip_hdr(skb)->tos, tcp_rsk(req)->txhash); }
/*
From: Eric Dumazet edumazet@google.com
[ Upstream commit 5e5265522a9a7f91d1b0bd411d634bdaf16c80cd ]
TCP request sockets are lockless, some of their fields can change while being read by another cpu as syzbot noticed.
This is usually harmless, but we should annotate the known races.
This patch takes care of tcp_rsk(req)->txhash, a separate one is needed for tcp_rsk(req)->ts_recent.
BUG: KCSAN: data-race in tcp_make_synack / tcp_rtx_synack
write to 0xffff8881362304bc of 4 bytes by task 32083 on cpu 1: tcp_rtx_synack+0x9d/0x2a0 net/ipv4/tcp_output.c:4213 inet_rtx_syn_ack+0x38/0x80 net/ipv4/inet_connection_sock.c:880 tcp_check_req+0x379/0xc70 net/ipv4/tcp_minisocks.c:665 tcp_v6_rcv+0x125b/0x1b20 net/ipv6/tcp_ipv6.c:1673 ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437 ip6_input_finish net/ipv6/ip6_input.c:482 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491 dst_input include/net/dst.h:468 [inline] ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:303 [inline] ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309 __netif_receive_skb_one_core net/core/dev.c:5452 [inline] __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566 netif_receive_skb_internal net/core/dev.c:5652 [inline] netif_receive_skb+0x4a/0x310 net/core/dev.c:5711 tun_rx_batched+0x3bf/0x400 tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997 tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043 call_write_iter include/linux/fs.h:1871 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x4ab/0x7d0 fs/read_write.c:584 ksys_write+0xeb/0x1a0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x42/0x50 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff8881362304bc of 4 bytes by task 32078 on cpu 0: tcp_make_synack+0x367/0xb40 net/ipv4/tcp_output.c:3663 tcp_v6_send_synack+0x72/0x420 net/ipv6/tcp_ipv6.c:544 tcp_conn_request+0x11a8/0x1560 net/ipv4/tcp_input.c:7059 tcp_v6_conn_request+0x13f/0x180 net/ipv6/tcp_ipv6.c:1175 tcp_rcv_state_process+0x156/0x1de0 net/ipv4/tcp_input.c:6494 tcp_v6_do_rcv+0x98a/0xb70 net/ipv6/tcp_ipv6.c:1509 tcp_v6_rcv+0x17b8/0x1b20 net/ipv6/tcp_ipv6.c:1735 ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437 ip6_input_finish net/ipv6/ip6_input.c:482 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491 dst_input include/net/dst.h:468 [inline] ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79 NF_HOOK include/linux/netfilter.h:303 [inline] ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309 __netif_receive_skb_one_core net/core/dev.c:5452 [inline] __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566 netif_receive_skb_internal net/core/dev.c:5652 [inline] netif_receive_skb+0x4a/0x310 net/core/dev.c:5711 tun_rx_batched+0x3bf/0x400 tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997 tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043 call_write_iter include/linux/fs.h:1871 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x4ab/0x7d0 fs/read_write.c:584 ksys_write+0xeb/0x1a0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x42/0x50 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x91d25731 -> 0xe79325cd
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 32078 Comm: syz-executor.4 Not tainted 6.5.0-rc1-syzkaller-00033-geb26cbb1a754 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
Fixes: 58d607d3e52f ("tcp: provide skb->hash to synack packets") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230717144445.653164-2-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_ipv4.c | 3 ++- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 4 ++-- net/ipv6/tcp_ipv6.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index a64069077e388..52229c75e76f6 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -992,7 +992,8 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, 0, tcp_md5_do_lookup(sk, l3index, addr, AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, - ip_hdr(skb)->tos, tcp_rsk(req)->txhash); + ip_hdr(skb)->tos, + READ_ONCE(tcp_rsk(req)->txhash)); }
/* diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index dac0d62120e62..909f3b4ed2059 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -528,7 +528,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
newtp->lsndtime = tcp_jiffies32; - newsk->sk_txhash = treq->txhash; + newsk->sk_txhash = READ_ONCE(treq->txhash); newtp->total_retrans = req->num_retrans;
tcp_init_xmit_timers(newsk); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cfe128b81a010..1538b59913777 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3578,7 +3578,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, rcu_read_lock(); md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req)); #endif - skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4); + skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4); /* bpf program will be interested in the tcp_flags */ TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK; tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5, @@ -4121,7 +4121,7 @@ int tcp_rtx_synack(const struct sock *sk, struct request_sock *req)
/* Paired with WRITE_ONCE() in sock_setsockopt() */ if (READ_ONCE(sk->sk_txrehash) == SOCK_TXREHASH_ENABLED) - tcp_rsk(req)->txhash = net_tx_rndhash(); + WRITE_ONCE(tcp_rsk(req)->txhash, net_tx_rndhash()); res = af_ops->send_synack(sk, NULL, &fl, req, NULL, TCP_SYNACK_NORMAL, NULL); if (!res) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 7132eb213a7a2..a3c86b714b242 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1133,7 +1133,7 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, req->ts_recent, sk->sk_bound_dev_if, tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index), ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority, - tcp_rsk(req)->txhash); + READ_ONCE(tcp_rsk(req)->txhash)); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit eba20811f32652bc1a52d5e7cc403859b86390d9 ]
TCP request sockets are lockless, tcp_rsk(req)->ts_recent can change while being read by another cpu as syzbot noticed.
This is harmless, but we should annotate the known races.
Note that tcp_check_req() changes req->ts_recent a bit early, we might change this in the future.
BUG: KCSAN: data-race in tcp_check_req / tcp_check_req
write to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 1: tcp_check_req+0x694/0xc70 net/ipv4/tcp_minisocks.c:762 tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071 ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:303 [inline] ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:468 [inline] ip_rcv_finish net/ipv4/ip_input.c:449 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core net/core/dev.c:5493 [inline] __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607 process_backlog+0x21f/0x380 net/core/dev.c:5935 __napi_poll+0x60/0x3b0 net/core/dev.c:6498 napi_poll net/core/dev.c:6565 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6698 __do_softirq+0xc1/0x265 kernel/softirq.c:571 do_softirq+0x7e/0xb0 kernel/softirq.c:472 __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:396 local_bh_enable+0x1f/0x20 include/linux/bottom_half.h:33 rcu_read_unlock_bh include/linux/rcupdate.h:843 [inline] __dev_queue_xmit+0xabb/0x1d10 net/core/dev.c:4271 dev_queue_xmit include/linux/netdevice.h:3088 [inline] neigh_hh_output include/net/neighbour.h:528 [inline] neigh_output include/net/neighbour.h:542 [inline] ip_finish_output2+0x700/0x840 net/ipv4/ip_output.c:229 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:292 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:431 dst_output include/net/dst.h:458 [inline] ip_local_out net/ipv4/ip_output.c:126 [inline] __ip_queue_xmit+0xa4d/0xa70 net/ipv4/ip_output.c:533 ip_queue_xmit+0x38/0x40 net/ipv4/ip_output.c:547 __tcp_transmit_skb+0x1194/0x16e0 net/ipv4/tcp_output.c:1399 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline] tcp_write_xmit+0x13ff/0x2fd0 net/ipv4/tcp_output.c:2693 __tcp_push_pending_frames+0x6a/0x1a0 net/ipv4/tcp_output.c:2877 tcp_push_pending_frames include/net/tcp.h:1952 [inline] __tcp_sock_set_cork net/ipv4/tcp.c:3336 [inline] tcp_sock_set_cork+0xe8/0x100 net/ipv4/tcp.c:3343 rds_tcp_xmit_path_complete+0x3b/0x40 net/rds/tcp_send.c:52 rds_send_xmit+0xf8d/0x1420 net/rds/send.c:422 rds_send_worker+0x42/0x1d0 net/rds/threads.c:200 process_one_work+0x3e6/0x750 kernel/workqueue.c:2408 worker_thread+0x5f2/0xa10 kernel/workqueue.c:2555 kthread+0x1d7/0x210 kernel/kthread.c:379 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
read to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 0: tcp_check_req+0x32a/0xc70 net/ipv4/tcp_minisocks.c:622 tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071 ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:303 [inline] ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:468 [inline] ip_rcv_finish net/ipv4/ip_input.c:449 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core net/core/dev.c:5493 [inline] __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607 process_backlog+0x21f/0x380 net/core/dev.c:5935 __napi_poll+0x60/0x3b0 net/core/dev.c:6498 napi_poll net/core/dev.c:6565 [inline] net_rx_action+0x32b/0x750 net/core/dev.c:6698 __do_softirq+0xc1/0x265 kernel/softirq.c:571 run_ksoftirqd+0x17/0x20 kernel/softirq.c:939 smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164 kthread+0x1d7/0x210 kernel/kthread.c:379 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
value changed: 0x1cd237f1 -> 0x1cd237f2
Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230717144445.653164-3-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 9 ++++++--- net/ipv4/tcp_output.c | 2 +- net/ipv6/tcp_ipv6.c | 2 +- 4 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 52229c75e76f6..5d3e49ceb6917 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -988,7 +988,7 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, - req->ts_recent, + READ_ONCE(req->ts_recent), 0, tcp_md5_do_lookup(sk, l3index, addr, AF_INET), inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0, diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 909f3b4ed2059..62641d42b06b5 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -555,7 +555,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, newtp->max_window = newtp->snd_wnd;
if (newtp->rx_opt.tstamp_ok) { - newtp->rx_opt.ts_recent = req->ts_recent; + newtp->rx_opt.ts_recent = READ_ONCE(req->ts_recent); newtp->rx_opt.ts_recent_stamp = ktime_get_seconds(); newtp->tcp_header_len = sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED; } else { @@ -619,7 +619,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
if (tmp_opt.saw_tstamp) { - tmp_opt.ts_recent = req->ts_recent; + tmp_opt.ts_recent = READ_ONCE(req->ts_recent); if (tmp_opt.rcv_tsecr) tmp_opt.rcv_tsecr -= tcp_rsk(req)->ts_off; /* We do not store true stamp, but it is not required, @@ -758,8 +758,11 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
/* In sequence, PAWS is OK. */
+ /* TODO: We probably should defer ts_recent change once + * we take ownership of @req. + */ if (tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)->seq, tcp_rsk(req)->rcv_nxt)) - req->ts_recent = tmp_opt.rcv_tsval; + WRITE_ONCE(req->ts_recent, tmp_opt.rcv_tsval);
if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn) { /* Truncate SYN, it is out of window starting diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 1538b59913777..518cb4abc8b4f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -876,7 +876,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, if (likely(ireq->tstamp_ok)) { opts->options |= OPTION_TS; opts->tsval = tcp_skb_timestamp(skb) + tcp_rsk(req)->ts_off; - opts->tsecr = req->ts_recent; + opts->tsecr = READ_ONCE(req->ts_recent); remaining -= TCPOLEN_TSTAMP_ALIGNED; } if (likely(ireq->sack_ok)) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index a3c86b714b242..f7c248a7f8d1d 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1130,7 +1130,7 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb, tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp_raw() + tcp_rsk(req)->ts_off, - req->ts_recent, sk->sk_bound_dev_if, + READ_ONCE(req->ts_recent), sk->sk_bound_dev_if, tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index), ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority, READ_ONCE(tcp_rsk(req)->txhash));
From: Wang Ming machel@vivo.com
[ Upstream commit daa751444fd9d4184270b1479d8af49aaf1a1ee6 ]
key might contain private part of the key, so better use kfree_sensitive to free it.
Fixes: 38320c70d282 ("[IPSEC]: Use crypto_aead and authenc in ESP") Signed-off-by: Wang Ming machel@vivo.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/esp4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index ba06ed42e4284..2be2d49225573 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -1132,7 +1132,7 @@ static int esp_init_authenc(struct xfrm_state *x, err = crypto_aead_setkey(aead, key, keylen);
free_key: - kfree(key); + kfree_sensitive(key);
error: return err;
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit 4258faa130be4ea43e5e2d839467da421b8ff274 ]
goto tx_err if an unexpected result is returned by pskb_tirm() in ip6erspan_tunnel_xmit().
Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Reviewed-by: David Ahern dsahern@kernel.org Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ip6_gre.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index da80974ad23ae..070d87abf7c02 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -955,7 +955,8 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb, goto tx_err;
if (skb->len > dev->mtu + dev->hard_header_len) { - pskb_trim(skb, dev->mtu + dev->hard_header_len); + if (pskb_trim(skb, dev->mtu + dev->hard_header_len)) + goto tx_err; truncate = true; }
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 81b3ade5d2b98ad6e0a473b0e1e420a801275592 ]
This reverts commit 3f4ca5fafc08881d7a57daa20449d171f2887043.
Commit 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in ehash table") reversed the order in how a socket is inserted into ehash to fix an issue that ehash-lookup could fail when reqsk/full sk/twsk are swapped. However, it introduced another lookup failure.
The full socket in ehash is allocated from a slab with SLAB_TYPESAFE_BY_RCU and does not have SOCK_RCU_FREE, so the socket could be reused even while it is being referenced on another CPU doing RCU lookup.
Let's say a socket is reused and inserted into the same hash bucket during lookup. After the blamed commit, a new socket is inserted at the end of the list. If that happens, we will skip sockets placed after the previous position of the reused socket, resulting in ehash lookup failure.
As described in Documentation/RCU/rculist_nulls.rst, we should insert a new socket at the head of the list to avoid such an issue.
This issue, the swap-lookup-failure, and another variant reported in [0] can all be handled properly by adding a locked ehash lookup suggested by Eric Dumazet [1].
However, this issue could occur for every packet, thus more likely than the other two races, so let's revert the change for now.
Link: https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@baidu.com/ [0] Link: https://lore.kernel.org/netdev/CANn89iK8snOz8TYOhhwfimC7ykYA78GA3Nyv8x06SZYa... [1] Fixes: 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in ehash table") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230717215918.15723-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_hashtables.c | 17 ++--------------- net/ipv4/inet_timewait_sock.c | 8 ++++---- 2 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index e7391bf310a75..0819d6001b9ab 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -650,20 +650,8 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk) spin_lock(lock); if (osk) { WARN_ON_ONCE(sk->sk_hash != osk->sk_hash); - ret = sk_hashed(osk); - if (ret) { - /* Before deleting the node, we insert a new one to make - * sure that the look-up-sk process would not miss either - * of them and that at least one node would exist in ehash - * table all the time. Otherwise there's a tiny chance - * that lookup process could find nothing in ehash table. - */ - __sk_nulls_add_node_tail_rcu(sk, list); - sk_nulls_del_node_init_rcu(osk); - } - goto unlock; - } - if (found_dup_sk) { + ret = sk_nulls_del_node_init_rcu(osk); + } else if (found_dup_sk) { *found_dup_sk = inet_ehash_lookup_by_sk(sk, list); if (*found_dup_sk) ret = false; @@ -672,7 +660,6 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk) if (ret) __sk_nulls_add_node_rcu(sk, list);
-unlock: spin_unlock(lock);
return ret; diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index 40052414c7c71..2c1b245dba8e8 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -88,10 +88,10 @@ void inet_twsk_put(struct inet_timewait_sock *tw) } EXPORT_SYMBOL_GPL(inet_twsk_put);
-static void inet_twsk_add_node_tail_rcu(struct inet_timewait_sock *tw, - struct hlist_nulls_head *list) +static void inet_twsk_add_node_rcu(struct inet_timewait_sock *tw, + struct hlist_nulls_head *list) { - hlist_nulls_add_tail_rcu(&tw->tw_node, list); + hlist_nulls_add_head_rcu(&tw->tw_node, list); }
static void inet_twsk_add_bind_node(struct inet_timewait_sock *tw, @@ -144,7 +144,7 @@ void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
spin_lock(lock);
- inet_twsk_add_node_tail_rcu(tw, &ehead->chain); + inet_twsk_add_node_rcu(tw, &ehead->chain);
/* Step 3: Remove SK from hash chain */ if (__sk_nulls_del_node_init_rcu(sk))
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit 9f9d4c1a2e82174a4e799ec405284a2b0de32b6a ]
entries and bind debugfs files would display wrong data on NETSYS_V2 and later because instead of using mtk_get_ib1_pkt_type the driver would use MTK_FOE_IB1_PACKET_TYPE which corresponds to NETSYS_V1(.x) SoCs. Use mtk_get_ib1_pkt_type so entries and bind records display correctly.
Fixes: 03a3180e5c09e ("net: ethernet: mtk_eth_soc: introduce flow offloading support for mt7986") Signed-off-by: Daniel Golle daniel@makrotopia.org Acked-by: Lorenzo Bianconi lorenzo@kernel.org Link: https://lore.kernel.org/r/c0ae03d0182f4d27b874cbdf0059bc972c317f3c.168972713... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c index 316fe2e70fead..1a97feca77f23 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c @@ -98,7 +98,7 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind)
acct = mtk_foe_entry_get_mib(ppe, i, NULL);
- type = FIELD_GET(MTK_FOE_IB1_PACKET_TYPE, entry->ib1); + type = mtk_get_ib1_pkt_type(ppe->eth, entry->ib1); seq_printf(m, "%05x %s %7s", i, mtk_foe_entry_state_str(state), mtk_foe_pkt_type_str(type));
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 4e88761f5f8c7869f15a2046b1a1116f4fab4ac8 ]
This func misses checking for platform_get_irq()'s call and may passes the negative error codes to request_irq(), which takes unsigned IRQ #, causing it to fail with -EINVAL, overriding an original error code.
Fix this by stop calling request_irq() with invalid IRQ #s.
Fixes: 1630d85a8312 ("au1200fb: fix hardcoded IRQ") Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/au1200fb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/video/fbdev/au1200fb.c b/drivers/video/fbdev/au1200fb.c index aed88ce45bf09..d8f085d4ede30 100644 --- a/drivers/video/fbdev/au1200fb.c +++ b/drivers/video/fbdev/au1200fb.c @@ -1732,6 +1732,9 @@ static int au1200fb_drv_probe(struct platform_device *dev)
/* Now hook interrupt too */ irq = platform_get_irq(dev, 0); + if (irq < 0) + return irq; + ret = request_irq(irq, au1200fb_handle_irq, IRQF_SHARED, "lcd", (void *)dev); if (ret) {
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 6631463b6e6673916d2481f692938f393148aa82 ]
Now these upper layer protocol handlers can be called from llc_rcv() as sap->rcv_func(), which is registered by llc_sap_open().
* function which is passed to register_8022_client() -> no in-kernel user calls register_8022_client().
* snap_rcv() `- proto->rcvfunc() : registered by register_snap_client() -> aarp_rcv() and atalk_rcv() drop packets from non-root netns
* stp_pdu_rcv() `- garp_protos[]->rcv() : registered by stp_proto_register() -> garp_pdu_rcv() and br_stp_rcv() are netns-aware
So, we can safely remove the netns restriction in llc_rcv().
Fixes: e730c15519d0 ("[NET]: Make packet reception network namespace safe") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/llc/llc_input.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c index c309b72a58779..7cac441862e21 100644 --- a/net/llc/llc_input.c +++ b/net/llc/llc_input.c @@ -163,9 +163,6 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev, void (*sta_handler)(struct sk_buff *skb); void (*sap_handler)(struct llc_sap *sap, struct sk_buff *skb);
- if (!net_eq(dev_net(dev), &init_net)) - goto drop; - /* * When the interface is in promisc. mode, drop all the crap that it * receives, do not try to analyse it.
From: Vitaly Rodionov vitalyr@opensource.cirrus.com
[ Upstream commit f7b069cf08816252f494d193b9ecdff172bf9aa1 ]
Generic fixup for CS35L41 amplifies should not have vendor specific chained fixup. For ThinkPad laptops with led issue, we can just add specific fixup.
Fixes: a6ac60b36dade (ALSA: hda/realtek: Fix mute led issue on thinkpad with cs35l41 s-codec) Signed-off-by: Vitaly Rodionov vitalyr@opensource.cirrus.com Link: https://lore.kernel.org/r/20230720082022.13033-1-vitalyr@opensource.cirrus.c... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7224,6 +7224,7 @@ enum { ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN, ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS, ALC236_FIXUP_DELL_DUAL_CODECS, + ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI, };
/* A special fixup for Lenovo C940 and Yoga Duet 7; @@ -9135,8 +9136,6 @@ static const struct hda_fixup alc269_fix [ALC287_FIXUP_CS35L41_I2C_2] = { .type = HDA_FIXUP_FUNC, .v.func = cs35l41_fixup_i2c_two, - .chained = true, - .chain_id = ALC269_FIXUP_THINKPAD_ACPI, }, [ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED] = { .type = HDA_FIXUP_FUNC, @@ -9273,6 +9272,12 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, }, + [ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI] = { + .type = HDA_FIXUP_FUNC, + .v.func = cs35l41_fixup_i2c_two, + .chained = true, + .chain_id = ALC269_FIXUP_THINKPAD_ACPI, + }, };
static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -9798,14 +9803,14 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x22be, "Thinkpad X1 Carbon 8th", ALC285_FIXUP_THINKPAD_HEADSET_JACK), SND_PCI_QUIRK(0x17aa, 0x22c1, "Thinkpad P1 Gen 3", ALC285_FIXUP_THINKPAD_NO_BASS_SPK_HEADSET_JACK), SND_PCI_QUIRK(0x17aa, 0x22c2, "Thinkpad X1 Extreme Gen 3", ALC285_FIXUP_THINKPAD_NO_BASS_SPK_HEADSET_JACK), - SND_PCI_QUIRK(0x17aa, 0x22f1, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x22f2, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x22f3, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x2316, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x2317, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x2318, "Thinkpad Z13 Gen2", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x2319, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2), - SND_PCI_QUIRK(0x17aa, 0x231a, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2), + SND_PCI_QUIRK(0x17aa, 0x22f1, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x22f2, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x22f3, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x2316, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x2317, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x2318, "Thinkpad Z13 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x2319, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x231a, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), SND_PCI_QUIRK(0x17aa, 0x310c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION),
From: Florian Westphal fw@strlen.de
[ Upstream commit ddbd8be68941985f166f5107109a90ce13147c44 ]
On some platforms there is a padding hole in the nft_verdict structure, between the verdict code and the chain pointer.
On element insertion, if the new element clashes with an existing one and NLM_F_EXCL flag isn't set, we want to ignore the -EEXIST error as long as the data associated with duplicated element is the same as the existing one. The data equality check uses memcmp.
For normal data (NFT_DATA_VALUE) this works fine, but for NFT_DATA_VERDICT padding area leads to spurious failure even if the verdict data is the same.
This then makes the insertion fail with 'already exists' error, even though the new "key : data" matches an existing entry and userspace told the kernel that it doesn't want to receive an error indication.
Fixes: c016c7e45ddf ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 18546f9b2a63a..51909bcc181fa 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -10482,6 +10482,9 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
if (!tb[NFTA_VERDICT_CODE]) return -EINVAL; + + /* zero padding hole for memcmp */ + memset(data, 0, sizeof(*data)); data->verdict.code = ntohl(nla_get_be32(tb[NFTA_VERDICT_CODE]));
switch (data->verdict.code) {
From: Florian Westphal fw@strlen.de
[ Upstream commit 314c82841602a111c04a7210c21dc77e0d560242 ]
Can be called via nft set element list iteration, which may acquire rcu and/or bh read lock (depends on set type).
BUG: sleeping function called from invalid context at net/netfilter/nf_tables_api.c:3353 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1232, name: nft preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 2 locks held by nft/1232: #0: ffff8881180e3ea8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid #1: ffffffff83f5f540 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire Call Trace: nft_chain_validate nft_lookup_validate_setelem nft_pipapo_walk nft_lookup_validate nft_chain_validate nft_immediate_validate nft_chain_validate nf_tables_validate nf_tables_abort
No choice but to move it to nf_tables_validate().
Fixes: 81ea01066741 ("netfilter: nf_tables: add rescheduling points during loop detection walks") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 51909bcc181fa..f3a4aa9054876 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -3684,8 +3684,6 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain) if (err < 0) return err; } - - cond_resched(); }
return 0; @@ -3709,6 +3707,8 @@ static int nft_table_validate(struct net *net, const struct nft_table *table) err = nft_chain_validate(&ctx, chain); if (err < 0) return err; + + cond_resched(); }
return 0;
From: Florian Westphal fw@strlen.de
[ Upstream commit 87b5a5c209405cb6b57424cdfa226a6dbd349232 ]
end key should be equal to start unless NFT_SET_EXT_KEY_END is present.
Its possible to add elements that only have a start key ("{ 1.0.0.0 . 2.0.0.0 }") without an internval end.
Insertion treats this via:
if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) end = (const u8 *)nft_set_ext_key_end(ext)->data; else end = start;
but removal side always uses nft_set_ext_key_end(). This is wrong and leads to garbage remaining in the set after removal next lookup/insert attempt will give:
BUG: KASAN: slab-use-after-free in pipapo_get+0x8eb/0xb90 Read of size 1 at addr ffff888100d50586 by task nft-pipapo_uaf_/1399 Call Trace: kasan_report+0x105/0x140 pipapo_get+0x8eb/0xb90 nft_pipapo_insert+0x1dc/0x1710 nf_tables_newsetelem+0x31f5/0x4e00 ..
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Reported-by: lonial con kongln9170@gmail.com Reviewed-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_pipapo.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 0452ee586c1cc..a81829c10feab 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1930,7 +1930,11 @@ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, int i, start, rules_fx;
match_start = data; - match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data; + + if (nft_set_ext_exists(&e->ext, NFT_SET_EXT_KEY_END)) + match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data; + else + match_end = data;
start = first_rule; rules_fx = rules_f0;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 751d460ccff3137212f47d876221534bf0490996 ]
Skip bound chain from netns release path, the rule that owns this chain releases these objects.
Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index f3a4aa9054876..e3049c7db9041 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -10767,6 +10767,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.family = table->family; ctx.table = table; list_for_each_entry(chain, &table->chains, list) { + if (nft_chain_is_bound(chain)) + continue; + ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list);
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 6eaf41e87a223ae6f8e7a28d6e78384ad7e407f8 ]
Skip bound chain when flushing table rules, the rule that owns this chain releases these objects.
Otherwise, the following warning is triggered:
WARNING: CPU: 2 PID: 1217 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 2 PID: 1217 Comm: chain-flush Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich kevinrich1337@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index e3049c7db9041..ccf0b3d80fd97 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4086,6 +4086,8 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info, list_for_each_entry(chain, &table->chains, list) { if (!nft_is_active_next(net, chain)) continue; + if (nft_chain_is_bound(chain)) + continue;
ctx.chain = chain; err = nft_delrule_by_chain(&ctx);
From: Pauli Virtanen pav@iki.fi
[ Upstream commit 195ef75e19287b4bc413da3e3e3722b030ac881e ]
hci_update_accept_list_sync iterates over hdev->pend_le_conns and hdev->pend_le_reports, and waits for controller events in the loop body, without holding hdev lock.
Meanwhile, these lists and the items may be modified e.g. by le_scan_cleanup. This can invalidate the list cursor or any other item in the list, resulting to invalid behavior (eg use-after-free).
Use RCU for the hci_conn_params action lists. Since the loop bodies in hci_sync block and we cannot use RCU or hdev->lock for the whole loop, copy list items first and then iterate on the copy. Only the flags field is written from elsewhere, so READ_ONCE/WRITE_ONCE should guarantee we read valid values.
Free params everywhere with hci_conn_params_free so the cleanup is guaranteed to be done properly.
This fixes the following, which can be triggered e.g. by BlueZ new mgmt-tester case "Add + Remove Device Nowait - Success", or by changing hci_le_set_cig_params to always return false, and running iso-tester:
================================================================== BUG: KASAN: slab-use-after-free in hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) Read of size 8 at addr ffff888001265018 by task kworker/u3:0/32
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 Workqueue: hci0 hci_cmd_sync_work Call Trace: <TASK> dump_stack_lvl (./arch/x86/include/asm/irqflags.h:134 lib/dump_stack.c:107) print_report (mm/kasan/report.c:320 mm/kasan/report.c:430) ? __virt_addr_valid (./include/linux/mmzone.h:1915 ./include/linux/mmzone.h:2011 arch/x86/mm/physaddr.c:65) ? hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) kasan_report (mm/kasan/report.c:538) ? hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) ? __pfx_hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2780) ? mutex_lock (kernel/locking/mutex.c:282) ? __pfx_mutex_lock (kernel/locking/mutex.c:282) ? __pfx_mutex_unlock (kernel/locking/mutex.c:538) ? __pfx_update_passive_scan_sync (net/bluetooth/hci_sync.c:2861) hci_cmd_sync_work (net/bluetooth/hci_sync.c:306) process_one_work (./arch/x86/include/asm/preempt.h:27 kernel/workqueue.c:2399) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2538) ? __pfx_worker_thread (kernel/workqueue.c:2480) kthread (kernel/kthread.c:376) ? __pfx_kthread (kernel/kthread.c:331) ret_from_fork (arch/x86/entry/entry_64.S:314) </TASK>
Allocated by task 31: kasan_save_stack (mm/kasan/common.c:46) kasan_set_track (mm/kasan/common.c:52) __kasan_kmalloc (mm/kasan/common.c:374 mm/kasan/common.c:383) hci_conn_params_add (./include/linux/slab.h:580 ./include/linux/slab.h:720 net/bluetooth/hci_core.c:2277) hci_connect_le_scan (net/bluetooth/hci_conn.c:1419 net/bluetooth/hci_conn.c:1589) hci_connect_cis (net/bluetooth/hci_conn.c:2266) iso_connect_cis (net/bluetooth/iso.c:390) iso_sock_connect (net/bluetooth/iso.c:899) __sys_connect (net/socket.c:2003 net/socket.c:2020) __x64_sys_connect (net/socket.c:2027) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
Freed by task 15: kasan_save_stack (mm/kasan/common.c:46) kasan_set_track (mm/kasan/common.c:52) kasan_save_free_info (mm/kasan/generic.c:523) __kasan_slab_free (mm/kasan/common.c:238 mm/kasan/common.c:200 mm/kasan/common.c:244) __kmem_cache_free (mm/slub.c:1807 mm/slub.c:3787 mm/slub.c:3800) hci_conn_params_del (net/bluetooth/hci_core.c:2323) le_scan_cleanup (net/bluetooth/hci_conn.c:202) process_one_work (./arch/x86/include/asm/preempt.h:27 kernel/workqueue.c:2399) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2538) kthread (kernel/kthread.c:376) ret_from_fork (arch/x86/entry/entry_64.S:314) ==================================================================
Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Pauli Virtanen pav@iki.fi Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/bluetooth/hci_core.h | 5 ++ net/bluetooth/hci_conn.c | 10 +-- net/bluetooth/hci_core.c | 38 ++++++++-- net/bluetooth/hci_event.c | 12 ++-- net/bluetooth/hci_sync.c | 117 ++++++++++++++++++++++++++++--- net/bluetooth/mgmt.c | 26 +++---- 6 files changed, 164 insertions(+), 44 deletions(-)
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index 9654567cfae37..870b6d3c5146b 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -822,6 +822,7 @@ struct hci_conn_params {
struct hci_conn *conn; bool explicit_connect; + /* Accessed without hdev->lock: */ hci_conn_flags_t flags; u8 privacy_mode; }; @@ -1573,7 +1574,11 @@ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type); void hci_conn_params_del(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type); void hci_conn_params_clear_disabled(struct hci_dev *hdev); +void hci_conn_params_free(struct hci_conn_params *param);
+void hci_pend_le_list_del_init(struct hci_conn_params *param); +void hci_pend_le_list_add(struct hci_conn_params *param, + struct list_head *list); struct hci_conn_params *hci_pend_le_action_lookup(struct list_head *list, bdaddr_t *addr, u8 addr_type); diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 2275e0d9f8419..7b0c74ef93296 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -118,7 +118,7 @@ static void hci_connect_le_scan_cleanup(struct hci_conn *conn, u8 status) */ params->explicit_connect = false;
- list_del_init(¶ms->action); + hci_pend_le_list_del_init(params);
switch (params->auto_connect) { case HCI_AUTO_CONN_EXPLICIT: @@ -127,10 +127,10 @@ static void hci_connect_le_scan_cleanup(struct hci_conn *conn, u8 status) return; case HCI_AUTO_CONN_DIRECT: case HCI_AUTO_CONN_ALWAYS: - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_add(params, &hdev->pend_le_conns); break; case HCI_AUTO_CONN_REPORT: - list_add(¶ms->action, &hdev->pend_le_reports); + hci_pend_le_list_add(params, &hdev->pend_le_reports); break; default: break; @@ -1426,8 +1426,8 @@ static int hci_explicit_conn_params_set(struct hci_dev *hdev, if (params->auto_connect == HCI_AUTO_CONN_DISABLED || params->auto_connect == HCI_AUTO_CONN_REPORT || params->auto_connect == HCI_AUTO_CONN_EXPLICIT) { - list_del_init(¶ms->action); - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_del_init(params); + hci_pend_le_list_add(params, &hdev->pend_le_conns); }
params->explicit_connect = true; diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index 48917c68358de..b421e196f60c3 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -2249,21 +2249,45 @@ struct hci_conn_params *hci_conn_params_lookup(struct hci_dev *hdev, return NULL; }
-/* This function requires the caller holds hdev->lock */ +/* This function requires the caller holds hdev->lock or rcu_read_lock */ struct hci_conn_params *hci_pend_le_action_lookup(struct list_head *list, bdaddr_t *addr, u8 addr_type) { struct hci_conn_params *param;
- list_for_each_entry(param, list, action) { + rcu_read_lock(); + + list_for_each_entry_rcu(param, list, action) { if (bacmp(¶m->addr, addr) == 0 && - param->addr_type == addr_type) + param->addr_type == addr_type) { + rcu_read_unlock(); return param; + } }
+ rcu_read_unlock(); + return NULL; }
+/* This function requires the caller holds hdev->lock */ +void hci_pend_le_list_del_init(struct hci_conn_params *param) +{ + if (list_empty(¶m->action)) + return; + + list_del_rcu(¶m->action); + synchronize_rcu(); + INIT_LIST_HEAD(¶m->action); +} + +/* This function requires the caller holds hdev->lock */ +void hci_pend_le_list_add(struct hci_conn_params *param, + struct list_head *list) +{ + list_add_rcu(¶m->action, list); +} + /* This function requires the caller holds hdev->lock */ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type) @@ -2297,14 +2321,15 @@ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev, return params; }
-static void hci_conn_params_free(struct hci_conn_params *params) +void hci_conn_params_free(struct hci_conn_params *params) { + hci_pend_le_list_del_init(params); + if (params->conn) { hci_conn_drop(params->conn); hci_conn_put(params->conn); }
- list_del(¶ms->action); list_del(¶ms->list); kfree(params); } @@ -2342,8 +2367,7 @@ void hci_conn_params_clear_disabled(struct hci_dev *hdev) continue; }
- list_del(¶ms->list); - kfree(params); + hci_conn_params_free(params); }
BT_DBG("All LE disabled connection parameters were removed"); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 21e26d3b286cc..72b6d189d3de2 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -1564,7 +1564,7 @@ static u8 hci_cc_le_set_privacy_mode(struct hci_dev *hdev, void *data,
params = hci_conn_params_lookup(hdev, &cp->bdaddr, cp->bdaddr_type); if (params) - params->privacy_mode = cp->mode; + WRITE_ONCE(params->privacy_mode, cp->mode);
hci_dev_unlock(hdev);
@@ -2804,8 +2804,8 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status)
case HCI_AUTO_CONN_DIRECT: case HCI_AUTO_CONN_ALWAYS: - list_del_init(¶ms->action); - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_del_init(params); + hci_pend_le_list_add(params, &hdev->pend_le_conns); break;
default: @@ -3423,8 +3423,8 @@ static void hci_disconn_complete_evt(struct hci_dev *hdev, void *data,
case HCI_AUTO_CONN_DIRECT: case HCI_AUTO_CONN_ALWAYS: - list_del_init(¶ms->action); - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_del_init(params); + hci_pend_le_list_add(params, &hdev->pend_le_conns); hci_update_passive_scan(hdev); break;
@@ -5961,7 +5961,7 @@ static void le_conn_complete_evt(struct hci_dev *hdev, u8 status, params = hci_pend_le_action_lookup(&hdev->pend_le_conns, &conn->dst, conn->dst_type); if (params) { - list_del_init(¶ms->action); + hci_pend_le_list_del_init(params); if (params->conn) { hci_conn_drop(params->conn); hci_conn_put(params->conn); diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index b5b1b610df335..1bcb54272dc67 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -2160,15 +2160,23 @@ static int hci_le_del_accept_list_sync(struct hci_dev *hdev, return 0; }
+struct conn_params { + bdaddr_t addr; + u8 addr_type; + hci_conn_flags_t flags; + u8 privacy_mode; +}; + /* Adds connection to resolve list if needed. * Setting params to NULL programs local hdev->irk */ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev, - struct hci_conn_params *params) + struct conn_params *params) { struct hci_cp_le_add_to_resolv_list cp; struct smp_irk *irk; struct bdaddr_list_with_irk *entry; + struct hci_conn_params *p;
if (!use_ll_privacy(hdev)) return 0; @@ -2203,6 +2211,16 @@ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev, /* Default privacy mode is always Network */ params->privacy_mode = HCI_NETWORK_PRIVACY;
+ rcu_read_lock(); + p = hci_pend_le_action_lookup(&hdev->pend_le_conns, + ¶ms->addr, params->addr_type); + if (!p) + p = hci_pend_le_action_lookup(&hdev->pend_le_reports, + ¶ms->addr, params->addr_type); + if (p) + WRITE_ONCE(p->privacy_mode, HCI_NETWORK_PRIVACY); + rcu_read_unlock(); + done: if (hci_dev_test_flag(hdev, HCI_PRIVACY)) memcpy(cp.local_irk, hdev->irk, 16); @@ -2215,7 +2233,7 @@ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev,
/* Set Device Privacy Mode. */ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev, - struct hci_conn_params *params) + struct conn_params *params) { struct hci_cp_le_set_privacy_mode cp; struct smp_irk *irk; @@ -2240,6 +2258,8 @@ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev, bacpy(&cp.bdaddr, &irk->bdaddr); cp.mode = HCI_DEVICE_PRIVACY;
+ /* Note: params->privacy_mode is not updated since it is a copy */ + return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_PRIVACY_MODE, sizeof(cp), &cp, HCI_CMD_TIMEOUT); } @@ -2249,7 +2269,7 @@ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev, * properly set the privacy mode. */ static int hci_le_add_accept_list_sync(struct hci_dev *hdev, - struct hci_conn_params *params, + struct conn_params *params, u8 *num_entries) { struct hci_cp_le_add_to_accept_list cp; @@ -2447,6 +2467,52 @@ struct sk_buff *hci_read_local_oob_data_sync(struct hci_dev *hdev, return __hci_cmd_sync_sk(hdev, opcode, 0, NULL, 0, HCI_CMD_TIMEOUT, sk); }
+static struct conn_params *conn_params_copy(struct list_head *list, size_t *n) +{ + struct hci_conn_params *params; + struct conn_params *p; + size_t i; + + rcu_read_lock(); + + i = 0; + list_for_each_entry_rcu(params, list, action) + ++i; + *n = i; + + rcu_read_unlock(); + + p = kvcalloc(*n, sizeof(struct conn_params), GFP_KERNEL); + if (!p) + return NULL; + + rcu_read_lock(); + + i = 0; + list_for_each_entry_rcu(params, list, action) { + /* Racing adds are handled in next scan update */ + if (i >= *n) + break; + + /* No hdev->lock, but: addr, addr_type are immutable. + * privacy_mode is only written by us or in + * hci_cc_le_set_privacy_mode that we wait for. + * We should be idempotent so MGMT updating flags + * while we are processing is OK. + */ + bacpy(&p[i].addr, ¶ms->addr); + p[i].addr_type = params->addr_type; + p[i].flags = READ_ONCE(params->flags); + p[i].privacy_mode = READ_ONCE(params->privacy_mode); + ++i; + } + + rcu_read_unlock(); + + *n = i; + return p; +} + /* Device must not be scanning when updating the accept list. * * Update is done using the following sequence: @@ -2466,11 +2532,12 @@ struct sk_buff *hci_read_local_oob_data_sync(struct hci_dev *hdev, */ static u8 hci_update_accept_list_sync(struct hci_dev *hdev) { - struct hci_conn_params *params; + struct conn_params *params; struct bdaddr_list *b, *t; u8 num_entries = 0; bool pend_conn, pend_report; u8 filter_policy; + size_t i, n; int err;
/* Pause advertising if resolving list can be used as controllers @@ -2504,6 +2571,7 @@ static u8 hci_update_accept_list_sync(struct hci_dev *hdev) if (hci_conn_hash_lookup_le(hdev, &b->bdaddr, b->bdaddr_type)) continue;
+ /* Pointers not dereferenced, no locks needed */ pend_conn = hci_pend_le_action_lookup(&hdev->pend_le_conns, &b->bdaddr, b->bdaddr_type); @@ -2532,23 +2600,50 @@ static u8 hci_update_accept_list_sync(struct hci_dev *hdev) * available accept list entries in the controller, then * just abort and return filer policy value to not use the * accept list. + * + * The list and params may be mutated while we wait for events, + * so make a copy and iterate it. */ - list_for_each_entry(params, &hdev->pend_le_conns, action) { - err = hci_le_add_accept_list_sync(hdev, params, &num_entries); - if (err) + + params = conn_params_copy(&hdev->pend_le_conns, &n); + if (!params) { + err = -ENOMEM; + goto done; + } + + for (i = 0; i < n; ++i) { + err = hci_le_add_accept_list_sync(hdev, ¶ms[i], + &num_entries); + if (err) { + kvfree(params); goto done; + } }
+ kvfree(params); + /* After adding all new pending connections, walk through * the list of pending reports and also add these to the * accept list if there is still space. Abort if space runs out. */ - list_for_each_entry(params, &hdev->pend_le_reports, action) { - err = hci_le_add_accept_list_sync(hdev, params, &num_entries); - if (err) + + params = conn_params_copy(&hdev->pend_le_reports, &n); + if (!params) { + err = -ENOMEM; + goto done; + } + + for (i = 0; i < n; ++i) { + err = hci_le_add_accept_list_sync(hdev, ¶ms[i], + &num_entries); + if (err) { + kvfree(params); goto done; + } }
+ kvfree(params); + /* Use the allowlist unless the following conditions are all true: * - We are not currently suspending * - There are 1 or more ADV monitors registered and it's not offloaded @@ -4839,12 +4934,12 @@ static void hci_pend_le_actions_clear(struct hci_dev *hdev) struct hci_conn_params *p;
list_for_each_entry(p, &hdev->le_conn_params, list) { + hci_pend_le_list_del_init(p); if (p->conn) { hci_conn_drop(p->conn); hci_conn_put(p->conn); p->conn = NULL; } - list_del_init(&p->action); }
BT_DBG("All LE pending actions cleared"); diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index f7b2d0971f240..1e07d0f289723 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -1297,15 +1297,15 @@ static void restart_le_actions(struct hci_dev *hdev) /* Needed for AUTO_OFF case where might not "really" * have been powered off. */ - list_del_init(&p->action); + hci_pend_le_list_del_init(p);
switch (p->auto_connect) { case HCI_AUTO_CONN_DIRECT: case HCI_AUTO_CONN_ALWAYS: - list_add(&p->action, &hdev->pend_le_conns); + hci_pend_le_list_add(p, &hdev->pend_le_conns); break; case HCI_AUTO_CONN_REPORT: - list_add(&p->action, &hdev->pend_le_reports); + hci_pend_le_list_add(p, &hdev->pend_le_reports); break; default: break; @@ -5169,7 +5169,7 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data, goto unlock; }
- params->flags = current_flags; + WRITE_ONCE(params->flags, current_flags); status = MGMT_STATUS_SUCCESS;
/* Update passive scan if HCI_CONN_FLAG_DEVICE_PRIVACY @@ -7580,7 +7580,7 @@ static int hci_conn_params_set(struct hci_dev *hdev, bdaddr_t *addr, if (params->auto_connect == auto_connect) return 0;
- list_del_init(¶ms->action); + hci_pend_le_list_del_init(params);
switch (auto_connect) { case HCI_AUTO_CONN_DISABLED: @@ -7589,18 +7589,18 @@ static int hci_conn_params_set(struct hci_dev *hdev, bdaddr_t *addr, * connect to device, keep connecting. */ if (params->explicit_connect) - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_add(params, &hdev->pend_le_conns); break; case HCI_AUTO_CONN_REPORT: if (params->explicit_connect) - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_add(params, &hdev->pend_le_conns); else - list_add(¶ms->action, &hdev->pend_le_reports); + hci_pend_le_list_add(params, &hdev->pend_le_reports); break; case HCI_AUTO_CONN_DIRECT: case HCI_AUTO_CONN_ALWAYS: if (!is_connected(hdev, addr, addr_type)) - list_add(¶ms->action, &hdev->pend_le_conns); + hci_pend_le_list_add(params, &hdev->pend_le_conns); break; }
@@ -7823,9 +7823,7 @@ static int remove_device(struct sock *sk, struct hci_dev *hdev, goto unlock; }
- list_del(¶ms->action); - list_del(¶ms->list); - kfree(params); + hci_conn_params_free(params);
device_removed(sk, hdev, &cp->addr.bdaddr, cp->addr.type); } else { @@ -7856,9 +7854,7 @@ static int remove_device(struct sock *sk, struct hci_dev *hdev, p->auto_connect = HCI_AUTO_CONN_EXPLICIT; continue; } - list_del(&p->action); - list_del(&p->list); - kfree(p); + hci_conn_params_free(p); }
bt_dev_dbg(hdev, "All LE connection parameters were removed");
From: Pauli Virtanen pav@iki.fi
[ Upstream commit 7f7cfcb6f0825652973b780f248603e23f16ee90 ]
In hci_cs_disconnect, we do hci_conn_del even if disconnection failed.
ISO, L2CAP and SCO connections refer to the hci_conn without hci_conn_get, so disconn_cfm must be called so they can clean up their conn, otherwise use-after-free occurs.
ISO: ========================================================== iso_sock_connect:880: sk 00000000eabd6557 iso_connect_cis:356: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7e:da ... iso_conn_add:140: hcon 000000001696f1fd conn 00000000b6251073 hci_dev_put:1487: hci0 orig refcnt 17 __iso_chan_add:214: conn 00000000b6251073 iso_sock_clear_timer:117: sock 00000000eabd6557 state 3 ... hci_rx_work:4085: hci0 Event packet hci_event_packet:7601: hci0: event 0x0f hci_cmd_status_evt:4346: hci0: opcode 0x0406 hci_cs_disconnect:2760: hci0: status 0x0c hci_sent_cmd_data:3107: hci0 opcode 0x0406 hci_conn_del:1151: hci0 hcon 000000001696f1fd handle 2560 hci_conn_unlink:1102: hci0: hcon 000000001696f1fd hci_conn_drop:1451: hcon 00000000d8521aaf orig refcnt 2 hci_chan_list_flush:2780: hcon 000000001696f1fd hci_dev_put:1487: hci0 orig refcnt 21 hci_dev_put:1487: hci0 orig refcnt 20 hci_req_cmd_complete:3978: opcode 0x0406 status 0x0c ... <no iso_* activity on sk/conn> ... iso_sock_sendmsg:1098: sock 00000000dea5e2e0, sk 00000000eabd6557 BUG: kernel NULL pointer dereference, address: 0000000000000668 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 RIP: 0010:iso_sock_sendmsg (net/bluetooth/iso.c:1112) bluetooth ==========================================================
L2CAP: ================================================================== hci_cmd_status_evt:4359: hci0: opcode 0x0406 hci_cs_disconnect:2760: hci0: status 0x0c hci_sent_cmd_data:3085: hci0 opcode 0x0406 hci_conn_del:1151: hci0 hcon ffff88800c999000 handle 3585 hci_conn_unlink:1102: hci0: hcon ffff88800c999000 hci_chan_list_flush:2780: hcon ffff88800c999000 hci_chan_del:2761: hci0 hcon ffff88800c999000 chan ffff888018ddd280 ... BUG: KASAN: slab-use-after-free in hci_send_acl+0x2d/0x540 [bluetooth] Read of size 8 at addr ffff888018ddd298 by task bluetoothd/1175
CPU: 0 PID: 1175 Comm: bluetoothd Tainted: G E 6.4.0-rc4+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5b/0x90 print_report+0xcf/0x670 ? __virt_addr_valid+0xf8/0x180 ? hci_send_acl+0x2d/0x540 [bluetooth] kasan_report+0xa8/0xe0 ? hci_send_acl+0x2d/0x540 [bluetooth] hci_send_acl+0x2d/0x540 [bluetooth] ? __pfx___lock_acquire+0x10/0x10 l2cap_chan_send+0x1fd/0x1300 [bluetooth] ? l2cap_sock_sendmsg+0xf2/0x170 [bluetooth] ? __pfx_l2cap_chan_send+0x10/0x10 [bluetooth] ? lock_release+0x1d5/0x3c0 ? mark_held_locks+0x1a/0x90 l2cap_sock_sendmsg+0x100/0x170 [bluetooth] sock_write_iter+0x275/0x280 ? __pfx_sock_write_iter+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 do_iter_readv_writev+0x176/0x220 ? __pfx_do_iter_readv_writev+0x10/0x10 ? find_held_lock+0x83/0xa0 ? selinux_file_permission+0x13e/0x210 do_iter_write+0xda/0x340 vfs_writev+0x1b4/0x400 ? __pfx_vfs_writev+0x10/0x10 ? __seccomp_filter+0x112/0x750 ? populate_seccomp_data+0x182/0x220 ? __fget_light+0xdf/0x100 ? do_writev+0x19d/0x210 do_writev+0x19d/0x210 ? __pfx_do_writev+0x10/0x10 ? mark_held_locks+0x1a/0x90 do_syscall_64+0x60/0x90 ? lockdep_hardirqs_on_prepare+0x149/0x210 ? do_syscall_64+0x6c/0x90 ? lockdep_hardirqs_on_prepare+0x149/0x210 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7ff45cb23e64 Code: 15 d1 1f 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d 9d a7 0d 00 00 74 13 b8 14 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89 RSP: 002b:00007fff21ae09b8 EFLAGS: 00000202 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007ff45cb23e64 RDX: 0000000000000001 RSI: 00007fff21ae0aa0 RDI: 0000000000000017 RBP: 00007fff21ae0aa0 R08: 000000000095a8a0 R09: 0000607000053f40 R10: 0000000000000001 R11: 0000000000000202 R12: 00007fff21ae0ac0 R13: 00000fffe435c150 R14: 00007fff21ae0a80 R15: 000060f000000040 </TASK>
Allocated by task 771: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0xaa/0xb0 hci_chan_create+0x67/0x1b0 [bluetooth] l2cap_conn_add.part.0+0x17/0x590 [bluetooth] l2cap_connect_cfm+0x266/0x6b0 [bluetooth] hci_le_remote_feat_complete_evt+0x167/0x310 [bluetooth] hci_event_packet+0x38d/0x800 [bluetooth] hci_rx_work+0x287/0xb20 [bluetooth] process_one_work+0x4f7/0x970 worker_thread+0x8f/0x620 kthread+0x17f/0x1c0 ret_from_fork+0x2c/0x50
Freed by task 771: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2e/0x50 ____kasan_slab_free+0x169/0x1c0 slab_free_freelist_hook+0x9e/0x1c0 __kmem_cache_free+0xc0/0x310 hci_chan_list_flush+0x46/0x90 [bluetooth] hci_conn_cleanup+0x7d/0x330 [bluetooth] hci_cs_disconnect+0x35d/0x530 [bluetooth] hci_cmd_status_evt+0xef/0x2b0 [bluetooth] hci_event_packet+0x38d/0x800 [bluetooth] hci_rx_work+0x287/0xb20 [bluetooth] process_one_work+0x4f7/0x970 worker_thread+0x8f/0x620 kthread+0x17f/0x1c0 ret_from_fork+0x2c/0x50 ==================================================================
Fixes: b8d290525e39 ("Bluetooth: clean up connection in hci_cs_disconnect") Signed-off-by: Pauli Virtanen pav@iki.fi Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_event.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 72b6d189d3de2..cb0b5fe7a6f8c 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -2784,6 +2784,9 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status) hci_enable_advertising(hdev); }
+ /* Inform sockets conn is gone before we delete it */ + hci_disconn_cfm(conn, HCI_ERROR_UNSPECIFIED); + goto done; }
From: Pauli Virtanen pav@iki.fi
[ Upstream commit d40ae85ee62e3666f45bc61864b22121346f88ef ]
sk->sk_state indicates whether iso_pi(sk)->conn is valid. Operations that check/update sk_state and access conn should hold lock_sock, otherwise they can race.
The order of taking locks is hci_dev_lock > lock_sock > iso_conn_lock, which is how it is in connect/disconnect_cfm -> iso_conn_del -> iso_chan_del.
Fix locking in iso_connect_cis/bis and sendmsg/recvmsg to take lock_sock around updating sk_state and conn.
iso_conn_del must not occur during iso_connect_cis/bis, as it frees the iso_conn. Hold hdev->lock longer to prevent that.
This should not reintroduce the issue fixed in commit 241f51931c35 ("Bluetooth: ISO: Avoid circular locking dependency"), since the we acquire locks in order. We retain the fix in iso_sock_connect to release lock_sock before iso_connect_* acquires hdev->lock.
Similarly for commit 6a5ad251b7cd ("Bluetooth: ISO: Fix possible circular locking dependency"). We retain the fix in iso_conn_ready to not acquire iso_conn_lock before lock_sock.
iso_conn_add shall return iso_conn with valid hcon. Make it so also when reusing an old CIS connection waiting for disconnect timeout (see __iso_sock_close where conn->hcon is set to NULL).
Trace with iso_conn_del after iso_chan_add in iso_connect_cis: =============================================================== iso_sock_create:771: sock 00000000be9b69b7 iso_sock_init:693: sk 000000004dff667e iso_sock_bind:827: sk 000000004dff667e 70:1a:b8:98:ff:a2 type 1 iso_sock_setsockopt:1289: sk 000000004dff667e iso_sock_setsockopt:1289: sk 000000004dff667e iso_sock_setsockopt:1289: sk 000000004dff667e iso_sock_connect:875: sk 000000004dff667e iso_connect_cis:353: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7e:da hci_get_route:1199: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7e:da hci_conn_add:1005: hci0 dst 28:3d:c2:4a:7e:da iso_conn_add:140: hcon 000000007b65d182 conn 00000000daf8625e __iso_chan_add:214: conn 00000000daf8625e iso_connect_cfm:1700: hcon 000000007b65d182 bdaddr 28:3d:c2:4a:7e:da status 12 iso_conn_del:187: hcon 000000007b65d182 conn 00000000daf8625e, err 16 iso_sock_clear_timer:117: sock 000000004dff667e state 3 <Note: sk_state is BT_BOUND (3), so iso_connect_cis is still running at this point> iso_chan_del:153: sk 000000004dff667e, conn 00000000daf8625e, err 16 hci_conn_del:1151: hci0 hcon 000000007b65d182 handle 65535 hci_conn_unlink:1102: hci0: hcon 000000007b65d182 hci_chan_list_flush:2780: hcon 000000007b65d182 iso_sock_getsockopt:1376: sk 000000004dff667e iso_sock_getname:1070: sock 00000000be9b69b7, sk 000000004dff667e iso_sock_getname:1070: sock 00000000be9b69b7, sk 000000004dff667e iso_sock_getsockopt:1376: sk 000000004dff667e iso_sock_getname:1070: sock 00000000be9b69b7, sk 000000004dff667e iso_sock_getname:1070: sock 00000000be9b69b7, sk 000000004dff667e iso_sock_shutdown:1434: sock 00000000be9b69b7, sk 000000004dff667e, how 1 __iso_sock_close:632: sk 000000004dff667e state 5 socket 00000000be9b69b7 <Note: sk_state is BT_CONNECT (5), even though iso_chan_del sets BT_CLOSED (6). Only iso_connect_cis sets it to BT_CONNECT, so it must be that iso_chan_del occurred between iso_chan_add and end of iso_connect_cis.> BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 8000000006467067 P4D 8000000006467067 PUD 3f5f067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 RIP: 0010:__iso_sock_close (net/bluetooth/iso.c:664) bluetooth ===============================================================
Trace with iso_conn_del before iso_chan_add in iso_connect_cis: =============================================================== iso_connect_cis:356: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7e:da ... iso_conn_add:140: hcon 0000000093bc551f conn 00000000768ae504 hci_dev_put:1487: hci0 orig refcnt 21 hci_event_packet:7607: hci0: event 0x0e hci_cmd_complete_evt:4231: hci0: opcode 0x2062 hci_cc_le_set_cig_params:3846: hci0: status 0x07 hci_sent_cmd_data:3107: hci0 opcode 0x2062 iso_connect_cfm:1703: hcon 0000000093bc551f bdaddr 28:3d:c2:4a:7e:da status 7 iso_conn_del:187: hcon 0000000093bc551f conn 00000000768ae504, err 12 hci_conn_del:1151: hci0 hcon 0000000093bc551f handle 65535 hci_conn_unlink:1102: hci0: hcon 0000000093bc551f hci_chan_list_flush:2780: hcon 0000000093bc551f __iso_chan_add:214: conn 00000000768ae504 <Note: this conn was already freed in iso_conn_del above> iso_sock_clear_timer:117: sock 0000000098323f95 state 3 general protection fault, probably for non-canonical address 0x30b29c630930aec8: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 1920 Comm: bluetoothd Tainted: G E 6.3.0-rc7+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 RIP: 0010:detach_if_pending+0x28/0xd0 Code: 90 90 0f 1f 44 00 00 48 8b 47 08 48 85 c0 0f 84 ad 00 00 00 55 89 d5 53 48 83 3f 00 48 89 fb 74 7d 66 90 48 8b 03 48 8b 53 08 <> RSP: 0018:ffffb90841a67d08 EFLAGS: 00010007 RAX: 0000000000000000 RBX: ffff9141bd5061b8 RCX: 0000000000000000 RDX: 30b29c630930aec8 RSI: ffff9141fdd21e80 RDI: ffff9141bd5061b8 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffb90841a67b88 R10: 0000000000000003 R11: ffffffff8613f558 R12: ffff9141fdd21e80 R13: 0000000000000000 R14: ffff9141b5976010 R15: ffff914185755338 FS: 00007f45768bd840(0000) GS:ffff9141fdd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000619000424074 CR3: 0000000009f5e005 CR4: 0000000000170ee0 Call Trace: <TASK> timer_delete+0x48/0x80 try_to_grab_pending+0xdf/0x170 __cancel_work+0x37/0xb0 iso_connect_cis+0x141/0x400 [bluetooth] ===============================================================
Trace with NULL conn->hcon in state BT_CONNECT: =============================================================== __iso_sock_close:619: sk 00000000f7c71fc5 state 1 socket 00000000d90c5fe5 ... __iso_sock_close:619: sk 00000000f7c71fc5 state 8 socket 00000000d90c5fe5 iso_chan_del:153: sk 00000000f7c71fc5, conn 0000000022c03a7e, err 104 ... iso_sock_connect:862: sk 00000000129b56c3 iso_connect_cis:348: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7d:2a hci_get_route:1199: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7d:2a hci_dev_hold:1495: hci0 orig refcnt 19 __iso_chan_add:214: conn 0000000022c03a7e <Note: reusing old conn> iso_sock_clear_timer:117: sock 00000000129b56c3 state 3 ... iso_sock_ready:1485: sk 00000000129b56c3 ... iso_sock_sendmsg:1077: sock 00000000e5013966, sk 00000000129b56c3 BUG: kernel NULL pointer dereference, address: 00000000000006a8 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 1403 Comm: wireplumber Tainted: G E 6.3.0-rc7+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 RIP: 0010:iso_sock_sendmsg+0x63/0x2a0 [bluetooth] ===============================================================
Fixes: 241f51931c35 ("Bluetooth: ISO: Avoid circular locking dependency") Fixes: 6a5ad251b7cd ("Bluetooth: ISO: Fix possible circular locking dependency") Signed-off-by: Pauli Virtanen pav@iki.fi Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/iso.c | 53 ++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 22 deletions(-)
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index 34d55a85d8f6f..94d5bc104fede 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -123,8 +123,11 @@ static struct iso_conn *iso_conn_add(struct hci_conn *hcon) { struct iso_conn *conn = hcon->iso_data;
- if (conn) + if (conn) { + if (!conn->hcon) + conn->hcon = hcon; return conn; + }
conn = kzalloc(sizeof(*conn), GFP_KERNEL); if (!conn) @@ -300,14 +303,13 @@ static int iso_connect_bis(struct sock *sk) goto unlock; }
- hci_dev_unlock(hdev); - hci_dev_put(hdev); + lock_sock(sk);
err = iso_chan_add(conn, sk, NULL); - if (err) - return err; - - lock_sock(sk); + if (err) { + release_sock(sk); + goto unlock; + }
/* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); @@ -321,7 +323,6 @@ static int iso_connect_bis(struct sock *sk) }
release_sock(sk); - return err;
unlock: hci_dev_unlock(hdev); @@ -389,14 +390,13 @@ static int iso_connect_cis(struct sock *sk) goto unlock; }
- hci_dev_unlock(hdev); - hci_dev_put(hdev); + lock_sock(sk);
err = iso_chan_add(conn, sk, NULL); - if (err) - return err; - - lock_sock(sk); + if (err) { + release_sock(sk); + goto unlock; + }
/* Update source addr of the socket */ bacpy(&iso_pi(sk)->src, &hcon->src); @@ -413,7 +413,6 @@ static int iso_connect_cis(struct sock *sk) }
release_sock(sk); - return err;
unlock: hci_dev_unlock(hdev); @@ -1072,8 +1071,8 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; - struct iso_conn *conn = iso_pi(sk)->conn; struct sk_buff *skb, **frag; + size_t mtu; int err;
BT_DBG("sock %p, sk %p", sock, sk); @@ -1085,11 +1084,18 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg, if (msg->msg_flags & MSG_OOB) return -EOPNOTSUPP;
- if (sk->sk_state != BT_CONNECTED) + lock_sock(sk); + + if (sk->sk_state != BT_CONNECTED) { + release_sock(sk); return -ENOTCONN; + } + + mtu = iso_pi(sk)->conn->hcon->hdev->iso_mtu; + + release_sock(sk);
- skb = bt_skb_sendmsg(sk, msg, len, conn->hcon->hdev->iso_mtu, - HCI_ISO_DATA_HDR_SIZE, 0); + skb = bt_skb_sendmsg(sk, msg, len, mtu, HCI_ISO_DATA_HDR_SIZE, 0); if (IS_ERR(skb)) return PTR_ERR(skb);
@@ -1102,8 +1108,7 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg, while (len) { struct sk_buff *tmp;
- tmp = bt_skb_sendmsg(sk, msg, len, conn->hcon->hdev->iso_mtu, - 0, 0); + tmp = bt_skb_sendmsg(sk, msg, len, mtu, 0, 0); if (IS_ERR(tmp)) { kfree_skb(skb); return PTR_ERR(tmp); @@ -1158,15 +1163,19 @@ static int iso_sock_recvmsg(struct socket *sock, struct msghdr *msg, BT_DBG("sk %p", sk);
if (test_and_clear_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) { + lock_sock(sk); switch (sk->sk_state) { case BT_CONNECT2: - lock_sock(sk); iso_conn_defer_accept(pi->conn->hcon); sk->sk_state = BT_CONFIG; release_sock(sk); return 0; case BT_CONNECT: + release_sock(sk); return iso_connect_cis(sk); + default: + release_sock(sk); + break; } }
From: Douglas Anderson dianders@chromium.org
[ Upstream commit de6dfcefd107667ce2dbedf4d9337f5ed557a4a1 ]
KASAN reports that there's a use-after-free in hci_remove_adv_monitor(). Trawling through the disassembly, you can see that the complaint is from the access in bt_dev_dbg() under the HCI_ADV_MONITOR_EXT_MSFT case. The problem case happens because msft_remove_monitor() can end up freeing the monitor structure. Specifically: hci_remove_adv_monitor() -> msft_remove_monitor() -> msft_remove_monitor_sync() -> msft_le_cancel_monitor_advertisement_cb() -> hci_free_adv_monitor()
Let's fix the problem by just stashing the relevant data when it's still valid.
Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor") Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index b421e196f60c3..1ec83985f1ab0 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -1972,6 +1972,7 @@ static int hci_remove_adv_monitor(struct hci_dev *hdev, struct adv_monitor *monitor) { int status = 0; + int handle;
switch (hci_get_adv_monitor_offload_ext(hdev)) { case HCI_ADV_MONITOR_EXT_NONE: /* also goes here when powered off */ @@ -1980,9 +1981,10 @@ static int hci_remove_adv_monitor(struct hci_dev *hdev, goto free_monitor;
case HCI_ADV_MONITOR_EXT_MSFT: + handle = monitor->handle; status = msft_remove_monitor(hdev, monitor); bt_dev_dbg(hdev, "%s remove monitor %d msft status %d", - hdev->name, monitor->handle, status); + hdev->name, handle, status); break; }
From: Siddh Raman Pant code@siddh.me
[ Upstream commit b4066eb04bb67e7ff66e5aaab0db4a753f37eaad ]
hci_connect_sco currently returns NULL when there is no link (i.e. when hci_conn_link() returns NULL).
sco_connect() expects an ERR_PTR in case of any error (see line 266 in sco.c). Thus, hcon set as NULL passes through to sco_conn_add(), which tries to get hcon->hdev, resulting in dereferencing a NULL pointer as reported by syzkaller.
The same issue exists for iso_connect_cis() calling hci_connect_cis().
Thus, make hci_connect_sco() and hci_connect_cis() return ERR_PTR instead of NULL.
Reported-and-tested-by: syzbot+37acd5d80d00d609d233@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=37acd5d80d00d609d233 Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") Signed-off-by: Siddh Raman Pant code@siddh.me Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_conn.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 7b0c74ef93296..31c115b225e7e 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -1684,7 +1684,7 @@ struct hci_conn *hci_connect_sco(struct hci_dev *hdev, int type, bdaddr_t *dst, if (!link) { hci_conn_drop(acl); hci_conn_drop(sco); - return NULL; + return ERR_PTR(-ENOLINK); }
sco->setting = setting; @@ -2256,7 +2256,7 @@ struct hci_conn *hci_connect_cis(struct hci_dev *hdev, bdaddr_t *dst, if (!link) { hci_conn_drop(le); hci_conn_drop(cis); - return NULL; + return ERR_PTR(-ENOLINK); }
/* If LE is already connected and CIS handle is already set proceed to
From: Pauli Virtanen pav@iki.fi
[ Upstream commit 3dcaa192ac2159193bc6ab57bc5369dcb84edd8e ]
Operations that check/update sk_state and access conn should hold lock_sock, otherwise they can race.
The order of taking locks is hci_dev_lock > lock_sock > sco_conn_lock, which is how it is in connect/disconnect_cfm -> sco_conn_del -> sco_chan_del.
Fix locking in sco_connect to take lock_sock around updating sk_state and conn.
sco_conn_del must not occur during sco_connect, as it frees the sco_conn. Hold hdev->lock longer to prevent that.
sco_conn_add shall return sco_conn with valid hcon. Make it so also when reusing an old SCO connection waiting for disconnect timeout (see __sco_sock_close where conn->hcon is set to NULL).
This should not reintroduce the issue fixed in the earlier commit 9a8ec9e8ebb5 ("Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm"), the relevant fix of releasing lock_sock in sco_sock_connect before acquiring hdev->lock is retained.
These changes mirror similar fixes earlier in ISO sockets.
Fixes: 9a8ec9e8ebb5 ("Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm") Signed-off-by: Pauli Virtanen pav@iki.fi Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/sco.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index cd1a27ac555d0..7762604ddfc05 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -126,8 +126,11 @@ static struct sco_conn *sco_conn_add(struct hci_conn *hcon) struct hci_dev *hdev = hcon->hdev; struct sco_conn *conn = hcon->sco_data;
- if (conn) + if (conn) { + if (!conn->hcon) + conn->hcon = hcon; return conn; + }
conn = kzalloc(sizeof(struct sco_conn), GFP_KERNEL); if (!conn) @@ -268,21 +271,21 @@ static int sco_connect(struct sock *sk) goto unlock; }
- hci_dev_unlock(hdev); - hci_dev_put(hdev); - conn = sco_conn_add(hcon); if (!conn) { hci_conn_drop(hcon); - return -ENOMEM; + err = -ENOMEM; + goto unlock; }
- err = sco_chan_add(conn, sk, NULL); - if (err) - return err; - lock_sock(sk);
+ err = sco_chan_add(conn, sk, NULL); + if (err) { + release_sock(sk); + goto unlock; + } + /* Update source addr of the socket */ bacpy(&sco_pi(sk)->src, &hcon->src);
@@ -296,8 +299,6 @@ static int sco_connect(struct sock *sk)
release_sock(sk);
- return err; - unlock: hci_dev_unlock(hdev); hci_dev_put(hdev);
From: Tomasz Moń tomasz.mon@nordicsemi.no
[ Upstream commit 95b7015433053cd5f648ad2a7b8f43b2c99c949a ]
Commit c13380a55522 ("Bluetooth: btusb: Do not require hardcoded interface numbers") inadvertedly broke bluetooth on Intel Macbook 2014. The intention was to keep behavior intact when BTUSB_IFNUM_2 is set and otherwise allow any interface numbers. The problem is that the new logic condition omits the case where bInterfaceNumber is 0.
Fix BTUSB_IFNUM_2 handling by allowing both interface number 0 and 2 when the flag is set.
Fixes: c13380a55522 ("Bluetooth: btusb: Do not require hardcoded interface numbers") Reported-by: John Holland johnbholland@icloud.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217651 Signed-off-by: Tomasz Moń tomasz.mon@nordicsemi.no Tested-by: John Hollandjohnbholland@icloud.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 2a8e2bb038f58..50e23762ec5e9 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -4099,6 +4099,7 @@ static int btusb_probe(struct usb_interface *intf, BT_DBG("intf %p id %p", intf, id);
if ((id->driver_info & BTUSB_IFNUM_2) && + (intf->cur_altsetting->desc.bInterfaceNumber != 0) && (intf->cur_altsetting->desc.bInterfaceNumber != 2)) return -ENODEV;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 348b81b68b13ebd489a3e6a46aa1c384c731c919 ]
do_tcp_getsockopt() reads tp->tcp_tx_delay while another cpu might change its value.
Fixes: a842fe1425cb ("tcp: add optional per socket transmit delay") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-2-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8d20d9221238c..c0e0add372f75 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3783,7 +3783,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, case TCP_TX_DELAY: if (val) tcp_enable_tx_delay(); - tp->tcp_tx_delay = val; + WRITE_ONCE(tp->tcp_tx_delay, val); break; default: err = -ENOPROTOOPT; @@ -4263,7 +4263,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, break;
case TCP_TX_DELAY: - val = tp->tcp_tx_delay; + val = READ_ONCE(tp->tcp_tx_delay); break;
case TCP_TIMESTAMP:
From: Eric Dumazet edumazet@google.com
[ Upstream commit dd23c9f1e8d5c1d2e3d29393412385ccb9c7a948 ]
do_tcp_getsockopt() reads tp->tsoffset while another cpu might change its value.
Fixes: 93be6ce0e91b ("tcp: set and get per-socket timestamp") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-3-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 4 ++-- net/ipv4/tcp_ipv4.c | 5 +++-- 2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c0e0add372f75..15b1191411ec3 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3765,7 +3765,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (!tp->repair) err = -EPERM; else - tp->tsoffset = val - tcp_time_stamp_raw(); + WRITE_ONCE(tp->tsoffset, val - tcp_time_stamp_raw()); break; case TCP_REPAIR_WINDOW: err = tcp_repair_set_window(tp, optval, optlen); @@ -4267,7 +4267,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, break;
case TCP_TIMESTAMP: - val = tcp_time_stamp_raw() + tp->tsoffset; + val = tcp_time_stamp_raw() + READ_ONCE(tp->tsoffset); break; case TCP_NOTSENT_LOWAT: val = tp->notsent_lowat; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 5d3e49ceb6917..f37d13ee7b4cc 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -307,8 +307,9 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) inet->inet_daddr, inet->inet_sport, usin->sin_port)); - tp->tsoffset = secure_tcp_ts_off(net, inet->inet_saddr, - inet->inet_daddr); + WRITE_ONCE(tp->tsoffset, + secure_tcp_ts_off(net, inet->inet_saddr, + inet->inet_daddr)); }
inet->inet_id = get_random_u16();
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4164245c76ff906c9086758e1c3f87082a7f5ef5 ]
do_tcp_getsockopt() reads tp->keepalive_time while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-4-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tcp.h | 7 +++++-- net/ipv4/tcp.c | 3 ++- 2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 5066e4586cf09..9a12e8c09ea04 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1522,9 +1522,12 @@ static inline int keepalive_intvl_when(const struct tcp_sock *tp) static inline int keepalive_time_when(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp); + int val;
- return tp->keepalive_time ? : - READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time); + /* Paired with WRITE_ONCE() in tcp_sock_set_keepidle_locked() */ + val = READ_ONCE(tp->keepalive_time); + + return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time); }
static inline int keepalive_probes(const struct tcp_sock *tp) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 15b1191411ec3..c3b743093d482 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3421,7 +3421,8 @@ int tcp_sock_set_keepidle_locked(struct sock *sk, int val) if (val < 1 || val > MAX_TCP_KEEPIDLE) return -EINVAL;
- tp->keepalive_time = val * HZ; + /* Paired with WRITE_ONCE() in keepalive_time_when() */ + WRITE_ONCE(tp->keepalive_time, val * HZ); if (sock_flag(sk, SOCK_KEEPOPEN) && !((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) { u32 elapsed = keepalive_time_elapsed(tp);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 5ecf9d4f52ff2f1d4d44c9b68bc75688e82f13b4 ]
do_tcp_getsockopt() reads tp->keepalive_intvl while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-5-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tcp.h | 9 +++++++-- net/ipv4/tcp.c | 4 ++-- 2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 9a12e8c09ea04..45d50a40795da 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1514,9 +1514,14 @@ void tcp_leave_memory_pressure(struct sock *sk); static inline int keepalive_intvl_when(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp); + int val; + + /* Paired with WRITE_ONCE() in tcp_sock_set_keepintvl() + * and do_tcp_setsockopt(). + */ + val = READ_ONCE(tp->keepalive_intvl);
- return tp->keepalive_intvl ? : - READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl); + return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl); }
static inline int keepalive_time_when(const struct tcp_sock *tp) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c3b743093d482..514817119bd4d 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3454,7 +3454,7 @@ int tcp_sock_set_keepintvl(struct sock *sk, int val) return -EINVAL;
lock_sock(sk); - tcp_sk(sk)->keepalive_intvl = val * HZ; + WRITE_ONCE(tcp_sk(sk)->keepalive_intvl, val * HZ); release_sock(sk); return 0; } @@ -3668,7 +3668,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (val < 1 || val > MAX_TCP_KEEPINTVL) err = -EINVAL; else - tp->keepalive_intvl = val * HZ; + WRITE_ONCE(tp->keepalive_intvl, val * HZ); break; case TCP_KEEPCNT: if (val < 1 || val > MAX_TCP_KEEPCNT)
From: Eric Dumazet edumazet@google.com
[ Upstream commit 6e5e1de616bf5f3df1769abc9292191dfad9110a ]
do_tcp_getsockopt() reads tp->keepalive_probes while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-6-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tcp.h | 9 +++++++-- net/ipv4/tcp.c | 5 +++-- 2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index 45d50a40795da..f5c20afab6286 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1538,9 +1538,14 @@ static inline int keepalive_time_when(const struct tcp_sock *tp) static inline int keepalive_probes(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp); + int val; + + /* Paired with WRITE_ONCE() in tcp_sock_set_keepcnt() + * and do_tcp_setsockopt(). + */ + val = READ_ONCE(tp->keepalive_probes);
- return tp->keepalive_probes ? : - READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes); + return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes); }
static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 514817119bd4d..cc7966cfad1a3 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3466,7 +3466,8 @@ int tcp_sock_set_keepcnt(struct sock *sk, int val) return -EINVAL;
lock_sock(sk); - tcp_sk(sk)->keepalive_probes = val; + /* Paired with READ_ONCE() in keepalive_probes() */ + WRITE_ONCE(tcp_sk(sk)->keepalive_probes, val); release_sock(sk); return 0; } @@ -3674,7 +3675,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (val < 1 || val > MAX_TCP_KEEPCNT) err = -EINVAL; else - tp->keepalive_probes = val; + WRITE_ONCE(tp->keepalive_probes, val); break; case TCP_SYNCNT: if (val < 1 || val > MAX_TCP_SYNCNT)
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3a037f0f3c4bfe44518f2fbb478aa2f99a9cd8bb ]
do_tcp_getsockopt() and reqsk_timer_handler() read icsk->icsk_syn_retries while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-7-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_connection_sock.c | 2 +- net/ipv4/tcp.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 1386787eaf1a5..3105a676eba76 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -1016,7 +1016,7 @@ static void reqsk_timer_handler(struct timer_list *t)
icsk = inet_csk(sk_listener); net = sock_net(sk_listener); - max_syn_ack_retries = icsk->icsk_syn_retries ? : + max_syn_ack_retries = READ_ONCE(icsk->icsk_syn_retries) ? : READ_ONCE(net->ipv4.sysctl_tcp_synack_retries); /* Normally all the openreqs are young and become mature * (i.e. converted to established socket) for first timeout. diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index cc7966cfad1a3..488cf4ae75fab 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3400,7 +3400,7 @@ int tcp_sock_set_syncnt(struct sock *sk, int val) return -EINVAL;
lock_sock(sk); - inet_csk(sk)->icsk_syn_retries = val; + WRITE_ONCE(inet_csk(sk)->icsk_syn_retries, val); release_sock(sk); return 0; } @@ -3681,7 +3681,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (val < 1 || val > MAX_TCP_SYNCNT) err = -EINVAL; else - icsk->icsk_syn_retries = val; + WRITE_ONCE(icsk->icsk_syn_retries, val); break;
case TCP_SAVE_SYN: @@ -4102,7 +4102,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, val = keepalive_probes(tp); break; case TCP_SYNCNT: - val = icsk->icsk_syn_retries ? : + val = READ_ONCE(icsk->icsk_syn_retries) ? : READ_ONCE(net->ipv4.sysctl_tcp_syn_retries); break; case TCP_LINGER2:
From: Eric Dumazet edumazet@google.com
[ Upstream commit 9df5335ca974e688389c875546e5819778a80d59 ]
do_tcp_getsockopt() reads tp->linger2 while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-8-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 488cf4ae75fab..0ebe775bde688 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3694,11 +3694,11 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
case TCP_LINGER2: if (val < 0) - tp->linger2 = -1; + WRITE_ONCE(tp->linger2, -1); else if (val > TCP_FIN_TIMEOUT_MAX / HZ) - tp->linger2 = TCP_FIN_TIMEOUT_MAX; + WRITE_ONCE(tp->linger2, TCP_FIN_TIMEOUT_MAX); else - tp->linger2 = val * HZ; + WRITE_ONCE(tp->linger2, val * HZ); break;
case TCP_DEFER_ACCEPT: @@ -4106,7 +4106,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, READ_ONCE(net->ipv4.sysctl_tcp_syn_retries); break; case TCP_LINGER2: - val = tp->linger2; + val = READ_ONCE(tp->linger2); if (val >= 0) val = (val ? : READ_ONCE(net->ipv4.sysctl_tcp_fin_timeout)) / HZ; break;
From: Eric Dumazet edumazet@google.com
[ Upstream commit ae488c74422fb1dcd807c0201804b3b5e8a322a3 ]
do_tcp_getsockopt() reads rskq_defer_accept while another cpu might change its value.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-9-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0ebe775bde688..c95d8b43390b6 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3703,9 +3703,9 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
case TCP_DEFER_ACCEPT: /* Translate value in seconds to number of retransmits */ - icsk->icsk_accept_queue.rskq_defer_accept = - secs_to_retrans(val, TCP_TIMEOUT_INIT / HZ, - TCP_RTO_MAX / HZ); + WRITE_ONCE(icsk->icsk_accept_queue.rskq_defer_accept, + secs_to_retrans(val, TCP_TIMEOUT_INIT / HZ, + TCP_RTO_MAX / HZ)); break;
case TCP_WINDOW_CLAMP: @@ -4111,8 +4111,9 @@ int do_tcp_getsockopt(struct sock *sk, int level, val = (val ? : READ_ONCE(net->ipv4.sysctl_tcp_fin_timeout)) / HZ; break; case TCP_DEFER_ACCEPT: - val = retrans_to_secs(icsk->icsk_accept_queue.rskq_defer_accept, - TCP_TIMEOUT_INIT / HZ, TCP_RTO_MAX / HZ); + val = READ_ONCE(icsk->icsk_accept_queue.rskq_defer_accept); + val = retrans_to_secs(val, TCP_TIMEOUT_INIT / HZ, + TCP_RTO_MAX / HZ); break; case TCP_WINDOW_CLAMP: val = tp->window_clamp;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 1aeb87bc1440c5447a7fa2d6e3c2cca52cbd206b ]
tp->notsent_lowat can be read locklessly from do_tcp_getsockopt() and tcp_poll().
Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-10-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tcp.h | 6 +++++- net/ipv4/tcp.c | 4 ++-- 2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h index f5c20afab6286..182337a8cf94a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2066,7 +2066,11 @@ void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr); static inline u32 tcp_notsent_lowat(const struct tcp_sock *tp) { struct net *net = sock_net((struct sock *)tp); - return tp->notsent_lowat ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat); + u32 val; + + val = READ_ONCE(tp->notsent_lowat); + + return val ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat); }
bool tcp_stream_memory_free(const struct sock *sk, int wake); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c95d8b43390b6..4556ba6e7d74d 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3773,7 +3773,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, err = tcp_repair_set_window(tp, optval, optlen); break; case TCP_NOTSENT_LOWAT: - tp->notsent_lowat = val; + WRITE_ONCE(tp->notsent_lowat, val); sk->sk_write_space(sk); break; case TCP_INQ: @@ -4273,7 +4273,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, val = tcp_time_stamp_raw() + READ_ONCE(tp->tsoffset); break; case TCP_NOTSENT_LOWAT: - val = tp->notsent_lowat; + val = READ_ONCE(tp->notsent_lowat); break; case TCP_INQ: val = tp->recvmsg_inq;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 26023e91e12c68669db416b97234328a03d8e499 ]
This field can be read locklessly from do_tcp_getsockopt()
Fixes: dca43c75e7e5 ("tcp: Add TCP_USER_TIMEOUT socket option.") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-11-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4556ba6e7d74d..c9b955d9d7ace 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3409,7 +3409,7 @@ EXPORT_SYMBOL(tcp_sock_set_syncnt); void tcp_sock_set_user_timeout(struct sock *sk, u32 val) { lock_sock(sk); - inet_csk(sk)->icsk_user_timeout = val; + WRITE_ONCE(inet_csk(sk)->icsk_user_timeout, val); release_sock(sk); } EXPORT_SYMBOL(tcp_sock_set_user_timeout); @@ -3729,7 +3729,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, if (val < 0) err = -EINVAL; else - icsk->icsk_user_timeout = val; + WRITE_ONCE(icsk->icsk_user_timeout, val); break;
case TCP_FASTOPEN: @@ -4250,7 +4250,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, break;
case TCP_USER_TIMEOUT: - val = icsk->icsk_user_timeout; + val = READ_ONCE(icsk->icsk_user_timeout); break;
case TCP_FASTOPEN:
From: Eric Dumazet edumazet@google.com
[ Upstream commit 70f360dd7042cb843635ece9d28335a4addff9eb ]
This field can be read locklessly.
Fixes: 1536e2857bd3 ("tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner") Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230719212857.3943972-12-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/tcp.h | 2 +- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_fastopen.c | 6 ++++-- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h index b4c08ac869835..91a37c99ba665 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -513,7 +513,7 @@ static inline void fastopen_queue_tune(struct sock *sk, int backlog) struct request_sock_queue *queue = &inet_csk(sk)->icsk_accept_queue; int somaxconn = READ_ONCE(sock_net(sk)->core.sysctl_somaxconn);
- queue->fastopenq.max_qlen = min_t(unsigned int, backlog, somaxconn); + WRITE_ONCE(queue->fastopenq.max_qlen, min_t(unsigned int, backlog, somaxconn)); }
static inline void tcp_move_syn(struct tcp_sock *tp, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c9b955d9d7ace..79f29e138fc9f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4254,7 +4254,7 @@ int do_tcp_getsockopt(struct sock *sk, int level, break;
case TCP_FASTOPEN: - val = icsk->icsk_accept_queue.fastopenq.max_qlen; + val = READ_ONCE(icsk->icsk_accept_queue.fastopenq.max_qlen); break;
case TCP_FASTOPEN_CONNECT: diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 45cc7f1ca2961..85e4953f11821 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -296,6 +296,7 @@ static struct sock *tcp_fastopen_create_child(struct sock *sk, static bool tcp_fastopen_queue_check(struct sock *sk) { struct fastopen_queue *fastopenq; + int max_qlen;
/* Make sure the listener has enabled fastopen, and we don't * exceed the max # of pending TFO requests allowed before trying @@ -308,10 +309,11 @@ static bool tcp_fastopen_queue_check(struct sock *sk) * temporarily vs a server not supporting Fast Open at all. */ fastopenq = &inet_csk(sk)->icsk_accept_queue.fastopenq; - if (fastopenq->max_qlen == 0) + max_qlen = READ_ONCE(fastopenq->max_qlen); + if (max_qlen == 0) return false;
- if (fastopenq->qlen >= fastopenq->max_qlen) { + if (fastopenq->qlen >= max_qlen) { struct request_sock *req1; spin_lock(&fastopenq->lock); req1 = fastopenq->rskq_rst_head;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 1c613beaf877c0c0d755853dc62687e2013e55c4 ]
mdio_bus_init() and phy_driver_register() both have error paths, and if those are ever hit, ethtool will have a stale pointer to the phy_ethtool_phy_ops stub structure, which references memory from a module that failed to load (phylib).
It is probably hard to force an error in this code path even manually, but the error teardown path of phy_init() should be the same as phy_exit(), which is now simply not the case.
Fixes: 55d8f053ce1b ("net: phy: Register ethtool PHY operations") Link: https://lore.kernel.org/netdev/ZLaiJ4G6TaJYGJyU@shell.armlinux.org.uk/ Suggested-by: Russell King (Oracle) linux@armlinux.org.uk Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://lore.kernel.org/r/20230720000231.1939689-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 53598210be6cb..2c4e6de8f4d9f 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3452,23 +3452,30 @@ static int __init phy_init(void) { int rc;
+ ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops); + rc = mdio_bus_init(); if (rc) - return rc; + goto err_ethtool_phy_ops;
- ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops); features_init();
rc = phy_driver_register(&genphy_c45_driver, THIS_MODULE); if (rc) - goto err_c45; + goto err_mdio_bus;
rc = phy_driver_register(&genphy_driver, THIS_MODULE); - if (rc) { - phy_driver_unregister(&genphy_c45_driver); + if (rc) + goto err_c45; + + return 0; + err_c45: - mdio_bus_exit(); - } + phy_driver_unregister(&genphy_c45_driver); +err_mdio_bus: + mdio_bus_exit(); +err_ethtool_phy_ops: + ethtool_set_ethtool_phy_ops(NULL);
return rc; }
From: Zhang Yi yi.zhang@huawei.com
commit c2d6fd9d6f35079f1669f0100f05b46708c74b7f upstream.
There is a long-standing metadata corruption issue that happens from time to time, but it's very difficult to reproduce and analyse, benefit from the JBD2_CYCLE_RECORD option, we found out that the problem is the checkpointing process miss to write out some buffers which are raced by another do_get_write_access(). Looks below for detail.
jbd2_log_do_checkpoint() //transaction X //buffer A is dirty and not belones to any transaction __buffer_relink_io() //move it to the IO list __flush_batch() write_dirty_buffer() do_get_write_access() clear_buffer_dirty __jbd2_journal_file_buffer() //add buffer A to a new transaction Y lock_buffer(bh) //doesn't write out __jbd2_journal_remove_checkpoint() //finish checkpoint except buffer A //filesystem corrupt if the new transaction Y isn't fully write out.
Due to the t_checkpoint_list walking loop in jbd2_log_do_checkpoint() have already handles waiting for buffers under IO and re-added new transaction to complete commit, and it also removing cleaned buffers, this makes sure the list will eventually get empty. So it's fine to leave buffers on the t_checkpoint_list while flushing out and completely stop using the t_checkpoint_io_list.
Cc: stable@vger.kernel.org Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Zhang Yi yi.zhang@huawei.com Tested-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/checkpoint.c | 102 ++++++++++++++------------------------------------- 1 file changed, 29 insertions(+), 73 deletions(-)
--- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -58,28 +58,6 @@ static inline void __buffer_unlink(struc }
/* - * Move a buffer from the checkpoint list to the checkpoint io list - * - * Called with j_list_lock held - */ -static inline void __buffer_relink_io(struct journal_head *jh) -{ - transaction_t *transaction = jh->b_cp_transaction; - - __buffer_unlink_first(jh); - - if (!transaction->t_checkpoint_io_list) { - jh->b_cpnext = jh->b_cpprev = jh; - } else { - jh->b_cpnext = transaction->t_checkpoint_io_list; - jh->b_cpprev = transaction->t_checkpoint_io_list->b_cpprev; - jh->b_cpprev->b_cpnext = jh; - jh->b_cpnext->b_cpprev = jh; - } - transaction->t_checkpoint_io_list = jh; -} - -/* * Check a checkpoint buffer could be release or not. * * Requires j_list_lock @@ -183,6 +161,7 @@ __flush_batch(journal_t *journal, int *b struct buffer_head *bh = journal->j_chkpt_bhs[i]; BUFFER_TRACE(bh, "brelse"); __brelse(bh); + journal->j_chkpt_bhs[i] = NULL; } *batch_count = 0; } @@ -242,6 +221,11 @@ restart: jh = transaction->t_checkpoint_list; bh = jh2bh(jh);
+ /* + * The buffer may be writing back, or flushing out in the + * last couple of cycles, or re-adding into a new transaction, + * need to check it again until it's unlocked. + */ if (buffer_locked(bh)) { get_bh(bh); spin_unlock(&journal->j_list_lock); @@ -287,28 +271,32 @@ restart: } if (!buffer_dirty(bh)) { BUFFER_TRACE(bh, "remove from checkpoint"); - if (__jbd2_journal_remove_checkpoint(jh)) - /* The transaction was released; we're done */ + /* + * If the transaction was released or the checkpoint + * list was empty, we're done. + */ + if (__jbd2_journal_remove_checkpoint(jh) || + !transaction->t_checkpoint_list) goto out; - continue; + } else { + /* + * We are about to write the buffer, it could be + * raced by some other transaction shrink or buffer + * re-log logic once we release the j_list_lock, + * leave it on the checkpoint list and check status + * again to make sure it's clean. + */ + BUFFER_TRACE(bh, "queue"); + get_bh(bh); + J_ASSERT_BH(bh, !buffer_jwrite(bh)); + journal->j_chkpt_bhs[batch_count++] = bh; + transaction->t_chp_stats.cs_written++; + transaction->t_checkpoint_list = jh->b_cpnext; } - /* - * Important: we are about to write the buffer, and - * possibly block, while still holding the journal - * lock. We cannot afford to let the transaction - * logic start messing around with this buffer before - * we write it to disk, as that would break - * recoverability. - */ - BUFFER_TRACE(bh, "queue"); - get_bh(bh); - J_ASSERT_BH(bh, !buffer_jwrite(bh)); - journal->j_chkpt_bhs[batch_count++] = bh; - __buffer_relink_io(jh); - transaction->t_chp_stats.cs_written++; + if ((batch_count == JBD2_NR_BATCH) || - need_resched() || - spin_needbreak(&journal->j_list_lock)) + need_resched() || spin_needbreak(&journal->j_list_lock) || + jh2bh(transaction->t_checkpoint_list) == journal->j_chkpt_bhs[0]) goto unlock_and_flush; }
@@ -322,38 +310,6 @@ restart: goto restart; }
- /* - * Now we issued all of the transaction's buffers, let's deal - * with the buffers that are out for I/O. - */ -restart2: - /* Did somebody clean up the transaction in the meanwhile? */ - if (journal->j_checkpoint_transactions != transaction || - transaction->t_tid != this_tid) - goto out; - - while (transaction->t_checkpoint_io_list) { - jh = transaction->t_checkpoint_io_list; - bh = jh2bh(jh); - if (buffer_locked(bh)) { - get_bh(bh); - spin_unlock(&journal->j_list_lock); - wait_on_buffer(bh); - /* the journal_head may have gone by now */ - BUFFER_TRACE(bh, "brelse"); - __brelse(bh); - spin_lock(&journal->j_list_lock); - goto restart2; - } - - /* - * Now in whatever state the buffer currently is, we - * know that it has been written out and so we can - * drop it from the list - */ - if (__jbd2_journal_remove_checkpoint(jh)) - break; - } out: spin_unlock(&journal->j_list_lock); result = jbd2_cleanup_journal_tail(journal);
From: Miguel Ojeda ojeda@kernel.org
commit df01b7cfcef08bf3fdcac2909d0e1910781d6bfd upstream.
`rustc` outputs by default the temporary files (i.e. the ones saved by `-Csave-temps`, such as `*.rcgu*` files) in the current working directory when `-o` and `--out-dir` are not given (even if `--emit=x=path` is given, i.e. it does not use those for temporaries).
Since out-of-tree modules are compiled from the `linux` tree, `rustc` then tries to create them there, which may not be accessible.
Thus pass `--out-dir` explicitly, even if it is just for the temporary files.
Similarly, do so for Rust host programs too.
Reported-by: Raphael Nestler raphael.nestler@gmail.com Closes: https://github.com/Rust-for-Linux/linux/issues/1015 Reported-by: Andrea Righi andrea.righi@canonical.com Tested-by: Raphael Nestler raphael.nestler@gmail.com # non-hostprogs Tested-by: Andrea Righi andrea.righi@canonical.com # non-hostprogs Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc") Cc: stable@vger.kernel.org Signed-off-by: Miguel Ojeda ojeda@kernel.org Tested-by: Martin Rodriguez Reboredo yakoyoku@gmail.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/Makefile.build | 5 ++++- scripts/Makefile.host | 6 +++++- 2 files changed, 9 insertions(+), 2 deletions(-)
--- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -279,6 +279,9 @@ $(obj)/%.lst: $(src)/%.c FORCE
rust_allowed_features := core_ffi_c,explicit_generic_args_with_impl_trait,new_uninit,pin_macro
+# `--out-dir` is required to avoid temporaries being created by `rustc` in the +# current working directory, which may be not accessible in the out-of-tree +# modules case. rust_common_cmd = \ RUST_MODFILE=$(modfile) $(RUSTC_OR_CLIPPY) $(rust_flags) \ -Zallow-features=$(rust_allowed_features) \ @@ -287,7 +290,7 @@ rust_common_cmd = \ --extern alloc --extern kernel \ --crate-type rlib -L $(objtree)/rust/ \ --crate-name $(basename $(notdir $@)) \ - --emit=dep-info=$(depfile) + --out-dir $(dir $@) --emit=dep-info=$(depfile)
# `--emit=obj`, `--emit=asm` and `--emit=llvm-ir` imply a single codegen unit # will be used. We explicitly request `-Ccodegen-units=1` in any case, and --- a/scripts/Makefile.host +++ b/scripts/Makefile.host @@ -86,7 +86,11 @@ hostc_flags = -Wp,-MMD,$(depfile) \ hostcxx_flags = -Wp,-MMD,$(depfile) \ $(KBUILD_HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \ $(HOSTCXXFLAGS_$(target-stem).o) -hostrust_flags = --emit=dep-info=$(depfile) \ + +# `--out-dir` is required to avoid temporaries being created by `rustc` in the +# current working directory, which may be not accessible in the out-of-tree +# modules case. +hostrust_flags = --out-dir $(dir $@) --emit=dep-info=$(depfile) \ $(KBUILD_HOSTRUSTFLAGS) $(HOST_EXTRARUSTFLAGS) \ $(HOSTRUSTFLAGS_$(target-stem))
From: Mohamed Khalfella mkhalfella@purestorage.com
commit 4b8b3905165ef98386a3c06f196c85d21292d029 upstream.
Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") added a check to fail histogram creation if save_hist_vars() failed to add histogram to hist_vars list. But the commit failed to set ret to failed return code before jumping to unregister histogram, fix it.
Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella...
Cc: stable@vger.kernel.org Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") Signed-off-by: Mohamed Khalfella mkhalfella@purestorage.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_hist.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -6668,7 +6668,8 @@ static int event_hist_trigger_parse(stru goto out_unreg;
if (has_hist_vars(hist_data) || hist_data->n_var_refs) { - if (save_hist_vars(hist_data)) + ret = save_hist_vars(hist_data); + if (ret) goto out_unreg; }
From: Yunxiang Li Yunxiang.Li@amd.com
commit 4481913607e58196c48a4fef5e6f45350684ec3c upstream.
When the resource is the first in the bulk_move range, adding it again (thus moving it to the tail) will corrupt the list since the first pointer is not moved. This eventually lead to null pointer deref in ttm_lru_bulk_move_del()
Fixes: fee2ede15542 ("drm/ttm: rework bulk move handling v5") Signed-off-by: Yunxiang Li Yunxiang.Li@amd.com Reviewed-by: Christian König christian.koenig@amd.com CC: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230622141902.28718-3-Yunxian... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/ttm/ttm_resource.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -86,6 +86,8 @@ static void ttm_lru_bulk_move_pos_tail(s struct ttm_resource *res) { if (pos->last != res) { + if (pos->first == res) + pos->first = list_next_entry(res, lru); list_move(&res->lru, &pos->last->lru); pos->last = res; } @@ -111,7 +113,8 @@ static void ttm_lru_bulk_move_del(struct { struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
- if (unlikely(pos->first == res && pos->last == res)) { + if (unlikely(WARN_ON(!pos->first || !pos->last) || + (pos->first == res && pos->last == res))) { pos->first = NULL; pos->last = NULL; } else if (pos->first == res) {
From: Abe Kohandel abe.kohandel@intel.com
commit 5b6d0b91f84cff3f28724076f93f6f9e2ef8d775 upstream.
Remove a misleading comment about the DMA operations of the Intel Mount Evans SoC's SPI Controller as requested by Serge.
Signed-off-by: Abe Kohandel abe.kohandel@intel.com Link: https://lore.kernel.org/linux-spi/20230606191333.247ucbf7h3tlooxf@mobilestat... Fixes: 0760d5d0e9f0 ("spi: dw: Add compatible for Intel Mount Evans SoC") Reviewed-by: Serge Semin fancer.lancer@gmail.com Link: https://lore.kernel.org/r/20230606231844.726272-1-abe.kohandel@intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-dw-mmio.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/drivers/spi/spi-dw-mmio.c +++ b/drivers/spi/spi-dw-mmio.c @@ -237,14 +237,7 @@ static int dw_spi_intel_init(struct plat }
/* - * The Intel Mount Evans SoC's Integrated Management Complex uses the - * SPI controller for access to a NOR SPI FLASH. However, the SoC doesn't - * provide a mechanism to override the native chip select signal. - * - * This driver doesn't use DMA for memory operations when a chip select - * override is not provided due to the native chip select timing behavior. - * As a result no DMA configuration is done for the controller and this - * configuration is not tested. + * DMA-based mem ops are not configured for this device and are not tested. */ static int dw_spi_mountevans_imc_init(struct platform_device *pdev, struct dw_spi_mmio *dwsmmio)
From: Yu Kuai yukuai3@huawei.com
commit fcaa174a9c995cf0af3967e55644a1543ea07e36 upstream.
In order to prevent request_queue to be freed before cleaning up blktrace debugfs entries, commit db59133e9279 ("scsi: sg: fix blktrace debugfs entries leakage") use scsi_device_get(), however, scsi_device_get() will also grab scsi module reference and scsi module can't be removed.
It's reported that blktests can't unload scsi_debug after block/001:
blktests (master) # ./check block block/001 (stress device hotplugging) [failed] +++ /root/blktests/results/nodev/block/001.out.bad 2023-06-19 Running block/001 Stressing sd +modprobe: FATAL: Module scsi_debug is in use.
Fix this problem by grabbing request_queue reference directly, so that scsi host module can still be unloaded while request_queue will be pinged by sg device.
Reported-by: Chaitanya Kulkarni chaitanyak@nvidia.com Link: https://lore.kernel.org/all/1760da91-876d-fc9c-ab51-999a6f66ad50@nvidia.com/ Fixes: db59133e9279 ("scsi: sg: fix blktrace debugfs entries leakage") Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20230621160111.1433521-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/sg.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1496,7 +1496,7 @@ sg_add_device(struct device *cl_dev) int error; unsigned long iflags;
- error = scsi_device_get(scsidp); + error = blk_get_queue(scsidp->request_queue); if (error) return error;
@@ -1557,7 +1557,7 @@ cdev_add_err: out: if (cdev) cdev_del(cdev); - scsi_device_put(scsidp); + blk_put_queue(scsidp->request_queue); return error; }
@@ -1574,7 +1574,7 @@ sg_device_destroy(struct kref *kref) */
blk_trace_remove(q); - scsi_device_put(sdp->device); + blk_put_queue(q);
write_lock_irqsave(&sg_index_lock, flags); idr_remove(&sg_index_idr, sdp->index);
From: Yu Kuai yukuai3@huawei.com
commit 80b6051085c5fedcb1dfd7b2562a63a83655c4d8 upstream.
Commit fcaa174a9c99 ("scsi/sg: don't grab scsi host module reference") make a mess how blk_get_queue() is called, blk_get_queue() returns true on success while the caller expects it returns 0 on success.
Fix this problem and also add a corresponding error message on failure.
Fixes: fcaa174a9c99 ("scsi/sg: don't grab scsi host module reference") Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Closes: https://lore.kernel.org/all/87lefv622n.fsf@linux.ibm.com/ Signed-off-by: Yu Kuai yukuai3@huawei.com Link: https://lore.kernel.org/r/20230705024001.177585-1-yukuai1@huaweicloud.com Tested-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Tested-by: Marc Hartmayer mhartmay@linux.ibm.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Marc Hartmayer mhartmay@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/sg.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1496,9 +1496,10 @@ sg_add_device(struct device *cl_dev) int error; unsigned long iflags;
- error = blk_get_queue(scsidp->request_queue); - if (error) - return error; + if (!blk_get_queue(scsidp->request_queue)) { + pr_warn("%s: get scsi_device queue failed\n", __func__); + return -ENODEV; + }
error = -ENOMEM; cdev = cdev_alloc();
From: Srinivasan Shanmugam srinivasan.shanmugam@amd.com
commit 87279fdf5ee0ad1360765ef70389d1c4d0f81bb6 upstream.
Fix the following errors & warnings reported by checkpatch:
ERROR: space required before the open brace '{' ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line ERROR: space prohibited before that ',' (ctx:WxW) ERROR: else should follow close brace '}' ERROR: open brace '{' following function definitions go on the next line ERROR: code indent should use tabs where possible
WARNING: braces {} are not necessary for single statement blocks WARNING: void function return statements are not generally useful WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line
Cc: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Cc: Aurabindo Pillai aurabindo.pillai@amd.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Reviewed-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 133 ++++++++++------------ 1 file changed, 65 insertions(+), 68 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -416,12 +416,12 @@ static void dm_pflip_high_irq(void *inte
spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
- if (amdgpu_crtc->pflip_status != AMDGPU_FLIP_SUBMITTED){ - DC_LOG_PFLIP("amdgpu_crtc->pflip_status = %d !=AMDGPU_FLIP_SUBMITTED(%d) on crtc:%d[%p] \n", - amdgpu_crtc->pflip_status, - AMDGPU_FLIP_SUBMITTED, - amdgpu_crtc->crtc_id, - amdgpu_crtc); + if (amdgpu_crtc->pflip_status != AMDGPU_FLIP_SUBMITTED) { + DC_LOG_PFLIP("amdgpu_crtc->pflip_status = %d !=AMDGPU_FLIP_SUBMITTED(%d) on crtc:%d[%p]\n", + amdgpu_crtc->pflip_status, + AMDGPU_FLIP_SUBMITTED, + amdgpu_crtc->crtc_id, + amdgpu_crtc); spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags); return; } @@ -875,7 +875,7 @@ static int dm_set_powergating_state(void }
/* Prototypes of private functions */ -static int dm_early_init(void* handle); +static int dm_early_init(void *handle);
/* Allocate memory for FBC compressed data */ static void amdgpu_dm_fbc_init(struct drm_connector *connector) @@ -1274,7 +1274,7 @@ static void mmhub_read_system_context(st pa_config->system_aperture.start_addr = (uint64_t)logical_addr_low << 18; pa_config->system_aperture.end_addr = (uint64_t)logical_addr_high << 18;
- pa_config->system_aperture.agp_base = (uint64_t)agp_base << 24 ; + pa_config->system_aperture.agp_base = (uint64_t)agp_base << 24; pa_config->system_aperture.agp_bot = (uint64_t)agp_bot << 24; pa_config->system_aperture.agp_top = (uint64_t)agp_top << 24;
@@ -1357,8 +1357,7 @@ static void dm_handle_hpd_rx_offload_wor DP_TEST_RESPONSE, &test_response.raw, sizeof(test_response)); - } - else if ((dc_link->connector_signal != SIGNAL_TYPE_EDP) && + } else if ((dc_link->connector_signal != SIGNAL_TYPE_EDP) && dc_link_check_link_loss_status(dc_link, &offload_work->data) && dc_link_dp_allow_hpd_rx_irq(dc_link)) { /* offload_work->data is from handle_hpd_rx_irq-> @@ -1546,7 +1545,7 @@ static int amdgpu_dm_init(struct amdgpu_ mutex_init(&adev->dm.dc_lock); mutex_init(&adev->dm.audio_lock);
- if(amdgpu_dm_irq_init(adev)) { + if (amdgpu_dm_irq_init(adev)) { DRM_ERROR("amdgpu: failed to initialize DM IRQ support.\n"); goto error; } @@ -1691,9 +1690,8 @@ static int amdgpu_dm_init(struct amdgpu_ if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER) adev->dm.dc->debug.disable_stutter = true;
- if (amdgpu_dc_debug_mask & DC_DISABLE_DSC) { + if (amdgpu_dc_debug_mask & DC_DISABLE_DSC) adev->dm.dc->debug.disable_dsc = true; - }
if (amdgpu_dc_debug_mask & DC_DISABLE_CLOCK_GATING) adev->dm.dc->debug.disable_clock_gate = true; @@ -1937,8 +1935,6 @@ static void amdgpu_dm_fini(struct amdgpu mutex_destroy(&adev->dm.audio_lock); mutex_destroy(&adev->dm.dc_lock); mutex_destroy(&adev->dm.dpia_aux_lock); - - return; }
static int load_dmcu_fw(struct amdgpu_device *adev) @@ -1947,7 +1943,7 @@ static int load_dmcu_fw(struct amdgpu_de int r; const struct dmcu_firmware_header_v1_0 *hdr;
- switch(adev->asic_type) { + switch (adev->asic_type) { #if defined(CONFIG_DRM_AMD_DC_SI) case CHIP_TAHITI: case CHIP_PITCAIRN: @@ -2704,7 +2700,7 @@ static void dm_gpureset_commit_state(str struct dc_scaling_info scaling_infos[MAX_SURFACES]; struct dc_flip_addrs flip_addrs[MAX_SURFACES]; struct dc_stream_update stream_update; - } * bundle; + } *bundle; int k, m;
bundle = kzalloc(sizeof(*bundle), GFP_KERNEL); @@ -2734,8 +2730,6 @@ static void dm_gpureset_commit_state(str
cleanup: kfree(bundle); - - return; }
static int dm_resume(void *handle) @@ -2949,8 +2943,7 @@ static const struct amd_ip_funcs amdgpu_ .set_powergating_state = dm_set_powergating_state, };
-const struct amdgpu_ip_block_version dm_ip_block = -{ +const struct amdgpu_ip_block_version dm_ip_block = { .type = AMD_IP_BLOCK_TYPE_DCE, .major = 1, .minor = 0, @@ -2995,9 +2988,12 @@ static void update_connector_ext_caps(st caps->ext_caps = &aconnector->dc_link->dpcd_sink_ext_caps; caps->aux_support = false;
- if (caps->ext_caps->bits.oled == 1 /*|| - caps->ext_caps->bits.sdr_aux_backlight_control == 1 || - caps->ext_caps->bits.hdr_aux_backlight_control == 1*/) + if (caps->ext_caps->bits.oled == 1 + /* + * || + * caps->ext_caps->bits.sdr_aux_backlight_control == 1 || + * caps->ext_caps->bits.hdr_aux_backlight_control == 1 + */) caps->aux_support = true;
if (amdgpu_backlight == 0) @@ -3264,6 +3260,7 @@ static void dm_handle_mst_sideband_msg(s process_count < max_process_count) { u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {}; u8 retry; + dret = 0;
process_count++; @@ -3463,7 +3460,7 @@ static void register_hpd_handlers(struct aconnector = to_amdgpu_dm_connector(connector); dc_link = aconnector->dc_link;
- if (DC_IRQ_SOURCE_INVALID != dc_link->irq_source_hpd) { + if (dc_link->irq_source_hpd != DC_IRQ_SOURCE_INVALID) { int_params.int_context = INTERRUPT_LOW_IRQ_CONTEXT; int_params.irq_source = dc_link->irq_source_hpd;
@@ -3472,7 +3469,7 @@ static void register_hpd_handlers(struct (void *) aconnector); }
- if (DC_IRQ_SOURCE_INVALID != dc_link->irq_source_hpd_rx) { + if (dc_link->irq_source_hpd_rx != DC_IRQ_SOURCE_INVALID) {
/* Also register for DP short pulse (hpd_rx). */ int_params.int_context = INTERRUPT_LOW_IRQ_CONTEXT; @@ -3498,7 +3495,7 @@ static int dce60_register_irq_handlers(s struct dc_interrupt_params int_params = {0}; int r; int i; - unsigned client_id = AMDGPU_IRQ_CLIENTID_LEGACY; + unsigned int client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
int_params.requested_polarity = INTERRUPT_POLARITY_DEFAULT; int_params.current_polarity = INTERRUPT_POLARITY_DEFAULT; @@ -3512,11 +3509,12 @@ static int dce60_register_irq_handlers(s * Base driver will call amdgpu_dm_irq_handler() for ALL interrupts * coming from DC hardware. * amdgpu_dm_irq_handler() will re-direct the interrupt to DC - * for acknowledging and handling. */ + * for acknowledging and handling. + */
/* Use VBLANK interrupt */ for (i = 0; i < adev->mode_info.num_crtc; i++) { - r = amdgpu_irq_add_id(adev, client_id, i+1 , &adev->crtc_irq); + r = amdgpu_irq_add_id(adev, client_id, i + 1, &adev->crtc_irq); if (r) { DRM_ERROR("Failed to add crtc irq id!\n"); return r; @@ -3524,7 +3522,7 @@ static int dce60_register_irq_handlers(s
int_params.int_context = INTERRUPT_HIGH_IRQ_CONTEXT; int_params.irq_source = - dc_interrupt_to_irq_source(dc, i+1 , 0); + dc_interrupt_to_irq_source(dc, i + 1, 0);
c_irq_params = &adev->dm.vblank_params[int_params.irq_source - DC_IRQ_SOURCE_VBLANK1];
@@ -3580,7 +3578,7 @@ static int dce110_register_irq_handlers( struct dc_interrupt_params int_params = {0}; int r; int i; - unsigned client_id = AMDGPU_IRQ_CLIENTID_LEGACY; + unsigned int client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
if (adev->family >= AMDGPU_FAMILY_AI) client_id = SOC15_IH_CLIENTID_DCE; @@ -3597,7 +3595,8 @@ static int dce110_register_irq_handlers( * Base driver will call amdgpu_dm_irq_handler() for ALL interrupts * coming from DC hardware. * amdgpu_dm_irq_handler() will re-direct the interrupt to DC - * for acknowledging and handling. */ + * for acknowledging and handling. + */
/* Use VBLANK interrupt */ for (i = VISLANDS30_IV_SRCID_D1_VERTICAL_INTERRUPT0; i <= VISLANDS30_IV_SRCID_D6_VERTICAL_INTERRUPT0; i++) { @@ -4044,7 +4043,7 @@ static void amdgpu_dm_update_backlight_c }
static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps, - unsigned *min, unsigned *max) + unsigned int *min, unsigned int *max) { if (!caps) return 0; @@ -4064,7 +4063,7 @@ static int get_brightness_range(const st static u32 convert_brightness_from_user(const struct amdgpu_dm_backlight_caps *caps, uint32_t brightness) { - unsigned min, max; + unsigned int min, max;
if (!get_brightness_range(caps, &min, &max)) return brightness; @@ -4077,7 +4076,7 @@ static u32 convert_brightness_from_user( static u32 convert_brightness_to_user(const struct amdgpu_dm_backlight_caps *caps, uint32_t brightness) { - unsigned min, max; + unsigned int min, max;
if (!get_brightness_range(caps, &min, &max)) return brightness; @@ -4557,7 +4556,6 @@ fail: static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm) { drm_atomic_private_obj_fini(&dm->atomic_obj); - return; }
/****************************************************************************** @@ -5375,6 +5373,7 @@ static bool adjust_colour_depth_from_dis { enum dc_color_depth depth = timing_out->display_color_depth; int normalized_clk; + do { normalized_clk = timing_out->pix_clk_100hz / 10; /* YCbCr 4:2:0 requires additional adjustment of 1/2 */ @@ -5590,6 +5589,7 @@ create_fake_sink(struct amdgpu_dm_connec { struct dc_sink_init_data sink_init_data = { 0 }; struct dc_sink *sink = NULL; + sink_init_data.link = aconnector->dc_link; sink_init_data.sink_signal = aconnector->dc_link->connector_signal;
@@ -5713,7 +5713,7 @@ get_highest_refresh_rate_mode(struct amd return &aconnector->freesync_vid_base;
/* Find the preferred mode */ - list_for_each_entry (m, list_head, head) { + list_for_each_entry(m, list_head, head) { if (m->type & DRM_MODE_TYPE_PREFERRED) { m_pref = m; break; @@ -5737,7 +5737,7 @@ get_highest_refresh_rate_mode(struct amd * For some monitors, preferred mode is not the mode with highest * supported refresh rate. */ - list_for_each_entry (m, list_head, head) { + list_for_each_entry(m, list_head, head) { current_refresh = drm_mode_vrefresh(m);
if (m->hdisplay == m_pref->hdisplay && @@ -6010,7 +6010,7 @@ create_stream_for_sink(struct amdgpu_dm_ * This may not be an error, the use case is when we have no * usermode calls to reset and set mode upon hotplug. In this * case, we call set mode ourselves to restore the previous mode - * and the modelist may not be filled in in time. + * and the modelist may not be filled in time. */ DRM_DEBUG_DRIVER("No preferred mode found\n"); } else { @@ -6034,9 +6034,9 @@ create_stream_for_sink(struct amdgpu_dm_ drm_mode_set_crtcinfo(&mode, 0);
/* - * If scaling is enabled and refresh rate didn't change - * we copy the vic and polarities of the old timings - */ + * If scaling is enabled and refresh rate didn't change + * we copy the vic and polarities of the old timings + */ if (!scale || mode_refresh != preferred_refresh) fill_stream_properties_from_drm_display_mode( stream, &mode, &aconnector->base, con_state, NULL, @@ -6756,6 +6756,7 @@ static int dm_encoder_helper_atomic_chec
if (!state->duplicated) { int max_bpc = conn_state->max_requested_bpc; + is_y420 = drm_mode_is_420_also(&connector->display_info, adjusted_mode) && aconnector->force_yuv420_output; color_depth = convert_color_depth_from_display_info(connector, @@ -7074,7 +7075,7 @@ static bool is_duplicate_mode(struct amd { struct drm_display_mode *m;
- list_for_each_entry (m, &aconnector->base.probed_modes, head) { + list_for_each_entry(m, &aconnector->base.probed_modes, head) { if (drm_mode_equal(m, mode)) return true; } @@ -7384,7 +7385,6 @@ static int amdgpu_dm_connector_init(stru
link->priv = aconnector;
- DRM_DEBUG_DRIVER("%s()\n", __func__);
i2c = create_i2c(link->ddc, link->link_index, &res); if (!i2c) { @@ -8106,8 +8106,7 @@ static void amdgpu_dm_commit_planes(stru * DRI3/Present extension with defined target_msc. */ last_flip_vblank = amdgpu_get_vblank_counter_kms(pcrtc); - } - else { + } else { /* For variable refresh rate mode only: * Get vblank of last completed flip to avoid > 1 vrr * flips per video frame by use of throttling, but allow @@ -8440,8 +8439,8 @@ static void amdgpu_dm_atomic_commit_tail dc_resource_state_copy_construct_current(dm->dc, dc_state); }
- for_each_oldnew_crtc_in_state (state, crtc, old_crtc_state, - new_crtc_state, i) { + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, + new_crtc_state, i) { struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state); @@ -8464,9 +8463,7 @@ static void amdgpu_dm_atomic_commit_tail dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
drm_dbg_state(state->dev, - "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, " - "planes_changed:%d, mode_changed:%d,active_changed:%d," - "connectors_changed:%d\n", + "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, planes_changed:%d, mode_changed:%d,active_changed:%d,connectors_changed:%d\n", acrtc->crtc_id, new_crtc_state->enable, new_crtc_state->active, @@ -9035,8 +9032,8 @@ static int do_aquire_global_lock(struct &commit->flip_done, 10*HZ);
if (ret == 0) - DRM_ERROR("[CRTC:%d:%s] hw_done or flip_done " - "timed out\n", crtc->base.id, crtc->name); + DRM_ERROR("[CRTC:%d:%s] hw_done or flip_done timed out\n", + crtc->base.id, crtc->name);
drm_crtc_commit_put(commit); } @@ -9121,7 +9118,8 @@ is_timing_unchanged_for_freesync(struct return false; }
-static void set_freesync_fixed_config(struct dm_crtc_state *dm_new_crtc_state) { +static void set_freesync_fixed_config(struct dm_crtc_state *dm_new_crtc_state) +{ u64 num, den, res; struct drm_crtc_state *new_crtc_state = &dm_new_crtc_state->base;
@@ -9244,9 +9242,7 @@ static int dm_update_crtc_state(struct a goto skip_modeset;
drm_dbg_state(state->dev, - "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, " - "planes_changed:%d, mode_changed:%d,active_changed:%d," - "connectors_changed:%d\n", + "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, planes_changed:%d, mode_changed:%d,active_changed:%d,connectors_changed:%d\n", acrtc->crtc_id, new_crtc_state->enable, new_crtc_state->active, @@ -9275,8 +9271,7 @@ static int dm_update_crtc_state(struct a old_crtc_state)) { new_crtc_state->mode_changed = false; DRM_DEBUG_DRIVER( - "Mode change not required for front porch change, " - "setting mode_changed to %d", + "Mode change not required for front porch change, setting mode_changed to %d", new_crtc_state->mode_changed);
set_freesync_fixed_config(dm_new_crtc_state); @@ -9288,9 +9283,8 @@ static int dm_update_crtc_state(struct a struct drm_display_mode *high_mode;
high_mode = get_highest_refresh_rate_mode(aconnector, false); - if (!drm_mode_equal(&new_crtc_state->mode, high_mode)) { + if (!drm_mode_equal(&new_crtc_state->mode, high_mode)) set_freesync_fixed_config(dm_new_crtc_state); - } }
ret = dm_atomic_get_state(state, &dm_state); @@ -9458,6 +9452,7 @@ static bool should_reset_plane(struct dr */ for_each_oldnew_plane_in_state(state, other, old_other_state, new_other_state, i) { struct amdgpu_framebuffer *old_afb, *new_afb; + if (other->type == DRM_PLANE_TYPE_CURSOR) continue;
@@ -9556,11 +9551,12 @@ static int dm_check_cursor_fb(struct amd }
/* Core DRM takes care of checking FB modifiers, so we only need to - * check tiling flags when the FB doesn't have a modifier. */ + * check tiling flags when the FB doesn't have a modifier. + */ if (!(fb->flags & DRM_MODE_FB_MODIFIERS)) { if (adev->family < AMDGPU_FAMILY_AI) { linear = AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_2D_TILED_THIN1 && - AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_1D_TILED_THIN1 && + AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_1D_TILED_THIN1 && AMDGPU_TILING_GET(afb->tiling_flags, MICRO_TILE_MODE) == 0; } else { linear = AMDGPU_TILING_GET(afb->tiling_flags, SWIZZLE_MODE) == 0; @@ -9782,12 +9778,12 @@ static int dm_check_crtc_cursor(struct d /* On DCE and DCN there is no dedicated hardware cursor plane. We get a * cursor per pipe but it's going to inherit the scaling and * positioning from the underlying pipe. Check the cursor plane's - * blending properties match the underlying planes'. */ + * blending properties match the underlying planes'. + */
new_cursor_state = drm_atomic_get_new_plane_state(state, cursor); - if (!new_cursor_state || !new_cursor_state->fb) { + if (!new_cursor_state || !new_cursor_state->fb) return 0; - }
dm_get_oriented_plane_size(new_cursor_state, &cursor_src_w, &cursor_src_h); cursor_scale_w = new_cursor_state->crtc_w * 1000 / cursor_src_w; @@ -9832,6 +9828,7 @@ static int add_affected_mst_dsc_crtcs(st struct drm_connector_state *conn_state, *old_conn_state; struct amdgpu_dm_connector *aconnector = NULL; int i; + for_each_oldnew_connector_in_state(state, connector, old_conn_state, conn_state, i) { if (!conn_state->crtc) conn_state = old_conn_state; @@ -10266,7 +10263,7 @@ static int amdgpu_dm_atomic_check(struct }
/* Store the overall update type for use later in atomic check. */ - for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) { + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) { struct dm_crtc_state *dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
@@ -10288,7 +10285,7 @@ fail: else if (ret == -EINTR || ret == -EAGAIN || ret == -ERESTARTSYS) DRM_DEBUG_DRIVER("Atomic check stopped due to signal.\n"); else - DRM_DEBUG_DRIVER("Atomic check failed with err: %d \n", ret); + DRM_DEBUG_DRIVER("Atomic check failed with err: %d\n", ret);
trace_amdgpu_dm_atomic_check_finish(state, ret);
From: Wayne Lin wayne.lin@amd.com
commit 4f6d9e38c4d244ad106eb9ebd8c0e1215e866f35 upstream.
[Why] Specific TBT4 dock doesn't send out short HPD to notify source that IRQ event DOWN_REP_MSG_RDY is set. Which violates the spec and cause source can't send out streams to mst sinks.
[How] To cover this misbehavior, add an additional polling method to detect DOWN_REP_MSG_RDY is set. HPD driven handling method is still kept. Just hook up our handler to drm mgr->cbs->poll_hpd_irq().
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Jerry Zuo jerry.zuo@amd.com Acked-by: Alan Liu haoping.liu@amd.com Signed-off-by: Wayne Lin wayne.lin@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 117 +++--------- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 110 +++++++++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 11 + 4 files changed, 159 insertions(+), 86 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1339,6 +1339,15 @@ static void dm_handle_hpd_rx_offload_wor if (amdgpu_in_reset(adev)) goto skip;
+ if (offload_work->data.bytes.device_service_irq.bits.UP_REQ_MSG_RDY || + offload_work->data.bytes.device_service_irq.bits.DOWN_REP_MSG_RDY) { + dm_handle_mst_sideband_msg_ready_event(&aconnector->mst_mgr, DOWN_OR_UP_MSG_RDY_EVENT); + spin_lock_irqsave(&offload_work->offload_wq->offload_lock, flags); + offload_work->offload_wq->is_handling_mst_msg_rdy_event = false; + spin_unlock_irqrestore(&offload_work->offload_wq->offload_lock, flags); + goto skip; + } + mutex_lock(&adev->dm.dc_lock); if (offload_work->data.bytes.device_service_irq.bits.AUTOMATED_TEST) { dc_link_dp_handle_automated_test(dc_link); @@ -3227,87 +3236,6 @@ static void handle_hpd_irq(void *param)
}
-static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector) -{ - u8 esi[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = { 0 }; - u8 dret; - bool new_irq_handled = false; - int dpcd_addr; - int dpcd_bytes_to_read; - - const int max_process_count = 30; - int process_count = 0; - - const struct dc_link_status *link_status = dc_link_get_status(aconnector->dc_link); - - if (link_status->dpcd_caps->dpcd_rev.raw < 0x12) { - dpcd_bytes_to_read = DP_LANE0_1_STATUS - DP_SINK_COUNT; - /* DPCD 0x200 - 0x201 for downstream IRQ */ - dpcd_addr = DP_SINK_COUNT; - } else { - dpcd_bytes_to_read = DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI; - /* DPCD 0x2002 - 0x2005 for downstream IRQ */ - dpcd_addr = DP_SINK_COUNT_ESI; - } - - dret = drm_dp_dpcd_read( - &aconnector->dm_dp_aux.aux, - dpcd_addr, - esi, - dpcd_bytes_to_read); - - while (dret == dpcd_bytes_to_read && - process_count < max_process_count) { - u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {}; - u8 retry; - - dret = 0; - - process_count++; - - DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]); - /* handle HPD short pulse irq */ - if (aconnector->mst_mgr.mst_state) - drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr, - esi, - ack, - &new_irq_handled); - - if (new_irq_handled) { - /* ACK at DPCD to notify down stream */ - for (retry = 0; retry < 3; retry++) { - ssize_t wret; - - wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux, - dpcd_addr + 1, - ack[1]); - if (wret == 1) - break; - } - - if (retry == 3) { - DRM_ERROR("Failed to ack MST event.\n"); - return; - } - - drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr); - /* check if there is new irq to be handled */ - dret = drm_dp_dpcd_read( - &aconnector->dm_dp_aux.aux, - dpcd_addr, - esi, - dpcd_bytes_to_read); - - new_irq_handled = false; - } else { - break; - } - } - - if (process_count == max_process_count) - DRM_DEBUG_DRIVER("Loop exceeded max iterations\n"); -} - static void schedule_hpd_rx_offload_work(struct hpd_rx_irq_offload_work_queue *offload_wq, union hpd_irq_data hpd_irq_data) { @@ -3369,7 +3297,23 @@ static void handle_hpd_rx_irq(void *para if (dc_link_dp_allow_hpd_rx_irq(dc_link)) { if (hpd_irq_data.bytes.device_service_irq.bits.UP_REQ_MSG_RDY || hpd_irq_data.bytes.device_service_irq.bits.DOWN_REP_MSG_RDY) { - dm_handle_mst_sideband_msg(aconnector); + bool skip = false; + + /* + * DOWN_REP_MSG_RDY is also handled by polling method + * mgr->cbs->poll_hpd_irq() + */ + spin_lock(&offload_wq->offload_lock); + skip = offload_wq->is_handling_mst_msg_rdy_event; + + if (!skip) + offload_wq->is_handling_mst_msg_rdy_event = true; + + spin_unlock(&offload_wq->offload_lock); + + if (!skip) + schedule_hpd_rx_offload_work(offload_wq, hpd_irq_data); + goto out; }
@@ -3478,11 +3422,11 @@ static void register_hpd_handlers(struct amdgpu_dm_irq_register_interrupt(adev, &int_params, handle_hpd_rx_irq, (void *) aconnector); - - if (adev->dm.hpd_rx_offload_wq) - adev->dm.hpd_rx_offload_wq[dc_link->link_index].aconnector = - aconnector; } + + if (adev->dm.hpd_rx_offload_wq) + adev->dm.hpd_rx_offload_wq[connector->index].aconnector = + aconnector; } }
@@ -7235,6 +7179,7 @@ void amdgpu_dm_connector_init_helper(str aconnector->as_type = ADAPTIVE_SYNC_TYPE_NONE; memset(&aconnector->vsdb_info, 0, sizeof(aconnector->vsdb_info)); mutex_init(&aconnector->hpd_lock); + mutex_init(&aconnector->handle_mst_msg_ready);
/* * configure support HPD hot plug connector_>polled default value is 0 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h @@ -195,6 +195,11 @@ struct hpd_rx_irq_offload_work_queue { */ bool is_handling_link_loss; /** + * @is_handling_mst_msg_rdy_event: Used to prevent inserting mst message + * ready event when we're already handling mst message ready event + */ + bool is_handling_mst_msg_rdy_event; + /** * @aconnector: The aconnector that this work queue is attached to */ struct amdgpu_dm_connector *aconnector; @@ -638,6 +643,8 @@ struct amdgpu_dm_connector { struct drm_dp_mst_port *mst_output_port; struct amdgpu_dm_connector *mst_root; struct drm_dp_aux *dsc_aux; + struct mutex handle_mst_msg_ready; + /* TODO see if we can merge with ddc_bus or make a dm_connector */ struct amdgpu_i2c_adapter *i2c;
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -619,8 +619,118 @@ dm_dp_add_mst_connector(struct drm_dp_ms return connector; }
+void dm_handle_mst_sideband_msg_ready_event( + struct drm_dp_mst_topology_mgr *mgr, + enum mst_msg_ready_type msg_rdy_type) +{ + uint8_t esi[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = { 0 }; + uint8_t dret; + bool new_irq_handled = false; + int dpcd_addr; + uint8_t dpcd_bytes_to_read; + const uint8_t max_process_count = 30; + uint8_t process_count = 0; + u8 retry; + struct amdgpu_dm_connector *aconnector = + container_of(mgr, struct amdgpu_dm_connector, mst_mgr); + + + const struct dc_link_status *link_status = dc_link_get_status(aconnector->dc_link); + + if (link_status->dpcd_caps->dpcd_rev.raw < 0x12) { + dpcd_bytes_to_read = DP_LANE0_1_STATUS - DP_SINK_COUNT; + /* DPCD 0x200 - 0x201 for downstream IRQ */ + dpcd_addr = DP_SINK_COUNT; + } else { + dpcd_bytes_to_read = DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI; + /* DPCD 0x2002 - 0x2005 for downstream IRQ */ + dpcd_addr = DP_SINK_COUNT_ESI; + } + + mutex_lock(&aconnector->handle_mst_msg_ready); + + while (process_count < max_process_count) { + u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {}; + + process_count++; + + dret = drm_dp_dpcd_read( + &aconnector->dm_dp_aux.aux, + dpcd_addr, + esi, + dpcd_bytes_to_read); + + if (dret != dpcd_bytes_to_read) { + DRM_DEBUG_KMS("DPCD read and acked number is not as expected!"); + break; + } + + DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]); + + switch (msg_rdy_type) { + case DOWN_REP_MSG_RDY_EVENT: + /* Only handle DOWN_REP_MSG_RDY case*/ + esi[1] &= DP_DOWN_REP_MSG_RDY; + break; + case UP_REQ_MSG_RDY_EVENT: + /* Only handle UP_REQ_MSG_RDY case*/ + esi[1] &= DP_UP_REQ_MSG_RDY; + break; + default: + /* Handle both cases*/ + esi[1] &= (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY); + break; + } + + if (!esi[1]) + break; + + /* handle MST irq */ + if (aconnector->mst_mgr.mst_state) + drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr, + esi, + ack, + &new_irq_handled); + + if (new_irq_handled) { + /* ACK at DPCD to notify down stream */ + for (retry = 0; retry < 3; retry++) { + ssize_t wret; + + wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux, + dpcd_addr + 1, + ack[1]); + if (wret == 1) + break; + } + + if (retry == 3) { + DRM_ERROR("Failed to ack MST event.\n"); + return; + } + + drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr); + + new_irq_handled = false; + } else { + break; + } + } + + mutex_unlock(&aconnector->handle_mst_msg_ready); + + if (process_count == max_process_count) + DRM_DEBUG_DRIVER("Loop exceeded max iterations\n"); +} + +static void dm_handle_mst_down_rep_msg_ready(struct drm_dp_mst_topology_mgr *mgr) +{ + dm_handle_mst_sideband_msg_ready_event(mgr, DOWN_REP_MSG_RDY_EVENT); +} + static const struct drm_dp_mst_topology_cbs dm_mst_cbs = { .add_connector = dm_dp_add_mst_connector, + .poll_hpd_irq = dm_handle_mst_down_rep_msg_ready, };
void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm, --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h @@ -49,6 +49,13 @@ #define PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B 1031 #define PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B 1000
+enum mst_msg_ready_type { + NONE_MSG_RDY_EVENT = 0, + DOWN_REP_MSG_RDY_EVENT = 1, + UP_REQ_MSG_RDY_EVENT = 2, + DOWN_OR_UP_MSG_RDY_EVENT = 3 +}; + struct amdgpu_display_manager; struct amdgpu_dm_connector;
@@ -61,6 +68,10 @@ void amdgpu_dm_initialize_dp_connector(s void dm_dp_create_fake_mst_encoders(struct amdgpu_device *adev);
+void dm_handle_mst_sideband_msg_ready_event( + struct drm_dp_mst_topology_mgr *mgr, + enum mst_msg_ready_type msg_rdy_type); + struct dsc_mst_fairness_vars { int pbn; bool dsc_enabled;
On Tue, 25 Jul 2023 12:42:47 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.4: 11 builds: 11 pass, 0 fail 28 boots: 28 pass, 0 fail 130 tests: 130 pass, 0 fail
Linux version: 6.4.7-rc1-g3c19c5641cce Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hi,
On Tue, 25 Jul 2023 12:42:47 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] commit 3c19c5641cce ("Linux 6.4.7-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: debugfs_rm_non_contexts.sh ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_m68k.sh ok 12 selftests: damon-tests: build_arm64.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh
PASS
On 7/25/23 04:42, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 7/25/23 03:42, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Tue, Jul 25, 2023 at 12:42:47PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully compiled and installed bindeb-pkgs on my computer (Acer Aspire E15, Intel Core i3 Haswell). No noticeable regressions.
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
Hey Greg,
Ran tests and boot tested on my system, no regressions found
Tested-by: Fenil Jain fkjainco@gmail.com
On Tue, Jul 25, 2023 at 12:42:47PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
On 7/25/23 3:42 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Tue, 25 Jul 2023 at 16:19, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.4.7-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.4.y * git commit: 3c19c5641cce21ec84a7d62be76d53f454531f48 * git describe: v6.4.6-228-g3c19c5641cce * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.6-...
## Test Regressions (compared to v6.4.5)
## Metric Regressions (compared to v6.4.5)
## Test Fixes (compared to v6.4.5)
## Metric Fixes (compared to v6.4.5)
## Test result summary total: 166993, pass: 145128, fail: 2201, skip: 19509, xfail: 155
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 141 total, 141 passed, 0 failed * arm64: 50 total, 50 passed, 0 failed * i386: 37 total, 37 passed, 0 failed * mips: 26 total, 26 passed, 0 failed * parisc: 3 total, 3 passed, 0 failed * powerpc: 34 total, 34 passed, 0 failed * riscv: 22 total, 22 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 12 total, 12 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 42 total, 42 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-vm * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On Tue, Jul 25, 2023 at 12:42:47PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
Build results: total: 157 pass: 157 fail: 0 Qemu test results: total: 522 pass: 522 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Tue, Jul 25, 2023 at 12:42:47PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
[ 1676.206713] ------------[ cut here ]------------ [ 1676.213985] rcutorture_oom_notify invoked upon OOM during forward-progress testing. [ 1676.224945] WARNING: CPU: 7 PID: 103 at kernel/rcu/rcutorture.c:2841 rcutorture_oom_notify+0x3c/0x1d0 [ 1676.238323] Modules linked in: [ 1676.242750] CPU: 7 PID: 103 Comm: rcu_torture_fwd Not tainted 6.4.7-rc1-g3c19c5641cce #6 [ 1676.254378] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 1676.268003] RIP: 0010:rcutorture_oom_notify+0x3c/0x1d0 [ 1676.275468] Code: d5 53 e8 e7 23 d4 00 48 8b 1d 70 34 45 02 48 85 db 0f 84 88 01 00 00 48 c7 c6 e0 f6 a0 b2 48 c7 c7 88 91 ee b2 e8 14 25 f7 ff <0f> 0b 8b 35 8c d8 a2 01 85 f6 7e 40 45 31 ed 4d 63 e5 41 83 c5 01 [ 1676.302738] RSP: 0000:ffffa7c6c0397a98 EFLAGS: 00010282 [ 1676.310984] RAX: 0000000000000000 RBX: ffff897a418cc000 RCX: 00000000ffffdfff [ 1676.322207] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000009ffb [ 1676.333232] RBP: ffffa7c6c0397b28 R08: 00000000ffffdfff R09: 00000000ffffdfff [ 1676.342365] R10: ffffffffb32591e0 R11: ffffffffb32591e0 R12: 0000000000000000 [ 1676.352563] R13: ffffa7c6c0397b28 R14: 00000000ffffffff R15: 0000000000000000 [ 1676.362721] FS: 0000000000000000(0000) GS:ffff897a5f5c0000(0000) knlGS:0000000000000000 [ 1676.374816] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1676.383256] CR2: 0000000000000000 CR3: 000000001e22e000 CR4: 00000000000006e0 [ 1676.392499] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1676.401739] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1676.410804] Call Trace: [ 1676.414279] <TASK> [ 1676.417140] ? rcutorture_oom_notify+0x3c/0x1d0 [ 1676.422944] ? __warn+0x7c/0x120 [ 1676.427146] ? rcutorture_oom_notify+0x3c/0x1d0 [ 1676.432902] ? report_bug+0x15d/0x180 [ 1676.437783] ? handle_bug+0x3c/0x70 [ 1676.442369] ? exc_invalid_op+0x17/0x70 [ 1676.447269] ? asm_exc_invalid_op+0x1a/0x20 [ 1676.452574] ? rcutorture_oom_notify+0x3c/0x1d0 [ 1676.458128] ? rcutorture_oom_notify+0x3c/0x1d0 [ 1676.463880] notifier_call_chain+0x55/0xb0 [ 1676.469255] blocking_notifier_call_chain+0x3a/0x60 [ 1676.475244] out_of_memory+0x3bc/0x710 [ 1676.480323] __alloc_pages_slowpath.constprop.0+0xbb6/0xd00 [ 1676.487347] __alloc_pages+0x2cb/0x2e0 [ 1676.492200] allocate_slab+0x348/0x3e0 [ 1676.496983] ? sysvec_reschedule_ipi+0x31/0xd0 [ 1676.502607] ___slab_alloc+0x2d8/0x7a0 [ 1676.507406] ? rcu_torture_fwd_prog+0x3d8/0xa60 [ 1676.513157] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 1676.519767] ? rcu_nocb_do_flush_bypass+0xc6/0x110 [ 1676.525875] ? rcu_torture_fwd_prog+0x3d8/0xa60 [ 1676.531607] __kmem_cache_alloc_node+0x183/0x1a0 [ 1676.537506] kmalloc_trace+0x25/0x90 [ 1676.542240] rcu_torture_fwd_prog+0x3d8/0xa60 [ 1676.547800] ? __pfx_rcu_torture_fwd_prog+0x10/0x10 [ 1676.554051] ? kthread+0xcb/0xf0 [ 1676.558286] ? __pfx_rcu_torture_fwd_prog+0x10/0x10 [ 1676.564594] kthread+0xcb/0xf0 [ 1676.568731] ? __pfx_kthread+0x10/0x10 [ 1676.573590] ret_from_fork+0x2c/0x50 [ 1676.578317] </TASK> [ 1676.581240] ---[ end trace 0000000000000000 ]---
thanks,
- Joel
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.4.7-rc1
Wayne Lin wayne.lin@amd.com drm/amd/display: Add polling method to handle MST reply packet
Srinivasan Shanmugam srinivasan.shanmugam@amd.com drm/amd/display: Clean up errors & warnings in amdgpu_dm.c
Yu Kuai yukuai3@huawei.com scsi: sg: Fix checking return value of blk_get_queue()
Yu Kuai yukuai3@huawei.com scsi/sg: don't grab scsi host module reference
Abe Kohandel abe.kohandel@intel.com spi: dw: Remove misleading comment for Mount Evans SoC
Yunxiang Li Yunxiang.Li@amd.com drm/ttm: fix bulk_move corruption when adding a entry
Mohamed Khalfella mkhalfella@purestorage.com tracing/histograms: Return an error if we fail to add histogram to hist_vars list
Miguel Ojeda ojeda@kernel.org kbuild: rust: avoid creating temporary files
Zhang Yi yi.zhang@huawei.com jbd2: recheck chechpointing non-dirty buffer
Vladimir Oltean vladimir.oltean@nxp.com net: phy: prevent stale pointer dereference in phy_init()
Eric Dumazet edumazet@google.com tcp: annotate data-races around fastopenq.max_qlen
Eric Dumazet edumazet@google.com tcp: annotate data-races around icsk->icsk_user_timeout
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->notsent_lowat
Eric Dumazet edumazet@google.com tcp: annotate data-races around rskq_defer_accept
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->linger2
Eric Dumazet edumazet@google.com tcp: annotate data-races around icsk->icsk_syn_retries
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_probes
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_intvl
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->keepalive_time
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->tsoffset
Eric Dumazet edumazet@google.com tcp: annotate data-races around tp->tcp_tx_delay
Tomasz Moń tomasz.mon@nordicsemi.no Bluetooth: btusb: Fix bluetooth on Intel Macbook 2014
Pauli Virtanen pav@iki.fi Bluetooth: SCO: fix sco_conn related locking and validity issues
Siddh Raman Pant code@siddh.me Bluetooth: hci_conn: return ERR_PTR instead of NULL when there is no link
Douglas Anderson dianders@chromium.org Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_remove_adv_monitor()
Pauli Virtanen pav@iki.fi Bluetooth: ISO: fix iso_conn related locking and validity issues
Pauli Virtanen pav@iki.fi Bluetooth: hci_event: call disconnect callback before deleting conn
Pauli Virtanen pav@iki.fi Bluetooth: use RCU for hci_conn_params and iterate safely in hci_sync
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip bound chain on rule flush
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip bound chain in netns release path
Florian Westphal fw@strlen.de netfilter: nft_set_pipapo: fix improper element removal
Florian Westphal fw@strlen.de netfilter: nf_tables: can't schedule in nft_chain_validate
Florian Westphal fw@strlen.de netfilter: nf_tables: fix spurious set element insertion failure
Vitaly Rodionov vitalyr@opensource.cirrus.com ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp
Kuniyuki Iwashima kuniyu@amazon.com llc: Don't drop packet from non-root netns.
Zhang Shurong zhang_shurong@foxmail.com fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: always mtk_get_ib1_pkt_type
Kuniyuki Iwashima kuniyu@amazon.com Revert "tcp: avoid the lookup process failing to get sk in ehash table"
Yuanjun Gong ruc_gongyuanjun@163.com net:ipv6: check return value of pskb_trim()
Wang Ming machel@vivo.com net: ipv4: Use kfree_sensitive instead of kfree
Eric Dumazet edumazet@google.com tcp: annotate data-races around tcp_rsk(req)->ts_recent
Eric Dumazet edumazet@google.com tcp: annotate data-races around tcp_rsk(req)->txhash
Antoine Tenart atenart@kernel.org net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV
Florian Kauer florian.kauer@linutronix.de igc: Prevent garbled TX queue with XDP ZEROCOPY
Kurt Kanzenbach kurt@linutronix.de igc: Avoid transmit queue timeout for XDP
Alexander Duyck alexanderduyck@fb.com bpf, arm64: Fix BTI type used for freplace attached functions
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Repeat check_max_stack_depth for async callbacks
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Fix subprog idx logic in check_max_stack_depth
Geetha sowjanya gakula@marvell.com octeontx2-pf: Dont allocate BPIDs for LBK interfaces
Ido Schimmel idosch@nvidia.com vrf: Fix lockdep splat in output path
Jiapeng Chong jiapeng.chong@linux.alibaba.com security: keys: Modify mismatched function name
Ahmed Zaki ahmed.zaki@intel.com iavf: fix reset task race with iavf_remove()
Ahmed Zaki ahmed.zaki@intel.com iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies
Marcin Szycik marcin.szycik@linux.intel.com iavf: Wait for reset in callbacks which trigger it
Przemek Kitszel przemyslaw.kitszel@intel.com iavf: make functions static where possible
Ahmed Zaki ahmed.zaki@intel.com iavf: use internal state to free traffic IRQs
Ding Hui dinghui@sangfor.com.cn iavf: Fix out-of-bounds when setting channels on remove
Ding Hui dinghui@sangfor.com.cn iavf: Fix use-after-free in free_netdev
Andrzej Hajda andrzej.hajda@intel.com drm/i915/perf: add sentinel to xehp_oa_b_counters
Heiner Kallweit hkallweit1@gmail.com r8169: fix ASPM-related problem for chip version 42 and 43
Tristram Ha Tristram.Ha@microchip.com net: dsa: microchip: correct KSZ8795 static MAC table access
Victor Nogueira victor@mojatatu.com net: sched: cls_bpf: Undo tcf_bind_filter in case of an error
Victor Nogueira victor@mojatatu.com net: sched: cls_u32: Undo refcount decrement in case update failed
Victor Nogueira victor@mojatatu.com net: sched: cls_u32: Undo tcf_bind_filter if u32_replace_hw_knode
Victor Nogueira victor@mojatatu.com net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms
Martin Fuzzey martin.fuzzey@flowbird.group regulator: da9063: fix null pointer deref with partial DT config
Dan Carpenter dan.carpenter@linaro.org ASoC: SOF: ipc3-dtrace: uninitialized data in dfsentry_trace_filter_write()
Michal Swiatkowski michal.swiatkowski@linux.intel.com ice: prevent NULL pointer deref during reload
Petr Oros poros@redhat.com ice: Unregister netdev and devlink_port only once
Shyam Prasad N nspmangalore@gmail.com cifs: fix mid leak during reconnection after timeout threshold
Dan Carpenter error27@gmail.com iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid()
Yan Zhai yan@cloudflare.com gso: fix dodgy bit handling for GSO_UDP_L4
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: handle probe deferral
Kuniyuki Iwashima kuniyu@amazon.com bridge: Add extack warning when enabling STP in netns.
Tanmay Patil t-patil@ti.com net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()
Linus Walleij linus.walleij@linaro.org dsa: mv88e6xxx: Do a final check before timing out
Marc Zyngier maz@kernel.org arm64: Fix HFGxTR_EL2 field naming
Paulo Alcantara pc@manguebit.com smb: client: fix missed ses refcounting
Yonghong Song yhs@fb.com kallsyms: strip LTO-only suffixes from promoted global functions
Jaewon Kim jaewon02.kim@samsung.com spi: s3c64xx: clear loopback bit after loopback test
Christoph Hellwig hch@lst.de btrfs: be a bit more careful when setting mirror_num_ret in btrfs_map_block
James Clark james.clark@arm.com perf build: Fix library not found error when using CSLIBS
Yangtao Li frank.li@vivo.com fbdev: imxfb: Removed unneeded release_mem_region
Martin Kaiser martin@kaiser.cx fbdev: imxfb: warn about invalid left/right margin
Jonas Gorski jonas.gorski@gmail.com spi: bcm63xx: fix max prepend length
Biju Das biju.das.jz@bp.renesas.com pinctrl: renesas: rzg2l: Handle non-unique subnode names
Geert Uytterhoeven geert+renesas@glider.be pinctrl: renesas: rzv2m: Handle non-unique subnode names
Suren Baghdasaryan surenb@google.com sched/psi: use kernfs polling functions for PSI trigger polling
Miaohe Lin linmiaohe@huawei.com sched/fair: Use recent_used_cpu to test p->cpus_ptr
Peter Zijlstra peterz@infradead.org iov_iter: Mark copy_iovec_from_user() noclone
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: qcom: q6apm: do not close GPR port before closing graph
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wcd938x: fix dB range for HPHL and HPHR
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix mbhc impedance loglevel
Vijendar Mukunda Vijendar.Mukunda@amd.com ASoC: amd: acp: fix for invalid dai id handling in acp_get_byte_count()
Hao Chen chenhao418@huawei.com net: hns3: fix strncpy() not using dest-buf length as length issue
Ying Hsu yinghsu@chromium.org igb: Fix igb_down hung on surprise removal
Yi Kuo yi@yikuo.dev wifi: iwlwifi: pcie: add device id 51F1 for killer 1675
Johannes Berg johannes.berg@intel.com wifi: iwlwifi: mvm: avoid baid size integer overflow
Mukesh Sisodiya mukesh.sisodiya@intel.com wifi: iwlwifi: Add support for new PCI Id
Gustavo A. R. Silva gustavoars@kernel.org wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point()
Mukesh Sisodiya mukesh.sisodiya@intel.com wifi: iwlwifi: mvm: Add NULL check before dereferencing the pointer
Petr Oros poros@redhat.com devlink: report devlink_port_type_warn source device
Jisheng Zhang jszhang@kernel.org net: ethernet: litex: add support for 64 bit stats
Gregory Greenman gregory.greenman@intel.com wifi: iwlwifi: mvm: fix potential array out of bounds access
P Praneesh quic_ppranees@quicinc.com wifi: ath11k: fix memory leak in WMI firmware stats
Balamurugan S quic_bselvara@quicinc.com wifi: ath12k: Avoid NULL pointer access during management transmit cleanup
Abe Kohandel abe.kohandel@intel.com spi: dw: Add compatible for Intel Mount Evans SoC
Ilan Peer ilan.peer@intel.com wifi: mac80211_hwsim: Fix possible NULL dereference
Wen Gong quic_wgong@quicinc.com wifi: ath11k: add support default regdb while searching board-2.bin for WCN6855
Jakub Kicinski kuba@kernel.org devlink: make health report on unregistered instance warn just once
Yonghong Song yhs@fb.com bpf: Silence a warning in btf_type_id_size()
Martin Blumenstingl martin.blumenstingl@googlemail.com wifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()
Aditi Ghag aditi.ghag@isovalent.com bpf: tcp: Avoid taking fast sock lock in iterator
Andrii Nakryiko andrii@kernel.org bpf: drop unnecessary user-triggerable WARN_ONCE in verifierl log
Brad Larson blarson@amd.com spi: cadence-quadspi: Add compatible for AMD Pensando Elba SoC
Martin KaFai Lau martin.lau@kernel.org bpf: Address KCSAN report on bpf_lru_list
Kui-Feng Lee thinker.li@gmail.com bpf: Print a warning only if writing to unprivileged_bpf_disabled.
Maxime Bizon mbizon@freebox.fr wifi: ath11k: fix registration of 6Ghz-only phy without the full channel range
Yicong Yang yangyicong@hisilicon.com sched/fair: Don't balance task to its current running CPU
Thomas Weißschuh linux@weissschuh.net tools/nolibc: ensure stack protector guard is never zero
Paul E. McKenney paulmck@kernel.org rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp
Shigeru Yoshida syoshida@redhat.com rcu-tasks: Avoid pr_info() with spin lock in cblist_init_generic()
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Dell Studio 1569
Mark Rutland mark.rutland@arm.com arm64: mm: fix VA-range sanity check
Youngmin Nam youngmin.nam@samsung.com arm64: set __exception_irq_entry with __irq_entry as a default
Mario Limonciello mario.limonciello@amd.com ACPI: resource: Remove "Zen" specific match and quirks
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Lenovo ThinkPad X131e (3371 AMD version)
Hans de Goede hdegoede@redhat.com ACPI: video: Add backlight=native DMI quirk for Apple iMac11,3
Hans de Goede hdegoede@redhat.com ACPI: x86: Add ACPI_QUIRK_UART1_SKIP for Lenovo Yoga Book yb1-x90f/l
Hans de Goede hdegoede@redhat.com ACPI: button: Add lid disable DMI quirk for Nextbook Ares 8A
Hans de Goede hdegoede@redhat.com ACPI: x86: Add skip i2c clients quirk for Nextbook Ares 8A
Sandeep Dhavale dhavale@google.com erofs: Fix detection of atomic context
Filipe Manana fdmanana@suse.com btrfs: abort transaction at update_ref_for_cow() when ref count is zero
Christoph Hellwig hch@lst.de btrfs: don't check PageError in __extent_writepage
David Sterba dsterba@suse.com btrfs: add xxhash to fast checksum implementations
Thomas Gleixner tglx@linutronix.de posix-timers: Ensure timer ID search-loop limit is valid
Ming Lei ming.lei@redhat.com blk-mq: fix NULL dereference on q->elevator in blk_mq_elv_switch_none
Yu Kuai yukuai3@huawei.com scsi: sg: fix blktrace debugfs entries leakage
Yu Kuai yukuai3@huawei.com md/raid10: prevent soft lockup while flush writes
Yu Kuai yukuai3@huawei.com md: fix data corruption for raid456 when reshape restart while grow up
Immad Mir mirimmad17@gmail.com FS: JFS: Check for read-only mounted filesystem in txBegin
Immad Mir mirimmad17@gmail.com FS: JFS: Fix null-ptr-deref Read in txBegin
Gustavo A. R. Silva gustavoars@kernel.org MIPS: dec: prom: Address -Warray-bounds warning
Yogesh yogi.kernel@gmail.com fs: jfs: Fix UBSAN: array-index-out-of-bounds in dbAllocDmapLev
Matthew Anderson ruinairas1992@gmail.com ALSA: hda/realtek: Add quirks for ROG ALLY CS35l41 audio
Jan Kara jack@suse.cz udf: Fix uninitialized array access for some pathnames
Christian Brauner brauner@kernel.org ovl: check type and offset of struct vfsmount in ovl_entry
Marco Morandini marco.morandini@polimi.it HID: add quirk for 03f0:464a HP Elite Presenter Mouse
Ye Bin yebin10@huawei.com quota: fix warning in dqgrab()
Jan Kara jack@suse.cz quota: Properly disable quotas when add_dquot_ref() fails
Oswald Buddenhagen oswald.buddenhagen@gmx.de ALSA: emu10k1: roll up loops in DSP setup code for Audigy
hackyzh002 hackyzh002@gmail.com drm/radeon: Fix integer overflow in radeon_cs_parser_init
Eric Whitney enwlinux@gmail.com ext4: correct inline offset when handling xattrs in inode body
Marc Zyngier maz@kernel.org KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
Marc Zyngier maz@kernel.org KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
Oliver Upton oliver.upton@linux.dev KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
Marc Zyngier maz@kernel.org KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix soundwire initialisation race
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix codec initialisation race
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd934x: fix resource leaks on component remove
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix missing mbhc init error handling
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix resource leaks on component remove
Sheetal sheetal@nvidia.com ASoC: tegra: Fix AMX byte map
Johan Hovold johan+linaro@kernel.org ASoC: qdsp6: audioreach: fix topology probe deferral
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd-mbhc-v2: fix resource leaks on component remove
Nathan Chancellor nathan@kernel.org ASoC: cs35l45: Select REGMAP_IRQ
Johan Hovold johan+linaro@kernel.org ASoC: codecs: wcd938x: fix missing clsh ctrl error handling
Thomas Petazzoni thomas.petazzoni@bootlin.com ASoC: cs42l51: fix driver to properly autoload with automatic module loading
Sameer Pujar spujar@nvidia.com ASoC: rt5640: Fix sleep in atomic context
Sheetal sheetal@nvidia.com ASoC: tegra: Fix ADX byte map
Fabio Estevam festevam@denx.de ASoC: fsl_sai: Revert "ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode"
Matus Gajdos matuszpd@gmail.com ASoC: fsl_sai: Disable bit clock with transmitter
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Keep PHY active for DP displays on DCN31
Taimur Hassan syed.hassan@amd.com drm/amd/display: check TG is non-null before checking if enabled
Zhikai Zhai zhikai.zhai@amd.com drm/amd/display: Disable MPC split by default on special asic
Simon Ser contact@emersion.fr drm/amd/display: only accept async flips for fast updates
Jocelyn Falempe jfalempe@redhat.com drm/client: Fix memory leak in drm_client_modeset_probe
Jocelyn Falempe jfalempe@redhat.com drm/client: Fix memory leak in drm_client_target_cloned
Ben Skeggs bskeggs@redhat.com drm/nouveau/i2c: fix number of aux event slots
Ben Skeggs bskeggs@redhat.com drm/nouveau/kms/nv50-: init hpd_irq_lock for PIOR DP
Ben Skeggs bskeggs@redhat.com drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts
Alex Deucher alexander.deucher@amd.com drm/amdgpu/pm: make mclk consistent for smu 13.0.7
Alex Deucher alexander.deucher@amd.com drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
Guchun Chen guchun.chen@amd.com drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
Ville Syrjälä ville.syrjala@linux.intel.com dma-buf/dma-resv: Stop leaking on krealloc() failure
Dan Carpenter dan.carpenter@linaro.org accel/qaic: Add consistent integer overflow checks
Dan Carpenter dan.carpenter@linaro.org accel/qaic: tighten bounds checking in decode_message()
Dan Carpenter dan.carpenter@linaro.org accel/qaic: tighten bounds checking in encode_message()
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: add ConnTrack procfs kconfig
Heiner Kallweit hkallweit1@gmail.com Revert "r8169: disable ASPM during NAPI poll"
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: fix time stamp counter initialization
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: gs_can_open(): improve error handling
YueHaibing yuehaibing@huawei.com can: bcm: Fix UAF in bcm_proc_show()
Fedor Ross fedor.ross@ifm.com can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout
Mark Brown broonie@kernel.org arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
Helge Deller deller@gmx.de ia64: mmap: Consider pgoff when searching for free mapping
Mark Brown broonie@kernel.org regmap: Account for register length in SMBus I/O limits
Rob Herring robh@kernel.org of: Preserve "of-display" device name for compatibility
Harald Freudenberger freude@linux.ibm.com s390/zcrypt: fix reply buffer calculations for CCA replies
Mark Brown broonie@kernel.org regmap: Drop initial version of maximum transfer length fixes
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: add 'ct' action kconfig dep
Dan Carpenter dan.carpenter@linaro.org accel/qaic: Fix a leak in map_user_pages()
Matthieu Baerts matthieu.baerts@tessares.net selftests: tc: set timeout to 15 minutes
Josef Bacik josef@toxicpanda.com btrfs: fix race between balance and cancel/pause
Miklos Szeredi mszeredi@redhat.com fuse: ioctl: translate ENOSYS in outarg
Filipe Manana fdmanana@suse.com btrfs: zoned: fix memory leak after finding block group with super blocks
Filipe Manana fdmanana@suse.com btrfs: fix double iput() on inode after an error during orphan cleanup
Josef Bacik josef@toxicpanda.com btrfs: set_page_extent_mapped after read_folio in btrfs_cont_expand
Qu Wenruo wqu@suse.com btrfs: raid56: always verify the P/Q contents for scrub
Bernd Schubert bschubert@ddn.com fuse: Apply flags2 only when userspace set the FUSE_INIT_EXT
Miklos Szeredi mszeredi@redhat.com fuse: add feature flag for expire-only
Miklos Szeredi mszeredi@redhat.com fuse: revalidate: don't invalidate if interrupted
Filipe Manana fdmanana@suse.com btrfs: fix warning when putting transaction with qgroups enabled after abort
Filipe Manana fdmanana@suse.com btrfs: fix iput() on error pointer after error during orphan cleanup
Georg Müller georgmueller@gmx.net perf probe: Read DWARF files from the correct CU
Georg Müller georgmueller@gmx.net perf probe: Add test for regression introduced by switch to die_get_decl_file()
Miguel Ojeda ojeda@kernel.org prctl: move PR_GET_AUXV out of PR_MCE_KILL
Petr Pavlu petr.pavlu@suse.com keys: Fix linking a duplicate key to a keyring's assoc_array
Colin Ian King colin.i.king@gmail.com selftests/mm: mkdirty: fix incorrect position of #endif
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: fix node allocation testing on 32 bit
Liam R. Howlett Liam.Howlett@oracle.com mm/mlock: fix vma iterator conversion of apply_vma_lock_flags()
Peng Zhang zhangpeng.00@bytedance.com maple_tree: set the node limit when creating a new root node
Luka Guzenko l.guzenko@web.de ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx
Christoffer Sandberg cs@tuxedo.de ALSA: hda/realtek: Add quirk for Clevo NS70AU
Kailang Yang kailang@realtek.com ALSA: hda/realtek - remove 3k pull low procedure
Helge Deller deller@gmx.de io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
Jens Axboe axboe@kernel.dk io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/exception.h | 5 - arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/asm/kvm_pgtable.h | 26 +-- arch/arm64/kernel/fpsimd.c | 33 ++- arch/arm64/kvm/arch_timer.c | 6 +- arch/arm64/kvm/arm.c | 19 +- arch/arm64/kvm/hyp/pgtable.c | 47 +++- arch/arm64/kvm/mmu.c | 18 +- arch/arm64/kvm/vgic/vgic-v3.c | 2 +- arch/arm64/kvm/vgic/vgic-v4.c | 7 +- arch/arm64/mm/mmu.c | 4 +- arch/arm64/net/bpf_jit_comp.c | 8 +- arch/arm64/tools/sysreg | 12 +- arch/ia64/kernel/sys_ia64.c | 2 +- arch/mips/include/asm/dec/prom.h | 2 +- arch/parisc/kernel/sys_parisc.c | 15 +- block/blk-mq.c | 10 +- drivers/accel/qaic/qaic_control.c | 39 ++-- drivers/acpi/button.c | 9 + drivers/acpi/resource.c | 60 ----- drivers/acpi/video_detect.c | 24 ++ drivers/acpi/x86/utils.c | 26 ++- drivers/base/regmap/regmap-i2c.c | 8 +- drivers/base/regmap/regmap-spi-avmm.c | 2 +- drivers/base/regmap/regmap.c | 6 +- drivers/bluetooth/btusb.c | 1 + drivers/dma-buf/dma-resv.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 5 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 256 +++++++++------------ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 12 + .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 110 +++++++++ .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 11 + .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c | 5 + .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 +- .../drm/amd/display/dc/dcn303/dcn303_resource.c | 2 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 8 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 2 +- drivers/gpu/drm/drm_client_modeset.c | 6 + drivers/gpu/drm/i915/i915_perf.c | 1 + drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 + drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h | 4 +- drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c | 27 ++- drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 11 +- drivers/gpu/drm/radeon/radeon_cs.c | 3 +- drivers/gpu/drm/ttm/ttm_resource.c | 5 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + drivers/iommu/iommu-sva.c | 3 +- drivers/md/md.c | 14 +- drivers/md/raid10.c | 2 + drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 10 +- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 1 + drivers/net/can/usb/gs_usb.c | 130 ++++++----- drivers/net/dsa/microchip/ksz8795.c | 8 +- drivers/net/dsa/microchip/ksz_common.c | 8 +- drivers/net/dsa/microchip/ksz_common.h | 7 + drivers/net/dsa/mv88e6xxx/chip.c | 7 + drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 33 ++- .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 29 ++- drivers/net/ethernet/intel/iavf/iavf.h | 16 +- drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 39 ++-- drivers/net/ethernet/intel/iavf/iavf_main.c | 223 ++++++++++++------ drivers/net/ethernet/intel/iavf/iavf_txrx.c | 43 ++-- drivers/net/ethernet/intel/iavf/iavf_txrx.h | 4 - drivers/net/ethernet/intel/iavf/iavf_virtchnl.c | 5 +- drivers/net/ethernet/intel/ice/ice_base.c | 2 + drivers/net/ethernet/intel/ice/ice_ethtool.c | 13 +- drivers/net/ethernet/intel/ice/ice_lib.c | 27 --- drivers/net/ethernet/intel/ice/ice_main.c | 10 +- drivers/net/ethernet/intel/igb/igb_main.c | 5 + drivers/net/ethernet/intel/igc/igc_main.c | 12 +- drivers/net/ethernet/litex/litex_liteeth.c | 19 +- .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 5 +- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +-- drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c | 2 +- drivers/net/ethernet/realtek/r8169_main.c | 18 +- drivers/net/ethernet/ti/cpsw_ale.c | 24 +- drivers/net/phy/phy_device.c | 21 +- drivers/net/vrf.c | 12 +- drivers/net/wireless/ath/ath11k/core.c | 53 +++-- drivers/net/wireless/ath/ath11k/mac.c | 3 +- drivers/net/wireless/ath/ath11k/wmi.c | 5 + drivers/net/wireless/ath/ath12k/mac.c | 1 + drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c | 9 +- drivers/net/wireless/intel/iwlwifi/mvm/power.c | 14 +- drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 2 +- drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 4 + drivers/net/wireless/realtek/rtw88/sdio.c | 24 +- drivers/net/wireless/virtual/mac80211_hwsim.c | 4 +- drivers/of/platform.c | 2 +- drivers/pinctrl/renesas/pinctrl-rzg2l.c | 28 ++- drivers/pinctrl/renesas/pinctrl-rzv2m.c | 28 ++- drivers/regulator/da9063-regulator.c | 3 + drivers/s390/crypto/zcrypt_msgtype6.c | 33 ++- drivers/scsi/sg.c | 10 + drivers/spi/spi-bcm63xx.c | 2 +- drivers/spi/spi-cadence-quadspi.c | 19 ++ drivers/spi/spi-dw-mmio.c | 22 ++ drivers/spi/spi-s3c64xx.c | 2 + drivers/video/fbdev/au1200fb.c | 3 + drivers/video/fbdev/imxfb.c | 5 +- fs/btrfs/block-group.c | 1 + fs/btrfs/ctree.c | 10 +- fs/btrfs/disk-io.c | 3 + fs/btrfs/extent_io.c | 33 +-- fs/btrfs/inode.c | 35 +-- fs/btrfs/qgroup.c | 1 + fs/btrfs/raid56.c | 11 +- fs/btrfs/volumes.c | 17 +- fs/erofs/zdata.c | 2 +- fs/ext4/xattr.c | 14 ++ fs/fuse/dir.c | 2 +- fs/fuse/inode.c | 8 +- fs/fuse/ioctl.c | 21 +- fs/jbd2/checkpoint.c | 102 +++----- fs/jfs/jfs_dmap.c | 3 + fs/jfs/jfs_txnmgr.c | 5 + fs/jfs/namei.c | 5 + fs/overlayfs/ovl_entry.h | 9 + fs/quota/dquot.c | 5 +- fs/smb/client/connect.c | 19 +- fs/smb/client/dfs.c | 26 +-- fs/smb/client/smb2transport.c | 2 +- fs/udf/unicode.c | 2 +- include/kvm/arm_vgic.h | 2 +- include/linux/psi.h | 5 +- include/linux/psi_types.h | 3 + include/linux/sched/signal.h | 2 +- include/linux/tcp.h | 2 +- include/net/bluetooth/hci_core.h | 5 + include/net/ip.h | 2 +- include/net/tcp.h | 31 ++- include/uapi/linux/fuse.h | 3 + io_uring/io_uring.c | 52 ++--- kernel/bpf/bpf_lru_list.c | 21 +- kernel/bpf/bpf_lru_list.h | 7 +- kernel/bpf/btf.c | 23 +- kernel/bpf/log.c | 3 - kernel/bpf/syscall.c | 3 +- kernel/bpf/verifier.c | 32 ++- kernel/cgroup/cgroup.c | 2 +- kernel/kallsyms.c | 5 +- kernel/rcu/tasks.h | 5 +- kernel/rcu/tree_exp.h | 2 +- kernel/rcu/tree_plugin.h | 4 +- kernel/sched/fair.c | 4 +- kernel/sched/psi.c | 29 ++- kernel/sys.c | 10 +- kernel/time/posix-timers.c | 31 +-- kernel/trace/trace_events_hist.c | 3 +- lib/iov_iter.c | 2 +- lib/maple_tree.c | 3 +- mm/mlock.c | 9 +- net/bluetooth/hci_conn.c | 14 +- net/bluetooth/hci_core.c | 42 +++- net/bluetooth/hci_event.c | 15 +- net/bluetooth/hci_sync.c | 121 ++++++++-- net/bluetooth/iso.c | 55 +++-- net/bluetooth/mgmt.c | 26 +-- net/bluetooth/sco.c | 23 +- net/bridge/br_stp_if.c | 3 + net/can/bcm.c | 12 +- net/devlink/health.c | 2 +- net/devlink/leftover.c | 5 +- net/ipv4/esp4.c | 2 +- net/ipv4/inet_connection_sock.c | 2 +- net/ipv4/inet_hashtables.c | 17 +- net/ipv4/inet_timewait_sock.c | 8 +- net/ipv4/ip_output.c | 4 +- net/ipv4/tcp.c | 57 ++--- net/ipv4/tcp_fastopen.c | 6 +- net/ipv4/tcp_ipv4.c | 27 ++- net/ipv4/tcp_minisocks.c | 11 +- net/ipv4/tcp_output.c | 6 +- net/ipv4/udp_offload.c | 16 +- net/ipv6/ip6_gre.c | 3 +- net/ipv6/tcp_ipv6.c | 4 +- net/ipv6/udp_offload.c | 3 +- net/llc/llc_input.c | 3 - net/netfilter/nf_tables_api.c | 12 +- net/netfilter/nft_set_pipapo.c | 6 +- net/sched/cls_bpf.c | 99 ++++---- net/sched/cls_matchall.c | 35 +-- net/sched/cls_u32.c | 48 +++- net/wireless/wext-core.c | 6 + scripts/Makefile.build | 5 +- scripts/Makefile.host | 6 +- scripts/kallsyms.c | 6 +- security/keys/request_key.c | 35 ++- security/keys/trusted-keys/trusted_tpm2.c | 2 +- sound/pci/emu10k1/emufx.c | 112 +-------- sound/pci/hda/patch_realtek.c | 100 +++++++- sound/soc/amd/acp/amd.h | 7 +- sound/soc/codecs/Kconfig | 1 + sound/soc/codecs/cs42l51-i2c.c | 6 + sound/soc/codecs/cs42l51.c | 7 - sound/soc/codecs/cs42l51.h | 1 - sound/soc/codecs/rt5640.c | 12 +- sound/soc/codecs/wcd-mbhc-v2.c | 57 +++-- sound/soc/codecs/wcd934x.c | 12 + sound/soc/codecs/wcd938x.c | 86 ++++++- sound/soc/fsl/fsl_sai.c | 8 +- sound/soc/fsl/fsl_sai.h | 1 + sound/soc/qcom/qdsp6/q6apm.c | 7 +- sound/soc/qcom/qdsp6/topology.c | 4 +- sound/soc/sof/ipc3-dtrace.c | 9 +- sound/soc/tegra/tegra210_adx.c | 34 ++- sound/soc/tegra/tegra210_amx.c | 40 ++-- tools/include/nolibc/stackprotector.h | 5 +- tools/perf/Makefile.config | 4 +- .../tests/shell/test_uprobe_from_different_cu.sh | 77 +++++++ tools/perf/util/dwarf-aux.c | 4 +- tools/testing/radix-tree/maple.c | 6 +- tools/testing/selftests/mm/mkdirty.c | 2 +- tools/testing/selftests/tc-testing/config | 2 + tools/testing/selftests/tc-testing/settings | 1 + 218 files changed, 2462 insertions(+), 1482 deletions(-)
Hi!
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
So is the stuttering new in 6.4.7?
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
...or you still believe 6.4.7 is okay to release?
Best regards, Pavel
On Jul 27, 2023, at 7:35 AM, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
So is the stuttering new in 6.4.7?
No it is an old feature in RCU torture tests. But is dependent on timing. Something changed in recent kernels that is making the issues with it more likely. Its hard to bisect as failure sometimes takes hours.
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
...or you still believe 6.4.7 is okay to release?
As such, it should be Ok. However naturally I am not happy that the RCU testing is intermittently failing. These issues have been seen in last several 6.4 stable releases so since those were released, maybe this one can be too? The fix for stuttering is currently being reviewed.
Thanks,
- Joel
Best regards, Pavel -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
On Thu, Jul 27, 2023 at 09:26:52AM -0400, Joel Fernandes wrote:
On Jul 27, 2023, at 7:35 AM, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
So is the stuttering new in 6.4.7?
No it is an old feature in RCU torture tests. But is dependent on timing. Something changed in recent kernels that is making the issues with it more likely. Its hard to bisect as failure sometimes takes hours.
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
...or you still believe 6.4.7 is okay to release?
As such, it should be Ok. However naturally I am not happy that the RCU testing is intermittently failing. These issues have been seen in last several 6.4 stable releases so since those were released, maybe this one can be too? The fix for stuttering is currently being reviewed.
Or, to look at it another way, the stuttering fix is specific to torture testing. Would we really want to hold up a -stable release only because rcutorture occasionally gives a false-positive failure on certain types of systems?
Thanx, Paul
Thanks,
- Joel
Best regards, Pavel -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
On 7/27/23 07:06, Paul E. McKenney wrote:
On Thu, Jul 27, 2023 at 09:26:52AM -0400, Joel Fernandes wrote:
On Jul 27, 2023, at 7:35 AM, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
So is the stuttering new in 6.4.7?
No it is an old feature in RCU torture tests. But is dependent on timing. Something changed in recent kernels that is making the issues with it more likely. Its hard to bisect as failure sometimes takes hours.
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
...or you still believe 6.4.7 is okay to release?
As such, it should be Ok. However naturally I am not happy that the RCU testing is intermittently failing. These issues have been seen in last several 6.4 stable releases so since those were released, maybe this one can be too? The fix for stuttering is currently being reviewed.
Or, to look at it another way, the stuttering fix is specific to torture testing. Would we really want to hold up a -stable release only because rcutorture occasionally gives a false-positive failure on certain types of systems?
No. However, (unrelated) in linux-next, rcu tests sometimes result in apparent hangs or long runtime.
[ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.797998] Running RCU synchronous self tests [ 0.798209] Running RCU synchronous self tests [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family: 0x15, model: 0x2, stepping: 0x0) [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. [ 0.925419] Running RCU-tasks wait API self tests
(hangs until aborted). This is primarily with Opteron CPUs, but also with others such as Haswell, Icelake-Server, and pentium3. It is all but impossible to bisect because it doesn't happen all the time. All I was able to figure out was that it has to do with rcu changes in linux-next. I'd be much more concerned about that.
Guenter
On Thu, Jul 27, 2023 at 07:39:54AM -0700, Guenter Roeck wrote:
On 7/27/23 07:06, Paul E. McKenney wrote:
On Thu, Jul 27, 2023 at 09:26:52AM -0400, Joel Fernandes wrote:
On Jul 27, 2023, at 7:35 AM, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 6.4.7 release. There are 227 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 27 Jul 2023 10:44:26 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.7-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I saw this when running rcutorture, this one happened in the TREE04 configuration. This is likely due to the stuttering issues we are discussing in the other thread. Anyway I am just making a note here while I am continuing to look into it.
So is the stuttering new in 6.4.7?
No it is an old feature in RCU torture tests. But is dependent on timing. Something changed in recent kernels that is making the issues with it more likely. Its hard to bisect as failure sometimes takes hours.
Other than that, all tests pass: Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
...or you still believe 6.4.7 is okay to release?
As such, it should be Ok. However naturally I am not happy that the RCU testing is intermittently failing. These issues have been seen in last several 6.4 stable releases so since those were released, maybe this one can be too? The fix for stuttering is currently being reviewed.
Or, to look at it another way, the stuttering fix is specific to torture testing. Would we really want to hold up a -stable release only because rcutorture occasionally gives a false-positive failure on certain types of systems?
No. However, (unrelated) in linux-next, rcu tests sometimes result in apparent hangs or long runtime.
[ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.797998] Running RCU synchronous self tests [ 0.798209] Running RCU synchronous self tests [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family: 0x15, model: 0x2, stepping: 0x0) [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. [ 0.925419] Running RCU-tasks wait API self tests
(hangs until aborted). This is primarily with Opteron CPUs, but also with others such as Haswell, Icelake-Server, and pentium3. It is all but impossible to bisect because it doesn't happen all the time. All I was able to figure out was that it has to do with rcu changes in linux-next. I'd be much more concerned about that.
First I have heard of this, so thank you for letting me know.
About what fraction of the time does this happen?
Thanx, Paul
On 7/27/23 09:07, Paul E. McKenney wrote:
...]
No. However, (unrelated) in linux-next, rcu tests sometimes result in apparent hangs or long runtime.
[ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.797998] Running RCU synchronous self tests [ 0.798209] Running RCU synchronous self tests [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family: 0x15, model: 0x2, stepping: 0x0) [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. [ 0.925419] Running RCU-tasks wait API self tests
(hangs until aborted). This is primarily with Opteron CPUs, but also with others such as Haswell, Icelake-Server, and pentium3. It is all but impossible to bisect because it doesn't happen all the time. All I was able to figure out was that it has to do with rcu changes in linux-next. I'd be much more concerned about that.
First I have heard of this, so thank you for letting me know.
About what fraction of the time does this happen?
Here is a sample test log from yesterday's -next. This is with x86_64. Today's -next always crashes, so no data.
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .................R........... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ............. passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ..................R.......... passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .................R.............. failed (silent) Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .................R................. passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ................... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ......... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running .................R... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ...... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running .....................R....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running .................R.............. failed (silent)
An earlier test run:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ........ passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running .......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ......... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ........ passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running .................R..... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running .................R.............. failed (silent) Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running .......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ....... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ........ passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ....... passed
"R" means retry, and the dots reflect time expired. It looks like it happens most of the time, but not always, on affected CPUs. I don't have specific data for non-Intel CPUs. I don't think I see the problem there, but there is too much interference from other problems to be sure.
For comparison, here is the result from the latest mainline:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .......... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ........... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running ........ passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ....... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ............. passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ......... passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ............ passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running .......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running .......... passed Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ...... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ...... passed
Guenter
On Thu, Jul 27, 2023 at 10:39:17AM -0700, Guenter Roeck wrote:
On 7/27/23 09:07, Paul E. McKenney wrote:
...]
No. However, (unrelated) in linux-next, rcu tests sometimes result in apparent hangs or long runtime.
[ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.797998] Running RCU synchronous self tests [ 0.798209] Running RCU synchronous self tests [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family: 0x15, model: 0x2, stepping: 0x0) [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. [ 0.925419] Running RCU-tasks wait API self tests
(hangs until aborted). This is primarily with Opteron CPUs, but also with others such as Haswell, Icelake-Server, and pentium3. It is all but impossible to bisect because it doesn't happen all the time. All I was able to figure out was that it has to do with rcu changes in linux-next. I'd be much more concerned about that.
First I have heard of this, so thank you for letting me know.
About what fraction of the time does this happen?
Here is a sample test log from yesterday's -next. This is with x86_64. Today's -next always crashes, so no data.
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .................R........... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ............. passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ..................R.......... passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .................R.............. failed (silent) Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .................R................. passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ................... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ......... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running .................R... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ...... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running .....................R....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running .................R.............. failed (silent)
An earlier test run:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ........ passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running .......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ......... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ........ passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running .................R..... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running .................R.............. failed (silent) Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running .......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ....... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ........ passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ....... passed
"R" means retry, and the dots reflect time expired. It looks like it happens most of the time, but not always, on affected CPUs. I don't have specific data for non-Intel CPUs. I don't think I see the problem there, but there is too much interference from other problems to be sure.
For comparison, here is the result from the latest mainline:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .......... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ........... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running ........ passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ....... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ............. passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ......... passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ............ passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running .......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running .......... passed Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ...... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ...... passed
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
Thanx, Paul
------------------------------------------------------------------------
commit 709a917710dc01798e01750ea628ece4bfc42b7b Author: Paul E. McKenney paulmck@kernel.org Date: Thu Jul 27 13:13:46 2023 -0700
rcu-tasks: Add printk()s to localize boot-time self-test hang
Currently, rcu_tasks_initiate_self_tests() prints a message and then initiates self tests on up to three different RCU Tasks flavors. If one of the flavors has a grace-period hang, it is not easy to work out which of the three hung. This commit therefore prints a message prior to each individual test.
Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Paul E. McKenney paulmck@kernel.org
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..427433c90935 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -1981,20 +1981,22 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
static void rcu_tasks_initiate_self_tests(void) { - pr_info("Running RCU-tasks wait API self tests\n"); #ifdef CONFIG_TASKS_RCU + pr_info("Running RCU Tasks wait API self tests\n"); tests[0].runstart = jiffies; synchronize_rcu_tasks(); call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback); #endif
#ifdef CONFIG_TASKS_RUDE_RCU + pr_info("Running RCU Tasks Rude wait API self tests\n"); tests[1].runstart = jiffies; synchronize_rcu_tasks_rude(); call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback); #endif
#ifdef CONFIG_TASKS_TRACE_RCU + pr_info("Running RCU Tasks Trace wait API self tests\n"); tests[2].runstart = jiffies; synchronize_rcu_tasks_trace(); call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
On Jul 27, 2023, at 4:33 PM, Paul E. McKenney paulmck@kernel.org wrote:
On Thu, Jul 27, 2023 at 10:39:17AM -0700, Guenter Roeck wrote:
On 7/27/23 09:07, Paul E. McKenney wrote:
...]
No. However, (unrelated) in linux-next, rcu tests sometimes result in apparent hangs or long runtime.
[ 0.778841] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.779011] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) [ 0.797998] Running RCU synchronous self tests [ 0.798209] Running RCU synchronous self tests [ 0.912368] smpboot: CPU0: AMD Opteron 63xx class CPU (family: 0x15, model: 0x2, stepping: 0x0) [ 0.923398] RCU Tasks: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1. [ 0.925419] Running RCU-tasks wait API self tests
(hangs until aborted). This is primarily with Opteron CPUs, but also with others such as Haswell, Icelake-Server, and pentium3. It is all but impossible to bisect because it doesn't happen all the time. All I was able to figure out was that it has to do with rcu changes in linux-next. I'd be much more concerned about that.
First I have heard of this, so thank you for letting me know.
About what fraction of the time does this happen?
Here is a sample test log from yesterday's -next. This is with x86_64. Today's -next always crashes, so no data.
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .................R........... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ............. passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ..................R.......... passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .................R.............. failed (silent) Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .................R................. passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ................... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ......... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running .................R... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ...... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running .....................R....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running .................R.............. failed (silent)
An earlier test run:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .................R....... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ........ passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running .......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ....... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running .................R.... passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ......... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ........ passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running .................R..... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running .................R.............. failed (silent) Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running .......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ........ passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ...... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running .......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running .......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ......... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ....... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ........ passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running ......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running ....................R................. failed (silent) Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ....... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ....... passed
"R" means retry, and the dots reflect time expired. It looks like it happens most of the time, but not always, on affected CPUs. I don't have specific data for non-Intel CPUs. I don't think I see the problem there, but there is too much interference from other problems to be sure.
For comparison, here is the result from the latest mainline:
Building x86_64:q35:Broadwell-noTSX:defconfig:smp:net,e1000:mem256:ata:hd ... running ....... passed Building x86_64:q35:Cascadelake-Server:defconfig:smp:net,e1000e:mem256:ata:cd ... running .......... passed Building x86_64:q35:IvyBridge:defconfig:smp2:net,i82801:efi:mem512:nvme:hd ... running ...... passed Building x86_64:q35:SandyBridge:defconfig:smp4:net,ne2k_pci:efi32:mem1G:usb:hd ... running ......... passed Building x86_64:q35:SandyBridge:defconfig:smp8:net,ne2k_pci:mem1G:usb-hub:hd ... running ........... passed Building x86_64:q35:Haswell:defconfig:smp:tpm-tis:net,pcnet:mem2G:usb-uas:hd ... running ........ passed Building x86_64:q35:Skylake-Client:defconfig:smp2:tpm-tis:net,rtl8139:efi:mem4G:sdhci:mmc:hd ... running ....... passed Building x86_64:q35:Conroe:defconfig:smp4:net,tulip:efi32:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Denverton:defconfig:smp2:net,tulip:efi:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:EPYC-Milan:defconfig:smp:tpm-crb:net,tulip:mem256:scsi[DC395]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp:net,virtio-net-old:mem512:scsi[AM53C974]:hd ... running ....... passed Building x86_64:q35:Westmere-IBRS:defconfig:smp2:tpm-crb:net,usb-ohci:efi:mem1G:scsi[53C810]:cd ... running .......... passed Building x86_64:q35:Skylake-Server:defconfig:smp4:tpm-tis:net,e1000-82544gc:efi32:mem2G:scsi[53C895A]:hd ... running ....... passed Building x86_64:pc:EPYC:defconfig:smp:pci-bridge:net,usb-uhci:mem4G:scsi[FUSION]:hd ... running ............. passed Building x86_64:q35:EPYC-IBPB:defconfig:smp2:net,e1000-82545em:efi:mem8G:scsi[MEGASAS]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:efi32:mem256:scsi[MEGASAS2]:hd ... running ....... passed Building x86_64:q35:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ...... passed Building x86_64:pc:Opteron_G5:defconfig:smp4:net,i82559c:mem256:scsi[MEGASAS2]:hd ... running ......... passed Building x86_64:pc:phenom:defconfig:smp:net,i82559er:mem512:initrd ... running ......... passed Building x86_64:q35:Opteron_G1:defconfig:smp2:net,i82562:efi:mem1G:initrd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci]:hd ... running ......... passed Building x86_64:pc:Opteron_G2:defconfig:smp:net,usb:efi32:mem2G:scsi[virtio-pci-old]:hd ... running ......... passed Building x86_64:q35:core2duo:defconfig:smp2:net,i82559a:mem4G:virtio-pci:hd ... running ...... passed Building x86_64:q35:Broadwell:defconfig:smp4:net,i82558b:efi:mem8G:virtio:hd ... running ....... passed Building x86_64:q35:Nehalem:defconfig:smp2:net,i82558a:efi32:mem1G:virtio:hd ... running ...... passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp4:net,ne2k_pci:efi:mem2G:virtio:cd ... running ............ passed Building x86_64:q35:Icelake-Server:defconfig:preempt:smp8:net,i82557a:mem4G:nvme:hd ... running ....... passed Building x86_64:q35:Skylake-Client-IBRS:defconfig:preempt:smp2:net,i82558b:efi32:mem1G:sdhci:mmc:hd ... running ...... passed Building x86_64:q35:KnightsMill:defconfig:preempt:smp6:net,i82550:mem512:initrd ... running ...... passed Building x86_64:q35:Cooperlake:defconfig:smp2:net,usb-ohci:efi:mem1G:scsi[53C810]:hd ... running ....... passed Building x86_64:q35:EPYC-Rome:defconfig:smp4:net,igb:mem2G:scsi[53C895A]:hd ... running .......... passed Building x86_64:pc:Opteron_G3:defconfig:nosmp:net,e1000:mem1G:usb:hd ... running .......... passed Building x86_64:q35:Opteron_G4:defconfig:nosmp:net,ne2k_pci:efi:mem512:ata:hd ... running ...... passed Building x86_64:q35:Haswell-noTSX-IBRS:defconfig:nosmp:net,pcnet:efi32:mem2G:ata:hd ... running ...... passed
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
Cheers,
- Joel
Thanx, Paul
commit 709a917710dc01798e01750ea628ece4bfc42b7b Author: Paul E. McKenney paulmck@kernel.org Date: Thu Jul 27 13:13:46 2023 -0700
rcu-tasks: Add printk()s to localize boot-time self-test hang
Currently, rcu_tasks_initiate_self_tests() prints a message and then initiates self tests on up to three different RCU Tasks flavors. If one of the flavors has a grace-period hang, it is not easy to work out which of the three hung. This commit therefore prints a message prior to each individual test.
Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Paul E. McKenney paulmck@kernel.org
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..427433c90935 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -1981,20 +1981,22 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
static void rcu_tasks_initiate_self_tests(void) {
- pr_info("Running RCU-tasks wait API self tests\n");
#ifdef CONFIG_TASKS_RCU
- pr_info("Running RCU Tasks wait API self tests\n"); tests[0].runstart = jiffies; synchronize_rcu_tasks(); call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback);
#endif
#ifdef CONFIG_TASKS_RUDE_RCU
- pr_info("Running RCU Tasks Rude wait API self tests\n"); tests[1].runstart = jiffies; synchronize_rcu_tasks_rude(); call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
#endif
#ifdef CONFIG_TASKS_TRACE_RCU
- pr_info("Running RCU Tasks Trace wait API self tests\n"); tests[2].runstart = jiffies; synchronize_rcu_tasks_trace(); call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
On 7/27/23 16:18, Joel Fernandes wrote:
[ ... ]
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
I managed to bisect the problem. See bisect log below. Bisect repeated twice. so it should be reliable. I don't really understand it, but the following reverts fix the problem. This is on top of next-20230721 because next-20230728 crashes immediately in my tests.
0caafe9b94ab (HEAD) Revert "sched/fair: Remove sched_feat(START_DEBIT)" 518bdbd39fdb Revert "sched/fair: Add lag based placement" a011162c3e32 Revert "sched/fair: Implement an EEVDF-like scheduling policy" df579720bf98 Revert "sched/fair: Commit to lag based placement" aac459a7e738 Revert "sched/smp: Use lag to simplify cross-runqueue placement" 8d686eb173e1 Revert "sched/fair: Commit to EEVDF" 486474c50f95 Revert "sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice" 79e94d67d08a Revert "sched/fair: Propagate enqueue flags into place_entity()" ae867bc97b71 (tag: next-20230721) Add linux-next specific files for 20230721
For context: x86 images (32 and 64 bit) in -next tend to hang at
[ 2.309323] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 2.311634] Running RCU-tasks wait API self tests
The hang is not seen with every boot; it happens roughly about once every 10 boot attempts. It is not CPU dependent as I initially thought.
Configuration file is at http://server.roeck-us.net/qemu/x86-next/config. Example qemu command line:
qemu-system-x86_64 -kernel arch/x86/boot/bzImage -M q35 -cpu Broadwell-noTSX -no-reboot \ -snapshot -device e1000,netdev=net0 -netdev user,id=net0 -m 256 \ -drive file=rootfs.ext2,format=raw,if=ide \ --append "earlycon=uart8250,io,0x3f8,9600n8 root=/dev/sda console=ttyS0" \ -nographic -monitor none
Guenter
--- # bad: [ae867bc97b713121b2a7f5fcac68378a0774739b] Add linux-next specific files for 20230721 # good: [fdf0eaf11452d72945af31804e2a1048ee1b574c] Linux 6.5-rc2 git bisect start 'HEAD' 'v6.5-rc2' # good: [f09bf8f6c8cbbff6f52523abcda88c86db72e31c] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git git bisect good f09bf8f6c8cbbff6f52523abcda88c86db72e31c # good: [86374a6210aeebceb927204d80f9e65739134bc3] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git git bisect good 86374a6210aeebceb927204d80f9e65739134bc3 # bad: [d588c93cae9e3dff15d125e755edcba5d842f41a] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git git bisect bad d588c93cae9e3dff15d125e755edcba5d842f41a # good: [acadcaf8c67062ad4c1a0ad0e05bf429b04740c5] Merge branch 'for-next' of git://git.kernel.dk/linux-block.git git bisect good acadcaf8c67062ad4c1a0ad0e05bf429b04740c5 # good: [2c73542f4cdc59fd23514f9e963d0b3419bd5e16] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git git bisect good 2c73542f4cdc59fd23514f9e963d0b3419bd5e16 # good: [be15b91155cd5a6c4ac8f46740ae62e610981b79] Merge remote-tracking branch 'spi/for-6.6' into spi-next git bisect good be15b91155cd5a6c4ac8f46740ae62e610981b79 # bad: [8f4995b370a57e7ad92c0f66664d171b23234337] Merge branch into tip/master: 'sched/eevdf' git bisect bad 8f4995b370a57e7ad92c0f66664d171b23234337 # bad: [99d4d26551b56f4e523dd04e4970b94aa796a64e] rbtree: Add rb_add_augmented_cached() helper git bisect bad 99d4d26551b56f4e523dd04e4970b94aa796a64e # good: [7ff1693236f5d97a939dbeb660c07671a2d57071] sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups git bisect good 7ff1693236f5d97a939dbeb660c07671a2d57071 # good: [48b5583719cdfbdee238f9549a6a1a47af2b0469] sched/headers: Rename task_struct::state to task_struct::__state in the comments too git bisect good 48b5583719cdfbdee238f9549a6a1a47af2b0469 # good: [af4cf40470c22efa3987200fd19478199e08e103] sched/fair: Add cfs_rq::avg_vruntime git bisect good af4cf40470c22efa3987200fd19478199e08e103 # bad: [86bfbb7ce4f67a88df2639198169b685668e7349] sched/fair: Add lag based placement git bisect bad 86bfbb7ce4f67a88df2639198169b685668e7349 # bad: [e0c2ff903c320d3fd3c2c604dc401b3b7c0a1d13] sched/fair: Remove sched_feat(START_DEBIT) git bisect bad e0c2ff903c320d3fd3c2c604dc401b3b7c0a1d13 # first bad commit: [e0c2ff903c320d3fd3c2c604dc401b3b7c0a1d13] sched/fair: Remove sched_feat(START_DEBIT)
On Sat, Jul 29, 2023 at 09:00:02PM -0700, Guenter Roeck wrote:
On 7/27/23 16:18, Joel Fernandes wrote:
[ ... ]
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
I managed to bisect the problem. See bisect log below. Bisect repeated twice. so it should be reliable. I don't really understand it, but the following reverts fix the problem. This is on top of next-20230721 because next-20230728 crashes immediately in my tests.
0caafe9b94ab (HEAD) Revert "sched/fair: Remove sched_feat(START_DEBIT)" 518bdbd39fdb Revert "sched/fair: Add lag based placement" a011162c3e32 Revert "sched/fair: Implement an EEVDF-like scheduling policy" df579720bf98 Revert "sched/fair: Commit to lag based placement" aac459a7e738 Revert "sched/smp: Use lag to simplify cross-runqueue placement" 8d686eb173e1 Revert "sched/fair: Commit to EEVDF" 486474c50f95 Revert "sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice" 79e94d67d08a Revert "sched/fair: Propagate enqueue flags into place_entity()" ae867bc97b71 (tag: next-20230721) Add linux-next specific files for 20230721
For context: x86 images (32 and 64 bit) in -next tend to hang at
[ 2.309323] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 2.311634] Running RCU-tasks wait API self tests
The hang is not seen with every boot; it happens roughly about once every 10 boot attempts. It is not CPU dependent as I initially thought.
Configuration file is at http://server.roeck-us.net/qemu/x86-next/config. Example qemu command line:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
On 7/31/23 07:19, Peter Zijlstra wrote:
On Sat, Jul 29, 2023 at 09:00:02PM -0700, Guenter Roeck wrote:
On 7/27/23 16:18, Joel Fernandes wrote:
[ ... ]
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
I managed to bisect the problem. See bisect log below. Bisect repeated twice. so it should be reliable. I don't really understand it, but the following reverts fix the problem. This is on top of next-20230721 because next-20230728 crashes immediately in my tests.
0caafe9b94ab (HEAD) Revert "sched/fair: Remove sched_feat(START_DEBIT)" 518bdbd39fdb Revert "sched/fair: Add lag based placement" a011162c3e32 Revert "sched/fair: Implement an EEVDF-like scheduling policy" df579720bf98 Revert "sched/fair: Commit to lag based placement" aac459a7e738 Revert "sched/smp: Use lag to simplify cross-runqueue placement" 8d686eb173e1 Revert "sched/fair: Commit to EEVDF" 486474c50f95 Revert "sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice" 79e94d67d08a Revert "sched/fair: Propagate enqueue flags into place_entity()" ae867bc97b71 (tag: next-20230721) Add linux-next specific files for 20230721
For context: x86 images (32 and 64 bit) in -next tend to hang at
[ 2.309323] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 2.311634] Running RCU-tasks wait API self tests
The hang is not seen with every boot; it happens roughly about once every 10 boot attempts. It is not CPU dependent as I initially thought.
Configuration file is at http://server.roeck-us.net/qemu/x86-next/config. Example qemu command line:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
That crashes hard with my configuration.
[ 6.353191] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 6.353392] BUG: unable to handle page fault for address: ffff9b10c0013cd0 [ 6.353531] #PF: supervisor instruction fetch in kernel mode [ 6.353624] #PF: error_code(0x0011) - permissions violation [ 6.353751] PGD 1000067 P4D 1000067 PUD 1205067 PMD 1206067 PTE 800000000124e063 [ 6.354011] Oops: 0011 [#1] PREEMPT SMP PTI [ 6.354164] CPU: 0 PID: 182 Comm: kunit_try_catch Tainted: G N 6.5.0-rc4-next-20230731 #1 [ 6.354315] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 6.354525] RIP: 0010:0xffff9b10c0013cd0 [ 6.354793] Code: ff ff 60 64 ce a9 ff ff ff ff 00 00 00 00 00 00 00 00 d1 3a bc a8 ff ff ff ff 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <f0> 00 01 44 10 8a ff ff b8 01 01 44 10 8a ff ff 00 00 00 00 00 00 [ 6.355059] RSP: 0000:ffff9b10c027fd60 EFLAGS: 00000246 [ 6.355157] RAX: ffff9b10c0013cd0 RBX: ffff8a1043bdb400 RCX: 0000000000000000 [ 6.355259] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8a1043bdb400 [ 6.355358] RBP: ffff9b10c027fdc8 R08: 0000000000000001 R09: 0000000000000001 [ 6.355456] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9b10c027fe74 [ 6.355556] R13: ffff8a10440100f0 R14: ffff8a10440101b8 R15: ffff9b10c027fe74 [ 6.355679] FS: 0000000000000000(0000) GS:ffff8a104fc00000(0000) knlGS:0000000000000000 [ 6.355798] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.355886] CR2: ffff9b10c0013cd0 CR3: 000000000e048000 CR4: 00000000003506f0 [ 6.356029] Call Trace: [ 6.356158] <TASK> [ 6.356334] ? __die+0x1f/0x70 [ 6.356472] ? page_fault_oops+0x14a/0x460 [ 6.356547] ? exc_page_fault+0xee/0x1c0 [ 6.356612] ? asm_exc_page_fault+0x26/0x30 [ 6.356703] ? kunit_filter_attr_tests+0xc4/0x2e0 [ 6.356796] kunit_filter_suites+0x2e2/0x460 [ 6.356889] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.356979] filter_suites_test+0xea/0x2c0 [ 6.357051] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.357148] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 6.357228] kthread+0xef/0x120 [ 6.357282] ? __pfx_kthread+0x10/0x10 [ 6.357343] ret_from_fork+0x2f/0x50 [ 6.357399] ? __pfx_kthread+0x10/0x10 [ 6.357458] ret_from_fork_asm+0x1b/0x30 [ 6.357560] </TASK> [ 6.357632] Modules linked in: [ 6.357786] CR2: ffff9b10c0013cd0 [ 6.358010] ---[ end trace 0000000000000000 ]---
Enabling CONFIG_ZERO_CALL_USED_REGS might fix (hide) this, but I have not tried.
Guenter
On Mon, Jul 31, 2023 at 07:35:13AM -0700, Guenter Roeck wrote:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
That crashes hard with my configuration.
[ 6.353191] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 6.353392] BUG: unable to handle page fault for address: ffff9b10c0013cd0 [ 6.353531] #PF: supervisor instruction fetch in kernel mode [ 6.353624] #PF: error_code(0x0011) - permissions violation [ 6.353751] PGD 1000067 P4D 1000067 PUD 1205067 PMD 1206067 PTE 800000000124e063 [ 6.354011] Oops: 0011 [#1] PREEMPT SMP PTI [ 6.354164] CPU: 0 PID: 182 Comm: kunit_try_catch Tainted: G N 6.5.0-rc4-next-20230731 #1 [ 6.354315] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 6.354525] RIP: 0010:0xffff9b10c0013cd0 [ 6.354793] Code: ff ff 60 64 ce a9 ff ff ff ff 00 00 00 00 00 00 00 00 d1 3a bc a8 ff ff ff ff 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <f0> 00 01 44 10 8a ff ff b8 01 01 44 10 8a ff ff 00 00 00 00 00 00 [ 6.355059] RSP: 0000:ffff9b10c027fd60 EFLAGS: 00000246 [ 6.355157] RAX: ffff9b10c0013cd0 RBX: ffff8a1043bdb400 RCX: 0000000000000000 [ 6.355259] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8a1043bdb400 [ 6.355358] RBP: ffff9b10c027fdc8 R08: 0000000000000001 R09: 0000000000000001 [ 6.355456] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9b10c027fe74 [ 6.355556] R13: ffff8a10440100f0 R14: ffff8a10440101b8 R15: ffff9b10c027fe74 [ 6.355679] FS: 0000000000000000(0000) GS:ffff8a104fc00000(0000) knlGS:0000000000000000 [ 6.355798] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.355886] CR2: ffff9b10c0013cd0 CR3: 000000000e048000 CR4: 00000000003506f0 [ 6.356029] Call Trace: [ 6.356158] <TASK> [ 6.356334] ? __die+0x1f/0x70 [ 6.356472] ? page_fault_oops+0x14a/0x460 [ 6.356547] ? exc_page_fault+0xee/0x1c0 [ 6.356612] ? asm_exc_page_fault+0x26/0x30 [ 6.356703] ? kunit_filter_attr_tests+0xc4/0x2e0 [ 6.356796] kunit_filter_suites+0x2e2/0x460 [ 6.356889] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.356979] filter_suites_test+0xea/0x2c0 [ 6.357051] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.357148] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 6.357228] kthread+0xef/0x120 [ 6.357282] ? __pfx_kthread+0x10/0x10 [ 6.357343] ret_from_fork+0x2f/0x50 [ 6.357399] ? __pfx_kthread+0x10/0x10 [ 6.357458] ret_from_fork_asm+0x1b/0x30 [ 6.357560] </TASK> [ 6.357632] Modules linked in: [ 6.357786] CR2: ffff9b10c0013cd0 [ 6.358010] ---[ end trace 0000000000000000 ]---
I get:
[ 2.423691] ------------[ cut here ]------------ [ 2.424994] WARNING: CPU: 0 PID: 184 at mm/slab_common.c:992 free_large_kmalloc+0x4f/0x80 [ 2.426183] Modules linked in: [ 2.426624] CPU: 0 PID: 184 Comm: kunit_try_catch Tainted: G N 6.5.0-rc4-next-20230731 #1 [ 2.427964] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 2.429265] RIP: 0010:free_large_kmalloc+0x4f/0x80 [ 2.429952] Code: f7 da 48 63 d2 48 8b 03 be 06 00 00 00 48 c1 e8 3a 48 8b 3c c5 60 ba 11 ab e8 0d 52 ff ff 89 ee 48 89 df 5b 5d e9 41 df 03 00 <0f> 0b 80 3d 49 43 e9 01 00 75 [ 2.432511] RSP: 0000:ffffadcb0024bdb8 EFLAGS: 00010246 [ 2.433259] RAX: 0100000000001000 RBX: ffffd16bc018aa40 RCX: ffffadcb0024bd7c [ 2.434262] RDX: ffffd16bc018aa48 RSI: ffffffffa96a9ec7 RDI: ffffd16bc018aa40 [ 2.435265] RBP: ffffadcb0024be60 R08: 0000000000000001 R09: 0000000000000001 [ 2.436269] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8a7084014410 [ 2.437267] R13: ffff8a70840c4000 R14: 0000000000000002 R15: ffff8a70840564a8 [ 2.438271] FS: 0000000000000000(0000) GS:ffff8a708f800000(0000) knlGS:0000000000000000 [ 2.439403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.440215] CR2: ffff8a7089401000 CR3: 0000000007a48001 CR4: 0000000000170ef0 [ 2.441218] Call Trace: [ 2.441568] <TASK> [ 2.441883] ? free_large_kmalloc+0x4f/0x80 [ 2.442491] ? __warn+0x80/0x170 [ 2.442988] ? free_large_kmalloc+0x4f/0x80 [ 2.443591] ? report_bug+0x171/0x1a0 [ 2.444145] ? handle_bug+0x3c/0x70 [ 2.444662] ? exc_invalid_op+0x17/0x70 [ 2.445225] ? asm_exc_invalid_op+0x1a/0x20 [ 2.445844] ? kunit_add_action+0xc7/0x140 [ 2.446455] ? free_large_kmalloc+0x4f/0x80 [ 2.447054] kunit_filter_suites+0x468/0x480 [ 2.447662] ? kunit_add_action+0xc7/0x140 [ 2.448258] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 2.449105] filter_suites_test+0xea/0x2c0 [ 2.449702] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 2.450469] kthread+0xf0/0x120 [ 2.450940] ? __pfx_kthread+0x10/0x10 [ 2.451481] ret_from_fork+0x2f/0x50 [ 2.452012] ? __pfx_kthread+0x10/0x10 [ 2.452557] ret_from_fork_asm+0x1b/0x30 [ 2.453146] </TASK> [ 2.453474] irq event stamp: 677 [ 2.453943] hardirqs last enabled at (689): [<ffffffffa911c24a>] console_unlock+0x10a/0x160 [ 2.455151] hardirqs last disabled at (700): [<ffffffffa911c22f>] console_unlock+0xef/0x160 [ 2.456329] softirqs last enabled at (662): [<ffffffffa909179a>] irq_exit_rcu+0x7a/0xa0 [ 2.457474] softirqs last disabled at (657): [<ffffffffa909179a>] irq_exit_rcu+0x7a/0xa0 [ 2.458610] ---[ end trace 0000000000000000 ]---
But then it continues and eventually reaches:
Linux version 6.5.0-rc4-next-20230731 (root@ivb-ep) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Mon Jul 31 15:39:05 CEST 2023 Network interface test passed Boot successful. / #
Full log attached.
On 7/31/23 07:47, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 07:35:13AM -0700, Guenter Roeck wrote:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
That crashes hard with my configuration.
[ 6.353191] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 6.353392] BUG: unable to handle page fault for address: ffff9b10c0013cd0 [ 6.353531] #PF: supervisor instruction fetch in kernel mode [ 6.353624] #PF: error_code(0x0011) - permissions violation [ 6.353751] PGD 1000067 P4D 1000067 PUD 1205067 PMD 1206067 PTE 800000000124e063 [ 6.354011] Oops: 0011 [#1] PREEMPT SMP PTI [ 6.354164] CPU: 0 PID: 182 Comm: kunit_try_catch Tainted: G N 6.5.0-rc4-next-20230731 #1 [ 6.354315] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 6.354525] RIP: 0010:0xffff9b10c0013cd0 [ 6.354793] Code: ff ff 60 64 ce a9 ff ff ff ff 00 00 00 00 00 00 00 00 d1 3a bc a8 ff ff ff ff 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <f0> 00 01 44 10 8a ff ff b8 01 01 44 10 8a ff ff 00 00 00 00 00 00 [ 6.355059] RSP: 0000:ffff9b10c027fd60 EFLAGS: 00000246 [ 6.355157] RAX: ffff9b10c0013cd0 RBX: ffff8a1043bdb400 RCX: 0000000000000000 [ 6.355259] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8a1043bdb400 [ 6.355358] RBP: ffff9b10c027fdc8 R08: 0000000000000001 R09: 0000000000000001 [ 6.355456] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9b10c027fe74 [ 6.355556] R13: ffff8a10440100f0 R14: ffff8a10440101b8 R15: ffff9b10c027fe74 [ 6.355679] FS: 0000000000000000(0000) GS:ffff8a104fc00000(0000) knlGS:0000000000000000 [ 6.355798] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.355886] CR2: ffff9b10c0013cd0 CR3: 000000000e048000 CR4: 00000000003506f0 [ 6.356029] Call Trace: [ 6.356158] <TASK> [ 6.356334] ? __die+0x1f/0x70 [ 6.356472] ? page_fault_oops+0x14a/0x460 [ 6.356547] ? exc_page_fault+0xee/0x1c0 [ 6.356612] ? asm_exc_page_fault+0x26/0x30 [ 6.356703] ? kunit_filter_attr_tests+0xc4/0x2e0 [ 6.356796] kunit_filter_suites+0x2e2/0x460 [ 6.356889] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.356979] filter_suites_test+0xea/0x2c0 [ 6.357051] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 6.357148] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 6.357228] kthread+0xef/0x120 [ 6.357282] ? __pfx_kthread+0x10/0x10 [ 6.357343] ret_from_fork+0x2f/0x50 [ 6.357399] ? __pfx_kthread+0x10/0x10 [ 6.357458] ret_from_fork_asm+0x1b/0x30 [ 6.357560] </TASK> [ 6.357632] Modules linked in: [ 6.357786] CR2: ffff9b10c0013cd0 [ 6.358010] ---[ end trace 0000000000000000 ]---
I get:
[ 2.423691] ------------[ cut here ]------------ [ 2.424994] WARNING: CPU: 0 PID: 184 at mm/slab_common.c:992 free_large_kmalloc+0x4f/0x80 [ 2.426183] Modules linked in: [ 2.426624] CPU: 0 PID: 184 Comm: kunit_try_catch Tainted: G N 6.5.0-rc4-next-20230731 #1 [ 2.427964] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 2.429265] RIP: 0010:free_large_kmalloc+0x4f/0x80 [ 2.429952] Code: f7 da 48 63 d2 48 8b 03 be 06 00 00 00 48 c1 e8 3a 48 8b 3c c5 60 ba 11 ab e8 0d 52 ff ff 89 ee 48 89 df 5b 5d e9 41 df 03 00 <0f> 0b 80 3d 49 43 e9 01 00 75 [ 2.432511] RSP: 0000:ffffadcb0024bdb8 EFLAGS: 00010246 [ 2.433259] RAX: 0100000000001000 RBX: ffffd16bc018aa40 RCX: ffffadcb0024bd7c [ 2.434262] RDX: ffffd16bc018aa48 RSI: ffffffffa96a9ec7 RDI: ffffd16bc018aa40 [ 2.435265] RBP: ffffadcb0024be60 R08: 0000000000000001 R09: 0000000000000001 [ 2.436269] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8a7084014410 [ 2.437267] R13: ffff8a70840c4000 R14: 0000000000000002 R15: ffff8a70840564a8 [ 2.438271] FS: 0000000000000000(0000) GS:ffff8a708f800000(0000) knlGS:0000000000000000 [ 2.439403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.440215] CR2: ffff8a7089401000 CR3: 0000000007a48001 CR4: 0000000000170ef0 [ 2.441218] Call Trace: [ 2.441568] <TASK> [ 2.441883] ? free_large_kmalloc+0x4f/0x80 [ 2.442491] ? __warn+0x80/0x170 [ 2.442988] ? free_large_kmalloc+0x4f/0x80 [ 2.443591] ? report_bug+0x171/0x1a0 [ 2.444145] ? handle_bug+0x3c/0x70 [ 2.444662] ? exc_invalid_op+0x17/0x70 [ 2.445225] ? asm_exc_invalid_op+0x1a/0x20 [ 2.445844] ? kunit_add_action+0xc7/0x140 [ 2.446455] ? free_large_kmalloc+0x4f/0x80 [ 2.447054] kunit_filter_suites+0x468/0x480 [ 2.447662] ? kunit_add_action+0xc7/0x140 [ 2.448258] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 2.449105] filter_suites_test+0xea/0x2c0 [ 2.449702] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 2.450469] kthread+0xf0/0x120 [ 2.450940] ? __pfx_kthread+0x10/0x10 [ 2.451481] ret_from_fork+0x2f/0x50 [ 2.452012] ? __pfx_kthread+0x10/0x10 [ 2.452557] ret_from_fork_asm+0x1b/0x30 [ 2.453146] </TASK> [ 2.453474] irq event stamp: 677 [ 2.453943] hardirqs last enabled at (689): [<ffffffffa911c24a>] console_unlock+0x10a/0x160 [ 2.455151] hardirqs last disabled at (700): [<ffffffffa911c22f>] console_unlock+0xef/0x160 [ 2.456329] softirqs last enabled at (662): [<ffffffffa909179a>] irq_exit_rcu+0x7a/0xa0 [ 2.457474] softirqs last disabled at (657): [<ffffffffa909179a>] irq_exit_rcu+0x7a/0xa0 [ 2.458610] ---[ end trace 0000000000000000 ]---
Same problem. I see the warning on some architectures, the crash on others. The fix for that problem is at https://lore.kernel.org/linux-kselftest/20230729010003.4058582-1-ruanjinjie@... It is caused by the "kunit: Add test attributes API" patch series. See https://lore.kernel.org/lkml/5205b6aa-c9ea-8f9c-f42c-b840346f740c@roeck-us.n...
Guenter
On Mon, Jul 31, 2023 at 04:19:34PM +0200, Peter Zijlstra wrote:
On Sat, Jul 29, 2023 at 09:00:02PM -0700, Guenter Roeck wrote:
On 7/27/23 16:18, Joel Fernandes wrote:
[ ... ]
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
I managed to bisect the problem. See bisect log below. Bisect repeated twice. so it should be reliable. I don't really understand it, but the following reverts fix the problem. This is on top of next-20230721 because next-20230728 crashes immediately in my tests.
0caafe9b94ab (HEAD) Revert "sched/fair: Remove sched_feat(START_DEBIT)" 518bdbd39fdb Revert "sched/fair: Add lag based placement" a011162c3e32 Revert "sched/fair: Implement an EEVDF-like scheduling policy" df579720bf98 Revert "sched/fair: Commit to lag based placement" aac459a7e738 Revert "sched/smp: Use lag to simplify cross-runqueue placement" 8d686eb173e1 Revert "sched/fair: Commit to EEVDF" 486474c50f95 Revert "sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice" 79e94d67d08a Revert "sched/fair: Propagate enqueue flags into place_entity()" ae867bc97b71 (tag: next-20230721) Add linux-next specific files for 20230721
For context: x86 images (32 and 64 bit) in -next tend to hang at
[ 2.309323] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 2.311634] Running RCU-tasks wait API self tests
The hang is not seen with every boot; it happens roughly about once every 10 boot attempts. It is not CPU dependent as I initially thought.
Configuration file is at http://server.roeck-us.net/qemu/x86-next/config. Example qemu command line:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
On 7/31/23 07:39, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 04:19:34PM +0200, Peter Zijlstra wrote:
On Sat, Jul 29, 2023 at 09:00:02PM -0700, Guenter Roeck wrote:
On 7/27/23 16:18, Joel Fernandes wrote:
[ ... ]
I freely confess that I am having a hard time imagining what would be CPU dependent in that code. Timing, maybe? Whatever the reason, I am not seeing these failures in my testing.
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
If you have more than one of them, could you please apply this patch and show me the corresponding console output from the resulting hang?
FWIW, I am not able to repro this issue either. If a .config can be shared of the problem system, I can try it out to see if it can be reproduced on my side.
I managed to bisect the problem. See bisect log below. Bisect repeated twice. so it should be reliable. I don't really understand it, but the following reverts fix the problem. This is on top of next-20230721 because next-20230728 crashes immediately in my tests.
0caafe9b94ab (HEAD) Revert "sched/fair: Remove sched_feat(START_DEBIT)" 518bdbd39fdb Revert "sched/fair: Add lag based placement" a011162c3e32 Revert "sched/fair: Implement an EEVDF-like scheduling policy" df579720bf98 Revert "sched/fair: Commit to lag based placement" aac459a7e738 Revert "sched/smp: Use lag to simplify cross-runqueue placement" 8d686eb173e1 Revert "sched/fair: Commit to EEVDF" 486474c50f95 Revert "sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice" 79e94d67d08a Revert "sched/fair: Propagate enqueue flags into place_entity()" ae867bc97b71 (tag: next-20230721) Add linux-next specific files for 20230721
For context: x86 images (32 and 64 bit) in -next tend to hang at
[ 2.309323] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 2.311634] Running RCU-tasks wait API self tests
The hang is not seen with every boot; it happens roughly about once every 10 boot attempts. It is not CPU dependent as I initially thought.
Configuration file is at http://server.roeck-us.net/qemu/x86-next/config. Example qemu command line:
Hurmph, let me see if I can reproduce on next-20230731 (not having the older next thingies around).
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Thanks, Guenter
On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Excellent! Let me prod that with something sharp, see what comes creeping out.
On Mon, 2023-07-31 at 16:52 +0200, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Excellent! Let me prod that with something sharp, see what comes creeping out.
In an effort to get up to speed with this area of the kernel, I've been playing around with this too today and managed to reproduce the problem using the same configuration. I'm completely new to this code but I think I may have found the root of the problem.
What I've found is that there is a race condition between starting the RCU tasks grace-period thread in rcu_spawn_tasks_kthread_generic() and a subsequent call to synchronize_rcu_tasks_generic(). This results in rtp->tasks_gp_mutex being locked in the initial thread which subsequently blocks the newly started grace- period thread.
The problem is that although synchronize_rcu_tasks_generic() checks to see if the grace-period kthread is running, it uses rtp->kthread_ptr to achieve this. This is only set in the thread entry point and not when the thread is created, meaning that it is set only after the creating thread yields or is preempted. If this has not happened before the next call to synchronize_rcu_tasks_generic() then a deadlock occurs.
I've created a debug patch that introduces a new flag in rcu_tasks that is set when the kthread is created and used this in synchronize_rcu_tasks_generic() in place of READ_ONCE(rtp->kthread_ptr). This fixes the issue in my test environment.
I'm happy to have a go at submitting a patch for this if it helps.
On Mon, Jul 31, 2023 at 05:08:29PM +0100, Roy Hopkins wrote:
On Mon, 2023-07-31 at 16:52 +0200, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Excellent! Let me prod that with something sharp, see what comes creeping out.
In an effort to get up to speed with this area of the kernel, I've been playing around with this too today and managed to reproduce the problem using the same configuration. I'm completely new to this code but I think I may have found the root of the problem.
What I've found is that there is a race condition between starting the RCU tasks grace-period thread in rcu_spawn_tasks_kthread_generic() and a subsequent call to synchronize_rcu_tasks_generic(). This results in rtp->tasks_gp_mutex being locked in the initial thread which subsequently blocks the newly started grace- period thread.
The problem is that although synchronize_rcu_tasks_generic() checks to see if the grace-period kthread is running, it uses rtp->kthread_ptr to achieve this. This is only set in the thread entry point and not when the thread is created, meaning that it is set only after the creating thread yields or is preempted. If this has not happened before the next call to synchronize_rcu_tasks_generic() then a deadlock occurs.
I've created a debug patch that introduces a new flag in rcu_tasks that is set when the kthread is created and used this in synchronize_rcu_tasks_generic() in place of READ_ONCE(rtp->kthread_ptr). This fixes the issue in my test environment.
I'm happy to have a go at submitting a patch for this if it helps.
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..b083b5a30025 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -652,7 +658,11 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp) t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname); if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name)) return; - smp_mb(); /* Ensure others see full kthread. */ + for (;;) { + cond_resched(); + if (smp_load_acquire(&rtp->kthread_ptr)) + break; + } }
#ifndef CONFIG_TINY_RCU
On Mon, 2023-07-31 at 18:14 +0200, Peter Zijlstra wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..b083b5a30025 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -652,7 +658,11 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp) t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname); if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name)) return; - smp_mb(); /* Ensure others see full kthread. */ + for (;;) { + cond_resched(); + if (smp_load_acquire(&rtp->kthread_ptr)) + break; + } } #ifndef CONFIG_TINY_RCU
FWIW, here's my hack which seems to fix it.
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 9b9ce09f8f35..2e76fbfff9c6 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -52,6 +52,7 @@ struct rcu_tasks_percpu { * @cbs_gbl_lock: Lock protecting callback list. * @tasks_gp_mutex: Mutex protecting grace period, needed during mid-boot dead zone. * @kthread_ptr: This flavor's grace-period/callback-invocation kthread. + * @kthread_started: Flag that indicates whether kthread has been launched. * @gp_func: This flavor's grace-period-wait function. * @gp_state: Grace period's most recent state transition (debugging). * @gp_sleep: Per-grace-period sleep to prevent CPU-bound looping. @@ -92,6 +93,7 @@ struct rcu_tasks { unsigned long n_ipis; unsigned long n_ipis_fails; struct task_struct *kthread_ptr; + int kthread_started; rcu_tasks_gp_func_t gp_func; pregp_func_t pregp_func; pertask_func_t pertask_func; @@ -582,7 +584,7 @@ static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp) return;
// If the grace-period kthread is running, use it. - if (READ_ONCE(rtp->kthread_ptr)) { + if (READ_ONCE(rtp->kthread_started)) { wait_rcu_gp(rtp->call_func); return; } @@ -595,6 +597,7 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp) struct task_struct *t;
t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname); + rtp->kthread_started = 1; if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name)) return; smp_mb(); /* Ensure others see full kthread. */
On 7/31/23 09:14, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 05:08:29PM +0100, Roy Hopkins wrote:
On Mon, 2023-07-31 at 16:52 +0200, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Excellent! Let me prod that with something sharp, see what comes creeping out.
In an effort to get up to speed with this area of the kernel, I've been playing around with this too today and managed to reproduce the problem using the same configuration. I'm completely new to this code but I think I may have found the root of the problem.
What I've found is that there is a race condition between starting the RCU tasks grace-period thread in rcu_spawn_tasks_kthread_generic() and a subsequent call to synchronize_rcu_tasks_generic(). This results in rtp->tasks_gp_mutex being locked in the initial thread which subsequently blocks the newly started grace- period thread.
The problem is that although synchronize_rcu_tasks_generic() checks to see if the grace-period kthread is running, it uses rtp->kthread_ptr to achieve this. This is only set in the thread entry point and not when the thread is created, meaning that it is set only after the creating thread yields or is preempted. If this has not happened before the next call to synchronize_rcu_tasks_generic() then a deadlock occurs.
I've created a debug patch that introduces a new flag in rcu_tasks that is set when the kthread is created and used this in synchronize_rcu_tasks_generic() in place of READ_ONCE(rtp->kthread_ptr). This fixes the issue in my test environment.
I'm happy to have a go at submitting a patch for this if it helps.
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
Thanks, Guenter
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..b083b5a30025 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -652,7 +658,11 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp) t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname); if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name)) return;
- smp_mb(); /* Ensure others see full kthread. */
- for (;;) {
cond_resched();
if (smp_load_acquire(&rtp->kthread_ptr))
break;
- } }
#ifndef CONFIG_TINY_RCU
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
On 7/31/23 14:15, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
Kind of scary. So I only experienced the problem because the START_DEBIT patch happened to be queued roughly at the same time, and it might otherwise have found its way unnoticed into the upstream kernel. That makes me wonder if this or other similar patches may uncover similar problems elsewhere in the kernel (i.e., either hide new or existing race conditions or expose existing ones).
This in turn makes me wonder if it would be possible to define a test which would uncover such problems without the START_DEBIT patch. Any idea ?
Guenter
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
On 7/31/23 14:15, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
Kind of scary. So I only experienced the problem because the START_DEBIT patch happened to be queued roughly at the same time, and it might otherwise have found its way unnoticed into the upstream kernel. That makes me wonder if this or other similar patches may uncover similar problems elsewhere in the kernel (i.e., either hide new or existing race conditions or expose existing ones).
This in turn makes me wonder if it would be possible to define a test which would uncover such problems without the START_DEBIT patch. Any idea ?
IIRC some of the thread sanitizers use breakpoints to inject random sleeps, specifically to tickle races.
On Tue, Aug 01, 2023 at 09:08:52PM +0200, Peter Zijlstra wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
On 7/31/23 14:15, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
Kind of scary. So I only experienced the problem because the START_DEBIT patch happened to be queued roughly at the same time, and it might otherwise have found its way unnoticed into the upstream kernel. That makes me wonder if this or other similar patches may uncover similar problems elsewhere in the kernel (i.e., either hide new or existing race conditions or expose existing ones).
This in turn makes me wonder if it would be possible to define a test which would uncover such problems without the START_DEBIT patch. Any idea ?
IIRC some of the thread sanitizers use breakpoints to inject random sleeps, specifically to tickle races.
I have heard of are some of these, arguably including KCSAN, but they would have a tough time on this one.
They would have to inject many milliseconds between the check of ->kthread_ptr in synchronize_rcu_tasks_generic() and that mutex_lock() in rcu_tasks_one_gp(). Plus this window only occurs during boot shortly before init is spawned.
On the other hand, randomly injecting delay just before acquiring each lock would cover this case. But such a sanitzer would still only get one shot per boot of the kernel for this particular bug.
Thanx, Paul
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
On 7/31/23 14:15, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
Kind of scary. So I only experienced the problem because the START_DEBIT patch happened to be queued roughly at the same time, and it might otherwise have found its way unnoticed into the upstream kernel. That makes me wonder if this or other similar patches may uncover similar problems elsewhere in the kernel (i.e., either hide new or existing race conditions or expose existing ones).
This in turn makes me wonder if it would be possible to define a test which would uncover such problems without the START_DEBIT patch. Any idea ?
Thank you all for tracking this down!
One way is to put a schedule_timeout_idle(100) right before the call to rcu_tasks_one_gp() from synchronize_rcu_tasks_generic(). That is quite specific to this particular issue, but it does have the virtue of making it actually happen in my testing.
There have been a few academic projects that inject delays at points chosen by various heuristics plus some randomness. But this would be a bit of a challenge to those because each kernel only passes through this window once at boot time.
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); }
if (needgpcb & 0x2) {
On Tue, Aug 01, 2023 at 12:11:04PM -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
On 7/31/23 14:15, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
That commit changes the timings of things; dumb luck otherwise.
Kind of scary. So I only experienced the problem because the START_DEBIT patch happened to be queued roughly at the same time, and it might otherwise have found its way unnoticed into the upstream kernel.
And just to set the record straight, this bug has been in mainline for about a year, since v5.19.
Thanx, Paul
That makes me wonder if this
or other similar patches may uncover similar problems elsewhere in the kernel (i.e., either hide new or existing race conditions or expose existing ones).
This in turn makes me wonder if it would be possible to define a test which would uncover such problems without the START_DEBIT patch. Any idea ?
Thank you all for tracking this down!
One way is to put a schedule_timeout_idle(100) right before the call to rcu_tasks_one_gp() from synchronize_rcu_tasks_generic(). That is quite specific to this particular issue, but it does have the virtue of making it actually happen in my testing.
There have been a few academic projects that inject delays at points chosen by various heuristics plus some randomness. But this would be a bit of a challenge to those because each kernel only passes through this window once at boot time.
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else {
set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE);mutex_unlock(&rtp->tasks_gp_mutex);
}mutex_lock(&rtp->tasks_gp_mutex);
if (needgpcb & 0x2) {
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote:
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
Very good, thank you! May I add your Tested-by?
Thanx, Paul
On Wed, 2023-08-02 at 08:05 -0700, Paul E. McKenney wrote:
On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote:
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
Very good, thank you! May I add your Tested-by?
Thanx, Paul
Yes, please do.
On Wed, Aug 02, 2023 at 04:31:12PM +0100, Roy Hopkins wrote:
On Wed, 2023-08-02 at 08:05 -0700, Paul E. McKenney wrote:
On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote:
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
Very good, thank you! May I add your Tested-by?
Thanx, Paul
Yes, please do.
Thank you again, and I will apply this on my next rebase.
Thanx, Paul
On 8/2/23 08:05, Paul E. McKenney wrote:
On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote:
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
Very good, thank you! May I add your Tested-by?
FWIW, I am still working on it. So far I get
[ 8.191589] KTAP version 1 [ 8.191769] # Subtest: kunit_executor_test [ 8.191972] # module: kunit [ 8.192012] 1..8 [ 8.197643] ok 1 parse_filter_test [ 8.201851] ok 2 filter_suites_test [ 8.206713] ok 3 filter_suites_test_glob_test [ 8.211806] ok 4 filter_suites_to_empty_test [ 8.214077] kunit executor: filter operation not found: speed>slow, module!=example [ 8.217933] # parse_filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:126 [ 8.217933] Expected err == 0, but [ 8.217933] err == -22 (0xffffffffffffffea) [ 8.217933] [ 8.217933] failed to parse filter '(efault)' [ 8.221266] not ok 5 parse_filter_attr_test [ 8.224224] kunit executor: filter operation not found: speed>slow [ 8.225837] # filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:165 [ 8.225837] Expected err == 0, but [ 8.225837] err == -22 (0xffffffffffffffea) [ 8.228850] not ok 6 filter_attr_test [ 8.230942] kunit executor: filter operation not found: module!=dummy [ 8.232167] # filter_attr_empty_test: ASSERTION FAILED at lib/kunit/executor_test.c:190 [ 8.232167] Expected err == 0, but [ 8.232167] err == -22 (0xffffffffffffffea) [ 8.235317] not ok 7 filter_attr_empty_test [ 8.237065] kunit executor: filter operation not found: speed>slow [ 8.238796] # filter_attr_skip_test: ASSERTION FAILED at lib/kunit/executor_test.c:209 [ 8.238796] Expected err == 0, but [ 8.238796] err == -22 (0xffffffffffffffea) [ 8.241897] not ok 8 filter_attr_skip_test [ 8.241947] # kunit_executor_test: pass:4 fail:4 skip:0 total:8 [ 8.242144] # Totals: pass:4 fail:4 skip:0 total:8
and it looks like the console no longer works. Most likely this is some other problem that was introduced while tests were broken. It will take me some time to track that down.
Guenter
On Wed, Aug 02, 2023 at 08:45:06AM -0700, Guenter Roeck wrote:
On 8/2/23 08:05, Paul E. McKenney wrote:
On Wed, Aug 02, 2023 at 02:57:56PM +0100, Roy Hopkins wrote:
On Tue, 2023-08-01 at 12:11 -0700, Paul E. McKenney wrote:
On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
Please see below for my preferred fix. Does this work for you guys?
Back to figuring out why recent kernels occasionally to blow up all rcutorture guest OSes...
Thanx, Paul
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7294be62727b..2d5b8385c357 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot) if (unlikely(midboot)) { needgpcb = 0x2; } else { + mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS); rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE); + mutex_lock(&rtp->tasks_gp_mutex); } if (needgpcb & 0x2) {
Your preferred fix looks good to me.
With the original code I can quite easily reproduce the problem on my system every 10 reboots or so. With your fix in place the problem no longer occurs.
Very good, thank you! May I add your Tested-by?
FWIW, I am still working on it. So far I get
[ 8.191589] KTAP version 1 [ 8.191769] # Subtest: kunit_executor_test [ 8.191972] # module: kunit [ 8.192012] 1..8 [ 8.197643] ok 1 parse_filter_test [ 8.201851] ok 2 filter_suites_test [ 8.206713] ok 3 filter_suites_test_glob_test [ 8.211806] ok 4 filter_suites_to_empty_test [ 8.214077] kunit executor: filter operation not found: speed>slow, module!=example [ 8.217933] # parse_filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:126 [ 8.217933] Expected err == 0, but [ 8.217933] err == -22 (0xffffffffffffffea) [ 8.217933] [ 8.217933] failed to parse filter '(efault)' [ 8.221266] not ok 5 parse_filter_attr_test [ 8.224224] kunit executor: filter operation not found: speed>slow [ 8.225837] # filter_attr_test: ASSERTION FAILED at lib/kunit/executor_test.c:165 [ 8.225837] Expected err == 0, but [ 8.225837] err == -22 (0xffffffffffffffea) [ 8.228850] not ok 6 filter_attr_test [ 8.230942] kunit executor: filter operation not found: module!=dummy [ 8.232167] # filter_attr_empty_test: ASSERTION FAILED at lib/kunit/executor_test.c:190 [ 8.232167] Expected err == 0, but [ 8.232167] err == -22 (0xffffffffffffffea) [ 8.235317] not ok 7 filter_attr_empty_test [ 8.237065] kunit executor: filter operation not found: speed>slow [ 8.238796] # filter_attr_skip_test: ASSERTION FAILED at lib/kunit/executor_test.c:209 [ 8.238796] Expected err == 0, but [ 8.238796] err == -22 (0xffffffffffffffea) [ 8.241897] not ok 8 filter_attr_skip_test [ 8.241947] # kunit_executor_test: pass:4 fail:4 skip:0 total:8 [ 8.242144] # Totals: pass:4 fail:4 skip:0 total:8
and it looks like the console no longer works. Most likely this is some other problem that was introduced while tests were broken. It will take me some time to track that down.
No rush.
Given that this bug is a year old, that it happens only when debug options are enabled, and that it has only been seen in current -next, my plan is to submit it into the next merge window.
So this one stays mutable for about another 10 days.
On the strength of Roy's Tested-by, however, I will push this patch into -next soon, so that should make things a bit easier. Or so I hope.
And again, thank you all for tracking this down!
Thanx, Paul
Two quick comments, both of them "this code is a bit odd" rather than anything else.
On Tue, 1 Aug 2023 at 12:11, Paul E. McKenney paulmck@kernel.org wrote:
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
Why is this file called "tasks.h"?
It's not a header file. It makes no sense. It's full of C code. It's included in only one place. It's just _weird_.
However, more relevantly:
mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
Isn't the tasks_gp_mutex the thing that protects the gp state here? Shouldn't it be after setting?
rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE);
Also, looking at rcu_tasks_need_gpcb() that is now called outside the lock, it does something quite odd.
At the very top of the function does
for (cpu = 0; cpu < smp_load_acquire(&rtp->percpu_dequeue_lim); cpu++) {
and 'smp_load_acquire()' is all about saying "everything *after* this load is ordered,
But the way it is done in that loop, it is indeed done at the beginning of the loop, but then it's done *after* the loop too, so the last smp_load_acquire seems a bit nonsensical.
If you want to load a value and say "this value is now sensible for everything that follows", I think you should load it *first*. No?
IOW, wouldn't the whole sequence make more sense as
dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); for (cpu = 0; cpu < dequeue_limit; cpu++) {
and say that everything in rcu_tasks_need_gpcb() is ordered wrt the initial limit on entry?
I dunno. That use of "smp_load_acquire()" just seems odd. Memory ordering is hard to understand to begin with, but then when you have things like loops that do the same ordered load multiple times, it goes from "hard to understand" to positively confusing.
Linus
On Wed, Aug 02, 2023 at 10:14:51AM -0700, Linus Torvalds wrote:
Two quick comments, both of them "this code is a bit odd" rather than anything else.
Good to get eyes on this code, so thank you very much!!!
On Tue, 1 Aug 2023 at 12:11, Paul E. McKenney paulmck@kernel.org wrote:
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
Why is this file called "tasks.h"?
It's not a header file. It makes no sense. It's full of C code. It's included in only one place. It's just _weird_.
You are right, it is weird.
This is a holdover from when I was much more concerned about being criticized for having #ifdef in a .c file, and pretty much every line in this file is under some combination or another of #ifdefs. This concern led to kernel/rcu/tree_plugin.h being set up in this way back when preemptible RCU was introduced, and for good or for bad I just kept following that pattern.
We could convert this to a .c file, keep the #ifdefs, drop some instances of "static", add a bunch of declarations, and maybe (or maybe not) push a function or two into some .h file for performance/inlining reasons. Me, I would prefer to leave it alone, but we can certainly change it.
However, more relevantly:
mutex_unlock(&rtp->tasks_gp_mutex); set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
Isn't the tasks_gp_mutex the thing that protects the gp state here? Shouldn't it be after setting?
Much of the gp state is protected by being accessed only by the gp kthread. But there is a window in time where the gp might be driven directly out of the synchronize_rcu_tasks() call. That window in time does not have a definite end, so this ->tasks_gp_mutex does the needed mutual exclusion during the transition of gp processing to the newly created gp kthread.
rcuwait_wait_event(&rtp->cbs_wait, (needgpcb = rcu_tasks_need_gpcb(rtp)), TASK_IDLE);
Also, looking at rcu_tasks_need_gpcb() that is now called outside the lock, it does something quite odd.
The state of each callback list is protected by the ->lock field of the rcu_tasks_percpu structure. Yes, rcu_segcblist_n_cbs() is invoked int rcu_tasks_need_gpcb() outside of the lock, but it is designed for lockless use. If it is modified just after the check, then there will be a later wakeup on the one hand or we will just uselessly acquire that ->lock this one time on the other.
Also, ncbs records the number of callbacks seen in that first loop, then used later, where its value might be stale. This might result in a collapse back to single-callback-queue operation and a later expansion back up. Except that at this point we are still in single-CPU mode, so there should not be any lock contention, which means that there should still be but a single callback queue. The transition itself is protected by ->cbs_gbl_lock.
At the very top of the function does
for (cpu = 0; cpu < smp_load_acquire(&rtp->percpu_dequeue_lim); cpu++) {
and 'smp_load_acquire()' is all about saying "everything *after* this load is ordered,
But the way it is done in that loop, it is indeed done at the beginning of the loop, but then it's done *after* the loop too, so the last smp_load_acquire seems a bit nonsensical.
If you want to load a value and say "this value is now sensible for everything that follows", I think you should load it *first*. No?
IOW, wouldn't the whole sequence make more sense as
dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); for (cpu = 0; cpu < dequeue_limit; cpu++) {
and say that everything in rcu_tasks_need_gpcb() is ordered wrt the initial limit on entry?
I dunno. That use of "smp_load_acquire()" just seems odd. Memory ordering is hard to understand to begin with, but then when you have things like loops that do the same ordered load multiple times, it goes from "hard to understand" to positively confusing.
Excellent point. I am queueing that change with your Suggested-by. If testing goes well, it will be as shown below.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 83049a893de5..94bb5abdbb37 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -432,6 +432,7 @@ static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp) static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp) { int cpu; + int dequeue_limit; unsigned long flags; bool gpdone = poll_state_synchronize_rcu(rtp->percpu_dequeue_gpseq); long n; @@ -439,7 +440,8 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp) long ncbsnz = 0; int needgpcb = 0;
- for (cpu = 0; cpu < smp_load_acquire(&rtp->percpu_dequeue_lim); cpu++) { + dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); + for (cpu = 0; cpu < dequeue_limit; cpu++) { struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
/* Advance and accelerate any new callbacks. */
On 7/27/23 13:33, Paul E. McKenney wrote: [ ... ]
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
Only CONFIG_TASKS_RCU. I added another log message after call_rcu_tasks(). It never returns from that function.
[ 1.168993] Running RCU synchronous self tests [ 1.169219] Running RCU synchronous self tests [ 1.285795] smpboot: CPU0: Intel Xeon Processor (Cascadelake) (family: 0x6, model: 0x55, stepping: 0x6) [ 1.302827] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 1.304526] Running RCU Tasks wait API self tests
... and then nothing for at least 10 minutes (then I gave up and stopped the test).
Qemu command line:
qemu-system-x86_64 -kernel \ arch/x86/boot/bzImage -M q35 -cpu Cascadelake-Server -no-reboot \ -snapshot -device e1000e,netdev=net0 -netdev user,id=net0 -m 256 \ -drive file=rootfs.iso,format=raw,if=ide,media=cdrom \ --append "earlycon=uart8250,io,0x3f8,9600n8 panic=-1 slub_debug=FZPUA root=/dev/sr0 rootwait console=ttyS0 noreboot" \ -d unimp,guest_errors -nographic -monitor none
Again, this doesn't happen all the time. With Cascadelake-Server I see it maybe once every 5 boot attempts. I tried with qemu v8.0 and v8.1. Note that it does seem to happen with various CPU types, only for some it seems to me more likely to happen (so maybe the CPU type was a red herring). It does seem to depend on the system load, and happen more often if the system is under heavy load.
Guenter
On Thu, Jul 27, 2023 at 09:22:52PM -0700, Guenter Roeck wrote:
On 7/27/23 13:33, Paul E. McKenney wrote: [ ... ]
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
Only CONFIG_TASKS_RCU. I added another log message after call_rcu_tasks(). It never returns from that function.
[ 1.168993] Running RCU synchronous self tests [ 1.169219] Running RCU synchronous self tests [ 1.285795] smpboot: CPU0: Intel Xeon Processor (Cascadelake) (family: 0x6, model: 0x55, stepping: 0x6) [ 1.302827] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 1.304526] Running RCU Tasks wait API self tests
... and then nothing for at least 10 minutes (then I gave up and stopped the test).
Qemu command line:
qemu-system-x86_64 -kernel \ arch/x86/boot/bzImage -M q35 -cpu Cascadelake-Server -no-reboot \ -snapshot -device e1000e,netdev=net0 -netdev user,id=net0 -m 256 \ -drive file=rootfs.iso,format=raw,if=ide,media=cdrom \ --append "earlycon=uart8250,io,0x3f8,9600n8 panic=-1 slub_debug=FZPUA root=/dev/sr0 rootwait console=ttyS0 noreboot" \ -d unimp,guest_errors -nographic -monitor none
Again, this doesn't happen all the time. With Cascadelake-Server I see it maybe once every 5 boot attempts. I tried with qemu v8.0 and v8.1. Note that it does seem to happen with various CPU types, only for some it seems to me more likely to happen (so maybe the CPU type was a red herring). It does seem to depend on the system load, and happen more often if the system is under heavy load.
Hmmm... What kernel are you using as your qemu/KVM hypervisor?
And I echo Joel's requests for your .config file.
Thanx, Paul
On Sun, Jul 30, 2023 at 08:54:46PM -0700, Paul E. McKenney wrote:
On Thu, Jul 27, 2023 at 09:22:52PM -0700, Guenter Roeck wrote:
On 7/27/23 13:33, Paul E. McKenney wrote: [ ... ]
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
Only CONFIG_TASKS_RCU. I added another log message after call_rcu_tasks(). It never returns from that function.
[ 1.168993] Running RCU synchronous self tests [ 1.169219] Running RCU synchronous self tests [ 1.285795] smpboot: CPU0: Intel Xeon Processor (Cascadelake) (family: 0x6, model: 0x55, stepping: 0x6) [ 1.302827] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 1.304526] Running RCU Tasks wait API self tests
... and then nothing for at least 10 minutes (then I gave up and stopped the test).
Qemu command line:
qemu-system-x86_64 -kernel \ arch/x86/boot/bzImage -M q35 -cpu Cascadelake-Server -no-reboot \ -snapshot -device e1000e,netdev=net0 -netdev user,id=net0 -m 256 \ -drive file=rootfs.iso,format=raw,if=ide,media=cdrom \ --append "earlycon=uart8250,io,0x3f8,9600n8 panic=-1 slub_debug=FZPUA root=/dev/sr0 rootwait console=ttyS0 noreboot" \ -d unimp,guest_errors -nographic -monitor none
Again, this doesn't happen all the time. With Cascadelake-Server I see it maybe once every 5 boot attempts. I tried with qemu v8.0 and v8.1. Note that it does seem to happen with various CPU types, only for some it seems to me more likely to happen (so maybe the CPU type was a red herring). It does seem to depend on the system load, and happen more often if the system is under heavy load.
Hmmm... What kernel are you using as your qemu/KVM hypervisor?
Never mind, I now see your bisection result. Good show, thank you!!!
Thanx, Paul
And I echo Joel's requests for your .config file.
Thanx, Paul
On 7/30/23 20:54, Paul E. McKenney wrote:
On Thu, Jul 27, 2023 at 09:22:52PM -0700, Guenter Roeck wrote:
On 7/27/23 13:33, Paul E. McKenney wrote: [ ... ]
So which of the following Kconfig options is defined in your .config? CONFIG_TASKS_RCU, CONFIG_TASKS_RUDE_RCU, and CONFIG_TASKS_TRACE_RCU.
Only CONFIG_TASKS_RCU. I added another log message after call_rcu_tasks(). It never returns from that function.
[ 1.168993] Running RCU synchronous self tests [ 1.169219] Running RCU synchronous self tests [ 1.285795] smpboot: CPU0: Intel Xeon Processor (Cascadelake) (family: 0x6, model: 0x55, stepping: 0x6) [ 1.302827] RCU Tasks: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1. [ 1.304526] Running RCU Tasks wait API self tests
... and then nothing for at least 10 minutes (then I gave up and stopped the test).
Qemu command line:
qemu-system-x86_64 -kernel \ arch/x86/boot/bzImage -M q35 -cpu Cascadelake-Server -no-reboot \ -snapshot -device e1000e,netdev=net0 -netdev user,id=net0 -m 256 \ -drive file=rootfs.iso,format=raw,if=ide,media=cdrom \ --append "earlycon=uart8250,io,0x3f8,9600n8 panic=-1 slub_debug=FZPUA root=/dev/sr0 rootwait console=ttyS0 noreboot" \ -d unimp,guest_errors -nographic -monitor none
Again, this doesn't happen all the time. With Cascadelake-Server I see it maybe once every 5 boot attempts. I tried with qemu v8.0 and v8.1. Note that it does seem to happen with various CPU types, only for some it seems to me more likely to happen (so maybe the CPU type was a red herring). It does seem to depend on the system load, and happen more often if the system is under heavy load.
Hmmm... What kernel are you using as your qemu/KVM hypervisor?
Not sure I understand the question. KVM is disabled in my systems. The host CPUs are Ryzen 3900X and 5900X, but I don't really see why that would matter.
And I echo Joel's requests for your .config file.
Did you see the e-mail I sent about this problem earlier today ?
https://lore.kernel.org/lkml/3da81a5c-700b-8e21-1bde-27dd3a0b8945@roeck-us.n...
I think I'll declare this to be a problem with my test environment and disable RCU debugging.
Thanks, Guenter
linux-stable-mirror@lists.linaro.org