This is the start of the stable review cycle for the 6.5.13 release. There are 491 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.5.13-rc1
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have trace_event_file have ref counters
Michael Ellerman mpe@ellerman.id.au powerpc/powernv: Fix fortify source warnings in opal-prd.c
Lewis Huang lewis.huang@amd.com drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox
Tianci Yin tianci.yin@amd.com drm/amd/display: Enable fast plane updates on DCN3.2 and above
Mario Limonciello mario.limonciello@amd.com drm/amd/display: fix a NULL pointer dereference in amdgpu_dm_i2c_xfer()
Fangzhi Zuo jerry.zuo@amd.com drm/amd/display: Fix DSC not Enabled on Direct MST Sink
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Guard against invalid RPTR/WPTR being set
Felix Kuehling Felix.Kuehling@amd.com drm/amdgpu: Fix possible null pointer dereference
Christian König christian.koenig@amd.com drm/amdgpu: lower CS errors to debug severity
Christian König christian.koenig@amd.com drm/amdgpu: fix error handling in amdgpu_bo_list_get()
Christian König christian.koenig@amd.com drm/amdgpu: fix error handling in amdgpu_vm_init
Alex Deucher alexander.deucher@amd.com drm/amdgpu: don't use ATRM for external devices
Alex Deucher alexander.deucher@amd.com drm/amdgpu: add a retry for IP discovery init
Tim Huang Tim.Huang@amd.com drm/amdgpu: fix GRBM read timeout when do mes_self_test
Alex Deucher alexander.deucher@amd.com drm/amdgpu: don't use pci_is_thunderbolt_attached()
Alex Deucher alexander.deucher@amd.com drm/amdgpu/smu13: drop compute workload workaround
Ma Jun Jun.Ma2@amd.com drm/amd/pm: Fix error of MACO flag setting code
Nirmoy Das nirmoy.das@intel.com drm/i915: Flush WC GGTT only on required platforms
Kunwu Chan chentao@kylinos.cn drm/i915: Fix potential spectre vulnerability
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Bump GLK CDCLK frequency when driving multiple pipes
Chaitanya Kumar Borah chaitanya.kumar.borah@intel.com drm/i915/mtl: Support HBR3 rate with C10 phy and eDP in MTL
Jani Nikula jani.nikula@intel.com drm: bridge: it66121: ->get_edid callback must not return err pointers
Bas Nieuwenhuizen bas@basnieuwenhuizen.nl drm/amd/pm: Handle non-terminated overdrive commands.
Brian Foster bfoster@redhat.com ext4: fix racy may inline data check in dio write
Jan Kara jack@suse.cz ext4: properly sync file size update after O_SYNC direct IO
Kemeng Shi shikemeng@huaweicloud.com ext4: add missed brelse in update_backups
Kemeng Shi shikemeng@huaweicloud.com ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks
Zhang Yi yi.zhang@huawei.com ext4: correct the start block of counting reserved clusters
Kemeng Shi shikemeng@huaweicloud.com ext4: correct return value of ext4_convert_meta_bg
Ojaswin Mujoo ojaswin@linux.ibm.com ext4: mark buffer new if it is unwritten to avoid stale data exposure
Kemeng Shi shikemeng@huaweicloud.com ext4: correct offset of gdb backup in non meta_bg group to update_backups
Max Kellermann max.kellermann@ionos.com ext4: apply umask if ACL support is disabled
Zhang Yi yi.zhang@huawei.com ext4: make sure allocate pending entry not fail
Baokun Li libaokun1@huawei.com ext4: fix race between writepages and remount
Heiner Kallweit hkallweit1@gmail.com Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"
Andrey Konovalov andrey.konovalov@linaro.org media: qcom: camss: Fix csid-gen2 for test pattern generator
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix invalid clock enable bit disjunction
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix missing vfe_lite clocks check
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix VFE-480 vfe_disable_output()
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix VFE-17x vfe_disable_output()
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix vfe_get() error jump
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix pm_domain_on sequence in probe
Victor Shih victor.shih@genesyslogic.com.tw mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of AER
ChunHao Lin hau@realtek.com r8169: add handling DASH when DASH is disabled
ChunHao Lin hau@realtek.com r8169: fix network lost after resume on DASH systems
Paolo Abeni pabeni@redhat.com selftests: mptcp: fix fastclose with csum failure
Paolo Abeni pabeni@redhat.com mptcp: fix setsockopt(IP_TOS) subflow locking
Geliang Tang geliang.tang@suse.com mptcp: add validity check for sending RM_ADDR
Paolo Abeni pabeni@redhat.com mptcp: deal with large GSO size
Roman Gushchin roman.gushchin@linux.dev mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors
Stefan Roesch shr@devkernel.io mm: fix for negative counter: nr_file_hugepages
Victor Shih victor.shih@genesyslogic.com.tw mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2
Nam Cao namcaov@gmail.com riscv: kprobes: allow writing to x0
Song Shuai suagrfillet@gmail.com riscv: correct pt_level name via pgtable_l5/4_enabled
Song Shuai suagrfillet@gmail.com riscv: mm: Update the comment of CONFIG_PAGE_OFFSET
Nam Cao namcaov@gmail.com riscv: put interrupt entries into .irqentry.text
Minda Chen minda.chen@starfivetech.com riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause
Nathan Chancellor nathan@kernel.org LoongArch: Mark __percpu functions as always inline
Chuck Lever chuck.lever@oracle.com NFSD: Update nfsd_cache_append() to use xdr_stream
Mahmoud Adam mngyadam@amazon.com nfsd: fix file memleak on client_opens_release
Mikulas Patocka mpatocka@redhat.com dm-verity: don't use blocking calls from tasklets
Mikulas Patocka mpatocka@redhat.com dm-bufio: fix no-sleep mode
Jani Nikula jani.nikula@intel.com drm/mediatek/dp: fix memory leak on ->get_edid callback error path
Jani Nikula jani.nikula@intel.com drm/mediatek/dp: fix memory leak on ->get_edid callback audio detection
Sakari Ailus sakari.ailus@linux.intel.com media: ccs: Correctly initialise try compose rectangle
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: add checks to handle capabilities from firmware
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: fix the check to handle session buffer requirement
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi_parser: Add check to keep the number of codecs within range
Sean Young sean@mess.org media: sharp: fix sharp encoding
Sean Young sean@mess.org media: lirc: drop trailing space from scancode transmit
Jaegeuk Kim jaegeuk@kernel.org f2fs: split initial and dynamic conditions for extent_cache
Su Hui suhui@nfschina.com f2fs: avoid format-overflow warning
Jaegeuk Kim jaegeuk@kernel.org f2fs: set the default compress_level on ioctl
Jaegeuk Kim jaegeuk@kernel.org f2fs: do not return EFSCORRUPTED, but try to run online repair
Heiner Kallweit hkallweit1@gmail.com i2c: i801: fix potential race in i801_block_transaction_byte_by_byte
Andreas Gruenbacher agruenba@redhat.com gfs2: don't withdraw if init_threads() got interrupted
Klaus Kudielka klaus.kudielka@gmail.com net: phylink: initialize carrier state at creation
Alexander Sverdlin alexander.sverdlin@siemens.com net: dsa: lan9303: consequently nested-lock physical MDIO
Andrew Lunn andrew@lunn.ch net: ethtool: Fix documentation of ethtool_sprintf()
Harald Freudenberger freude@linux.ibm.com s390/ap: fix AP bus crash on early config change callback invocation
Tam Nguyen tamnguyenchi@os.amperecomputing.com i2c: designware: Disable TX_EMPTY irq while waiting for block length byte
Darren Hart darren@os.amperecomputing.com sbsa_gwdt: Calculate timeout with 64-bit math
Ondrej Mosnacek omosnace@redhat.com lsm: fix default return value for inode_getsecctx
Ondrej Mosnacek omosnace@redhat.com lsm: fix default return value for vm_enough_memory
Robert Marko robert.marko@sartura.hr Revert "i2c: pxa: move to generic GPIO recovery"
Johnathan Mantey johnathanx.mantey@intel.com Revert ncsi: Propagate carrier gain/loss events to the NCSI controller
Stefan Binding sbinding@opensource.cirrus.com ALSA: hda/realtek: Add quirks for HP Laptops
Matus Malych matus@malych.org ALSA: hda/realtek: Enable Mute LED on HP 255 G10
Chandradeep Dey codesigning@chandradeepdey.com ALSA: hda/realtek - Enable internal speaker of ASUS K6500ZC
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Add Dell ALC295 to pin fall back table
Eymen Yigit eymenyg01@gmail.com ALSA: hda/realtek: Enable Mute LED on HP 255 G8
Takashi Iwai tiwai@suse.de ALSA: info: Fix potential deadlock at disconnection
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: wait for data BG to be finished on direct IO allocation
Dave Chinner dchinner@redhat.com xfs: recovery should not clear di_flushiter unconditionally
David Howells dhowells@redhat.com cifs: Fix encryption of cleared, but unset rq_iter data buffers
Shyam Prasad N sprasad@microsoft.com cifs: do not reset chan_max if multichannel is not supported at mount
Shyam Prasad N sprasad@microsoft.com cifs: force interface update before a fresh session setup
Shyam Prasad N sprasad@microsoft.com cifs: reconnect helper should set reconnect for the right channel
Paulo Alcantara pc@manguebit.com smb: client: fix potential deadlock when releasing mids
Paulo Alcantara pc@manguebit.com smb: client: fix use-after-free in smb2_query_info_compound()
Paulo Alcantara pc@manguebit.com smb: client: fix use-after-free bug in cifs_debug_data_proc_show()
Steve French stfrench@microsoft.com smb3: fix caching of ctime on setxattr
Steve French stfrench@microsoft.com smb3: allow dumping session and tcon id to improve stats analysis and debugging
Steve French stfrench@microsoft.com smb3: fix touch -h of symlink
Steve French stfrench@microsoft.com smb3: fix creating FIFOs when mounting with "sfu" mount option
Jeff Layton jlayton@kernel.org fs: add ctime accessors infrastructure
Basavaraj Natikar Basavaraj.Natikar@amd.com xhci: Enable RPM on controllers that support low-power states
Helge Deller deller@gmx.de parisc/power: Fix power soft-off when running on qemu
Helge Deller deller@gmx.de parisc/pgtable: Do not drop upper 5 address bits of physical address
Helge Deller deller@gmx.de parisc: Prevent booting 64-bit kernels on PA1.x machines
Zi Yan ziy@nvidia.com mm/hugetlb: use nth_page() in place of direct struct page manipulation
Peter Xu peterx@redhat.com mm/hugetlb: prepare hugetlb_follow_page_mask() for FOLL_PIN
Joel Fernandes (Google) joel@joelfernandes.org rcutorture: Fix stuttering races and other issues
Paul E. McKenney paulmck@kernel.org torture: Make torture_hrtimeout_ns() take an hrtimer mode parameter
Paul E. McKenney paulmck@kernel.org torture: Move stutter_wait() timeouts to hrtimers
Paul E. McKenney paulmck@kernel.org torture: Make torture_hrtimeout_*() use TASK_IDLE
Dietmar Eggemann dietmar.eggemann@arm.com torture: Add lock_torture writer_fifo module parameter
Paul E. McKenney paulmck@kernel.org torture: Add a kthread-creation callback to _torture_create_kthread()
Lukas Wunner lukas@wunner.de PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card
Lizhi Hou lizhi.hou@amd.com PCI: Add quirks to generate device tree node for Xilinx Alveo U50
Lizhi Hou lizhi.hou@amd.com PCI: Create device tree node for bridge
Lizhi Hou lizhi.hou@amd.com of: dynamic: Add interfaces for creating device node dynamically
Manivannan Sadhasivam mani@kernel.org PCI: qcom-ep: Add dedicated callback for writing to DBI2 registers
Pengfei Li pengfei.li_1@nxp.com pmdomain: imx: Make imx pgc power domain also set the fwnode
Tomeu Vizoso tomeu@tomeuvizoso.net pmdomain: amlogic: Fix mask for the second NNA mem PD domain
Maíra Canal mcanal@igalia.com pmdomain: bcm: bcm2835-power: check if the ASB register is equal to enable
Dan Williams dan.j.williams@intel.com cxl/port: Fix delete_endpoint() vs parent unregistration race
Jim Harris jim.harris@samsung.com cxl/region: Fix x1 root-decoder granularity calculations
Frank Li Frank.Li@nxp.com i3c: master: svc: fix random hot join failure since timeout error
Frank Li Frank.Li@nxp.com i3c: master: svc: fix SDA keep low when polling IBIWON timeout happen
Frank Li Frank.Li@nxp.com i3c: master: svc: fix check wrong status register in irq handler
Frank Li Frank.Li@nxp.com i3c: master: svc: fix ibi may not return mandatory data byte
Frank Li Frank.Li@nxp.com i3c: master: svc: fix wrong data return when IBI happen during start frame
Frank Li Frank.Li@nxp.com i3c: master: svc: fix race condition in ibi work thread
Joshua Yeong joshua.yeong@starfivetech.com i3c: master: cdns: Fix reading status register
Jim Harris jim.harris@samsung.com cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails
Linus Walleij linus.walleij@linaro.org mtd: cfi_cmdset_0001: Byte swap OTP info
Florent Revest revest@chromium.org mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
Zi Yan ziy@nvidia.com mm/memory_hotplug: use pfn math in place of direct struct page manipulation
Zi Yan ziy@nvidia.com mm/cma: use nth_page() in place of direct struct page manipulation
Heiko Carstens hca@linux.ibm.com s390/cmma: fix handling of swapper_pg_dir and invalid_pg_dir
Heiko Carstens hca@linux.ibm.com s390/cmma: fix detection of DAT pages
Heiko Carstens hca@linux.ibm.com s390/cmma: fix initial kernel address space page table walk
Heiko Carstens hca@linux.ibm.com s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()
Alain Volmat alain.volmat@foss.st.com dmaengine: stm32-mdma: correct desc prep when channel running
Gaurav Batra gbatra@linux.vnet.ibm.com powerpc/pseries/iommu: enable_ddw incorrectly returns direct mapping for SR-IOV device
Sanjuán García, Jorge Jorge.SanjuanGarcia@duagon.com mcb: fix error handling for different scenarios when parsing
Saravana Kannan saravanak@google.com driver core: Release all resources during unbind before updating device links
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have the user copy of synthetic event address use correct context
Tiezhu Yang yangtiezhu@loongson.cn selftests/clone3: Fix broken test under !CONFIG_TIME_NS
Benjamin Bara benjamin.bara@skidata.com i2c: core: Run atomic i2c xfer when !preemptible
Benjamin Bara benjamin.bara@skidata.com kernel/reboot: emergency_restart: Set correct system_state
Eric Biggers ebiggers@google.com quota: explicitly forbid quota files from being encrypted
Zhihao Cheng chengzhihao1@huawei.com jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org ASoC: codecs: wsa-macro: fix uninitialized stack variables with name prefix
Jamie Lentin jm@lentin.co.uk hid: lenovo: Resend all settings on reset_resume for compact keyboards
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Move _GNU_SOURCE define into Makefile
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Remove duplicate feature check from CMT test
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Fix uninitialized .sa_flags
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: codecs: wsa883x: make use of new mute_unmute_on_trigger flag
Srinivas Kandagatla srinivas.kandagatla@linaro.org ASoC: soc-dai: add flag to mute and unmute stream during trigger
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: split async and sync catchall in two functions
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: remove catchall element in GC sync path
Mimi Zohar zohar@linux.ibm.com ima: detect changes to the backing overlay file
Amir Goldstein amir73il@gmail.com ima: annotate iint mutex to avoid lockdep false positive warnings
Johan Hovold johan+linaro@kernel.org mfd: qcom-spmi-pmic: Fix revid implementation
Johan Hovold johan+linaro@kernel.org mfd: qcom-spmi-pmic: Fix reference leaks in revid helper
Christian Marangi ansuelsmth@gmail.com leds: trigger: netdev: Move size check in set_device_name
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq6018: Fix tcsr_mutex register size
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq9574: Fix hwlock index for SMEM
Vasily Khoruzhick anarsoul@gmail.com ACPI: FPDT: properly handle invalid FPDT subtables
Kathiravan Thirumoorthy quic_kathirav@quicinc.com firmware: qcom_scm: use 64-bit calling convention only when client is 64-bit
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq8074: Fix hwlock index for SMEM
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq5332: Fix hwlock index for SMEM
David Arcari darcari@redhat.com thermal: intel: powerclamp: fix mismatch in get function for max_idle
Josef Bacik josef@toxicpanda.com btrfs: don't arbitrarily slow down delalloc if we're committing
Catalin Marinas catalin.marinas@arm.com rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
Brian Geffon bgeffon@google.com PM: hibernate: Clean up sync_read handling in snapshot_write_next()
Brian Geffon bgeffon@google.com PM: hibernate: Use __get_safe_page() rather than touching the list
Biju Das biju.das.jz@bp.renesas.com dt-bindings: timer: renesas,rz-mtu3: Fix overflow/underflow interrupt names
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq6018: Fix hwlock index for SMEM
Joel Fernandes (Google) joel@joelfernandes.org rcu/tree: Defer setting of jiffies during stall reset
Chuck Lever chuck.lever@oracle.com svcrdma: Drop connection after an RDMA Read error
Ajay Singh ajay.kathat@microchip.com wifi: wilc1000: use vmm_table as array in wilc struct
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: exynos: Don't discard .remove() callback
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: kirin: Don't discard .remove() callback
Heiner Kallweit hkallweit1@gmail.com PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common()
Bean Huo beanhuo@micron.com mmc: Add quirk MMC_QUIRK_BROKEN_CACHE_FLUSH for Micron eMMC Q2J54A
Nitin Yadav n-yadav@ti.com mmc: sdhci_am654: fix start loop index for TAP value parsing
Dan Carpenter dan.carpenter@linaro.org mmc: vub300: fix an error code
Namjae Jeon linkinjeon@kernel.org ksmbd: fix slab out of bounds write in smb_inherit_dacl()
Namjae Jeon linkinjeon@kernel.org ksmbd: handle malformed smb1 message
Marios Makassikis mmakassikis@freebox.fr ksmbd: fix recursive locking in vfs helpers
Kathiravan Thirumoorthy quic_kathirav@quicinc.com clk: qcom: ipq6018: drop the CLK_SET_RATE_PARENT flag from PLL clocks
Kathiravan Thirumoorthy quic_kathirav@quicinc.com clk: qcom: ipq8074: drop the CLK_SET_RATE_PARENT flag from PLL clocks
Gustavo A. R. Silva gustavoars@kernel.org clk: visconti: Fix undefined behavior bug in struct visconti_pll_provider
Gustavo A. R. Silva gustavoars@kernel.org clk: socfpga: Fix undefined behavior bug in struct stratix10_clock_data
Ville Syrjälä ville.syrjala@linux.intel.com powercap: intel_rapl: Downgrade BIOS locked limits pr_warn() to pr_debug()
Christian Marangi ansuelsmth@gmail.com cpufreq: stats: Fix buffer overflow detection in trans_stats()
Helge Deller deller@gmx.de parisc/power: Add power soft-off when running on qemu
Helge Deller deller@gmx.de parisc/pdc: Add width field to struct pdc_model
Helge Deller deller@gmx.de parisc/agp: Use 64-bit LE values in SBA IOMMU PDIR table
Maria Yu quic_aiquny@quicinc.com arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
Nathan Chancellor nathan@kernel.org arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: keystone: Don't discard .probe() callback
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: keystone: Don't discard .remove() callback
Jarkko Sakkinen jarkko@kernel.org KEYS: trusted: Rollback init_trusted() consistently
Sumit Garg sumit.garg@linaro.org KEYS: trusted: tee: Refactor register SHM usage
Hao Jia jiahao.os@bytedance.com sched/core: Fix RQCF_ACT_SKIP leak
Herve Codina herve.codina@bootlin.com genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware
Rong Chen rong.chen@amlogic.com mmc: meson-gx: Remove setting of CMD_CFG_ERROR
Johan Hovold johan+linaro@kernel.org wifi: ath12k: fix dfs-radar and temperature event locking
Johan Hovold johan+linaro@kernel.org wifi: ath12k: fix htt mlo-offset event locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix gtk offload status event locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix htt pktlog locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix dfs radar event locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix temperature event locking
Mark Brown broonie@kernel.org regmap: Ensure range selector registers are updated after cache sync
Werner Sembach wse@tuxedocomputers.com ACPI: resource: Do IRQ override on TongFang GMxXGxx
John David Anglin dave@parisc-linux.org parisc: Add nop instructions after TLB inserts
SeongJae Park sj@kernel.org mm/damon/sysfs: check error from damon_sysfs_update_target()
SeongJae Park sj@kernel.org mm/damon/sysfs-schemes: handle tried regions sysfs directory allocation failure
SeongJae Park sj@kernel.org mm/damon/sysfs-schemes: handle tried region directory allocation failure
SeongJae Park sj@kernel.org mm/damon/core: avoid divide-by-zero during monitoring results update
SeongJae Park sj@kernel.org mm/damon: implement a function for max nr_accesses safe calculation
SeongJae Park sj@kernel.org mm/damon/ops-common: avoid divide-by-zero during region hotness calculation
SeongJae Park sj@kernel.org mm/damon/lru_sort: avoid divide-by-zero in hot threshold calculation
Mikulas Patocka mpatocka@redhat.com dm crypt: account large pages in cc->n_allocated_pages
Helge Deller deller@gmx.de fbdev: stifb: Make the STI next font pointer a 32-bit signed offset
Koichiro Den den@valinux.co.jp iommufd: Fix missing update of domains_itree after splitting iopt_area
Krister Johansen kjlx@templeofstupid.com watchdog: move softlockup_panic back to early_param
SeongJae Park sj@kernel.org mm/damon/sysfs: update monitoring target regions for online input commit
SeongJae Park sj@kernel.org mm/damon/sysfs: remove requested targets when online-commit inputs
Lukas Wunner lukas@wunner.de PCI/sysfs: Protect driver's D3cold preference from user space
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix event channel handling for secondary consoles
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix error path in xen_hvc_init() to always register frontend driver
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix console unplug
Pavel Krasavin pkrasavin@imaqliq.com tty: serial: meson: fix hard LOCKUP on crtscts mode
Muhammad Usama Anjum usama.anjum@collabora.com tty/sysrq: replace smp_processor_id() with get_cpu()
Krister Johansen kjlx@templeofstupid.com proc: sysctl: prevent aliased sysctls from getting passed to init
Paul Moore paul@paul-moore.com audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()
Paul Moore paul@paul-moore.com audit: don't take task_lock() in audit_exe_compare() code path
Johannes Weiner hannes@cmpxchg.org sched: psi: fix unprivileged polling against cgroups
Victor Shih victor.shih@genesyslogic.com.tw mmc: sdhci-pci-gli: GL9755: Mask the replay timer timeout of AER
Haitao Shan hshan@google.com KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.
Tao Su tao1.su@linux.intel.com KVM: x86: Clear bit12 of ICR after APIC-write VM-exit
Maciej S. Szmigiero maciej.szmigiero@oracle.com KVM: x86: Ignore MSR_AMD64_TW_CFG access
Nicolas Saenz Julienne nsaenz@amazon.com KVM: x86: hyper-v: Don't auto-enable stimer on write from user-space
Pu Wen puwen@hygon.cn x86/cpu/hygon: Fix the CPU topology evaluation for real
Koichiro Den den@valinux.co.jp x86/apic/msi: Fix misconfigured non-maskable MSI quirk
Mario Limonciello mario.limonciello@amd.com x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4
Roxana Nicolescu roxana.nicolescu@canonical.com crypto: x86/sha - load modules based on CPU features
Peter Wang peter.wang@mediatek.com scsi: ufs: core: Fix racing issue between ufshcd_mcq_abort() and ISR
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix system crash due to bad pointer access
Manivannan Sadhasivam mani@kernel.org scsi: ufs: qcom: Update PHY settings only when scaling to higher gears
Chandrakanth patil chandrakanth.patil@broadcom.com scsi: megaraid_sas: Increase register read retry rount from 3 to 30 for selected registers
Ranjan Kumar ranjan.kumar@broadcom.com scsi: mpt3sas: Fix loop logic
Shung-Hsi Yu shung-hsi.yu@suse.com bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
Hao Sun sunhao.th@gmail.com bpf: Fix check_stack_write_fixed_off() to correctly spill imm
Kees Cook keescook@chromium.org randstruct: Fix gcc-plugin performance mode to stay in group
Nicholas Piggin npiggin@gmail.com powerpc/perf: Fix disabling BHRB and instruction sampling
Adrian Hunter adrian.hunter@intel.com perf intel-pt: Fix async branch flags
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: add checks to perform sanity on queue pointers
Alexandre Ghiti alexghiti@rivosinc.com drivers: perf: Check find_first_bit() return value
Ilkka Koskinen ilkka@os.amperecomputing.com perf: arm_cspmu: Reject events meant for other PMUs
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com i915/perf: Fix NULL deref bugs with drm_dbg() calls
Peter Zijlstra peterz@infradead.org perf/core: Fix cpuctx refcounting
Ekaterina Esina eesina@astralinux.ru cifs: fix check of rc in function generate_smb3signingkey
Anastasia Belova abelova@astralinux.ru cifs: spnego: add ';' in HOST_KEY_LEN
Naomi Chu naomi.chu@mediatek.com scsi: ufs: core: Expand MCQ queue slot to DeviceQueueDepth + 1
Chen Yu yu.c.chen@intel.com tools/power/turbostat: Enable the C-state Pre-wake printing
Zhang Rui rui.zhang@intel.com tools/power/turbostat: Fix a knl bug
Vlad Buslov vladbu@nvidia.com macvlan: Don't propagate promisc change to lower dev in passthru
Xin Long lucien.xin@gmail.com net: sched: do not offload flows with a helper in act_ct
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Check return value of snprintf writing to fw_version buffer
Saeed Mahameed saeedm@nvidia.com net/mlx5e: Reduce the size of icosq_str
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5: Increase size of irq name buffer
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Update doorbell for port timestamping CQ before the software counter
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5: Consolidate devlink documentation in devlink/mlx5.rst
Vlad Buslov vladbu@nvidia.com net/mlx5e: Fix pedit endianness
Gavin Li gavinl@nvidia.com net/mlx5e: fix double free of encap_header in update funcs
Dust Li dust.li@linux.alibaba.com net/mlx5e: fix double free of encap_header
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5: Decouple PHC .adjtime and .adjphase implementations
Jens Axboe axboe@kernel.dk io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval
Ziwei Xiao ziweixiao@google.com gve: Fixes for napi_poll when budget is 0
Shannon Nelson shannon.nelson@amd.com pds_core: fix up some format-truncation complaints
Shannon Nelson shannon.nelson@amd.com pds_core: use correct index to mask irq
Baruch Siach baruch@tkos.co.il net: stmmac: avoid rx queue overrun
Baruch Siach baruch@tkos.co.il net: stmmac: fix rx budget limit check
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: bogus ENOENT when destroying element which does not exist
Dan Carpenter dan.carpenter@linaro.org netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()
Linkui Xiao xiaolinkui@kylinos.cn netfilter: nf_conntrack_bridge: initialize err to 0
Eric Dumazet edumazet@google.com af_unix: fix use-after-free in unix_stream_read_actor()
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Fix MTU max setting
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Handle large frames
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Fix max RX frame define
Eric Dumazet edumazet@google.com bonding: stop the device in bond_setup_by_slave()
Eric Dumazet edumazet@google.com ptp: annotate data-race around q->head and q->tail
Christoph Hellwig hch@infradead.org blk-mq: make sure active queue usage is held for bio_integrity_prep()
Juergen Gross jgross@suse.com xen/events: fix delayed eoi list handling
Willem de Bruijn willemb@google.com ppp: limit MRU to 64K
Sven Auhagen sven.auhagen@voleatech.de net: mvneta: fix calls to page_pool_get_stats
Shigeru Yoshida syoshida@redhat.com tipc: Fix kernel-infoleak due to uninitialized TLV value
Jijie Shao shaojijie@huawei.com net: hns3: fix VF wrong speed and duplex issue
Jijie Shao shaojijie@huawei.com net: hns3: fix VF reset fail issue
Yonglong Liu liuyonglong@huawei.com net: hns3: fix variable may not initialized problem in hns3_init_mac_addr()
Yonglong Liu liuyonglong@huawei.com net: hns3: fix out-of-bounds access may occur when coalesce info is read via debugfs
Jian Shen shenjian15@huawei.com net: hns3: fix incorrect capability bit display for copper port
Yonglong Liu liuyonglong@huawei.com net: hns3: add barrier in vf mailbox reply process
Jian Shen shenjian15@huawei.com net: hns3: fix add VLAN fail issue
Juergen Gross jgross@suse.com xen/events: avoid using info_for_irq() in xen_send_IPI_one()
Shigeru Yoshida syoshida@redhat.com tty: Fix uninit-value access in ppp_sync_receive()
Eric Dumazet edumazet@google.com ipvlan: add ipvlan_route_v6_outbound() helper
Stanislav Fomichev sdf@google.com net: set SOCK_RCU_FREE before inserting socket into hashtable
Andrii Nakryiko andrii@kernel.org bpf: fix precision backtracking instruction iteration
Andrii Nakryiko andrii@kernel.org bpf: handle ldimm64 properly in check_cfg()
Kees Cook keescook@chromium.org gcc-plugins: randstruct: Only warn about true flexible arrays
Dan Carpenter dan.carpenter@linaro.org vhost-vdpa: fix use after free in vhost_vdpa_probe()
Stefano Garzarella sgarzare@redhat.com vdpa_sim_blk: allocate the buffer zeroed
Nirmoy Das nirmoy.das@intel.com drm/i915/tc: Fix -Wformat-truncation in intel_tc_port_init
Andreas Gruenbacher agruenba@redhat.com gfs2: Silence "suspicious RCU usage in gfs2_permission" warning
Nam Cao namcaov@gmail.com riscv: provide riscv-specific is_trap_insn()
Andrew Jones ajones@ventanamicro.com RISC-V: hwprobe: Fix vDSO SIGSEGV
felix fuzhen5@huawei.com SUNRPC: Fix RPC client cleaned up the freed pipefs dentries
Olga Kornievskaia kolga@netapp.com NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO
Dan Carpenter dan.carpenter@linaro.org SUNRPC: Add an IS_ERR() check back to where it was
Olga Kornievskaia kolga@netapp.com NFSv4.1: fix handling NFS4ERR_DELAY when testing for session trunking
Arnd Bergmann arnd@arndb.de drm/i915/mtl: avoid stringop-overflow warning
Yi Yang yiyang13@huawei.com mtd: rawnand: meson: check return value of devm_kasprintf()
Yi Yang yiyang13@huawei.com mtd: rawnand: intel: check return value of devm_kasprintf()
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: ECONNRESET might require a rebind
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org dt-bindings: serial: fix regex pattern for matching serial node children
Jinghao Jia jinghao@linux.ibm.com samples/bpf: syscall_tp_user: Fix array out-of-bound access
Jinghao Jia jinghao@linux.ibm.com samples/bpf: syscall_tp_user: Rename num_progs into nr_tests
Finn Thain fthain@linux-m68k.org sched/core: Optimize in_task() and in_interrupt() a bit
Miri Korenblit miriam.rachel.korenblit@intel.com wifi: iwlwifi: Use FW rate for non-data frames
Yi Yang yiyang13@huawei.com mtd: rawnand: tegra: add missing check for platform_get_irq()
Dan Carpenter dan.carpenter@linaro.org pwm: Fix double shift bug
Vitaly Prosyak vitaly.prosyak@amd.com drm/amdgpu: fix software pci_unplug on some chips
Alex Spataru alex_spataru@outlook.com ALSA: hda/realtek: Add quirk for ASUS UX7602ZM
Zongmin Zhou zhouzongmin@kylinos.cn drm/qxl: prevent memory leak
Tony Lindgren tony@atomide.com ASoC: ti: omap-mcbsp: Fix runtime PM underflow warnings
Philipp Stanner pstanner@redhat.com i2c: dev: copy userspace array safely
Deepak Gupta debug@rivosinc.com riscv: VMAP_STACK overflow detection thread-safe
Douglas Anderson dianders@chromium.org kgdb: Flush console before entering kgdb on panic
Wayne Lin wayne.lin@amd.com drm/amd/display: Avoid NULL dereference of timing generator
Takashi Iwai tiwai@suse.de media: imon: fix access to invalid resource for the second interface
Sakari Ailus sakari.ailus@linux.intel.com media: ccs: Fix driver quirk struct documentation
Ilpo Järvinen ilpo.jarvinen@linux.intel.com media: cobalt: Use FIELD_GET() to extract Link Width
Al Viro viro@zeniv.linux.org.uk gfs2: fix an oops in gfs2_permission
Bob Peterson rpeterso@redhat.com gfs2: ignore negated quota changes
Hans Verkuil hverkuil-cisco@xs4all.nl media: ipu-bridge: increase sensor_name size
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: avoid integer overflow
Rajeshwar R Shinde coolrrsh@gmail.com media: gspca: cpia1: shift-out-of-bounds in set_flicker
Billy Tsai billy_tsai@aspeedtech.com i3c: master: mipi-i3c-hci: Fix a kernel panic for accessing DAT_data.
zhenwei pi pizhenwei@bytedance.com virtio-blk: fix implicit overflow on virtio_max_dma_size
Axel Lin axel.lin@ingics.com i2c: sun6i-p2wi: Prevent potential division by zero
Wolfram Sang wsa+renesas@sang-engineering.com i2c: fix memleak in i2c_new_client_device()
Jarkko Nikula jarkko.nikula@linux.intel.com i2c: i801: Add support for Intel Birch Stream SoC
Jarkko Nikula jarkko.nikula@linux.intel.com i3c: mipi-i3c-hci: Fix out of bounds access in hci_dma_irq_handler
Dominique Martinet asmadeus@codewreck.org 9p: v9fs_listxattr: fix %s null argument warning
Marco Elver elver@google.com 9p/trans_fd: Annotate data-racy writes to file::f_flags
Hardik Gajjar hgajjar@de.adit-jv.com usb: gadget: f_ncm: Always set current gadget in ncm_bind()
Wesley Cheng quic_wcheng@quicinc.com usb: host: xhci: Avoid XHCI resume delay if SSUSB device is not present
Zhiguo Niu zhiguo.niu@unisoc.com f2fs: fix error handling of __get_node_page
Zhiguo Niu zhiguo.niu@unisoc.com f2fs: fix error path of __f2fs_build_free_nids
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com soundwire: dmi-quirks: update HP Omen match
Neil Armstrong neil.armstrong@linaro.org usb: ucsi: glink: use the connector orientation GPIO to provide switch events
Stanley Chang stanley_chang@realtek.com usb: dwc3: core: configure TX/RX threshold for DWC3_IP
Konrad Dybcio konrad.dybcio@linaro.org phy: qualcomm: phy-qcom-eusb2-repeater: Zero out untouched tuning regs
Konrad Dybcio konrad.dybcio@linaro.org phy: qualcomm: phy-qcom-eusb2-repeater: Use regmap_fields
Konrad Dybcio konrad.dybcio@linaro.org dt-bindings: phy: qcom,snps-eusb2-repeater: Add magic tuning overrides
Yi Yang yiyang13@huawei.com tty: vcc: Add check for kstrdup() in vcc_probe()
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Apply USB 3.x bandwidth quirk only in software connection manager
Zhang Shurong zhang_shurong@foxmail.com iio: adc: stm32-adc: harden against NULL pointer deref in stm32_adc_probe()
Jarkko Nikula jarkko.nikula@linux.intel.com mfd: intel-lpss: Add Intel Lunar Lake-M PCI IDs
Yuezhang Mo Yuezhang.Mo@sony.com exfat: support handle zero-size directory
Jiri Kosina jkosina@suse.cz HID: Add quirk for Dell Pro Wireless Keyboard and Mouse KM5221W
Longfang Liu liulongfang@huawei.com crypto: hisilicon/qm - prevent soft lockup in receive loop
Hans de Goede hdegoede@redhat.com ASoC: Intel: soc-acpi-cht: Add Lenovo Yoga Tab 3 Pro YT3-X90 quirk
Bjorn Helgaas bhelgaas@google.com PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com misc: pci_endpoint_test: Add Device ID for R-Car S4-8 PCIe controller
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com PCI: dwc: Add missing PCI_EXP_LNKCAP_MLW handling
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com PCI: dwc: Add dw_pcie_link_set_max_link_width()
Bartosz Pawlowski bartosz.pawlowski@intel.com PCI: Disable ATS for specific Intel IPU E2000 devices
Bartosz Pawlowski bartosz.pawlowski@intel.com PCI: Extract ATS disabling to a helper function
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: Use FIELD_GET() to extract Link Width
Wenchao Hao haowenchao2@huawei.com scsi: libfc: Fix potential NULL pointer dereference in fc_lport_ptp_setup()
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: Do error check on own line to split long "if" conditions
Ilpo Järvinen ilpo.jarvinen@linux.intel.com atm: iphase: Do PCI error checks on own line
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: mvebu: Use FIELD_PREP() with Link Width
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields
Linus Walleij linus.walleij@linaro.org gpiolib: of: Add quirk for mt2701-cs42448 ASoC sound
Cezary Rojewski cezary.rojewski@intel.com ALSA: hda: Fix possible null-ptr-deref when assigning a stream
Vincent Whitchurch vincent.whitchurch@axis.com ARM: 9320/1: fix stack depot IRQ stack filter
Mikhail Khvainitski me@khvoinitsky.org HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround
Manas Ghandat ghandatmanas@gmail.com jfs: fix array-index-out-of-bounds in diAlloc
Manas Ghandat ghandatmanas@gmail.com jfs: fix array-index-out-of-bounds in dbFindLeaf
Juntong Deng juntong.deng@outlook.com fs/jfs: Add validity check for db_maxag and db_agpref
Juntong Deng juntong.deng@outlook.com fs/jfs: Add check for negative db_l2nbperpage
Tyrel Datwyler tyreld@linux.ibm.com scsi: ibmvfc: Remove BUG_ON in the case of an empty event pool
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Set debugfs_dir pointer to NULL after removing debugfs
Ilpo Järvinen ilpo.jarvinen@linux.intel.com RDMA/hfi1: Use FIELD_GET() to extract Link Width
Rander Wang rander.wang@intel.com ASoC: SOF: ipc4: handle EXCEPTION_CAUGHT notification from firmware
Geoffrey D. Bennett g@b4.vu ALSA: scarlett2: Move USB IDs out from device_info struct
Lu Jialin lujialin4@huawei.com crypto: pcrypt - Fix hungtask for PADATA_RESET
Richard Fitzgerald rf@opensource.cirrus.com ASoC: SOF: Pass PCI SSID to machine driver
Richard Fitzgerald rf@opensource.cirrus.com ASoC: soc-card: Add storage for PCI SSID
Trevor Wu trevor.wu@mediatek.com ASoC: mediatek: mt8188-mt6359: support dynamic pinctrl
zhujun2 zhujun2@cmss.chinamobile.com selftests/efivarfs: create-read: fix a resource leak
Laurentiu Tudor laurentiu.tudor@nxp.com arm64: dts: ls208xa: use a pseudo-bus to constrain usb dma size
Lin.Cao lincao12@amd.com drm/amd: check num of link levels when update pcie param
Samson Tam samson.tam@amd.com drm/amd/display: fix num_ways overflow error
Mario Limonciello mario.limonciello@amd.com drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported
Qu Huang qu.huang@linux.dev drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL
Jesse Zhang jesse.zhang@amd.com drm/amdkfd: Fix shift out-of-bounds issue
Ondrej Jirman megi@xff.cz drm/panel: st7703: Pick different reset sequence
Ma Ke make_ruc2021@163.com drm/amdgpu/vkms: fix a possible null pointer dereference
Ma Ke make_ruc2021@163.com drm/radeon: fix a possible null pointer dereference
Ma Ke make_ruc2021@163.com drm/panel/panel-tpo-tpg110: fix a possible null pointer dereference
Ma Ke make_ruc2021@163.com drm/panel: fix a possible null pointer dereference
Stanley.Yang Stanley.Yang@amd.com drm/amdgpu: Fix potential null pointer derefernce
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
Jani Nikula jani.nikula@intel.com drm/msm/dp: skip validity check for DP CTS EDID checksum
Philipp Stanner pstanner@redhat.com drm: vmwgfx_surface.c: copy user-array safely
Philipp Stanner pstanner@redhat.com drm_lease.c: copy user-array safely
Philipp Stanner pstanner@redhat.com kernel: watch_queue: copy user-array safely
Philipp Stanner pstanner@redhat.com kernel: kexec: copy user-array safely
Philipp Stanner pstanner@redhat.com string.h: add array-wrappers for (v)memdup_user()
Wenjing Liu wenjing.liu@amd.com drm/amd/display: use full update for clip size increase of large plane source
Mario Limonciello mario.limonciello@amd.com drm/amd: Update `update_pcie_parameters` functions to use uint8_t arguments
Xiaogang Chen xiaogang.chen@amd.com drm/amdkfd: Fix a race condition of vram buffer unref in svm code
David (Ming Qiang) Wu David.Wu3@amd.com drm/amdgpu: not to save bo in the case of RAS err_event_athub
Yu Kuai yukuai3@huawei.com md: don't rely on 'mddev->pers' to be set in mddev_suspend()
Ville Syrjälä ville.syrjala@linux.intel.com drm/edid: Fixup h/vsync_end instead of h/vtotal
Wenjing Liu wenjing.liu@amd.com drm/amd/display: add seamless pipe topology transition check
Alvin Lee alvin.lee2@amd.com drm/amd/display: Don't lock phantom pipe on disabling
Alvin Lee Alvin.Lee2@amd.com drm/amd/display: Blank phantom OTG before enabling
baozhu.liu lucas.liu@siengine.com drm/komeda: drop all currently held locks if deadlock happens
Harish Kasiviswanathan Harish.Kasiviswanathan@amd.com drm/amdkfd: ratelimited SQ interrupt messages
Sui Jingfeng suijingfeng@loongson.cn drm/gma500: Fix call trace when psb_gem_mm_init() fails
Olli Asikainen olli.asikainen@gmail.com platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e
Herve Codina herve.codina@bootlin.com of: address: Fix address translation when address-size is greater than 2
Tzung-Bi Shih tzungbi@kernel.org platform/chrome: kunit: initialize lock for fake ec_dev
Hans de Goede hdegoede@redhat.com gpiolib: acpi: Add a ignore interrupt quirk for Peaq C1010
Gerhard Engleder gerhard@engleder-embedded.com tsnep: Fix tsnep_request_irq() format-overflow warning
Jonathan Denose jdenose@chromium.org ACPI: EC: Add quirk for HP 250 G7 Notebook PC
ZhengHan Wang wzhmmmmm@gmail.com Bluetooth: Fix double free in hci_conn_cleanup
youwan Wang wangyouwan@126.com Bluetooth: btusb: Add date->evt_skb is NULL check
Gregory Greenman gregory.greenman@intel.com wifi: iwlwifi: mvm: fix size check for fw_link_id
Andrii Nakryiko andrii@kernel.org bpf: Ensure proper register state printing for cond jumps
Arseniy Krasnov avkrasnov@salutedevices.com vsock: read from socket's error queue
Raju Lakkaraju Raju.Lakkaraju@microchip.com net: sfp: add quirk for FS's 2.5G copper SFP
Douglas Anderson dianders@chromium.org wifi: ath10k: Don't touch the CE interrupt registers after power up
Ma Ke make_ruc2021@163.com wifi: ath12k: mhi: fix potential memory leak in ath12k_mhi_register()
Eric Dumazet edumazet@google.com net: annotate data-races around sk->sk_dst_pending_confirm
Eric Dumazet edumazet@google.com net: annotate data-races around sk->sk_tx_queue_mapping
Ingo Rohloff lundril@gmx.de wifi: mt76: mt7921e: Support MT7992 IP in Xiaomi Redmibook 15 Pro (2023)
Christian Marangi ansuelsmth@gmail.com net: sfp: add quirk for Fiberstone GPON-ONU-34-20BI
Shiju Jose shiju.jose@huawei.com ACPI: APEI: Fix AER info corruption when error status data has multiple sections
Baochen Qiang quic_bqiang@quicinc.com wifi: ath12k: fix possible out-of-bound write in ath12k_wmi_ext_hal_reg_caps()
Dmitry Antipov dmantipov@yandex.ru wifi: ath10k: fix clang-specific fortify warning
Baochen Qiang quic_bqiang@quicinc.com wifi: ath12k: fix possible out-of-bound read in ath12k_htt_pull_ppdu_stats()
Dmitry Antipov dmantipov@yandex.ru wifi: ath9k: fix clang-specific fortify warnings
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Detect IP == ksym.end as part of BPF program
Sieng-Piaw Liew liew.s.piaw@gmail.com atl1c: Work around the DMA RX overflow issue
Ping-Ke Shih pkshih@realtek.com wifi: mac80211: don't return unset power in ieee80211_get_tx_power()
Dmitry Antipov dmantipov@yandex.ru wifi: mac80211_hwsim: fix clang-specific fortify warning
Harshitha Prem quic_hprem@quicinc.com wifi: ath12k: Ignore fragments from uninitialized peer in dp
Dmitry Antipov dmantipov@yandex.ru wifi: plfxlc: fix clang-specific fortify warning
Mike Rapoport (IBM) rppt@kernel.org x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
Frederic Weisbecker frederic@kernel.org workqueue: Provide one lock class key per work_on_cpu() callsite
Ran Xiaokai ran.xiaokai@zte.com.cn cpu/hotplug: Don't offline the last non-isolated CPU
Rik van Riel riel@surriel.com smp,csd: Throw an error if a CSD lock is stuck for too long
Frederic Weisbecker frederic@kernel.org srcu: Only accelerate on enqueue time
Ronald Wahl ronald.wahl@raritan.com clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware
Jacky Bai ping.bai@nxp.com clocksource/drivers/timer-imx-gpt: Fix potential memory leak
Ricardo Cañuelo ricardo.canuelo@collabora.com selftests/lkdtm: Disable CONFIG_UBSAN_TRAP in test config
Denis Arefev arefev@swemel.ru srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
Zhen Lei thunder.leizhen@huawei.com rcu: Dump memory object info if callback function is invalid
Shuai Xue xueshuai@linux.alibaba.com perf/core: Bail out early if the request AUX area is out of bound
Josh Poimboeuf jpoimboe@kernel.org x86/retpoline: Make sure there are no unconverted return thunks due to KCSAN
Kent Overstreet kent.overstreet@gmail.com lib/generic-radix-tree.c: Don't overflow in peek()
Filipe Manana fdmanana@suse.com btrfs: abort transaction on generation mismatch when marking eb as dirty
John Stultz jstultz@google.com locking/ww_mutex/test: Fix potential workqueue corruption
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 11 + .../bindings/phy/qcom,snps-eusb2-repeater.yaml | 21 ++ .../devicetree/bindings/serial/serial.yaml | 2 +- .../devicetree/bindings/timer/renesas,rz-mtu3.yaml | 38 +-- Documentation/i2c/busses/i2c-i801.rst | 1 + .../ethernet/mellanox/mlx5/counters.rst | 6 + .../ethernet/mellanox/mlx5/devlink.rst | 313 ------------------ .../ethernet/mellanox/mlx5/index.rst | 1 - Documentation/networking/devlink/mlx5.rst | 182 +++++++++++ Makefile | 4 +- arch/arm/include/asm/exception.h | 4 - arch/arm64/Kconfig | 2 + arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 46 +-- arch/arm64/boot/dts/qcom/ipq5332.dtsi | 2 +- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 4 +- arch/arm64/boot/dts/qcom/ipq8074.dtsi | 2 +- arch/arm64/boot/dts/qcom/ipq9574.dtsi | 2 +- arch/arm64/kernel/module-plts.c | 6 - arch/loongarch/include/asm/percpu.h | 10 +- arch/parisc/include/uapi/asm/pdc.h | 1 + arch/parisc/kernel/entry.S | 88 +++-- arch/parisc/kernel/head.S | 5 +- arch/powerpc/perf/core-book3s.c | 5 +- arch/powerpc/platforms/powernv/opal-prd.c | 17 +- arch/powerpc/platforms/pseries/iommu.c | 8 +- arch/riscv/include/asm/asm-prototypes.h | 1 - arch/riscv/include/asm/asm.h | 22 ++ arch/riscv/include/asm/hwprobe.h | 5 + arch/riscv/include/asm/page.h | 4 +- arch/riscv/include/asm/thread_info.h | 3 - arch/riscv/include/asm/vdso/processor.h | 2 +- arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 72 +---- arch/riscv/kernel/probes/simulate-insn.c | 2 +- arch/riscv/kernel/probes/uprobes.c | 6 + arch/riscv/kernel/traps.c | 36 +-- arch/riscv/kernel/vdso/hwprobe.c | 2 +- arch/riscv/mm/ptdump.c | 3 + arch/s390/mm/page-states.c | 25 +- arch/s390/mm/vmem.c | 8 +- arch/x86/crypto/sha1_ssse3_glue.c | 12 + arch/x86/crypto/sha256_ssse3_glue.c | 12 + arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/numa.h | 7 - arch/x86/kernel/apic/msi.c | 8 +- arch/x86/kernel/cpu/hygon.c | 8 +- arch/x86/kvm/hyperv.c | 10 +- arch/x86/kvm/lapic.c | 30 +- arch/x86/kvm/vmx/vmx.c | 4 +- arch/x86/kvm/x86.c | 2 + arch/x86/mm/numa.c | 7 - arch/x86/pci/fixup.c | 59 ++++ block/blk-mq.c | 75 ++--- crypto/pcrypt.c | 4 + drivers/acpi/acpi_fpdt.c | 45 ++- drivers/acpi/apei/ghes.c | 23 +- drivers/acpi/ec.c | 10 + drivers/acpi/resource.c | 12 + drivers/atm/iphase.c | 20 +- drivers/base/dd.c | 2 +- drivers/base/regmap/regcache.c | 30 ++ drivers/block/virtio_blk.c | 4 +- drivers/bluetooth/btusb.c | 3 + drivers/char/agp/parisc-agp.c | 16 +- drivers/clk/qcom/gcc-ipq6018.c | 6 - drivers/clk/qcom/gcc-ipq8074.c | 6 - drivers/clk/socfpga/stratix10-clk.h | 4 +- drivers/clk/visconti/pll.h | 4 +- drivers/clocksource/timer-atmel-tcb.c | 1 + drivers/clocksource/timer-imx-gpt.c | 18 +- drivers/cpufreq/cpufreq_stats.c | 14 +- drivers/crypto/hisilicon/qm.c | 2 + drivers/cxl/core/port.c | 34 +- drivers/cxl/core/region.c | 23 +- drivers/dma/stm32-mdma.c | 4 +- drivers/firmware/qcom_scm.c | 7 + drivers/gpio/gpiolib-acpi.c | 20 ++ drivers/gpio/gpiolib-of.c | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 23 +- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 16 + drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 7 + drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 35 +- drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 5 +- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 6 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 13 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 24 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 29 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 70 ++-- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 +- drivers/gpu/drm/amd/display/dc/dc.h | 5 + .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 +- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 10 +- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 105 +++++- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h | 9 + drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 2 + drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h | 8 + drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 22 +- drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 50 ++- drivers/gpu/drm/amd/include/pptable.h | 4 +- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 +- .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 16 +- .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 4 +- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 2 +- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 4 +- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 4 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 9 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 40 +-- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 9 +- .../drm/arm/display/komeda/komeda_pipeline_state.c | 9 +- drivers/gpu/drm/bridge/ite-it66121.c | 4 +- drivers/gpu/drm/drm_edid.c | 18 +- drivers/gpu/drm/drm_lease.c | 4 +- drivers/gpu/drm/gma500/psb_drv.h | 1 + drivers/gpu/drm/gma500/psb_irq.c | 5 + drivers/gpu/drm/i915/display/intel_cdclk.c | 12 + drivers/gpu/drm/i915/display/intel_dp.c | 2 +- drivers/gpu/drm/i915/display/intel_tc.c | 11 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + drivers/gpu/drm/i915/gt/intel_ggtt.c | 35 +- drivers/gpu/drm/i915/gt/intel_rc6.c | 16 +- drivers/gpu/drm/i915/i915_perf.c | 15 +- drivers/gpu/drm/mediatek/mtk_dp.c | 6 +- drivers/gpu/drm/msm/dp/dp_panel.c | 21 +- drivers/gpu/drm/panel/panel-arm-versatile.c | 2 + drivers/gpu/drm/panel/panel-sitronix-st7703.c | 25 +- drivers/gpu/drm/panel/panel-tpo-tpg110.c | 2 + drivers/gpu/drm/qxl/qxl_display.c | 3 + drivers/gpu/drm/radeon/radeon_connectors.c | 2 + drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 4 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-lenovo.c | 118 ++++--- drivers/hid/hid-quirks.c | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-designware-master.c | 19 +- drivers/i2c/busses/i2c-i801.c | 22 +- drivers/i2c/busses/i2c-pxa.c | 76 ++++- drivers/i2c/busses/i2c-sun6i-p2wi.c | 5 + drivers/i2c/i2c-core-base.c | 13 +- drivers/i2c/i2c-core.h | 2 +- drivers/i2c/i2c-dev.c | 4 +- drivers/i3c/master/i3c-master-cdns.c | 6 +- drivers/i3c/master/mipi-i3c-hci/dat_v1.c | 29 +- drivers/i3c/master/mipi-i3c-hci/dma.c | 2 +- drivers/i3c/master/svc-i3c-master.c | 54 +++- drivers/iio/adc/stm32-adc-core.c | 9 +- drivers/infiniband/hw/hfi1/pcie.c | 9 +- drivers/iommu/iommufd/io_pagetable.c | 10 + drivers/leds/trigger/ledtrig-netdev.c | 6 +- drivers/mcb/mcb-core.c | 1 + drivers/mcb/mcb-parse.c | 2 +- drivers/md/dm-bufio.c | 87 +++-- drivers/md/dm-crypt.c | 15 +- drivers/md/dm-verity-fec.c | 4 +- drivers/md/dm-verity-target.c | 23 +- drivers/md/dm-verity.h | 2 +- drivers/md/md.c | 2 +- drivers/media/i2c/ccs/ccs-core.c | 2 +- drivers/media/i2c/ccs/ccs-quirk.h | 4 +- drivers/media/pci/cobalt/cobalt-driver.c | 11 +- drivers/media/pci/intel/ipu-bridge.h | 2 +- .../media/platform/qcom/camss/camss-csid-gen2.c | 11 +- .../platform/qcom/camss/camss-csiphy-3ph-1-0.c | 2 +- drivers/media/platform/qcom/camss/camss-vfe-170.c | 22 +- drivers/media/platform/qcom/camss/camss-vfe-480.c | 22 +- drivers/media/platform/qcom/camss/camss-vfe.c | 5 +- drivers/media/platform/qcom/camss/camss.c | 12 +- drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +- drivers/media/platform/qcom/venus/hfi_parser.c | 15 + drivers/media/platform/qcom/venus/hfi_venus.c | 10 + drivers/media/rc/imon.c | 6 + drivers/media/rc/ir-sharp-decoder.c | 8 +- drivers/media/rc/lirc_dev.c | 6 +- drivers/media/test-drivers/vivid/vivid-rds-gen.c | 2 +- drivers/media/usb/gspca/cpia1.c | 3 + drivers/mfd/intel-lpss-pci.c | 13 + drivers/mfd/qcom-spmi-pmic.c | 101 ++++-- drivers/misc/pci_endpoint_test.c | 4 + drivers/mmc/core/block.c | 4 +- drivers/mmc/core/card.h | 4 + drivers/mmc/core/mmc.c | 8 +- drivers/mmc/core/quirks.h | 7 +- drivers/mmc/host/meson-gx-mmc.c | 1 - drivers/mmc/host/sdhci-pci-gli.c | 30 ++ drivers/mmc/host/sdhci_am654.c | 2 +- drivers/mmc/host/vub300.c | 1 + drivers/mtd/chips/cfi_cmdset_0001.c | 20 +- drivers/mtd/nand/raw/intel-nand-controller.c | 10 + drivers/mtd/nand/raw/meson_nand.c | 3 + drivers/mtd/nand/raw/tegra_nand.c | 4 + drivers/net/bonding/bond_main.c | 6 + drivers/net/dsa/lan9303_mdio.c | 4 +- drivers/net/ethernet/amd/pds_core/adminq.c | 2 +- drivers/net/ethernet/amd/pds_core/core.h | 2 +- drivers/net/ethernet/amd/pds_core/dev.c | 8 +- drivers/net/ethernet/amd/pds_core/devlink.c | 2 +- drivers/net/ethernet/atheros/atl1c/atl1c.h | 3 - drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 67 +--- drivers/net/ethernet/cortina/gemini.c | 45 ++- drivers/net/ethernet/cortina/gemini.h | 4 +- drivers/net/ethernet/engleder/tsnep.h | 2 +- drivers/net/ethernet/engleder/tsnep_main.c | 12 +- drivers/net/ethernet/google/gve/gve_main.c | 8 +- drivers/net/ethernet/google/gve/gve_rx.c | 4 - drivers/net/ethernet/google/gve/gve_tx.c | 4 - drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 9 +- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 33 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 25 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 1 + .../ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c | 7 + drivers/net/ethernet/marvell/mvneta.c | 28 +- .../net/ethernet/mellanox/mlx5/core/en/health.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c | 255 +++++++++++---- drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h | 59 +++- .../ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 +- .../ethernet/mellanox/mlx5/core/en/reporter_tx.c | 65 ++++ .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 30 +- .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 16 +- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 12 +- drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 4 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 60 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 28 +- .../net/ethernet/mellanox/mlx5/core/lib/clock.c | 7 +- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h | 3 + drivers/net/ethernet/realtek/r8169_main.c | 46 ++- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +- drivers/net/ipvlan/ipvlan_core.c | 41 ++- drivers/net/macvlan.c | 2 +- drivers/net/phy/phylink.c | 1 + drivers/net/phy/sfp.c | 8 + drivers/net/ppp/ppp_synctty.c | 6 +- drivers/net/wireless/ath/ath10k/debug.c | 2 +- drivers/net/wireless/ath/ath10k/snoc.c | 18 +- drivers/net/wireless/ath/ath11k/dp_rx.c | 8 +- drivers/net/wireless/ath/ath11k/wmi.c | 19 +- drivers/net/wireless/ath/ath12k/dp.c | 1 + drivers/net/wireless/ath/ath12k/dp_rx.c | 33 +- drivers/net/wireless/ath/ath12k/mhi.c | 11 +- drivers/net/wireless/ath/ath12k/peer.h | 3 + drivers/net/wireless/ath/ath12k/wmi.c | 17 +- drivers/net/wireless/ath/ath9k/debug.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_debug.c | 2 +- drivers/net/wireless/intel/iwlwifi/mvm/link.c | 4 +- drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 14 +- drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 + drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +- drivers/net/wireless/purelifi/plfxlc/mac.c | 2 +- drivers/net/wireless/virtual/mac80211_hwsim.c | 2 +- drivers/of/address.c | 30 +- drivers/of/dynamic.c | 164 ++++++++++ drivers/of/unittest.c | 19 +- drivers/parisc/power.c | 16 +- drivers/pci/Kconfig | 12 + drivers/pci/Makefile | 1 + drivers/pci/bus.c | 2 + drivers/pci/controller/dwc/pci-exynos.c | 4 +- drivers/pci/controller/dwc/pci-keystone.c | 8 +- drivers/pci/controller/dwc/pcie-designware.c | 93 +++--- drivers/pci/controller/dwc/pcie-kirin.c | 4 +- drivers/pci/controller/dwc/pcie-qcom-ep.c | 17 + drivers/pci/controller/dwc/pcie-tegra194.c | 9 +- drivers/pci/controller/pci-mvebu.c | 2 +- drivers/pci/of.c | 79 +++++ drivers/pci/of_property.c | 355 +++++++++++++++++++++ drivers/pci/pci-acpi.c | 2 +- drivers/pci/pci-sysfs.c | 10 +- drivers/pci/pci.c | 22 +- drivers/pci/pci.h | 12 + drivers/pci/pcie/aer.c | 10 + drivers/pci/pcie/aspm.c | 2 + drivers/pci/probe.c | 6 +- drivers/pci/quirks.c | 64 +++- drivers/pci/remove.c | 1 + drivers/perf/arm_cspmu/arm_cspmu.c | 3 + drivers/perf/riscv_pmu_sbi.c | 5 + drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c | 131 ++++++-- drivers/platform/chrome/cros_ec_proto_test.c | 1 + drivers/platform/x86/thinkpad_acpi.c | 1 + drivers/powercap/intel_rapl_common.c | 2 +- drivers/ptp/ptp_chardev.c | 3 +- drivers/ptp/ptp_clock.c | 5 +- drivers/ptp/ptp_private.h | 8 +- drivers/ptp/ptp_sysfs.c | 3 +- drivers/s390/crypto/ap_bus.c | 4 + drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 +- drivers/scsi/ibmvscsi/ibmvfc.c | 124 ++++++- drivers/scsi/libfc/fc_lport.c | 6 + drivers/scsi/megaraid/megaraid_sas_base.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_base.c | 4 +- drivers/scsi/qla2xxx/qla_os.c | 12 +- drivers/soc/amlogic/meson-ee-pwrc.c | 2 +- drivers/soc/bcm/bcm2835-power.c | 2 +- drivers/soc/imx/gpc.c | 1 + drivers/soundwire/dmi-quirks.c | 2 +- drivers/thermal/intel/intel_powerclamp.c | 2 +- drivers/thunderbolt/quirks.c | 3 + drivers/tty/hvc/hvc_xen.c | 39 ++- drivers/tty/serial/meson_uart.c | 14 +- drivers/tty/sysrq.c | 3 +- drivers/tty/vcc.c | 16 +- drivers/ufs/core/ufs-mcq.c | 5 +- drivers/ufs/core/ufshcd.c | 3 +- drivers/ufs/host/ufs-qcom.c | 9 +- drivers/usb/dwc3/core.c | 160 +++++++--- drivers/usb/dwc3/core.h | 13 + drivers/usb/gadget/function/f_ncm.c | 27 +- drivers/usb/host/xhci-pci.c | 4 +- drivers/usb/host/xhci.c | 12 +- drivers/usb/typec/ucsi/ucsi_glink.c | 54 +++- drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 4 +- drivers/vhost/vdpa.c | 1 - drivers/watchdog/sbsa_gwdt.c | 4 +- drivers/xen/events/events_base.c | 16 +- fs/9p/xattr.c | 5 +- fs/btrfs/block-group.c | 4 +- fs/btrfs/ctree.c | 109 ++++--- fs/btrfs/ctree.h | 11 +- fs/btrfs/delalloc-space.c | 3 - fs/btrfs/delayed-inode.c | 2 +- fs/btrfs/dev-replace.c | 2 +- fs/btrfs/dir-item.c | 8 +- fs/btrfs/disk-io.c | 13 +- fs/btrfs/disk-io.h | 3 +- fs/btrfs/extent-tree.c | 36 ++- fs/btrfs/file-item.c | 17 +- fs/btrfs/file.c | 34 +- fs/btrfs/free-space-cache.c | 6 +- fs/btrfs/free-space-tree.c | 17 +- fs/btrfs/inode-item.c | 16 +- fs/btrfs/inode.c | 17 +- fs/btrfs/ioctl.c | 4 +- fs/btrfs/qgroup.c | 14 +- fs/btrfs/relocation.c | 10 +- fs/btrfs/root-tree.c | 4 +- fs/btrfs/tests/extent-buffer-tests.c | 6 +- fs/btrfs/tests/inode-tests.c | 12 +- fs/btrfs/tree-log.c | 12 +- fs/btrfs/uuid-tree.c | 6 +- fs/btrfs/volumes.c | 10 +- fs/btrfs/xattr.c | 8 +- fs/exfat/namei.c | 29 +- fs/ext4/acl.h | 5 + fs/ext4/ext4.h | 3 +- fs/ext4/extents_status.c | 127 +++++--- fs/ext4/file.c | 169 +++++----- fs/ext4/inode.c | 14 +- fs/ext4/resize.c | 23 +- fs/ext4/super.c | 14 + fs/f2fs/compress.c | 2 +- fs/f2fs/extent_cache.c | 53 ++- fs/f2fs/file.c | 9 + fs/f2fs/node.c | 18 +- fs/f2fs/xattr.c | 20 +- fs/gfs2/inode.c | 14 +- fs/gfs2/ops_fstype.c | 4 +- fs/gfs2/quota.c | 11 + fs/gfs2/super.c | 2 +- fs/inode.c | 16 + fs/jbd2/recovery.c | 8 + fs/jfs/jfs_dmap.c | 23 +- fs/jfs/jfs_imap.c | 5 +- fs/nfs/nfs4proc.c | 12 +- fs/nfsd/nfs4state.c | 2 +- fs/nfsd/nfscache.c | 23 +- fs/overlayfs/super.c | 2 +- fs/proc/proc_sysctl.c | 8 +- fs/quota/dquot.c | 14 + fs/smb/client/cached_dir.c | 84 +++-- fs/smb/client/cifs_debug.c | 6 + fs/smb/client/cifs_ioctl.h | 6 + fs/smb/client/cifs_spnego.c | 4 +- fs/smb/client/cifsfs.c | 1 + fs/smb/client/cifsglob.h | 12 +- fs/smb/client/cifspdu.h | 2 +- fs/smb/client/cifsproto.h | 7 +- fs/smb/client/connect.c | 15 +- fs/smb/client/inode.c | 4 + fs/smb/client/ioctl.c | 25 ++ fs/smb/client/sess.c | 1 - fs/smb/client/smb2misc.c | 2 +- fs/smb/client/smb2ops.c | 8 +- fs/smb/client/smb2transport.c | 5 +- fs/smb/client/transport.c | 11 +- fs/smb/client/xattr.c | 5 +- fs/smb/server/smb_common.c | 11 + fs/smb/server/smbacl.c | 29 +- fs/smb/server/vfs.c | 23 +- fs/xfs/xfs_inode_item_recover.c | 32 +- include/acpi/ghes.h | 4 + include/linux/bpf.h | 8 +- include/linux/damon.h | 7 + include/linux/ethtool.h | 4 +- include/linux/f2fs_fs.h | 1 + include/linux/fs.h | 45 ++- include/linux/generic-radix-tree.h | 7 + include/linux/irq.h | 26 +- include/linux/lsm_hook_defs.h | 4 +- include/linux/mmc/card.h | 2 + include/linux/msi.h | 6 - include/linux/of.h | 23 ++ include/linux/pci_ids.h | 2 + include/linux/perf_event.h | 13 +- include/linux/preempt.h | 15 +- include/linux/pwm.h | 4 +- include/linux/socket.h | 1 + include/linux/string.h | 40 +++ include/linux/sunrpc/clnt.h | 1 + include/linux/sysctl.h | 6 + include/linux/torture.h | 10 +- include/linux/trace_events.h | 4 + include/linux/workqueue.h | 46 ++- include/net/netfilter/nf_tables.h | 4 +- include/net/sock.h | 26 +- include/net/tc_act/tc_ct.h | 9 + include/sound/soc-acpi.h | 7 + include/sound/soc-card.h | 37 +++ include/sound/soc-dai.h | 1 + include/sound/soc.h | 11 + include/sound/sof.h | 8 + include/uapi/linux/prctl.h | 2 +- include/uapi/linux/vm_sockets.h | 17 + include/video/sticore.h | 2 +- init/Makefile | 1 + init/main.c | 4 + io_uring/fdinfo.c | 9 +- io_uring/sqpoll.c | 12 +- kernel/audit_watch.c | 9 +- kernel/bpf/core.c | 6 +- kernel/bpf/verifier.c | 64 +++- kernel/cgroup/cgroup.c | 12 - kernel/cpu.c | 11 +- kernel/debug/debug_core.c | 3 + kernel/events/core.c | 17 + kernel/events/ring_buffer.c | 6 + kernel/irq/debugfs.c | 1 - kernel/irq/generic-chip.c | 25 +- kernel/irq/msi.c | 12 +- kernel/kexec.c | 2 +- kernel/locking/locktorture.c | 12 +- kernel/locking/test-ww_mutex.c | 20 +- kernel/padata.c | 2 +- kernel/power/snapshot.c | 16 +- kernel/rcu/rcu.h | 7 + kernel/rcu/srcutiny.c | 1 + kernel/rcu/srcutree.c | 11 +- kernel/rcu/tasks.h | 1 + kernel/rcu/tiny.c | 1 + kernel/rcu/tree.c | 22 ++ kernel/rcu/tree.h | 4 + kernel/rcu/tree_stall.h | 20 +- kernel/reboot.c | 1 + kernel/sched/core.c | 5 +- kernel/smp.c | 13 +- kernel/torture.c | 67 ++-- kernel/trace/trace.c | 15 + kernel/trace/trace.h | 3 + kernel/trace/trace_events.c | 43 ++- kernel/trace/trace_events_filter.c | 3 + kernel/trace/trace_events_synth.c | 2 +- kernel/watch_queue.c | 2 +- kernel/watchdog.c | 7 + kernel/workqueue.c | 20 +- lib/generic-radix-tree.c | 17 +- mm/cma.c | 2 +- mm/damon/core.c | 10 +- mm/damon/lru_sort.c | 4 +- mm/damon/ops-common.c | 5 +- mm/damon/sysfs-schemes.c | 5 + mm/damon/sysfs.c | 89 +++--- mm/huge_memory.c | 16 +- mm/hugetlb.c | 33 +- mm/memcontrol.c | 3 +- mm/memory_hotplug.c | 2 +- net/9p/client.c | 2 +- net/9p/trans_fd.c | 13 +- net/bluetooth/hci_conn.c | 6 +- net/bluetooth/hci_sysfs.c | 23 +- net/bridge/netfilter/nf_conntrack_bridge.c | 2 +- net/core/sock.c | 2 +- net/ipv4/inet_hashtables.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/mac80211/cfg.c | 4 + net/mptcp/pm_netlink.c | 5 +- net/mptcp/protocol.c | 4 + net/mptcp/sockopt.c | 3 + net/ncsi/ncsi-aen.c | 5 - net/netfilter/nf_tables_api.c | 58 ++-- net/netfilter/nft_byteorder.c | 5 +- net/netfilter/nft_meta.c | 2 +- net/sched/act_ct.c | 3 + net/sunrpc/clnt.c | 7 +- net/sunrpc/rpcb_clnt.c | 4 + net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 +- net/tipc/netlink_compat.c | 1 + net/unix/af_unix.c | 9 +- net/vmw_vsock/af_vsock.c | 6 + samples/bpf/syscall_tp_user.c | 45 ++- scripts/Makefile.vmlinux | 1 + scripts/gcc-plugins/randomize_layout_plugin.c | 21 +- security/integrity/iint.c | 48 ++- security/integrity/ima/ima_api.c | 5 + security/integrity/ima/ima_main.c | 16 +- security/integrity/integrity.h | 2 + security/keys/trusted-keys/trusted_core.c | 20 +- security/keys/trusted-keys/trusted_tee.c | 64 ++-- sound/core/info.c | 21 +- sound/hda/hdac_stream.c | 6 +- sound/pci/hda/patch_realtek.c | 26 +- sound/soc/codecs/lpass-wsa-macro.c | 3 + sound/soc/codecs/wsa883x.c | 7 +- sound/soc/intel/common/soc-acpi-intel-cht-match.c | 43 +++ sound/soc/mediatek/mt8188/mt8188-mt6359.c | 21 ++ sound/soc/soc-dai.c | 7 + sound/soc/soc-pcm.c | 12 +- sound/soc/sof/ipc4.c | 3 + sound/soc/sof/sof-audio.c | 7 + sound/soc/sof/sof-pci-dev.c | 8 + sound/soc/ti/omap-mcbsp.c | 6 +- sound/usb/mixer_scarlett_gen2.c | 63 ++-- tools/include/uapi/linux/prctl.h | 2 +- tools/perf/util/intel-pt.c | 2 + tools/power/x86/turbostat/turbostat.c | 3 +- tools/testing/cxl/test/cxl.c | 2 +- tools/testing/selftests/bpf/verifier/ld_imm64.c | 8 +- tools/testing/selftests/clone3/clone3.c | 7 +- tools/testing/selftests/efivarfs/create-read.c | 2 + tools/testing/selftests/lkdtm/config | 1 - tools/testing/selftests/lkdtm/tests.txt | 2 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 +- tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/cmt_test.c | 3 - tools/testing/selftests/resctrl/mba_test.c | 2 +- tools/testing/selftests/resctrl/mbm_test.c | 2 +- tools/testing/selftests/resctrl/resctrl.h | 1 - tools/testing/selftests/resctrl/resctrl_val.c | 4 +- 550 files changed, 6181 insertions(+), 2845 deletions(-)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Stultz jstultz@google.com
[ Upstream commit bccdd808902f8c677317cec47c306e42b93b849e ]
In some cases running with the test-ww_mutex code, I was seeing odd behavior where sometimes it seemed flush_workqueue was returning before all the work threads were finished.
Often this would cause strange crashes as the mutexes would be freed while they were being used.
Looking at the code, there is a lifetime problem as the controlling thread that spawns the work allocates the "struct stress" structures that are passed to the workqueue threads. Then when the workqueue threads are finished, they free the stress struct that was passed to them.
Unfortunately the workqueue work_struct node is in the stress struct. Which means the work_struct is freed before the work thread returns and while flush_workqueue is waiting.
It seems like a better idea to have the controlling thread both allocate and free the stress structures, so that we can be sure we don't corrupt the workqueue by freeing the structure prematurely.
So this patch reworks the test to do so, and with this change I no longer see the early flush_workqueue returns.
Signed-off-by: John Stultz jstultz@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230922043616.19282-3-jstultz@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/test-ww_mutex.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 93cca6e698600..7c5a8f05497f2 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -466,7 +466,6 @@ static void stress_inorder_work(struct work_struct *work) } while (!time_after(jiffies, stress->timeout));
kfree(order); - kfree(stress); }
struct reorder_lock { @@ -531,7 +530,6 @@ static void stress_reorder_work(struct work_struct *work) list_for_each_entry_safe(ll, ln, &locks, link) kfree(ll); kfree(order); - kfree(stress); }
static void stress_one_work(struct work_struct *work) @@ -552,8 +550,6 @@ static void stress_one_work(struct work_struct *work) break; } } while (!time_after(jiffies, stress->timeout)); - - kfree(stress); }
#define STRESS_INORDER BIT(0) @@ -564,15 +560,24 @@ static void stress_one_work(struct work_struct *work) static int stress(int nlocks, int nthreads, unsigned int flags) { struct ww_mutex *locks; - int n; + struct stress *stress_array; + int n, count;
locks = kmalloc_array(nlocks, sizeof(*locks), GFP_KERNEL); if (!locks) return -ENOMEM;
+ stress_array = kmalloc_array(nthreads, sizeof(*stress_array), + GFP_KERNEL); + if (!stress_array) { + kfree(locks); + return -ENOMEM; + } + for (n = 0; n < nlocks; n++) ww_mutex_init(&locks[n], &ww_class);
+ count = 0; for (n = 0; nthreads; n++) { struct stress *stress; void (*fn)(struct work_struct *work); @@ -596,9 +601,7 @@ static int stress(int nlocks, int nthreads, unsigned int flags) if (!fn) continue;
- stress = kmalloc(sizeof(*stress), GFP_KERNEL); - if (!stress) - break; + stress = &stress_array[count++];
INIT_WORK(&stress->work, fn); stress->locks = locks; @@ -613,6 +616,7 @@ static int stress(int nlocks, int nthreads, unsigned int flags)
for (n = 0; n < nlocks; n++) ww_mutex_destroy(&locks[n]); + kfree(stress_array); kfree(locks);
return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 50564b651d01c19ce732819c5b3c3fd60707188e ]
When marking an extent buffer as dirty, at btrfs_mark_buffer_dirty(), we check if its generation matches the running transaction and if not we just print a warning. Such mismatch is an indicator that something really went wrong and only printing a warning message (and stack trace) is not enough to prevent a corruption. Allowing a transaction to commit with such an extent buffer will trigger an error if we ever try to read it from disk due to a generation mismatch with its parent generation.
So abort the current transaction with -EUCLEAN if we notice a generation mismatch. For this we need to pass a transaction handle to btrfs_mark_buffer_dirty() which is always available except in test code, in which case we can pass NULL since it operates on dummy extent buffers and all test roots have a single node/leaf (root node at level 0).
Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 4 +- fs/btrfs/ctree.c | 109 +++++++++++++++------------ fs/btrfs/ctree.h | 11 ++- fs/btrfs/delayed-inode.c | 2 +- fs/btrfs/dev-replace.c | 2 +- fs/btrfs/dir-item.c | 8 +- fs/btrfs/disk-io.c | 13 +++- fs/btrfs/disk-io.h | 3 +- fs/btrfs/extent-tree.c | 36 +++++---- fs/btrfs/file-item.c | 17 +++-- fs/btrfs/file.c | 34 ++++----- fs/btrfs/free-space-cache.c | 6 +- fs/btrfs/free-space-tree.c | 17 +++-- fs/btrfs/inode-item.c | 16 ++-- fs/btrfs/inode.c | 10 +-- fs/btrfs/ioctl.c | 4 +- fs/btrfs/qgroup.c | 14 ++-- fs/btrfs/relocation.c | 10 +-- fs/btrfs/root-tree.c | 4 +- fs/btrfs/tests/extent-buffer-tests.c | 6 +- fs/btrfs/tests/inode-tests.c | 12 ++- fs/btrfs/tree-log.c | 12 +-- fs/btrfs/uuid-tree.c | 6 +- fs/btrfs/volumes.c | 10 +-- fs/btrfs/xattr.c | 8 +- 25 files changed, 205 insertions(+), 169 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 5e7a19fca79c4..bf65f801d8439 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2587,7 +2587,7 @@ static int insert_dev_extent(struct btrfs_trans_handle *trans, btrfs_set_dev_extent_chunk_offset(leaf, extent, chunk_offset);
btrfs_set_dev_extent_length(leaf, extent, num_bytes); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); out: btrfs_free_path(path); return ret; @@ -3011,7 +3011,7 @@ static int update_block_group_item(struct btrfs_trans_handle *trans, cache->global_root_id); btrfs_set_stack_block_group_flags(&bgi, cache->flags); write_extent_buffer(leaf, &bgi, bi, sizeof(bgi)); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); fail: btrfs_release_path(path); /* diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 617d4827eec26..118ad4d2cbbe2 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -359,7 +359,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, return ret; }
- btrfs_mark_buffer_dirty(cow); + btrfs_mark_buffer_dirty(trans, cow); *cow_ret = cow; return 0; } @@ -627,7 +627,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, cow->start); btrfs_set_node_ptr_generation(parent, parent_slot, trans->transid); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent); if (last_ref) { ret = btrfs_tree_mod_log_free_eb(buf); if (ret) { @@ -643,7 +643,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, if (unlock_orig) btrfs_tree_unlock(buf); free_extent_buffer_stale(buf); - btrfs_mark_buffer_dirty(cow); + btrfs_mark_buffer_dirty(trans, cow); *cow_ret = cow; return 0; } @@ -1197,7 +1197,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto out; } btrfs_set_node_key(parent, &right_key, pslot + 1); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent); } } if (btrfs_header_nritems(mid) == 1) { @@ -1255,7 +1255,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto out; } btrfs_set_node_key(parent, &mid_key, pslot); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent); }
/* update the path */ @@ -1362,7 +1362,7 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, return ret; } btrfs_set_node_key(parent, &disk_key, pslot); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent); if (btrfs_header_nritems(left) > orig_slot) { path->nodes[level] = left; path->slots[level + 1] -= 1; @@ -1422,7 +1422,7 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, return ret; } btrfs_set_node_key(parent, &disk_key, pslot + 1); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent);
if (btrfs_header_nritems(mid) <= orig_slot) { path->nodes[level] = right; @@ -2678,7 +2678,8 @@ int btrfs_get_next_valid_item(struct btrfs_root *root, struct btrfs_key *key, * higher levels * */ -static void fixup_low_keys(struct btrfs_path *path, +static void fixup_low_keys(struct btrfs_trans_handle *trans, + struct btrfs_path *path, struct btrfs_disk_key *key, int level) { int i; @@ -2695,7 +2696,7 @@ static void fixup_low_keys(struct btrfs_path *path, BTRFS_MOD_LOG_KEY_REPLACE); BUG_ON(ret < 0); btrfs_set_node_key(t, key, tslot); - btrfs_mark_buffer_dirty(path->nodes[i]); + btrfs_mark_buffer_dirty(trans, path->nodes[i]); if (tslot != 0) break; } @@ -2707,10 +2708,11 @@ static void fixup_low_keys(struct btrfs_path *path, * This function isn't completely safe. It's the caller's responsibility * that the new key won't break the order */ -void btrfs_set_item_key_safe(struct btrfs_fs_info *fs_info, +void btrfs_set_item_key_safe(struct btrfs_trans_handle *trans, struct btrfs_path *path, const struct btrfs_key *new_key) { + struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_disk_key disk_key; struct extent_buffer *eb; int slot; @@ -2748,9 +2750,9 @@ void btrfs_set_item_key_safe(struct btrfs_fs_info *fs_info,
btrfs_cpu_key_to_disk(&disk_key, new_key); btrfs_set_item_key(eb, &disk_key, slot); - btrfs_mark_buffer_dirty(eb); + btrfs_mark_buffer_dirty(trans, eb); if (slot == 0) - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1); }
/* @@ -2881,8 +2883,8 @@ static int push_node_left(struct btrfs_trans_handle *trans, } btrfs_set_header_nritems(src, src_nritems - push_items); btrfs_set_header_nritems(dst, dst_nritems + push_items); - btrfs_mark_buffer_dirty(src); - btrfs_mark_buffer_dirty(dst); + btrfs_mark_buffer_dirty(trans, src); + btrfs_mark_buffer_dirty(trans, dst);
return ret; } @@ -2957,8 +2959,8 @@ static int balance_node_right(struct btrfs_trans_handle *trans, btrfs_set_header_nritems(src, src_nritems - push_items); btrfs_set_header_nritems(dst, dst_nritems + push_items);
- btrfs_mark_buffer_dirty(src); - btrfs_mark_buffer_dirty(dst); + btrfs_mark_buffer_dirty(trans, src); + btrfs_mark_buffer_dirty(trans, dst);
return ret; } @@ -3007,7 +3009,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans,
btrfs_set_node_ptr_generation(c, 0, lower_gen);
- btrfs_mark_buffer_dirty(c); + btrfs_mark_buffer_dirty(trans, c);
old = root->node; ret = btrfs_tree_mod_log_insert_root(root->node, c, false); @@ -3079,7 +3081,7 @@ static int insert_ptr(struct btrfs_trans_handle *trans, WARN_ON(trans->transid == 0); btrfs_set_node_ptr_generation(lower, slot, trans->transid); btrfs_set_header_nritems(lower, nritems + 1); - btrfs_mark_buffer_dirty(lower); + btrfs_mark_buffer_dirty(trans, lower);
return 0; } @@ -3158,8 +3160,8 @@ static noinline int split_node(struct btrfs_trans_handle *trans, btrfs_set_header_nritems(split, c_nritems - mid); btrfs_set_header_nritems(c, mid);
- btrfs_mark_buffer_dirty(c); - btrfs_mark_buffer_dirty(split); + btrfs_mark_buffer_dirty(trans, c); + btrfs_mark_buffer_dirty(trans, split);
ret = insert_ptr(trans, path, &disk_key, split->start, path->slots[level + 1] + 1, level + 1); @@ -3325,15 +3327,15 @@ static noinline int __push_leaf_right(struct btrfs_trans_handle *trans, btrfs_set_header_nritems(left, left_nritems);
if (left_nritems) - btrfs_mark_buffer_dirty(left); + btrfs_mark_buffer_dirty(trans, left); else btrfs_clear_buffer_dirty(trans, left);
- btrfs_mark_buffer_dirty(right); + btrfs_mark_buffer_dirty(trans, right);
btrfs_item_key(right, &disk_key, 0); btrfs_set_node_key(upper, &disk_key, slot + 1); - btrfs_mark_buffer_dirty(upper); + btrfs_mark_buffer_dirty(trans, upper);
/* then fixup the leaf pointer in the path */ if (path->slots[0] >= left_nritems) { @@ -3545,14 +3547,14 @@ static noinline int __push_leaf_left(struct btrfs_trans_handle *trans, btrfs_set_token_item_offset(&token, i, push_space); }
- btrfs_mark_buffer_dirty(left); + btrfs_mark_buffer_dirty(trans, left); if (right_nritems) - btrfs_mark_buffer_dirty(right); + btrfs_mark_buffer_dirty(trans, right); else btrfs_clear_buffer_dirty(trans, right);
btrfs_item_key(right, &disk_key, 0); - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1);
/* then fixup the leaf pointer in the path */ if (path->slots[0] < push_items) { @@ -3683,8 +3685,8 @@ static noinline int copy_for_split(struct btrfs_trans_handle *trans, if (ret < 0) return ret;
- btrfs_mark_buffer_dirty(right); - btrfs_mark_buffer_dirty(l); + btrfs_mark_buffer_dirty(trans, right); + btrfs_mark_buffer_dirty(trans, l); BUG_ON(path->slots[0] != slot);
if (mid <= slot) { @@ -3925,7 +3927,7 @@ static noinline int split_leaf(struct btrfs_trans_handle *trans, path->nodes[0] = right; path->slots[0] = 0; if (path->slots[1] == 0) - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1); } /* * We create a new leaf 'right' for the required ins_len and @@ -4024,7 +4026,8 @@ static noinline int setup_leaf_for_split(struct btrfs_trans_handle *trans, return ret; }
-static noinline int split_item(struct btrfs_path *path, +static noinline int split_item(struct btrfs_trans_handle *trans, + struct btrfs_path *path, const struct btrfs_key *new_key, unsigned long split_offset) { @@ -4083,7 +4086,7 @@ static noinline int split_item(struct btrfs_path *path, write_extent_buffer(leaf, buf + split_offset, btrfs_item_ptr_offset(leaf, slot), item_size - split_offset); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
BUG_ON(btrfs_leaf_free_space(leaf) < 0); kfree(buf); @@ -4117,7 +4120,7 @@ int btrfs_split_item(struct btrfs_trans_handle *trans, if (ret) return ret;
- ret = split_item(path, new_key, split_offset); + ret = split_item(trans, path, new_key, split_offset); return ret; }
@@ -4127,7 +4130,8 @@ int btrfs_split_item(struct btrfs_trans_handle *trans, * off the end of the item or if we shift the item to chop bytes off * the front. */ -void btrfs_truncate_item(struct btrfs_path *path, u32 new_size, int from_end) +void btrfs_truncate_item(struct btrfs_trans_handle *trans, + struct btrfs_path *path, u32 new_size, int from_end) { int slot; struct extent_buffer *leaf; @@ -4203,11 +4207,11 @@ void btrfs_truncate_item(struct btrfs_path *path, u32 new_size, int from_end) btrfs_set_disk_key_offset(&disk_key, offset + size_diff); btrfs_set_item_key(leaf, &disk_key, slot); if (slot == 0) - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1); }
btrfs_set_item_size(leaf, slot, new_size); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (btrfs_leaf_free_space(leaf) < 0) { btrfs_print_leaf(leaf); @@ -4218,7 +4222,8 @@ void btrfs_truncate_item(struct btrfs_path *path, u32 new_size, int from_end) /* * make the item pointed to by the path bigger, data_size is the added size. */ -void btrfs_extend_item(struct btrfs_path *path, u32 data_size) +void btrfs_extend_item(struct btrfs_trans_handle *trans, + struct btrfs_path *path, u32 data_size) { int slot; struct extent_buffer *leaf; @@ -4268,7 +4273,7 @@ void btrfs_extend_item(struct btrfs_path *path, u32 data_size) data_end = old_data; old_size = btrfs_item_size(leaf, slot); btrfs_set_item_size(leaf, slot, old_size + data_size); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (btrfs_leaf_free_space(leaf) < 0) { btrfs_print_leaf(leaf); @@ -4279,6 +4284,7 @@ void btrfs_extend_item(struct btrfs_path *path, u32 data_size) /* * Make space in the node before inserting one or more items. * + * @trans: transaction handle * @root: root we are inserting items to * @path: points to the leaf/slot where we are going to insert new items * @batch: information about the batch of items to insert @@ -4286,7 +4292,8 @@ void btrfs_extend_item(struct btrfs_path *path, u32 data_size) * Main purpose is to save stack depth by doing the bulk of the work in a * function that doesn't call btrfs_search_slot */ -static void setup_items_for_insert(struct btrfs_root *root, struct btrfs_path *path, +static void setup_items_for_insert(struct btrfs_trans_handle *trans, + struct btrfs_root *root, struct btrfs_path *path, const struct btrfs_item_batch *batch) { struct btrfs_fs_info *fs_info = root->fs_info; @@ -4306,7 +4313,7 @@ static void setup_items_for_insert(struct btrfs_root *root, struct btrfs_path *p */ if (path->slots[0] == 0) { btrfs_cpu_key_to_disk(&disk_key, &batch->keys[0]); - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1); } btrfs_unlock_up_safe(path, 1);
@@ -4365,7 +4372,7 @@ static void setup_items_for_insert(struct btrfs_root *root, struct btrfs_path *p }
btrfs_set_header_nritems(leaf, nritems + batch->nr); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (btrfs_leaf_free_space(leaf) < 0) { btrfs_print_leaf(leaf); @@ -4376,12 +4383,14 @@ static void setup_items_for_insert(struct btrfs_root *root, struct btrfs_path *p /* * Insert a new item into a leaf. * + * @trans: Transaction handle. * @root: The root of the btree. * @path: A path pointing to the target leaf and slot. * @key: The key of the new item. * @data_size: The size of the data associated with the new key. */ -void btrfs_setup_item_for_insert(struct btrfs_root *root, +void btrfs_setup_item_for_insert(struct btrfs_trans_handle *trans, + struct btrfs_root *root, struct btrfs_path *path, const struct btrfs_key *key, u32 data_size) @@ -4393,7 +4402,7 @@ void btrfs_setup_item_for_insert(struct btrfs_root *root, batch.total_data_size = data_size; batch.nr = 1;
- setup_items_for_insert(root, path, &batch); + setup_items_for_insert(trans, root, path, &batch); }
/* @@ -4419,7 +4428,7 @@ int btrfs_insert_empty_items(struct btrfs_trans_handle *trans, slot = path->slots[0]; BUG_ON(slot < 0);
- setup_items_for_insert(root, path, batch); + setup_items_for_insert(trans, root, path, batch); return 0; }
@@ -4444,7 +4453,7 @@ int btrfs_insert_item(struct btrfs_trans_handle *trans, struct btrfs_root *root, leaf = path->nodes[0]; ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); write_extent_buffer(leaf, data, ptr, data_size); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } btrfs_free_path(path); return ret; @@ -4475,7 +4484,7 @@ int btrfs_duplicate_item(struct btrfs_trans_handle *trans, return ret;
path->slots[0]++; - btrfs_setup_item_for_insert(root, path, new_key, item_size); + btrfs_setup_item_for_insert(trans, root, path, new_key, item_size); leaf = path->nodes[0]; memcpy_extent_buffer(leaf, btrfs_item_ptr_offset(leaf, path->slots[0]), @@ -4533,9 +4542,9 @@ int btrfs_del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_disk_key disk_key;
btrfs_node_key(parent, &disk_key, 0); - fixup_low_keys(path, &disk_key, level + 1); + fixup_low_keys(trans, path, &disk_key, level + 1); } - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent); return 0; }
@@ -4632,7 +4641,7 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_disk_key disk_key;
btrfs_item_key(leaf, &disk_key, 0); - fixup_low_keys(path, &disk_key, 1); + fixup_low_keys(trans, path, &disk_key, 1); }
/* @@ -4697,11 +4706,11 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, struct btrfs_root *root, * dirtied this buffer */ if (path->nodes[0] == leaf) - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); free_extent_buffer(leaf); } } else { - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } } return ret; diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ff40acd63a374..06333a74d6c4c 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -518,7 +518,7 @@ int btrfs_previous_item(struct btrfs_root *root, int type); int btrfs_previous_extent_item(struct btrfs_root *root, struct btrfs_path *path, u64 min_objectid); -void btrfs_set_item_key_safe(struct btrfs_fs_info *fs_info, +void btrfs_set_item_key_safe(struct btrfs_trans_handle *trans, struct btrfs_path *path, const struct btrfs_key *new_key); struct extent_buffer *btrfs_root_node(struct btrfs_root *root); @@ -545,8 +545,10 @@ int btrfs_block_can_be_shared(struct btrfs_trans_handle *trans, struct extent_buffer *buf); int btrfs_del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, int level, int slot); -void btrfs_extend_item(struct btrfs_path *path, u32 data_size); -void btrfs_truncate_item(struct btrfs_path *path, u32 new_size, int from_end); +void btrfs_extend_item(struct btrfs_trans_handle *trans, + struct btrfs_path *path, u32 data_size); +void btrfs_truncate_item(struct btrfs_trans_handle *trans, + struct btrfs_path *path, u32 new_size, int from_end); int btrfs_split_item(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, @@ -610,7 +612,8 @@ struct btrfs_item_batch { int nr; };
-void btrfs_setup_item_for_insert(struct btrfs_root *root, +void btrfs_setup_item_for_insert(struct btrfs_trans_handle *trans, + struct btrfs_root *root, struct btrfs_path *path, const struct btrfs_key *key, u32 data_size); diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 142e0a0f6a9fe..5d3229b42b3e2 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1030,7 +1030,7 @@ static int __btrfs_update_delayed_inode(struct btrfs_trans_handle *trans, struct btrfs_inode_item); write_extent_buffer(leaf, &node->inode_item, (unsigned long)inode_item, sizeof(struct btrfs_inode_item)); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (!test_bit(BTRFS_DELAYED_NODE_DEL_IREF, &node->flags)) goto out; diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 5f10965fd72bf..5549cbd9bdf6a 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -442,7 +442,7 @@ int btrfs_run_dev_replace(struct btrfs_trans_handle *trans) dev_replace->item_needs_writeback = 0; up_write(&dev_replace->rwsem);
- btrfs_mark_buffer_dirty(eb); + btrfs_mark_buffer_dirty(trans, eb);
out: btrfs_free_path(path); diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c index 082eb0e195981..9c07d5c3e5ad2 100644 --- a/fs/btrfs/dir-item.c +++ b/fs/btrfs/dir-item.c @@ -38,7 +38,7 @@ static struct btrfs_dir_item *insert_with_overflow(struct btrfs_trans_handle di = btrfs_match_dir_item_name(fs_info, path, name, name_len); if (di) return ERR_PTR(-EEXIST); - btrfs_extend_item(path, data_size); + btrfs_extend_item(trans, path, data_size); } else if (ret < 0) return ERR_PTR(ret); WARN_ON(ret > 0); @@ -93,7 +93,7 @@ int btrfs_insert_xattr_item(struct btrfs_trans_handle *trans,
write_extent_buffer(leaf, name, name_ptr, name_len); write_extent_buffer(leaf, data, data_ptr, data_len); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]);
return ret; } @@ -153,7 +153,7 @@ int btrfs_insert_dir_item(struct btrfs_trans_handle *trans, name_ptr = (unsigned long)(dir_item + 1);
write_extent_buffer(leaf, name->name, name_ptr, name->len); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
second_insert: /* FIXME, use some real flag for selecting the extra index */ @@ -439,7 +439,7 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle *trans, start = btrfs_item_ptr_offset(leaf, path->slots[0]); memmove_extent_buffer(leaf, ptr, ptr + sub_item_len, item_len - (ptr + sub_item_len - start)); - btrfs_truncate_item(path, item_len - sub_item_len, 1); + btrfs_truncate_item(trans, path, item_len - sub_item_len, 1); } return ret; } diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 681594df7334f..1ae781f533582 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -872,7 +872,7 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans, }
root->node = leaf; - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
root->commit_root = btrfs_root_node(root); set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); @@ -947,7 +947,7 @@ int btrfs_alloc_log_tree_node(struct btrfs_trans_handle *trans,
root->node = leaf;
- btrfs_mark_buffer_dirty(root->node); + btrfs_mark_buffer_dirty(trans, root->node); btrfs_tree_unlock(root->node);
return 0; @@ -4426,7 +4426,8 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info) btrfs_close_devices(fs_info->fs_devices); }
-void btrfs_mark_buffer_dirty(struct extent_buffer *buf) +void btrfs_mark_buffer_dirty(struct btrfs_trans_handle *trans, + struct extent_buffer *buf) { struct btrfs_fs_info *fs_info = buf->fs_info; u64 transid = btrfs_header_generation(buf); @@ -4440,10 +4441,14 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf) if (unlikely(test_bit(EXTENT_BUFFER_UNMAPPED, &buf->bflags))) return; #endif + /* This is an active transaction (its state < TRANS_STATE_UNBLOCKED). */ + ASSERT(trans->transid == fs_info->generation); btrfs_assert_tree_write_locked(buf); - if (transid != fs_info->generation) + if (transid != fs_info->generation) { WARN(1, KERN_CRIT "btrfs transid mismatch buffer %llu, found %llu running %llu\n", buf->start, transid, fs_info->generation); + btrfs_abort_transaction(trans, -EUCLEAN); + } set_extent_buffer_dirty(buf); #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY /* diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index b03767f4d7edf..e5bdb96912438 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -105,7 +105,8 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root) }
void btrfs_put_root(struct btrfs_root *root); -void btrfs_mark_buffer_dirty(struct extent_buffer *buf); +void btrfs_mark_buffer_dirty(struct btrfs_trans_handle *trans, + struct extent_buffer *buf); int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, int atomic); int btrfs_read_extent_buffer(struct extent_buffer *buf, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 14ea6b587e97b..118c56c512bd8 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -596,7 +596,7 @@ static noinline int insert_extent_data_ref(struct btrfs_trans_handle *trans, btrfs_set_extent_data_ref_count(leaf, ref, num_refs); } } - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); ret = 0; fail: btrfs_release_path(path); @@ -644,7 +644,7 @@ static noinline int remove_extent_data_ref(struct btrfs_trans_handle *trans, btrfs_set_extent_data_ref_count(leaf, ref1, num_refs); else if (key.type == BTRFS_SHARED_DATA_REF_KEY) btrfs_set_shared_data_ref_count(leaf, ref2, num_refs); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } return ret; } @@ -997,7 +997,7 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans, * helper to add new inline back ref */ static noinline_for_stack -void setup_inline_extent_backref(struct btrfs_fs_info *fs_info, +void setup_inline_extent_backref(struct btrfs_trans_handle *trans, struct btrfs_path *path, struct btrfs_extent_inline_ref *iref, u64 parent, u64 root_objectid, @@ -1020,7 +1020,7 @@ void setup_inline_extent_backref(struct btrfs_fs_info *fs_info, type = extent_ref_type(parent, owner); size = btrfs_extent_inline_ref_size(type);
- btrfs_extend_item(path, size); + btrfs_extend_item(trans, path, size);
ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item); refs = btrfs_extent_refs(leaf, ei); @@ -1054,7 +1054,7 @@ void setup_inline_extent_backref(struct btrfs_fs_info *fs_info, } else { btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid); } - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); }
static int lookup_extent_backref(struct btrfs_trans_handle *trans, @@ -1087,7 +1087,9 @@ static int lookup_extent_backref(struct btrfs_trans_handle *trans, /* * helper to update/remove inline back ref */ -static noinline_for_stack int update_inline_extent_backref(struct btrfs_path *path, +static noinline_for_stack int update_inline_extent_backref( + struct btrfs_trans_handle *trans, + struct btrfs_path *path, struct btrfs_extent_inline_ref *iref, int refs_to_mod, struct btrfs_delayed_extent_op *extent_op) @@ -1195,9 +1197,9 @@ static noinline_for_stack int update_inline_extent_backref(struct btrfs_path *pa memmove_extent_buffer(leaf, ptr, ptr + size, end - ptr - size); item_size -= size; - btrfs_truncate_item(path, item_size, 1); + btrfs_truncate_item(trans, path, item_size, 1); } - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); return 0; }
@@ -1227,9 +1229,10 @@ int insert_inline_extent_backref(struct btrfs_trans_handle *trans, bytenr, num_bytes, root_objectid, path->slots[0]); return -EUCLEAN; } - ret = update_inline_extent_backref(path, iref, refs_to_add, extent_op); + ret = update_inline_extent_backref(trans, path, iref, + refs_to_add, extent_op); } else if (ret == -ENOENT) { - setup_inline_extent_backref(trans->fs_info, path, iref, parent, + setup_inline_extent_backref(trans, path, iref, parent, root_objectid, owner, offset, refs_to_add, extent_op); ret = 0; @@ -1247,7 +1250,8 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans,
BUG_ON(!is_data && refs_to_drop != 1); if (iref) - ret = update_inline_extent_backref(path, iref, -refs_to_drop, NULL); + ret = update_inline_extent_backref(trans, path, iref, + -refs_to_drop, NULL); else if (is_data) ret = remove_extent_data_ref(trans, root, path, refs_to_drop); else @@ -1531,7 +1535,7 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, if (extent_op) __run_delayed_extent_op(extent_op, leaf, item);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
/* now insert the actual backref */ @@ -1697,7 +1701,7 @@ static int run_delayed_extent_op(struct btrfs_trans_handle *trans, ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item); __run_delayed_extent_op(extent_op, leaf, ei);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); out: btrfs_free_path(path); return err; @@ -3171,7 +3175,7 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, } } else { btrfs_set_extent_refs(leaf, ei, refs); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } if (found_extent) { ret = remove_extent_backref(trans, extent_root, path, @@ -4679,7 +4683,7 @@ static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans, btrfs_set_extent_data_ref_count(leaf, ref, ref_mod); }
- btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); btrfs_free_path(path);
return alloc_reserved_extent(trans, ins->objectid, ins->offset); @@ -4754,7 +4758,7 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans, btrfs_set_extent_inline_ref_offset(leaf, iref, ref->root); }
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_free_path(path);
return alloc_reserved_extent(trans, node->bytenr, fs_info->nodesize); diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 1ce5dd1544995..45cae356e89ba 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -194,7 +194,7 @@ int btrfs_insert_hole_extent(struct btrfs_trans_handle *trans, btrfs_set_file_extent_encryption(leaf, item, 0); btrfs_set_file_extent_other_encoding(leaf, item, 0);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); out: btrfs_free_path(path); return ret; @@ -811,11 +811,12 @@ blk_status_t btrfs_alloc_dummy_sum(struct btrfs_bio *bbio) * This calls btrfs_truncate_item with the correct args based on the overlap, * and fixes up the key as required. */ -static noinline void truncate_one_csum(struct btrfs_fs_info *fs_info, +static noinline void truncate_one_csum(struct btrfs_trans_handle *trans, struct btrfs_path *path, struct btrfs_key *key, u64 bytenr, u64 len) { + struct btrfs_fs_info *fs_info = trans->fs_info; struct extent_buffer *leaf; const u32 csum_size = fs_info->csum_size; u64 csum_end; @@ -836,7 +837,7 @@ static noinline void truncate_one_csum(struct btrfs_fs_info *fs_info, */ u32 new_size = (bytenr - key->offset) >> blocksize_bits; new_size *= csum_size; - btrfs_truncate_item(path, new_size, 1); + btrfs_truncate_item(trans, path, new_size, 1); } else if (key->offset >= bytenr && csum_end > end_byte && end_byte > key->offset) { /* @@ -848,10 +849,10 @@ static noinline void truncate_one_csum(struct btrfs_fs_info *fs_info, u32 new_size = (csum_end - end_byte) >> blocksize_bits; new_size *= csum_size;
- btrfs_truncate_item(path, new_size, 0); + btrfs_truncate_item(trans, path, new_size, 0);
key->offset = end_byte; - btrfs_set_item_key_safe(fs_info, path, key); + btrfs_set_item_key_safe(trans, path, key); } else { BUG(); } @@ -994,7 +995,7 @@ int btrfs_del_csums(struct btrfs_trans_handle *trans,
key.offset = end_byte - 1; } else { - truncate_one_csum(fs_info, path, &key, bytenr, len); + truncate_one_csum(trans, path, &key, bytenr, len); if (key.offset < bytenr) break; } @@ -1202,7 +1203,7 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, diff /= csum_size; diff *= csum_size;
- btrfs_extend_item(path, diff); + btrfs_extend_item(trans, path, diff); ret = 0; goto csum; } @@ -1249,7 +1250,7 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, ins_size /= csum_size; total_bytes += ins_size * fs_info->sectorsize;
- btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); if (total_bytes < sums->len) { btrfs_release_path(path); cond_resched(); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index eae9175f2c29b..a407af38a9237 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -368,7 +368,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans, btrfs_set_file_extent_offset(leaf, fi, extent_offset); btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - args->start); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (update_refs && disk_bytenr > 0) { btrfs_init_generic_ref(&ref, @@ -405,13 +405,13 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
memcpy(&new_key, &key, sizeof(new_key)); new_key.offset = args->end; - btrfs_set_item_key_safe(fs_info, path, &new_key); + btrfs_set_item_key_safe(trans, path, &new_key);
extent_offset += args->end - key.offset; btrfs_set_file_extent_offset(leaf, fi, extent_offset); btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - args->end); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); if (update_refs && disk_bytenr > 0) args->bytes_found += args->end - key.offset; break; @@ -431,7 +431,7 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans,
btrfs_set_file_extent_num_bytes(leaf, fi, args->start - key.offset); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); if (update_refs && disk_bytenr > 0) args->bytes_found += extent_end - args->start; if (args->end == extent_end) @@ -536,7 +536,8 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans, if (btrfs_comp_cpu_keys(&key, &slot_key) > 0) path->slots[0]++; } - btrfs_setup_item_for_insert(root, path, &key, args->extent_item_size); + btrfs_setup_item_for_insert(trans, root, path, &key, + args->extent_item_size); args->extent_inserted = true; }
@@ -593,7 +594,6 @@ static int extent_mergeable(struct extent_buffer *leaf, int slot, int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, struct btrfs_inode *inode, u64 start, u64 end) { - struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_root *root = inode->root; struct extent_buffer *leaf; struct btrfs_path *path; @@ -664,7 +664,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, ino, bytenr, orig_offset, &other_start, &other_end)) { new_key.offset = end; - btrfs_set_item_key_safe(fs_info, path, &new_key); + btrfs_set_item_key_safe(trans, path, &new_key); fi = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_file_extent_item); btrfs_set_file_extent_generation(leaf, fi, @@ -679,7 +679,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, trans->transid); btrfs_set_file_extent_num_bytes(leaf, fi, end - other_start); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); goto out; } } @@ -698,7 +698,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, trans->transid); path->slots[0]++; new_key.offset = start; - btrfs_set_item_key_safe(fs_info, path, &new_key); + btrfs_set_item_key_safe(trans, path, &new_key);
fi = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_file_extent_item); @@ -708,7 +708,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, other_end - start); btrfs_set_file_extent_offset(leaf, fi, start - orig_offset); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); goto out; } } @@ -742,7 +742,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, btrfs_set_file_extent_offset(leaf, fi, split - orig_offset); btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - split); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, bytenr, num_bytes, 0); @@ -814,7 +814,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, btrfs_set_file_extent_type(leaf, fi, BTRFS_FILE_EXTENT_REG); btrfs_set_file_extent_generation(leaf, fi, trans->transid); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } else { fi = btrfs_item_ptr(leaf, del_slot - 1, struct btrfs_file_extent_item); @@ -823,7 +823,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, btrfs_set_file_extent_generation(leaf, fi, trans->transid); btrfs_set_file_extent_num_bytes(leaf, fi, extent_end - key.offset); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
ret = btrfs_del_items(trans, root, path, del_slot, del_nr); if (ret < 0) { @@ -2103,7 +2103,7 @@ static int fill_holes(struct btrfs_trans_handle *trans, btrfs_set_file_extent_ram_bytes(leaf, fi, num_bytes); btrfs_set_file_extent_offset(leaf, fi, 0); btrfs_set_file_extent_generation(leaf, fi, trans->transid); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); goto out; }
@@ -2111,7 +2111,7 @@ static int fill_holes(struct btrfs_trans_handle *trans, u64 num_bytes;
key.offset = offset; - btrfs_set_item_key_safe(fs_info, path, &key); + btrfs_set_item_key_safe(trans, path, &key); fi = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_file_extent_item); num_bytes = btrfs_file_extent_num_bytes(leaf, fi) + end - @@ -2120,7 +2120,7 @@ static int fill_holes(struct btrfs_trans_handle *trans, btrfs_set_file_extent_ram_bytes(leaf, fi, num_bytes); btrfs_set_file_extent_offset(leaf, fi, 0); btrfs_set_file_extent_generation(leaf, fi, trans->transid); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); goto out; } btrfs_release_path(path); @@ -2272,7 +2272,7 @@ static int btrfs_insert_replace_extent(struct btrfs_trans_handle *trans, btrfs_set_file_extent_num_bytes(leaf, extent, replace_len); if (extent_info->is_new_extent) btrfs_set_file_extent_generation(leaf, extent, trans->transid); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
ret = btrfs_inode_set_file_extent_range(inode, extent_info->file_offset, diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 8808004180759..6b7383ae5a70c 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -195,7 +195,7 @@ static int __create_free_space_inode(struct btrfs_root *root, btrfs_set_inode_nlink(leaf, inode_item, 1); btrfs_set_inode_transid(leaf, inode_item, trans->transid); btrfs_set_inode_block_group(leaf, inode_item, offset); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
key.objectid = BTRFS_FREE_SPACE_OBJECTID; @@ -213,7 +213,7 @@ static int __create_free_space_inode(struct btrfs_root *root, struct btrfs_free_space_header); memzero_extent_buffer(leaf, (unsigned long)header, sizeof(*header)); btrfs_set_free_space_key(leaf, header, &disk_key); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
return 0; @@ -1185,7 +1185,7 @@ update_cache_item(struct btrfs_trans_handle *trans, btrfs_set_free_space_entries(leaf, header, entries); btrfs_set_free_space_bitmaps(leaf, header, bitmaps); btrfs_set_free_space_generation(leaf, header, trans->transid); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
return 0; diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c index f169378e2ca6e..ae060a26e1191 100644 --- a/fs/btrfs/free-space-tree.c +++ b/fs/btrfs/free-space-tree.c @@ -89,7 +89,7 @@ static int add_new_free_space_info(struct btrfs_trans_handle *trans, struct btrfs_free_space_info); btrfs_set_free_space_extent_count(leaf, info, 0); btrfs_set_free_space_flags(leaf, info, 0); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
ret = 0; out: @@ -287,7 +287,7 @@ int convert_free_space_to_bitmaps(struct btrfs_trans_handle *trans, flags |= BTRFS_FREE_SPACE_USING_BITMAPS; btrfs_set_free_space_flags(leaf, info, flags); expected_extent_count = btrfs_free_space_extent_count(leaf, info); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
if (extent_count != expected_extent_count) { @@ -324,7 +324,7 @@ int convert_free_space_to_bitmaps(struct btrfs_trans_handle *trans, ptr = btrfs_item_ptr_offset(leaf, path->slots[0]); write_extent_buffer(leaf, bitmap_cursor, ptr, data_size); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
i += extent_size; @@ -430,7 +430,7 @@ int convert_free_space_to_extents(struct btrfs_trans_handle *trans, flags &= ~BTRFS_FREE_SPACE_USING_BITMAPS; btrfs_set_free_space_flags(leaf, info, flags); expected_extent_count = btrfs_free_space_extent_count(leaf, info); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
nrbits = block_group->length >> block_group->fs_info->sectorsize_bits; @@ -495,7 +495,7 @@ static int update_free_space_extent_count(struct btrfs_trans_handle *trans,
extent_count += new_extents; btrfs_set_free_space_extent_count(path->nodes[0], info, extent_count); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); btrfs_release_path(path);
if (!(flags & BTRFS_FREE_SPACE_USING_BITMAPS) && @@ -533,7 +533,8 @@ int free_space_test_bit(struct btrfs_block_group *block_group, return !!extent_buffer_test_bit(leaf, ptr, i); }
-static void free_space_set_bits(struct btrfs_block_group *block_group, +static void free_space_set_bits(struct btrfs_trans_handle *trans, + struct btrfs_block_group *block_group, struct btrfs_path *path, u64 *start, u64 *size, int bit) { @@ -563,7 +564,7 @@ static void free_space_set_bits(struct btrfs_block_group *block_group, extent_buffer_bitmap_set(leaf, ptr, first, last - first); else extent_buffer_bitmap_clear(leaf, ptr, first, last - first); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
*size -= end - *start; *start = end; @@ -656,7 +657,7 @@ static int modify_free_space_bitmap(struct btrfs_trans_handle *trans, cur_start = start; cur_size = size; while (1) { - free_space_set_bits(block_group, path, &cur_start, &cur_size, + free_space_set_bits(trans, block_group, path, &cur_start, &cur_size, !remove); if (cur_size == 0) break; diff --git a/fs/btrfs/inode-item.c b/fs/btrfs/inode-item.c index 4c322b720a80a..d3ff97374d48a 100644 --- a/fs/btrfs/inode-item.c +++ b/fs/btrfs/inode-item.c @@ -167,7 +167,7 @@ static int btrfs_del_inode_extref(struct btrfs_trans_handle *trans, memmove_extent_buffer(leaf, ptr, ptr + del_len, item_size - (ptr + del_len - item_start));
- btrfs_truncate_item(path, item_size - del_len, 1); + btrfs_truncate_item(trans, path, item_size - del_len, 1);
out: btrfs_free_path(path); @@ -229,7 +229,7 @@ int btrfs_del_inode_ref(struct btrfs_trans_handle *trans, item_start = btrfs_item_ptr_offset(leaf, path->slots[0]); memmove_extent_buffer(leaf, ptr, ptr + sub_item_len, item_size - (ptr + sub_item_len - item_start)); - btrfs_truncate_item(path, item_size - sub_item_len, 1); + btrfs_truncate_item(trans, path, item_size - sub_item_len, 1); out: btrfs_free_path(path);
@@ -282,7 +282,7 @@ static int btrfs_insert_inode_extref(struct btrfs_trans_handle *trans, name)) goto out;
- btrfs_extend_item(path, ins_len); + btrfs_extend_item(trans, path, ins_len); ret = 0; } if (ret < 0) @@ -299,7 +299,7 @@ static int btrfs_insert_inode_extref(struct btrfs_trans_handle *trans,
ptr = (unsigned long)&extref->name; write_extent_buffer(path->nodes[0], name->name, ptr, name->len); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]);
out: btrfs_free_path(path); @@ -338,7 +338,7 @@ int btrfs_insert_inode_ref(struct btrfs_trans_handle *trans, goto out;
old_size = btrfs_item_size(path->nodes[0], path->slots[0]); - btrfs_extend_item(path, ins_len); + btrfs_extend_item(trans, path, ins_len); ref = btrfs_item_ptr(path->nodes[0], path->slots[0], struct btrfs_inode_ref); ref = (struct btrfs_inode_ref *)((unsigned long)ref + old_size); @@ -364,7 +364,7 @@ int btrfs_insert_inode_ref(struct btrfs_trans_handle *trans, ptr = (unsigned long)(ref + 1); } write_extent_buffer(path->nodes[0], name->name, ptr, name->len); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]);
out: btrfs_free_path(path); @@ -591,7 +591,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans, num_dec = (orig_num_bytes - extent_num_bytes); if (extent_start != 0) control->sub_bytes += num_dec; - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } else { extent_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, fi); @@ -617,7 +617,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
btrfs_set_file_extent_ram_bytes(leaf, fi, size); size = btrfs_file_extent_calc_inline_size(size); - btrfs_truncate_item(path, size, 1); + btrfs_truncate_item(trans, path, size, 1); } else if (!del_item) { /* * We have to bail so the last_size is set to diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0f4498dfa30c9..58e104f118f96 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -573,7 +573,7 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, kunmap_local(kaddr); put_page(page); } - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
/* @@ -3072,7 +3072,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, btrfs_item_ptr_offset(leaf, path->slots[0]), sizeof(struct btrfs_file_extent_item));
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_release_path(path);
/* @@ -4134,7 +4134,7 @@ static noinline int btrfs_update_inode_item(struct btrfs_trans_handle *trans, struct btrfs_inode_item);
fill_inode_item(trans, leaf, inode_item, &inode->vfs_inode); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_set_inode_last_trans(trans, inode); ret = 0; failed: @@ -6476,7 +6476,7 @@ int btrfs_create_new_inode(struct btrfs_trans_handle *trans, } }
- btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); /* * We don't need the path anymore, plus inheriting properties, adding * ACLs, security xattrs, orphan item or adding the link, will result in @@ -9630,7 +9630,7 @@ static int btrfs_symlink(struct mnt_idmap *idmap, struct inode *dir,
ptr = btrfs_file_extent_inline_start(ei); write_extent_buffer(leaf, symname, ptr, name_len); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); btrfs_free_path(path);
d_instantiate_new(dentry, inode); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 6d0df9bc1e72b..8bdf9bed25c75 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -663,7 +663,7 @@ static noinline int create_subvol(struct mnt_idmap *idmap, goto out; }
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
inode_item = &root_item->inode; btrfs_set_stack_inode_generation(inode_item, 1); @@ -2947,7 +2947,7 @@ static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp)
btrfs_cpu_key_to_disk(&disk_key, &new_root->root_key); btrfs_set_dir_item_key(path->nodes[0], di, &disk_key); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); btrfs_release_path(path);
btrfs_set_fs_incompat(fs_info, DEFAULT_SUBVOL); diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 2637d6b157ff9..74cabaa59be71 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -622,7 +622,7 @@ static int add_qgroup_relation_item(struct btrfs_trans_handle *trans, u64 src,
ret = btrfs_insert_empty_item(trans, quota_root, path, &key, 0);
- btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]);
btrfs_free_path(path); return ret; @@ -700,7 +700,7 @@ static int add_qgroup_item(struct btrfs_trans_handle *trans, btrfs_set_qgroup_info_excl(leaf, qgroup_info, 0); btrfs_set_qgroup_info_excl_cmpr(leaf, qgroup_info, 0);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
btrfs_release_path(path);
@@ -719,7 +719,7 @@ static int add_qgroup_item(struct btrfs_trans_handle *trans, btrfs_set_qgroup_limit_rsv_rfer(leaf, qgroup_limit, 0); btrfs_set_qgroup_limit_rsv_excl(leaf, qgroup_limit, 0);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
ret = 0; out: @@ -808,7 +808,7 @@ static int update_qgroup_limit_item(struct btrfs_trans_handle *trans, btrfs_set_qgroup_limit_rsv_rfer(l, qgroup_limit, qgroup->rsv_rfer); btrfs_set_qgroup_limit_rsv_excl(l, qgroup_limit, qgroup->rsv_excl);
- btrfs_mark_buffer_dirty(l); + btrfs_mark_buffer_dirty(trans, l);
out: btrfs_free_path(path); @@ -854,7 +854,7 @@ static int update_qgroup_info_item(struct btrfs_trans_handle *trans, btrfs_set_qgroup_info_excl(l, qgroup_info, qgroup->excl); btrfs_set_qgroup_info_excl_cmpr(l, qgroup_info, qgroup->excl_cmpr);
- btrfs_mark_buffer_dirty(l); + btrfs_mark_buffer_dirty(trans, l);
out: btrfs_free_path(path); @@ -896,7 +896,7 @@ static int update_qgroup_status_item(struct btrfs_trans_handle *trans) btrfs_set_qgroup_status_rescan(l, ptr, fs_info->qgroup_rescan_progress.objectid);
- btrfs_mark_buffer_dirty(l); + btrfs_mark_buffer_dirty(trans, l);
out: btrfs_free_path(path); @@ -1069,7 +1069,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) BTRFS_QGROUP_STATUS_FLAGS_MASK); btrfs_set_qgroup_status_rescan(leaf, ptr, 0);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
key.objectid = 0; key.type = BTRFS_ROOT_REF_KEY; diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 62ed57551824c..31781af447553 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1181,7 +1181,7 @@ int replace_file_extents(struct btrfs_trans_handle *trans, } } if (dirty) - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); if (inode) btrfs_add_delayed_iput(BTRFS_I(inode)); return ret; @@ -1374,13 +1374,13 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, */ btrfs_set_node_blockptr(parent, slot, new_bytenr); btrfs_set_node_ptr_generation(parent, slot, new_ptr_gen); - btrfs_mark_buffer_dirty(parent); + btrfs_mark_buffer_dirty(trans, parent);
btrfs_set_node_blockptr(path->nodes[level], path->slots[level], old_bytenr); btrfs_set_node_ptr_generation(path->nodes[level], path->slots[level], old_ptr_gen); - btrfs_mark_buffer_dirty(path->nodes[level]); + btrfs_mark_buffer_dirty(trans, path->nodes[level]);
btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, old_bytenr, blocksize, path->nodes[level]->start); @@ -2517,7 +2517,7 @@ static int do_relocation(struct btrfs_trans_handle *trans, node->eb->start); btrfs_set_node_ptr_generation(upper->eb, slot, trans->transid); - btrfs_mark_buffer_dirty(upper->eb); + btrfs_mark_buffer_dirty(trans, upper->eb);
btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, node->eb->start, blocksize, @@ -3833,7 +3833,7 @@ static int __insert_orphan_inode(struct btrfs_trans_handle *trans, btrfs_set_inode_mode(leaf, item, S_IFREG | 0600); btrfs_set_inode_flags(leaf, item, BTRFS_INODE_NOCOMPRESS | BTRFS_INODE_PREALLOC); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); out: btrfs_free_path(path); return ret; diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c index 859874579456f..5b0f1bccc409c 100644 --- a/fs/btrfs/root-tree.c +++ b/fs/btrfs/root-tree.c @@ -191,7 +191,7 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root btrfs_set_root_generation_v2(item, btrfs_root_generation(item));
write_extent_buffer(l, item, ptr, sizeof(*item)); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); out: btrfs_free_path(path); return ret; @@ -438,7 +438,7 @@ int btrfs_add_root_ref(struct btrfs_trans_handle *trans, u64 root_id, btrfs_set_root_ref_name_len(leaf, ref, name->len); ptr = (unsigned long)(ref + 1); write_extent_buffer(leaf, name->name, ptr, name->len); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
if (key.type == BTRFS_ROOT_BACKREF_KEY) { btrfs_release_path(path); diff --git a/fs/btrfs/tests/extent-buffer-tests.c b/fs/btrfs/tests/extent-buffer-tests.c index 5ef0b90e25c3b..6a43a64ba55ad 100644 --- a/fs/btrfs/tests/extent-buffer-tests.c +++ b/fs/btrfs/tests/extent-buffer-tests.c @@ -61,7 +61,11 @@ static int test_btrfs_split_item(u32 sectorsize, u32 nodesize) key.type = BTRFS_EXTENT_CSUM_KEY; key.offset = 0;
- btrfs_setup_item_for_insert(root, path, &key, value_len); + /* + * Passing a NULL trans handle is fine here, we have a dummy root eb + * and the tree is a single node (level 0). + */ + btrfs_setup_item_for_insert(NULL, root, path, &key, value_len); write_extent_buffer(eb, value, btrfs_item_ptr_offset(eb, 0), value_len);
diff --git a/fs/btrfs/tests/inode-tests.c b/fs/btrfs/tests/inode-tests.c index 05b03f5eab83b..492d69d2fa737 100644 --- a/fs/btrfs/tests/inode-tests.c +++ b/fs/btrfs/tests/inode-tests.c @@ -34,7 +34,11 @@ static void insert_extent(struct btrfs_root *root, u64 start, u64 len, key.type = BTRFS_EXTENT_DATA_KEY; key.offset = start;
- btrfs_setup_item_for_insert(root, &path, &key, value_len); + /* + * Passing a NULL trans handle is fine here, we have a dummy root eb + * and the tree is a single node (level 0). + */ + btrfs_setup_item_for_insert(NULL, root, &path, &key, value_len); fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); btrfs_set_file_extent_generation(leaf, fi, 1); btrfs_set_file_extent_type(leaf, fi, type); @@ -64,7 +68,11 @@ static void insert_inode_item_key(struct btrfs_root *root) key.type = BTRFS_INODE_ITEM_KEY; key.offset = 0;
- btrfs_setup_item_for_insert(root, &path, &key, value_len); + /* + * Passing a NULL trans handle is fine here, we have a dummy root eb + * and the tree is a single node (level 0). + */ + btrfs_setup_item_for_insert(NULL, root, &path, &key, value_len); }
/* diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index a00e7a0bc713d..ad0d934991741 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -504,9 +504,9 @@ static int overwrite_item(struct btrfs_trans_handle *trans, found_size = btrfs_item_size(path->nodes[0], path->slots[0]); if (found_size > item_size) - btrfs_truncate_item(path, item_size, 1); + btrfs_truncate_item(trans, path, item_size, 1); else if (found_size < item_size) - btrfs_extend_item(path, item_size - found_size); + btrfs_extend_item(trans, path, item_size - found_size); } else if (ret) { return ret; } @@ -574,7 +574,7 @@ static int overwrite_item(struct btrfs_trans_handle *trans, } } no_copy: - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); btrfs_release_path(path); return 0; } @@ -3530,7 +3530,7 @@ static noinline int insert_dir_log_key(struct btrfs_trans_handle *trans, last_offset = max(last_offset, curr_end); } btrfs_set_dir_log_end(path->nodes[0], item, last_offset); - btrfs_mark_buffer_dirty(path->nodes[0]); + btrfs_mark_buffer_dirty(trans, path->nodes[0]); btrfs_release_path(path); return 0; } @@ -4488,7 +4488,7 @@ static noinline int copy_items(struct btrfs_trans_handle *trans, dst_index++; }
- btrfs_mark_buffer_dirty(dst_path->nodes[0]); + btrfs_mark_buffer_dirty(trans, dst_path->nodes[0]); btrfs_release_path(dst_path); out: kfree(ins_data); @@ -4693,7 +4693,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, write_extent_buffer(leaf, &fi, btrfs_item_ptr_offset(leaf, path->slots[0]), sizeof(fi)); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
btrfs_release_path(path);
diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c index 7c7001f42b14c..5be74f9e47ebf 100644 --- a/fs/btrfs/uuid-tree.c +++ b/fs/btrfs/uuid-tree.c @@ -124,7 +124,7 @@ int btrfs_uuid_tree_add(struct btrfs_trans_handle *trans, u8 *uuid, u8 type, * An item with that type already exists. * Extend the item and store the new subid at the end. */ - btrfs_extend_item(path, sizeof(subid_le)); + btrfs_extend_item(trans, path, sizeof(subid_le)); eb = path->nodes[0]; slot = path->slots[0]; offset = btrfs_item_ptr_offset(eb, slot); @@ -139,7 +139,7 @@ int btrfs_uuid_tree_add(struct btrfs_trans_handle *trans, u8 *uuid, u8 type, ret = 0; subid_le = cpu_to_le64(subid_cpu); write_extent_buffer(eb, &subid_le, offset, sizeof(subid_le)); - btrfs_mark_buffer_dirty(eb); + btrfs_mark_buffer_dirty(trans, eb);
out: btrfs_free_path(path); @@ -221,7 +221,7 @@ int btrfs_uuid_tree_remove(struct btrfs_trans_handle *trans, u8 *uuid, u8 type, move_src = offset + sizeof(subid); move_len = item_size - (move_src - btrfs_item_ptr_offset(eb, slot)); memmove_extent_buffer(eb, move_dst, move_src, move_len); - btrfs_truncate_item(path, item_size - sizeof(subid), 1); + btrfs_truncate_item(trans, path, item_size - sizeof(subid), 1);
out: btrfs_free_path(path); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5019e9244d2d2..1df496c809376 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1908,7 +1908,7 @@ static int btrfs_add_dev_item(struct btrfs_trans_handle *trans, ptr = btrfs_device_fsid(dev_item); write_extent_buffer(leaf, trans->fs_info->fs_devices->metadata_uuid, ptr, BTRFS_FSID_SIZE); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
ret = 0; out: @@ -2613,7 +2613,7 @@ static int btrfs_finish_sprout(struct btrfs_trans_handle *trans) if (device->fs_devices->seeding) { btrfs_set_device_generation(leaf, dev_item, device->generation); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); }
path->slots[0]++; @@ -2911,7 +2911,7 @@ static noinline int btrfs_update_device(struct btrfs_trans_handle *trans, btrfs_device_get_disk_total_bytes(device)); btrfs_set_device_bytes_used(leaf, dev_item, btrfs_device_get_bytes_used(device)); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf);
out: btrfs_free_path(path); @@ -3499,7 +3499,7 @@ static int insert_balance_item(struct btrfs_fs_info *fs_info,
btrfs_set_balance_flags(leaf, item, bctl->flags);
- btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); out: btrfs_free_path(path); err = btrfs_commit_transaction(trans); @@ -7513,7 +7513,7 @@ static int update_dev_stat_item(struct btrfs_trans_handle *trans, for (i = 0; i < BTRFS_DEV_STAT_VALUES_MAX; i++) btrfs_set_dev_stats_value(eb, ptr, i, btrfs_dev_stat_read(device, i)); - btrfs_mark_buffer_dirty(eb); + btrfs_mark_buffer_dirty(trans, eb);
out: btrfs_free_path(path); diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c index fc4b20c2688a0..c454b8ce6babe 100644 --- a/fs/btrfs/xattr.c +++ b/fs/btrfs/xattr.c @@ -188,15 +188,15 @@ int btrfs_setxattr(struct btrfs_trans_handle *trans, struct inode *inode, if (old_data_len + name_len + sizeof(*di) == item_size) { /* No other xattrs packed in the same leaf item. */ if (size > old_data_len) - btrfs_extend_item(path, size - old_data_len); + btrfs_extend_item(trans, path, size - old_data_len); else if (size < old_data_len) - btrfs_truncate_item(path, data_size, 1); + btrfs_truncate_item(trans, path, data_size, 1); } else { /* There are other xattrs packed in the same item. */ ret = btrfs_delete_one_dir_name(trans, root, path, di); if (ret) goto out; - btrfs_extend_item(path, data_size); + btrfs_extend_item(trans, path, data_size); }
ptr = btrfs_item_ptr(leaf, slot, char); @@ -205,7 +205,7 @@ int btrfs_setxattr(struct btrfs_trans_handle *trans, struct inode *inode, btrfs_set_dir_data_len(leaf, di, size); data_ptr = ((unsigned long)(di + 1)) + name_len; write_extent_buffer(leaf, value, data_ptr, size); - btrfs_mark_buffer_dirty(leaf); + btrfs_mark_buffer_dirty(trans, leaf); } else { /* * Insert, and we had space for the xattr, so path->slots[0] is
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kent Overstreet kent.overstreet@gmail.com
[ Upstream commit 9492261ff2460252cf2d8de89cdf854c7e2b28a0 ]
When we started spreading new inode numbers throughout most of the 64 bit inode space, that triggered some corner case bugs, in particular some integer overflows related to the radix tree code. Oops.
Signed-off-by: Kent Overstreet kent.overstreet@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/generic-radix-tree.h | 7 +++++++ lib/generic-radix-tree.c | 17 ++++++++++++++--- 2 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/include/linux/generic-radix-tree.h b/include/linux/generic-radix-tree.h index 107613f7d7920..f6cd0f909d9fb 100644 --- a/include/linux/generic-radix-tree.h +++ b/include/linux/generic-radix-tree.h @@ -38,6 +38,7 @@
#include <asm/page.h> #include <linux/bug.h> +#include <linux/limits.h> #include <linux/log2.h> #include <linux/math.h> #include <linux/types.h> @@ -184,6 +185,12 @@ void *__genradix_iter_peek(struct genradix_iter *, struct __genradix *, size_t); static inline void __genradix_iter_advance(struct genradix_iter *iter, size_t obj_size) { + if (iter->offset + obj_size < iter->offset) { + iter->offset = SIZE_MAX; + iter->pos = SIZE_MAX; + return; + } + iter->offset += obj_size;
if (!is_power_of_2(obj_size) && diff --git a/lib/generic-radix-tree.c b/lib/generic-radix-tree.c index f25eb111c0516..7dfa88282b006 100644 --- a/lib/generic-radix-tree.c +++ b/lib/generic-radix-tree.c @@ -166,6 +166,10 @@ void *__genradix_iter_peek(struct genradix_iter *iter, struct genradix_root *r; struct genradix_node *n; unsigned level, i; + + if (iter->offset == SIZE_MAX) + return NULL; + restart: r = READ_ONCE(radix->root); if (!r) @@ -184,10 +188,17 @@ void *__genradix_iter_peek(struct genradix_iter *iter, (GENRADIX_ARY - 1);
while (!n->children[i]) { + size_t objs_per_ptr = genradix_depth_size(level); + + if (iter->offset + objs_per_ptr < iter->offset) { + iter->offset = SIZE_MAX; + iter->pos = SIZE_MAX; + return NULL; + } + i++; - iter->offset = round_down(iter->offset + - genradix_depth_size(level), - genradix_depth_size(level)); + iter->offset = round_down(iter->offset + objs_per_ptr, + objs_per_ptr); iter->pos = (iter->offset >> PAGE_SHIFT) * objs_per_page; if (i == GENRADIX_ARY)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@kernel.org
[ Upstream commit 2d7ce49f58dc95495b3e22e45d2be7de909b2c63 ]
Enabling CONFIG_KCSAN leads to unconverted, default return thunks to remain after patching.
As David Kaplan describes in his debugging of the issue, it is caused by a couple of KCSAN-generated constructors which aren't processed by objtool:
"When KCSAN is enabled, GCC generates lots of constructor functions named _sub_I_00099_0 which call __tsan_init and then return. The returns in these are generally annotated normally by objtool and fixed up at runtime. But objtool runs on vmlinux.o and vmlinux.o does not include a couple of object files that are in vmlinux, like init/version-timestamp.o and .vmlinux.export.o, both of which contain _sub_I_00099_0 functions. As a result, the returns in these functions are not annotated, and the panic occurs when we call one of them in do_ctors and it uses the default return thunk.
This difference can be seen by counting the number of these functions in the object files: $ objdump -d vmlinux.o|grep -c "<_sub_I_00099_0>:" 2601 $ objdump -d vmlinux|grep -c "<_sub_I_00099_0>:" 2603
If these functions are only run during kernel boot, there is no speculation concern."
Fix it by disabling KCSAN on version-timestamp.o and .vmlinux.export.o so the extra functions don't get generated. KASAN and GCOV are already disabled for those files.
[ bp: Massage commit message. ]
Closes: https://lore.kernel.org/lkml/20231016214810.GA3942238@dev-arch.thelio-3990X/ Reported-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Nick Desaulniers ndesaulniers@google.com Acked-by: Marco Elver elver@google.com Tested-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/20231017165946.v4i2d4exyqwqq3bx@treble Signed-off-by: Sasha Levin sashal@kernel.org --- init/Makefile | 1 + scripts/Makefile.vmlinux | 1 + 2 files changed, 2 insertions(+)
diff --git a/init/Makefile b/init/Makefile index ec557ada3c12e..cbac576c57d63 100644 --- a/init/Makefile +++ b/init/Makefile @@ -60,4 +60,5 @@ include/generated/utsversion.h: FORCE $(obj)/version-timestamp.o: include/generated/utsversion.h CFLAGS_version-timestamp.o := -include include/generated/utsversion.h KASAN_SANITIZE_version-timestamp.o := n +KCSAN_SANITIZE_version-timestamp.o := n GCOV_PROFILE_version-timestamp.o := n diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux index 3cd6ca15f390d..c9f3e03124d7f 100644 --- a/scripts/Makefile.vmlinux +++ b/scripts/Makefile.vmlinux @@ -19,6 +19,7 @@ quiet_cmd_cc_o_c = CC $@
ifdef CONFIG_MODULES KASAN_SANITIZE_.vmlinux.export.o := n +KCSAN_SANITIZE_.vmlinux.export.o := n GCOV_PROFILE_.vmlinux.export.o := n targets += .vmlinux.export.o vmlinux: .vmlinux.export.o
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuai Xue xueshuai@linux.alibaba.com
[ Upstream commit 54aee5f15b83437f23b2b2469bcf21bdd9823916 ]
When perf-record with a large AUX area, e.g 4GB, it fails with:
#perf record -C 0 -m ,4G -e arm_spe_0// -- sleep 1 failed to mmap with 12 (Cannot allocate memory)
and it reveals a WARNING with __alloc_pages():
------------[ cut here ]------------ WARNING: CPU: 44 PID: 17573 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248 Call trace: __alloc_pages+0x1ec/0x248 __kmalloc_large_node+0xc0/0x1f8 __kmalloc_node+0x134/0x1e8 rb_alloc_aux+0xe0/0x298 perf_mmap+0x440/0x660 mmap_region+0x308/0x8a8 do_mmap+0x3c0/0x528 vm_mmap_pgoff+0xf4/0x1b8 ksys_mmap_pgoff+0x18c/0x218 __arm64_sys_mmap+0x38/0x58 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0x58/0x188 do_el0_svc+0x34/0x50 el0_svc+0x34/0x108 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x1a4/0x1a8
'rb->aux_pages' allocated by kcalloc() is a pointer array which is used to maintains AUX trace pages. The allocated page for this array is physically contiguous (and virtually contiguous) with an order of 0..MAX_ORDER. If the size of pointer array crosses the limitation set by MAX_ORDER, it reveals a WARNING.
So bail out early with -ENOMEM if the request AUX area is out of bound, e.g.:
#perf record -C 0 -m ,4G -e arm_spe_0// -- sleep 1 failed to mmap with 12 (Cannot allocate memory)
Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/ring_buffer.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index a0433f37b0243..4a260ceed9c73 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -699,6 +699,12 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, watermark = 0; }
+ /* + * kcalloc_node() is unable to allocate buffer if the size is larger + * than: PAGE_SIZE << MAX_ORDER; directly bail out in this case. + */ + if (get_order((unsigned long)nr_pages * sizeof(void *)) > MAX_ORDER) + return -ENOMEM; rb->aux_pages = kcalloc_node(nr_pages, sizeof(void *), GFP_KERNEL, node); if (!rb->aux_pages)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhen Lei thunder.leizhen@huawei.com
[ Upstream commit 2cbc482d325ee58001472c4359b311958c4efdd1 ]
When a structure containing an RCU callback rhp is (incorrectly) freed and reallocated after rhp is passed to call_rcu(), it is not unusual for rhp->func to be set to NULL. This defeats the debugging prints used by __call_rcu_common() in kernels built with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which expect to identify the offending code using the identity of this function.
And in kernels build without CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, things are even worse, as can be seen from this splat:
Unable to handle kernel NULL pointer dereference at virtual address 0 ... ... PC is at 0x0 LR is at rcu_do_batch+0x1c0/0x3b8 ... ... (rcu_do_batch) from (rcu_core+0x1d4/0x284) (rcu_core) from (__do_softirq+0x24c/0x344) (__do_softirq) from (__irq_exit_rcu+0x64/0x108) (__irq_exit_rcu) from (irq_exit+0x8/0x10) (irq_exit) from (__handle_domain_irq+0x74/0x9c) (__handle_domain_irq) from (gic_handle_irq+0x8c/0x98) (gic_handle_irq) from (__irq_svc+0x5c/0x94) (__irq_svc) from (arch_cpu_idle+0x20/0x3c) (arch_cpu_idle) from (default_idle_call+0x4c/0x78) (default_idle_call) from (do_idle+0xf8/0x150) (do_idle) from (cpu_startup_entry+0x18/0x20) (cpu_startup_entry) from (0xc01530)
This commit therefore adds calls to mem_dump_obj(rhp) to output some information, for example:
slab kmalloc-256 start ffff410c45019900 pointer offset 0 size 256
This provides the rough size of the memory block and the offset of the rcu_head structure, which as least provides at least a few clues to help locate the problem. If the problem is reproducible, additional slab debugging can be enabled, for example, CONFIG_DEBUG_SLAB=y, which can provide significantly more information.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/rcu.h | 7 +++++++ kernel/rcu/srcutiny.c | 1 + kernel/rcu/srcutree.c | 1 + kernel/rcu/tasks.h | 1 + kernel/rcu/tiny.c | 1 + kernel/rcu/tree.c | 1 + 6 files changed, 12 insertions(+)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 98c1544cf572b..7ae3496b0b58d 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -10,6 +10,7 @@ #ifndef __LINUX_RCU_H #define __LINUX_RCU_H
+#include <linux/slab.h> #include <trace/events/rcu.h>
/* @@ -248,6 +249,12 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head) } #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
+static inline void debug_rcu_head_callback(struct rcu_head *rhp) +{ + if (unlikely(!rhp->func)) + kmem_dump_obj(rhp); +} + extern int rcu_cpu_stall_suppress_at_boot;
static inline bool rcu_stall_is_suppressed_at_boot(void) diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c index 336af24e0fe35..c38e5933a5d69 100644 --- a/kernel/rcu/srcutiny.c +++ b/kernel/rcu/srcutiny.c @@ -138,6 +138,7 @@ void srcu_drive_gp(struct work_struct *wp) while (lh) { rhp = lh; lh = lh->next; + debug_rcu_head_callback(rhp); local_bh_disable(); rhp->func(rhp); local_bh_enable(); diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 253ed509b6abb..c6481032d42be 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1735,6 +1735,7 @@ static void srcu_invoke_callbacks(struct work_struct *work) rhp = rcu_cblist_dequeue(&ready_cbs); for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) { debug_rcu_head_unqueue(rhp); + debug_rcu_head_callback(rhp); local_bh_disable(); rhp->func(rhp); local_bh_enable(); diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index b770add3f8430..b64d45cdea9c4 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -493,6 +493,7 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); len = rcl.len; for (rhp = rcu_cblist_dequeue(&rcl); rhp; rhp = rcu_cblist_dequeue(&rcl)) { + debug_rcu_head_callback(rhp); local_bh_disable(); rhp->func(rhp); local_bh_enable(); diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index 42f7589e51e09..fec804b790803 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -97,6 +97,7 @@ static inline bool rcu_reclaim_tiny(struct rcu_head *head)
trace_rcu_invoke_callback("", head); f = head->func; + debug_rcu_head_callback(head); WRITE_ONCE(head->func, (rcu_callback_t)0L); f(head); rcu_lock_release(&rcu_callback_map); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1449cb69a0e0e..9b9ebe6205b93 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2131,6 +2131,7 @@ static void rcu_do_batch(struct rcu_data *rdp) trace_rcu_invoke_callback(rcu_state.name, rhp);
f = rhp->func; + debug_rcu_head_callback(rhp); WRITE_ONCE(rhp->func, (rcu_callback_t)0L); f(rhp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Denis Arefev arefev@swemel.ru
[ Upstream commit d8d5b7bf6f2105883bbd91bbd4d5b67e4e3dff71 ]
The value of a bitwise expression 1 << (cpu - sdp->mynode->grplo) is subject to overflow due to a failure to cast operands to a larger data type before performing the bitwise operation.
The maximum result of this subtraction is defined by the RCU_FANOUT_LEAF Kconfig option, which on 64-bit systems defaults to 16 (resulting in a maximum shift of 15), but which can be set up as high as 64 (resulting in a maximum shift of 63). A value of 31 can result in sign extension, resulting in 0xffffffff80000000 instead of the desired 0x80000000. A value of 32 or greater triggers undefined behavior per the C standard.
This bug has not been known to cause issues because almost all kernels take the default CONFIG_RCU_FANOUT_LEAF=16. Furthermore, as long as a given compiler gives a deterministic non-zero result for 1<<N for N>=32, the code correctly invokes all SRCU callbacks, albeit wasting CPU time along the way.
This commit therefore substitutes the correct 1UL for the buggy 1.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Denis Arefev arefev@swemel.ru Reviewed-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Cc: David Laight David.Laight@aculab.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/srcutree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index c6481032d42be..7522517b63b6f 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) snp->grplo = cpu; snp->grphi = cpu; } - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); } smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); return true; @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp int cpu;
for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { - if (!(mask & (1 << (cpu - snp->grplo)))) + if (!(mask & (1UL << (cpu - snp->grplo)))) continue; srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ricardo Cañuelo ricardo.canuelo@collabora.com
[ Upstream commit cf77bf698887c3b9ebed76dea492b07a3c2c7632 ]
The lkdtm selftest config fragment enables CONFIG_UBSAN_TRAP to make the ARRAY_BOUNDS test kill the calling process when an out-of-bound access is detected by UBSAN. However, after this [1] commit, UBSAN is triggered under many new scenarios that weren't detected before, such as in struct definitions with fixed-size trailing arrays used as flexible arrays. As a result, CONFIG_UBSAN_TRAP=y has become a very aggressive option to enable except for specific situations.
`make kselftest-merge` applies CONFIG_UBSAN_TRAP=y to the kernel config for all selftests, which makes many of them fail because of system hangs during boot.
This change removes the config option from the lkdtm kselftest and configures the ARRAY_BOUNDS test to look for UBSAN reports rather than relying on the calling process being killed.
[1] commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")'
Signed-off-by: Ricardo Cañuelo ricardo.canuelo@collabora.com Reviewed-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20230802063252.1917997-1-ricardo.canuelo@collabora... Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/lkdtm/config | 1 - tools/testing/selftests/lkdtm/tests.txt | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config index 5d52f64dfb430..7afe05e8c4d79 100644 --- a/tools/testing/selftests/lkdtm/config +++ b/tools/testing/selftests/lkdtm/config @@ -9,7 +9,6 @@ CONFIG_INIT_ON_FREE_DEFAULT_ON=y CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y CONFIG_UBSAN=y CONFIG_UBSAN_BOUNDS=y -CONFIG_UBSAN_TRAP=y CONFIG_STACKPROTECTOR_STRONG=y CONFIG_SLUB_DEBUG=y CONFIG_SLUB_DEBUG_ON=y diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt index 607b8d7e3ea34..2f3a1b96da6e3 100644 --- a/tools/testing/selftests/lkdtm/tests.txt +++ b/tools/testing/selftests/lkdtm/tests.txt @@ -7,7 +7,7 @@ EXCEPTION #EXHAUST_STACK Corrupts memory on failure #CORRUPT_STACK Crashes entire system on success #CORRUPT_STACK_STRONG Crashes entire system on success -ARRAY_BOUNDS +ARRAY_BOUNDS call trace:|UBSAN: array-index-out-of-bounds CORRUPT_LIST_ADD list_add corruption CORRUPT_LIST_DEL list_del corruption STACK_GUARD_PAGE_LEADING
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacky Bai ping.bai@nxp.com
[ Upstream commit 8051a993ce222a5158bccc6ac22ace9253dd71cb ]
Fix coverity Issue CID 250382: Resource leak (RESOURCE_LEAK). Add kfree when error return.
Signed-off-by: Jacky Bai ping.bai@nxp.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20231009083922.1942971-1-ping.bai@nxp.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-imx-gpt.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/clocksource/timer-imx-gpt.c b/drivers/clocksource/timer-imx-gpt.c index 28ab4f1a7c713..6a878d227a13b 100644 --- a/drivers/clocksource/timer-imx-gpt.c +++ b/drivers/clocksource/timer-imx-gpt.c @@ -434,12 +434,16 @@ static int __init mxc_timer_init_dt(struct device_node *np, enum imx_gpt_type t return -ENOMEM;
imxtm->base = of_iomap(np, 0); - if (!imxtm->base) - return -ENXIO; + if (!imxtm->base) { + ret = -ENXIO; + goto err_kfree; + }
imxtm->irq = irq_of_parse_and_map(np, 0); - if (imxtm->irq <= 0) - return -EINVAL; + if (imxtm->irq <= 0) { + ret = -EINVAL; + goto err_kfree; + }
imxtm->clk_ipg = of_clk_get_by_name(np, "ipg");
@@ -452,11 +456,15 @@ static int __init mxc_timer_init_dt(struct device_node *np, enum imx_gpt_type t
ret = _mxc_timer_init(imxtm); if (ret) - return ret; + goto err_kfree;
initialized = 1;
return 0; + +err_kfree: + kfree(imxtm); + return ret; }
static int __init imx1_timer_init_dt(struct device_node *np)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronald Wahl ronald.wahl@raritan.com
[ Upstream commit 6d3bc4c02d59996d1d3180d8ed409a9d7d5900e0 ]
On SAM9 hardware two cascaded 16 bit timers are used to form a 32 bit high resolution timer that is used as scheduler clock when the kernel has been configured that way (CONFIG_ATMEL_CLOCKSOURCE_TCB).
The driver initially triggers a reset-to-zero of the two timers but this reset is only performed on the next rising clock. For the first timer this is ok - it will be in the next 60ns (16MHz clock). For the chained second timer this will only happen after the first timer overflows, i.e. after 2^16 clocks (~4ms with a 16MHz clock). So with other words the scheduler clock resets to 0 after the first 2^16 clock cycles.
It looks like that the scheduler does not like this and behaves wrongly over its lifetime, e.g. some tasks are scheduled with a long delay. Why that is and if there are additional requirements for this behaviour has not been further analysed.
There is a simple fix for resetting the second timer as well when the first timer is reset and this is to set the ATMEL_TC_ASWTRG_SET bit in the Channel Mode register (CMR) of the first timer. This will also rise the TIOA line (clock input of the second timer) when a software trigger respective SYNC is issued.
Signed-off-by: Ronald Wahl ronald.wahl@raritan.com Acked-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20231007161803.31342-1-rwahl@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-atmel-tcb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/clocksource/timer-atmel-tcb.c b/drivers/clocksource/timer-atmel-tcb.c index 27af17c995900..2a90c92a9182a 100644 --- a/drivers/clocksource/timer-atmel-tcb.c +++ b/drivers/clocksource/timer-atmel-tcb.c @@ -315,6 +315,7 @@ static void __init tcb_setup_dual_chan(struct atmel_tc *tc, int mck_divisor_idx) writel(mck_divisor_idx /* likely divide-by-8 */ | ATMEL_TC_WAVE | ATMEL_TC_WAVESEL_UP /* free-run */ + | ATMEL_TC_ASWTRG_SET /* TIOA0 rises at software trigger */ | ATMEL_TC_ACPA_SET /* TIOA0 rises at 0 */ | ATMEL_TC_ACPC_CLEAR, /* (duty cycle 50%) */ tcaddr + ATMEL_TC_REG(0, CMR));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 8a77f38bcd28d3c22ab7dd8eff3f299d43c00411 ]
Acceleration in SRCU happens on enqueue time for each new callback. This operation is expected not to fail and therefore any similar attempt from other places shouldn't find any remaining callbacks to accelerate.
Moreover accelerations performed beyond enqueue time are error prone because rcu_seq_snap() then may return the snapshot for a new grace period that is not going to be started.
Remove these dangerous and needless accelerations and introduce instead assertions reporting leaking unaccelerated callbacks beyond enqueue time.
Co-developed-by: Yong He alexyonghe@tencent.com Signed-off-by: Yong He alexyonghe@tencent.com Co-developed-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Co-developed-by: Neeraj upadhyay Neeraj.Upadhyay@amd.com Signed-off-by: Neeraj upadhyay Neeraj.Upadhyay@amd.com Reviewed-by: Like Xu likexu@tencent.com Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/srcutree.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 7522517b63b6f..2f770a9a2a13a 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -782,8 +782,7 @@ static void srcu_gp_start(struct srcu_struct *ssp) spin_lock_rcu_node(sdp); /* Interrupts already disabled. */ rcu_segcblist_advance(&sdp->srcu_cblist, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, - rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq)); + WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL)); spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */ WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies); WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0); @@ -1719,6 +1718,7 @@ static void srcu_invoke_callbacks(struct work_struct *work) ssp = sdp->ssp; rcu_cblist_init(&ready_cbs); spin_lock_irq_rcu_node(sdp); + WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL)); rcu_segcblist_advance(&sdp->srcu_cblist, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); if (sdp->srcu_cblist_invoking || @@ -1748,8 +1748,6 @@ static void srcu_invoke_callbacks(struct work_struct *work) */ spin_lock_irq_rcu_node(sdp); rcu_segcblist_add_len(&sdp->srcu_cblist, -len); - (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, - rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq)); sdp->srcu_cblist_invoking = false; more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist); spin_unlock_irq_rcu_node(sdp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rik van Riel riel@surriel.com
[ Upstream commit 94b3f0b5af2c7af69e3d6e0cdd9b0ea535f22186 ]
The CSD lock seems to get stuck in 2 "modes". When it gets stuck temporarily, it usually gets released in a few seconds, and sometimes up to one or two minutes.
If the CSD lock stays stuck for more than several minutes, it never seems to get unstuck, and gradually more and more things in the system end up also getting stuck.
In the latter case, we should just give up, so the system can dump out a little more information about what went wrong, and, with panic_on_oops and a kdump kernel loaded, dump a whole bunch more information about what might have gone wrong. In addition, there is an smp.panic_on_ipistall kernel boot parameter that by default retains the old behavior, but when set enables the panic after the CSD lock has been stuck for more than the specified number of milliseconds, as in 300,000 for five minutes.
[ paulmck: Apply Imran Khan feedback. ] [ paulmck: Apply Leonardo Bras feedback. ]
Link: https://lore.kernel.org/lkml/bc7cc8b0-f587-4451-8bcd-0daae627bcc7@paulmck-la... Signed-off-by: Rik van Riel riel@surriel.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Reviewed-by: Imran Khan imran.f.khan@oracle.com Reviewed-by: Leonardo Bras leobras@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Valentin Schneider vschneid@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Jonathan Corbet corbet@lwn.net Cc: Randy Dunlap rdunlap@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/admin-guide/kernel-parameters.txt | 7 +++++++ kernel/smp.c | 13 ++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 23ebe34ff901e..b951b4c2808f1 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5781,6 +5781,13 @@ This feature may be more efficiently disabled using the csdlock_debug- kernel parameter.
+ smp.panic_on_ipistall= [KNL] + If a csd_lock_timeout extends for more than + the specified number of milliseconds, panic the + system. By default, let CSD-lock acquisition + take as long as they take. Specifying 300,000 + for this value provides a 5-minute timeout. + smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port smsc-ircc2.ircc_sir= [HW] SIR base I/O port diff --git a/kernel/smp.c b/kernel/smp.c index 385179dae360e..0a9a3262f7822 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -168,6 +168,8 @@ static DEFINE_PER_CPU(void *, cur_csd_info);
static ulong csd_lock_timeout = 5000; /* CSD lock timeout in milliseconds. */ module_param(csd_lock_timeout, ulong, 0444); +static int panic_on_ipistall; /* CSD panic timeout in milliseconds, 300000 for five minutes. */ +module_param(panic_on_ipistall, int, 0444);
static atomic_t csd_bug_count = ATOMIC_INIT(0);
@@ -228,6 +230,7 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * }
ts2 = sched_clock(); + /* How long since we last checked for a stuck CSD lock.*/ ts_delta = ts2 - *ts1; if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0)) return false; @@ -241,9 +244,17 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * else cpux = cpu; cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */ + /* How long since this CSD lock was stuck. */ + ts_delta = ts2 - ts0; pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n", - firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0, + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta, cpu, csd->func, csd->info); + /* + * If the CSD lock is still stuck after 5 minutes, it is unlikely + * to become unstuck. Use a signed comparison to avoid triggering + * on underflows when the TSC is out of sync between sockets. + */ + BUG_ON(panic_on_ipistall > 0 && (s64)ts_delta > ((s64)panic_on_ipistall * NSEC_PER_MSEC)); if (cpu_cur_csd && csd != cpu_cur_csd) { pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n", *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ran Xiaokai ran.xiaokai@zte.com.cn
[ Upstream commit 38685e2a0476127db766f81b1c06019ddc4c9ffa ]
If a system has isolated CPUs via the "isolcpus=" command line parameter, then an attempt to offline the last housekeeping CPU will result in a WARN_ON() when rebuilding the scheduler domains and a subsequent panic due to and unhandled empty CPU mas in partition_sched_domains_locked().
cpuset_hotplug_workfn() rebuild_sched_domains_locked() ndoms = generate_sched_domains(&doms, &attr); cpumask_and(doms[0], top_cpuset.effective_cpus, housekeeping_cpumask(HK_FLAG_DOMAIN));
Thus results in an empty CPU mask which triggers the warning and then the subsequent crash:
WARNING: CPU: 4 PID: 80 at kernel/sched/topology.c:2366 build_sched_domains+0x120c/0x1408 Call trace: build_sched_domains+0x120c/0x1408 partition_sched_domains_locked+0x234/0x880 rebuild_sched_domains_locked+0x37c/0x798 rebuild_sched_domains+0x30/0x58 cpuset_hotplug_workfn+0x2a8/0x930
Unable to handle kernel paging request at virtual address fffe80027ab37080 partition_sched_domains_locked+0x318/0x880 rebuild_sched_domains_locked+0x37c/0x798
Aside of the resulting crash, it does not make any sense to offline the last last housekeeping CPU.
Prevent this by masking out the non-housekeeping CPUs when selecting a target CPU for initiating the CPU unplug operation via the work queue.
Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ran Xiaokai ran.xiaokai@zte.com.cn Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/202310171709530660462@zte.com.cn Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/cpu.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c index 26119d2154102..189ba5fd9af4b 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1503,11 +1503,14 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target) /* * Ensure that the control task does not run on the to be offlined * CPU to prevent a deadlock against cfs_b->period_timer. + * Also keep at least one housekeeping cpu onlined to avoid generating + * an empty sched_domain span. */ - cpu = cpumask_any_but(cpu_online_mask, cpu); - if (cpu >= nr_cpu_ids) - return -EBUSY; - return work_on_cpu(cpu, __cpu_down_maps_locked, &work); + for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN)) { + if (cpu != work.cpu) + return work_on_cpu(cpu, __cpu_down_maps_locked, &work); + } + return -EBUSY; }
static int cpu_down(unsigned int cpu, enum cpuhp_state target)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 265f3ed077036f053981f5eea0b5b43e7c5b39ff ]
All callers of work_on_cpu() share the same lock class key for all the functions queued. As a result the workqueue related locking scenario for a function A may be spuriously accounted as an inversion against the locking scenario of function B such as in the following model:
long A(void *arg) { mutex_lock(&mutex); mutex_unlock(&mutex); }
long B(void *arg) { }
void launchA(void) { work_on_cpu(0, A, NULL); }
void launchB(void) { mutex_lock(&mutex); work_on_cpu(1, B, NULL); mutex_unlock(&mutex); }
launchA and launchB running concurrently have no chance to deadlock. However the above can be reported by lockdep as a possible locking inversion because the works containing A() and B() are treated as belonging to the same locking class.
The following shows an existing example of such a spurious lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 6.6.0-rc1-00065-g934ebd6e5359 #35409 Not tainted ------------------------------------------------------ kworker/0:1/9 is trying to acquire lock: ffffffff9bc72f30 (cpu_hotplug_lock){++++}-{0:0}, at: _cpu_down+0x57/0x2b0
but task is already holding lock: ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&wfc.work)){+.+.}-{0:0}: __flush_work+0x83/0x4e0 work_on_cpu+0x97/0xc0 rcu_nocb_cpu_offload+0x62/0xb0 rcu_nocb_toggle+0xd0/0x1d0 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
-> #1 (rcu_state.barrier_mutex){+.+.}-{3:3}: __mutex_lock+0x81/0xc80 rcu_nocb_cpu_deoffload+0x38/0xb0 rcu_nocb_toggle+0x144/0x1d0 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
-> #0 (cpu_hotplug_lock){++++}-{0:0}: __lock_acquire+0x1538/0x2500 lock_acquire+0xbf/0x2a0 percpu_down_write+0x31/0x200 _cpu_down+0x57/0x2b0 __cpu_down_maps_locked+0x10/0x20 work_for_cpu_fn+0x15/0x20 process_scheduled_works+0x2a7/0x500 worker_thread+0x173/0x330 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
other info that might help us debug this:
Chain exists of: cpu_hotplug_lock --> rcu_state.barrier_mutex --> (work_completion)(&wfc.work)
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock((work_completion)(&wfc.work)); lock(rcu_state.barrier_mutex); lock((work_completion)(&wfc.work)); lock(cpu_hotplug_lock);
*** DEADLOCK ***
2 locks held by kworker/0:1/9: #0: ffff900481068b38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x212/0x500 #1: ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500
stack backtrace: CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc1-00065-g934ebd6e5359 #35409 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: events work_for_cpu_fn Call Trace: rcu-torture: rcu_torture_read_exit: Start of episode <TASK> dump_stack_lvl+0x4a/0x80 check_noncircular+0x132/0x150 __lock_acquire+0x1538/0x2500 lock_acquire+0xbf/0x2a0 ? _cpu_down+0x57/0x2b0 percpu_down_write+0x31/0x200 ? _cpu_down+0x57/0x2b0 _cpu_down+0x57/0x2b0 __cpu_down_maps_locked+0x10/0x20 work_for_cpu_fn+0x15/0x20 process_scheduled_works+0x2a7/0x500 worker_thread+0x173/0x330 ? __pfx_worker_thread+0x10/0x10 kthread+0xe6/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK
Fix this with providing one lock class key per work_on_cpu() caller.
Reported-and-tested-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/workqueue.h | 46 +++++++++++++++++++++++++++++++++------ kernel/workqueue.c | 20 ++++++++++------- 2 files changed, 51 insertions(+), 15 deletions(-)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 683efe29fa698..ca26c1f94f044 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -222,18 +222,16 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; } * to generate better code. */ #ifdef CONFIG_LOCKDEP -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK_KEY(_work, _func, _onstack, _key) \ do { \ - static struct lock_class_key __key; \ - \ __init_work((_work), _onstack); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ - lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, &__key, 0); \ + lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, (_key), 0); \ INIT_LIST_HEAD(&(_work)->entry); \ (_work)->func = (_func); \ } while (0) #else -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK_KEY(_work, _func, _onstack, _key) \ do { \ __init_work((_work), _onstack); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ @@ -242,12 +240,22 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; } } while (0) #endif
+#define __INIT_WORK(_work, _func, _onstack) \ + do { \ + static __maybe_unused struct lock_class_key __key; \ + \ + __INIT_WORK_KEY(_work, _func, _onstack, &__key); \ + } while (0) + #define INIT_WORK(_work, _func) \ __INIT_WORK((_work), (_func), 0)
#define INIT_WORK_ONSTACK(_work, _func) \ __INIT_WORK((_work), (_func), 1)
+#define INIT_WORK_ONSTACK_KEY(_work, _func, _key) \ + __INIT_WORK_KEY((_work), (_func), 1, _key) + #define __INIT_DELAYED_WORK(_work, _func, _tflags) \ do { \ INIT_WORK(&(_work)->work, (_func)); \ @@ -683,8 +691,32 @@ static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg) return fn(arg); } #else -long work_on_cpu(int cpu, long (*fn)(void *), void *arg); -long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg); +long work_on_cpu_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key); +/* + * A new key is defined for each caller to make sure the work + * associated with the function doesn't share its locking class. + */ +#define work_on_cpu(_cpu, _fn, _arg) \ +({ \ + static struct lock_class_key __key; \ + \ + work_on_cpu_key(_cpu, _fn, _arg, &__key); \ +}) + +long work_on_cpu_safe_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key); + +/* + * A new key is defined for each caller to make sure the work + * associated with the function doesn't share its locking class. + */ +#define work_on_cpu_safe(_cpu, _fn, _arg) \ +({ \ + static struct lock_class_key __key; \ + \ + work_on_cpu_safe_key(_cpu, _fn, _arg, &__key); \ +}) #endif /* CONFIG_SMP */
#ifdef CONFIG_FREEZER diff --git a/kernel/workqueue.c b/kernel/workqueue.c index e4a37d7a6752d..a7fcb25417726 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5571,50 +5571,54 @@ static void work_for_cpu_fn(struct work_struct *work) }
/** - * work_on_cpu - run a function in thread context on a particular cpu + * work_on_cpu_key - run a function in thread context on a particular cpu * @cpu: the cpu to run on * @fn: the function to run * @arg: the function arg + * @key: The lock class key for lock debugging purposes * * It is up to the caller to ensure that the cpu doesn't go offline. * The caller must not hold any locks which would prevent @fn from completing. * * Return: The value @fn returns. */ -long work_on_cpu(int cpu, long (*fn)(void *), void *arg) +long work_on_cpu_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key) { struct work_for_cpu wfc = { .fn = fn, .arg = arg };
- INIT_WORK_ONSTACK(&wfc.work, work_for_cpu_fn); + INIT_WORK_ONSTACK_KEY(&wfc.work, work_for_cpu_fn, key); schedule_work_on(cpu, &wfc.work); flush_work(&wfc.work); destroy_work_on_stack(&wfc.work); return wfc.ret; } -EXPORT_SYMBOL_GPL(work_on_cpu); +EXPORT_SYMBOL_GPL(work_on_cpu_key);
/** - * work_on_cpu_safe - run a function in thread context on a particular cpu + * work_on_cpu_safe_key - run a function in thread context on a particular cpu * @cpu: the cpu to run on * @fn: the function to run * @arg: the function argument + * @key: The lock class key for lock debugging purposes * * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold * any locks which would prevent @fn from completing. * * Return: The value @fn returns. */ -long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg) +long work_on_cpu_safe_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key) { long ret = -ENODEV;
cpus_read_lock(); if (cpu_online(cpu)) - ret = work_on_cpu(cpu, fn, arg); + ret = work_on_cpu_key(cpu, fn, arg, key); cpus_read_unlock(); return ret; } -EXPORT_SYMBOL_GPL(work_on_cpu_safe); +EXPORT_SYMBOL_GPL(work_on_cpu_safe_key); #endif /* CONFIG_SMP */
#ifdef CONFIG_FREEZER
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Rapoport (IBM) rppt@kernel.org
[ Upstream commit a1e2b8b36820d8c91275f207e77e91645b7c6836 ]
Qi Zheng reported crashes in a production environment and provided a simplified example as a reproducer:
| For example, if we use Qemu to start a two NUMA node kernel, | one of the nodes has 2M memory (less than NODE_MIN_SIZE), | and the other node has 2G, then we will encounter the | following panic: | | BUG: kernel NULL pointer dereference, address: 0000000000000000 | <...> | RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40 | <...> | Call Trace: | <TASK> | deactivate_slab() | bootstrap() | kmem_cache_init() | start_kernel() | secondary_startup_64_no_verify()
The crashes happen because of inconsistency between the nodemask that has nodes with less than 4MB as memoryless, and the actual memory fed into the core mm.
The commit:
9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")
... that introduced minimal size of a NUMA node does not explain why a node size cannot be less than 4MB and what boot failures this restriction might fix.
Fixes have been submitted to the core MM code to tighten up the memory topologies it accepts and to not crash on weird input:
mm: page_alloc: skip memoryless nodes entirely mm: memory_hotplug: drop memoryless node from fallback lists
Andrew has accepted them into the -mm tree, but there are no stable SHA1's yet.
This patch drops the limitation for minimal node size on x86:
- which works around the crash without the fixes to the core MM. - makes x86 topologies less weird, - removes an arbitrary and undocumented limitation on NUMA topologies.
[ mingo: Improved changelog clarity. ]
Reported-by: Qi Zheng zhengqi.arch@bytedance.com Tested-by: Mario Casquero mcasquer@redhat.com Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: David Hildenbrand david@redhat.com Acked-by: Michal Hocko mhocko@suse.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Rik van Riel riel@surriel.com Link: https://lore.kernel.org/r/ZS+2qqjEO5/867br@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/numa.h | 7 ------- arch/x86/mm/numa.c | 7 ------- 2 files changed, 14 deletions(-)
diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h index e3bae2b60a0db..ef2844d691735 100644 --- a/arch/x86/include/asm/numa.h +++ b/arch/x86/include/asm/numa.h @@ -12,13 +12,6 @@
#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
-/* - * Too small node sizes may confuse the VM badly. Usually they - * result from BIOS bugs. So dont recognize nodes as standalone - * NUMA entities that have less than this amount of RAM listed: - */ -#define NODE_MIN_SIZE (4*1024*1024) - extern int numa_off;
/* diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index c01c5506fd4ae..aa39d678fe81d 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -602,13 +602,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi) if (start >= end) continue;
- /* - * Don't confuse VM with a node that doesn't have the - * minimum amount of memory: - */ - if (end && (end - start) < NODE_MIN_SIZE) - continue; - alloc_node_data(nid); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit a763e92c78615ea838f5b9a841398b1d4adb968e ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/purelifi/plfxlc/mac.c:6: In file included from ./include/linux/netdevice.h:24: In file included from ./include/linux/timer.h:6: In file included from ./include/linux/ktime.h:24: In file included from ./include/linux/time.h:60: In file included from ./include/linux/time32.h:13: In file included from ./include/linux/timex.h:67: In file included from ./arch/x86/include/asm/timex.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'plfxlc_get_et_strings()' where fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole 'et_strings' array from its first member and so issues an overread warning. This warning may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230829094541.234751-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/purelifi/plfxlc/mac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/purelifi/plfxlc/mac.c b/drivers/net/wireless/purelifi/plfxlc/mac.c index 94ee831b5de35..506d2f31efb5a 100644 --- a/drivers/net/wireless/purelifi/plfxlc/mac.c +++ b/drivers/net/wireless/purelifi/plfxlc/mac.c @@ -666,7 +666,7 @@ static void plfxlc_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *et_strings, sizeof(et_strings)); + memcpy(data, et_strings, sizeof(et_strings)); }
static void plfxlc_get_et_stats(struct ieee80211_hw *hw,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harshitha Prem quic_hprem@quicinc.com
[ Upstream commit bbc86757ca62423c3b6bd8f7176da1ff43450769 ]
When max virtual ap interfaces are configured in all the bands with ACS and hostapd restart is done every 60s, a crash is observed at random times.
In the above scenario, a fragmented packet is received for self peer, for which rx_tid and rx_frags are not initialized in datapath. While handling this fragment, crash is observed as the rx_frag list is uninitialized and when we walk in ath12k_dp_rx_h_sort_frags, skb null leads to exception.
To address this, before processing received fragments we check dp_setup_done flag is set to ensure that peer has completed its dp peer setup for fragment queue, else ignore processing the fragments.
Call trace: PC points to "ath12k_dp_process_rx_err+0x4e8/0xfcc [ath12k]" LR points to "ath12k_dp_process_rx_err+0x480/0xfcc [ath12k]". The Backtrace obtained is as follows: ath12k_dp_process_rx_err+0x4e8/0xfcc [ath12k] ath12k_dp_service_srng+0x78/0x260 [ath12k] ath12k_pci_write32+0x990/0xb0c [ath12k] __napi_poll+0x30/0xa4 net_rx_action+0x118/0x270 __do_softirq+0x10c/0x244 irq_exit+0x64/0xb4 __handle_domain_irq+0x88/0xac gic_handle_irq+0x74/0xbc el1_irq+0xf0/0x1c0 arch_cpu_idle+0x10/0x18 do_idle+0x104/0x248 cpu_startup_entry+0x20/0x64 rest_init+0xd0/0xdc arch_call_rest_init+0xc/0x14
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
Signed-off-by: Harshitha Prem quic_hprem@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230821130343.29495-2-quic_hprem@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/dp.c | 1 + drivers/net/wireless/ath/ath12k/dp_rx.c | 9 +++++++++ drivers/net/wireless/ath/ath12k/peer.h | 3 +++ 3 files changed, 13 insertions(+)
diff --git a/drivers/net/wireless/ath/ath12k/dp.c b/drivers/net/wireless/ath/ath12k/dp.c index f933896f2a68d..6893466f61f04 100644 --- a/drivers/net/wireless/ath/ath12k/dp.c +++ b/drivers/net/wireless/ath/ath12k/dp.c @@ -38,6 +38,7 @@ void ath12k_dp_peer_cleanup(struct ath12k *ar, int vdev_id, const u8 *addr)
ath12k_dp_rx_peer_tid_cleanup(ar, peer); crypto_free_shash(peer->tfm_mmic); + peer->dp_setup_done = false; spin_unlock_bh(&ab->base_lock); }
diff --git a/drivers/net/wireless/ath/ath12k/dp_rx.c b/drivers/net/wireless/ath/ath12k/dp_rx.c index fcb91b8ef00e3..73edcb1908b91 100644 --- a/drivers/net/wireless/ath/ath12k/dp_rx.c +++ b/drivers/net/wireless/ath/ath12k/dp_rx.c @@ -2747,6 +2747,7 @@ int ath12k_dp_rx_peer_frag_setup(struct ath12k *ar, const u8 *peer_mac, int vdev }
peer->tfm_mmic = tfm; + peer->dp_setup_done = true; spin_unlock_bh(&ab->base_lock);
return 0; @@ -3213,6 +3214,14 @@ static int ath12k_dp_rx_frag_h_mpdu(struct ath12k *ar, ret = -ENOENT; goto out_unlock; } + + if (!peer->dp_setup_done) { + ath12k_warn(ab, "The peer %pM [%d] has uninitialized datapath\n", + peer->addr, peer_id); + ret = -ENOENT; + goto out_unlock; + } + rx_tid = &peer->rx_tid[tid];
if ((!skb_queue_empty(&rx_tid->rx_frags) && seqno != rx_tid->cur_sn) || diff --git a/drivers/net/wireless/ath/ath12k/peer.h b/drivers/net/wireless/ath/ath12k/peer.h index b296dc0e2f671..c6edb24cbedd8 100644 --- a/drivers/net/wireless/ath/ath12k/peer.h +++ b/drivers/net/wireless/ath/ath12k/peer.h @@ -44,6 +44,9 @@ struct ath12k_peer { struct ppdu_user_delayba ppdu_stats_delayba; bool delayba_flag; bool is_authorized; + + /* protected by ab->data_lock */ + bool dp_setup_done; };
void ath12k_peer_unmap_event(struct ath12k_base *ab, u16 peer_id);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit cbaccdc42483c65016f1bae89128c08dc17cfb2a ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/virtual/mac80211_hwsim.c:18: In file included from ./include/linux/slab.h:16: In file included from ./include/linux/gfp.h:7: In file included from ./include/linux/mmzone.h:8: In file included from ./include/linux/spinlock.h:56: In file included from ./include/linux/preempt.h:79: In file included from ./arch/x86/include/asm/preempt.h:9: In file included from ./include/linux/thread_info.h:60: In file included from ./arch/x86/include/asm/thread_info.h:53: In file included from ./arch/x86/include/asm/cpufeature.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'mac80211_hwsim_get_et_strings()' where fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole 'mac80211_hwsim_gstrings_stats' array from its first member and so issues an overread warning. This warning may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Link: https://lore.kernel.org/r/20230829094140.234636-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/virtual/mac80211_hwsim.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/virtual/mac80211_hwsim.c b/drivers/net/wireless/virtual/mac80211_hwsim.c index 23307c8baea21..6dc153a267872 100644 --- a/drivers/net/wireless/virtual/mac80211_hwsim.c +++ b/drivers/net/wireless/virtual/mac80211_hwsim.c @@ -3170,7 +3170,7 @@ static void mac80211_hwsim_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *mac80211_hwsim_gstrings_stats, + memcpy(data, mac80211_hwsim_gstrings_stats, sizeof(mac80211_hwsim_gstrings_stats)); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ping-Ke Shih pkshih@realtek.com
[ Upstream commit e160ab85166e77347d0cbe5149045cb25e83937f ]
We can get a UBSAN warning if ieee80211_get_tx_power() returns the INT_MIN value mac80211 internally uses for "unset power level".
UBSAN: signed-integer-overflow in net/wireless/nl80211.c:3816:5 -2147483648 * 100 cannot be represented in type 'int' CPU: 0 PID: 20433 Comm: insmod Tainted: G WC OE Call Trace: dump_stack+0x74/0x92 ubsan_epilogue+0x9/0x50 handle_overflow+0x8d/0xd0 __ubsan_handle_mul_overflow+0xe/0x10 nl80211_send_iface+0x688/0x6b0 [cfg80211] [...] cfg80211_register_wdev+0x78/0xb0 [cfg80211] cfg80211_netdev_notifier_call+0x200/0x620 [cfg80211] [...] ieee80211_if_add+0x60e/0x8f0 [mac80211] ieee80211_register_hw+0xda5/0x1170 [mac80211]
In this case, simply return an error instead, to indicate that no data is available.
Cc: Zong-Zhe Yang kevin_yang@realtek.com Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://lore.kernel.org/r/20230203023636.4418-1-pkshih@realtek.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/cfg.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index 0e3a1753a51c6..715da615f0359 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -3121,6 +3121,10 @@ static int ieee80211_get_tx_power(struct wiphy *wiphy, else *dbm = sdata->vif.bss_conf.txpower;
+ /* INT_MIN indicates no power level was set yet */ + if (*dbm == INT_MIN) + return -EINVAL; + return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sieng-Piaw Liew liew.s.piaw@gmail.com
[ Upstream commit 86565682e9053e5deb128193ea9e88531bbae9cf ]
This is based on alx driver commit 881d0327db37 ("net: alx: Work around the DMA RX overflow issue").
The alx and atl1c drivers had RX overflow error which was why a custom allocator was created to avoid certain addresses. The simpler workaround then created for alx driver, but not for atl1c due to lack of tester.
Instead of using a custom allocator, check the allocated skb address and use skb_reserve() to move away from problematic 0x...fc0 address.
Tested on AR8131 on Acer 4540.
Signed-off-by: Sieng-Piaw Liew liew.s.piaw@gmail.com Link: https://lore.kernel.org/r/20230912010711.12036-1-liew.s.piaw@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/atl1c/atl1c.h | 3 - .../net/ethernet/atheros/atl1c/atl1c_main.c | 67 +++++-------------- 2 files changed, 16 insertions(+), 54 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h index 43d821fe7a542..63ba64dbb7310 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c.h +++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h @@ -504,15 +504,12 @@ struct atl1c_rrd_ring { u16 next_to_use; u16 next_to_clean; struct napi_struct napi; - struct page *rx_page; - unsigned int rx_page_offset; };
/* board specific private data structure */ struct atl1c_adapter { struct net_device *netdev; struct pci_dev *pdev; - unsigned int rx_frag_size; struct atl1c_hw hw; struct atl1c_hw_stats hw_stats; struct mii_if_info mii; /* MII interface info */ diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index 940c5d1ff9cfc..74b78164cf74a 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -483,15 +483,10 @@ static int atl1c_set_mac_addr(struct net_device *netdev, void *p) static void atl1c_set_rxbufsize(struct atl1c_adapter *adapter, struct net_device *dev) { - unsigned int head_size; int mtu = dev->mtu;
adapter->rx_buffer_len = mtu > AT_RX_BUF_SIZE ? roundup(mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN, 8) : AT_RX_BUF_SIZE; - - head_size = SKB_DATA_ALIGN(adapter->rx_buffer_len + NET_SKB_PAD + NET_IP_ALIGN) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - adapter->rx_frag_size = roundup_pow_of_two(head_size); }
static netdev_features_t atl1c_fix_features(struct net_device *netdev, @@ -964,7 +959,6 @@ static void atl1c_init_ring_ptrs(struct atl1c_adapter *adapter) static void atl1c_free_ring_resources(struct atl1c_adapter *adapter) { struct pci_dev *pdev = adapter->pdev; - int i;
dma_free_coherent(&pdev->dev, adapter->ring_header.size, adapter->ring_header.desc, adapter->ring_header.dma); @@ -977,12 +971,6 @@ static void atl1c_free_ring_resources(struct atl1c_adapter *adapter) kfree(adapter->tpd_ring[0].buffer_info); adapter->tpd_ring[0].buffer_info = NULL; } - for (i = 0; i < adapter->rx_queue_count; ++i) { - if (adapter->rrd_ring[i].rx_page) { - put_page(adapter->rrd_ring[i].rx_page); - adapter->rrd_ring[i].rx_page = NULL; - } - } }
/** @@ -1754,48 +1742,11 @@ static inline void atl1c_rx_checksum(struct atl1c_adapter *adapter, skb_checksum_none_assert(skb); }
-static struct sk_buff *atl1c_alloc_skb(struct atl1c_adapter *adapter, - u32 queue, bool napi_mode) -{ - struct atl1c_rrd_ring *rrd_ring = &adapter->rrd_ring[queue]; - struct sk_buff *skb; - struct page *page; - - if (adapter->rx_frag_size > PAGE_SIZE) { - if (likely(napi_mode)) - return napi_alloc_skb(&rrd_ring->napi, - adapter->rx_buffer_len); - else - return netdev_alloc_skb_ip_align(adapter->netdev, - adapter->rx_buffer_len); - } - - page = rrd_ring->rx_page; - if (!page) { - page = alloc_page(GFP_ATOMIC); - if (unlikely(!page)) - return NULL; - rrd_ring->rx_page = page; - rrd_ring->rx_page_offset = 0; - } - - skb = build_skb(page_address(page) + rrd_ring->rx_page_offset, - adapter->rx_frag_size); - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); - rrd_ring->rx_page_offset += adapter->rx_frag_size; - if (rrd_ring->rx_page_offset >= PAGE_SIZE) - rrd_ring->rx_page = NULL; - else - get_page(page); - } - return skb; -} - static int atl1c_alloc_rx_buffer(struct atl1c_adapter *adapter, u32 queue, bool napi_mode) { struct atl1c_rfd_ring *rfd_ring = &adapter->rfd_ring[queue]; + struct atl1c_rrd_ring *rrd_ring = &adapter->rrd_ring[queue]; struct pci_dev *pdev = adapter->pdev; struct atl1c_buffer *buffer_info, *next_info; struct sk_buff *skb; @@ -1814,13 +1765,27 @@ static int atl1c_alloc_rx_buffer(struct atl1c_adapter *adapter, u32 queue, while (next_info->flags & ATL1C_BUFFER_FREE) { rfd_desc = ATL1C_RFD_DESC(rfd_ring, rfd_next_to_use);
- skb = atl1c_alloc_skb(adapter, queue, napi_mode); + /* When DMA RX address is set to something like + * 0x....fc0, it will be very likely to cause DMA + * RFD overflow issue. + * + * To work around it, we apply rx skb with 64 bytes + * longer space, and offset the address whenever + * 0x....fc0 is detected. + */ + if (likely(napi_mode)) + skb = napi_alloc_skb(&rrd_ring->napi, adapter->rx_buffer_len + 64); + else + skb = netdev_alloc_skb(adapter->netdev, adapter->rx_buffer_len + 64); if (unlikely(!skb)) { if (netif_msg_rx_err(adapter)) dev_warn(&pdev->dev, "alloc rx buffer failed\n"); break; }
+ if (((unsigned long)skb->data & 0xfff) == 0xfc0) + skb_reserve(skb, 64); + /* * Make buffer alignment 2 beyond a 16 byte boundary * this will result in a 16 byte aligned IP header after
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit 66d9111f3517f85ef2af0337ece02683ce0faf21 ]
Now that bpf_throw kfunc is the first such call instruction that has noreturn semantics within the verifier, this also kicks in dead code elimination in unprecedented ways. For one, any instruction following a bpf_throw call will never be marked as seen. Moreover, if a callchain ends up throwing, any instructions after the call instruction to the eventually throwing subprog in callers will also never be marked as seen.
The tempting way to fix this would be to emit extra 'int3' instructions which bump the jited_len of a program, and ensure that during runtime when a program throws, we can discover its boundaries even if the call instruction to bpf_throw (or to subprogs that always throw) is emitted as the final instruction in the program.
An example of such a program would be this:
do_something(): ... r0 = 0 exit
foo(): r1 = 0 call bpf_throw r0 = 0 exit
bar(cond): if r1 != 0 goto pc+2 call do_something exit call foo r0 = 0 // Never seen by verifier exit //
main(ctx): r1 = ... call bar r0 = 0 exit
Here, if we do end up throwing, the stacktrace would be the following:
bpf_throw foo bar main
In bar, the final instruction emitted will be the call to foo, as such, the return address will be the subsequent instruction (which the JIT emits as int3 on x86). This will end up lying outside the jited_len of the program, thus, when unwinding, we will fail to discover the return address as belonging to any program and end up in a panic due to the unreliable stack unwinding of BPF programs that we never expect.
To remedy this case, make bpf_prog_ksym_find treat IP == ksym.end as part of the BPF program, so that is_bpf_text_address returns true when such a case occurs, and we are able to unwind reliably when the final instruction ends up being a call instruction.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/r/20230912233214.1518551-12-memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index e3e45b651cd40..33d1a76b7fc5d 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -613,7 +613,11 @@ static __always_inline int bpf_tree_comp(void *key, struct latch_tree_node *n)
if (val < ksym->start) return -1; - if (val >= ksym->end) + /* Ensure that we detect return addresses as part of the program, when + * the final instruction is a call for a program part of the stack + * trace. Therefore, do val > ksym->end instead of val >= ksym->end. + */ + if (val > ksym->end) return 1;
return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit 95f97fe0ac974467ab4da215985a32b2fdf48af0 ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/ath/ath9k/debug.c:17: In file included from ./include/linux/slab.h:16: In file included from ./include/linux/gfp.h:7: In file included from ./include/linux/mmzone.h:8: In file included from ./include/linux/spinlock.h:56: In file included from ./include/linux/preempt.h:79: In file included from ./arch/x86/include/asm/preempt.h:9: In file included from ./include/linux/thread_info.h:60: In file included from ./arch/x86/include/asm/thread_info.h:53: In file included from ./arch/x86/include/asm/cpufeature.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
In file included from drivers/net/wireless/ath/ath9k/htc_drv_debug.c:17: In file included from drivers/net/wireless/ath/ath9k/htc.h:20: In file included from ./include/linux/module.h:13: In file included from ./include/linux/stat.h:19: In file included from ./include/linux/time.h:60: In file included from ./include/linux/time32.h:13: In file included from ./include/linux/timex.h:67: In file included from ./arch/x86/include/asm/timex.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'ath9k_get_et_strings()' and 'ath9k_htc_get_et_strings()' due to the same reason: fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole array from it's first member and so issues an overread warning. These warnings may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230829093856.234584-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/debug.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_debug.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c index fb7a2952d0ce8..d9bac1c343490 100644 --- a/drivers/net/wireless/ath/ath9k/debug.c +++ b/drivers/net/wireless/ath/ath9k/debug.c @@ -1333,7 +1333,7 @@ void ath9k_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath9k_gstrings_stats, + memcpy(data, ath9k_gstrings_stats, sizeof(ath9k_gstrings_stats)); }
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c index c55aab01fff5d..e79bbcd3279af 100644 --- a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c +++ b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c @@ -428,7 +428,7 @@ void ath9k_htc_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath9k_htc_gstrings_stats, + memcpy(data, ath9k_htc_gstrings_stats, sizeof(ath9k_htc_gstrings_stats)); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baochen Qiang quic_bqiang@quicinc.com
[ Upstream commit 1bc44a505a229bb1dd4957e11aa594edeea3690e ]
len is extracted from HTT message and could be an unexpected value in case errors happen, so add validation before using to avoid possible out-of-bound read in the following message iteration and parsing.
The same issue also applies to ppdu_info->ppdu_stats.common.num_users, so validate it before using too.
These are found during code review.
Compile test only.
Signed-off-by: Baochen Qiang quic_bqiang@quicinc.com Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230901015602.45112-1-quic_bqiang@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/dp_rx.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/net/wireless/ath/ath12k/dp_rx.c b/drivers/net/wireless/ath/ath12k/dp_rx.c index 73edcb1908b91..8dd744c381eeb 100644 --- a/drivers/net/wireless/ath/ath12k/dp_rx.c +++ b/drivers/net/wireless/ath/ath12k/dp_rx.c @@ -1555,6 +1555,13 @@ static int ath12k_htt_pull_ppdu_stats(struct ath12k_base *ab,
msg = (struct ath12k_htt_ppdu_stats_msg *)skb->data; len = le32_get_bits(msg->info, HTT_T2H_PPDU_STATS_INFO_PAYLOAD_SIZE); + if (len > (skb->len - struct_size(msg, data, 0))) { + ath12k_warn(ab, + "HTT PPDU STATS event has unexpected payload size %u, should be smaller than %u\n", + len, skb->len); + return -EINVAL; + } + pdev_id = le32_get_bits(msg->info, HTT_T2H_PPDU_STATS_INFO_PDEV_ID); ppdu_id = le32_to_cpu(msg->ppdu_id);
@@ -1583,6 +1590,16 @@ static int ath12k_htt_pull_ppdu_stats(struct ath12k_base *ab, goto exit; }
+ if (ppdu_info->ppdu_stats.common.num_users >= HTT_PPDU_STATS_MAX_USERS) { + spin_unlock_bh(&ar->data_lock); + ath12k_warn(ab, + "HTT PPDU STATS event has unexpected num_users %u, should be smaller than %u\n", + ppdu_info->ppdu_stats.common.num_users, + HTT_PPDU_STATS_MAX_USERS); + ret = -EINVAL; + goto exit; + } + /* back up data rate tlv for all peers */ if (ppdu_info->frame_type == HTT_STATS_PPDU_FTYPE_DATA && (ppdu_info->tlv_bitmap & (1 << HTT_PPDU_STATS_TAG_USR_COMMON)) &&
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit cb4c132ebfeac5962f7258ffc831caa0c4dada1a ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/ath/ath10k/debug.c:8: In file included from ./include/linux/module.h:13: In file included from ./include/linux/stat.h:19: In file included from ./include/linux/time.h:60: In file included from ./include/linux/time32.h:13: In file included from ./include/linux/timex.h:67: In file included from ./arch/x86/include/asm/timex.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'ath10k_debug_get_et_strings()' where fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole 'ath10k_gstrings_stats' array from it's first member and so issues an overread warning. This warning may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230829093652.234537-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c index f9518e1c99039..fe89bc61e5317 100644 --- a/drivers/net/wireless/ath/ath10k/debug.c +++ b/drivers/net/wireless/ath/ath10k/debug.c @@ -1140,7 +1140,7 @@ void ath10k_debug_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath10k_gstrings_stats, + memcpy(data, ath10k_gstrings_stats, sizeof(ath10k_gstrings_stats)); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baochen Qiang quic_bqiang@quicinc.com
[ Upstream commit b302dce3d9edea5b93d1902a541684a967f3c63c ]
reg_cap.phy_id is extracted from WMI event and could be an unexpected value in case some errors happen. As a result out-of-bound write may occur to soc->hal_reg_cap. Fix it by validating reg_cap.phy_id before using it.
This is found during code review.
Compile tested only.
Signed-off-by: Baochen Qiang quic_bqiang@quicinc.com Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230830020716.5420-1-quic_bqiang@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/wmi.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c index eebc5a65ce3b4..416b22fa53ebf 100644 --- a/drivers/net/wireless/ath/ath12k/wmi.c +++ b/drivers/net/wireless/ath/ath12k/wmi.c @@ -3799,6 +3799,12 @@ static int ath12k_wmi_ext_hal_reg_caps(struct ath12k_base *soc, ath12k_warn(soc, "failed to extract reg cap %d\n", i); return ret; } + + if (reg_cap.phy_id >= MAX_RADIOS) { + ath12k_warn(soc, "unexpected phy id %u\n", reg_cap.phy_id); + return -EINVAL; + } + soc->hal_reg_cap[reg_cap.phy_id] = reg_cap; } return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shiju Jose shiju.jose@huawei.com
[ Upstream commit e2abc47a5a1a9f641e7cacdca643fdd40729bf6e ]
ghes_handle_aer() passes AER data to the PCI core for logging and recovery by calling aer_recover_queue() with a pointer to struct aer_capability_regs.
The problem was that aer_recover_queue() queues the pointer directly without copying the aer_capability_regs data. The pointer was to the ghes->estatus buffer, which could be reused before aer_recover_work_func() reads the data.
To avoid this problem, allocate a new aer_capability_regs structure from the ghes_estatus_pool, copy the AER data from the ghes->estatus buffer into it, pass a pointer to the new struct to aer_recover_queue(), and free it after aer_recover_work_func() has processed it.
Reported-by: Bjorn Helgaas helgaas@kernel.org Acked-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Shiju Jose shiju.jose@huawei.com [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/apei/ghes.c | 23 ++++++++++++++++++++++- drivers/pci/pcie/aer.c | 10 ++++++++++ include/acpi/ghes.h | 4 ++++ 3 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index ef59d6ea16da0..63ad0541db381 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -209,6 +209,20 @@ int ghes_estatus_pool_init(unsigned int num_ghes) return -ENOMEM; }
+/** + * ghes_estatus_pool_region_free - free previously allocated memory + * from the ghes_estatus_pool. + * @addr: address of memory to free. + * @size: size of memory to free. + * + * Returns none. + */ +void ghes_estatus_pool_region_free(unsigned long addr, u32 size) +{ + gen_pool_free(ghes_estatus_pool, addr, size); +} +EXPORT_SYMBOL_GPL(ghes_estatus_pool_region_free); + static int map_gen_v2(struct ghes *ghes) { return apei_map_generic_address(&ghes->generic_v2->read_ack_register); @@ -564,6 +578,7 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) { unsigned int devfn; int aer_severity; + u8 *aer_info;
devfn = PCI_DEVFN(pcie_err->device_id.device, pcie_err->device_id.function); @@ -577,11 +592,17 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) if (gdata->flags & CPER_SEC_RESET) aer_severity = AER_FATAL;
+ aer_info = (void *)gen_pool_alloc(ghes_estatus_pool, + sizeof(struct aer_capability_regs)); + if (!aer_info) + return; + memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs)); + aer_recover_queue(pcie_err->device_id.segment, pcie_err->device_id.bus, devfn, aer_severity, (struct aer_capability_regs *) - pcie_err->aer_info); + aer_info); } #endif } diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index f6c24ded134cd..67025ee2b7454 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -29,6 +29,7 @@ #include <linux/kfifo.h> #include <linux/slab.h> #include <acpi/apei.h> +#include <acpi/ghes.h> #include <ras/ras_event.h>
#include "../pci.h" @@ -1010,6 +1011,15 @@ static void aer_recover_work_func(struct work_struct *work) continue; } cper_print_aer(pdev, entry.severity, entry.regs); + /* + * Memory for aer_capability_regs(entry.regs) is being allocated from the + * ghes_estatus_pool to protect it from overwriting when multiple sections + * are present in the error status. Thus free the same after processing + * the data. + */ + ghes_estatus_pool_region_free((unsigned long)entry.regs, + sizeof(struct aer_capability_regs)); + if (entry.severity == AER_NONFATAL) pcie_do_recovery(pdev, pci_channel_io_normal, aer_root_reset); diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index 3c8bba9f1114a..be1dd4c1a9174 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -73,8 +73,12 @@ int ghes_register_vendor_record_notifier(struct notifier_block *nb); void ghes_unregister_vendor_record_notifier(struct notifier_block *nb);
struct list_head *ghes_get_devices(void); + +void ghes_estatus_pool_region_free(unsigned long addr, u32 size); #else static inline struct list_head *ghes_get_devices(void) { return NULL; } + +static inline void ghes_estatus_pool_region_free(unsigned long addr, u32 size) { return; } #endif
int ghes_estatus_pool_init(unsigned int num_ghes);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Marangi ansuelsmth@gmail.com
[ Upstream commit d387e34fec407f881fdf165b5d7ec128ebff362f ]
Fiberstone GPON-ONU-34-20B can operate at 2500base-X, but report 1.2GBd NRZ in their EEPROM.
The module also require the ignore tx fault fixup similar to Huawei MA5671A as it gets disabled on error messages with serial redirection enabled.
Signed-off-by: Christian Marangi ansuelsmth@gmail.com Link: https://lore.kernel.org/r/20230919124720.8210-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/sfp.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index d855a18308d78..338b9769d91a1 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -452,6 +452,11 @@ static const struct sfp_quirk sfp_quirks[] = { // Rollball protocol to talk to the PHY. SFP_QUIRK_F("FS", "SFP-10G-T", sfp_fixup_fs_10gt),
+ // Fiberstore GPON-ONU-34-20BI can operate at 2500base-X, but report 1.2GBd + // NRZ in their EEPROM + SFP_QUIRK("FS", "GPON-ONU-34-20BI", sfp_quirk_2500basex, + sfp_fixup_ignore_tx_fault), + SFP_QUIRK_F("HALNy", "HL-GSFP", sfp_fixup_halny_gsfp),
// HG MXPD-483II-F 2.5G supports 2500Base-X, but incorrectly reports
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ingo Rohloff lundril@gmx.de
[ Upstream commit fce9c967820a72f600abbf061d7077861685a14d ]
In the Xiaomi Redmibook 15 Pro (2023) laptop I have got, a wifi chip is used, which according to its PCI Vendor ID is from "ITTIM Technology".
This chip works flawlessly with the mt7921e module. The driver doesn't bind to this PCI device, because the Vendor ID from "ITTIM Technology" is not recognized.
This patch adds the PCI Vendor ID from "ITTIM Technology" to the list of PCI Vendor IDs and lets the mt7921e driver bind to the mentioned wifi chip.
Signed-off-by: Ingo Rohloff lundril@gmx.de Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++ include/linux/pci_ids.h | 2 ++ 2 files changed, 4 insertions(+)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index 95610a117d2f0..ed5a220763ce6 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -17,6 +17,8 @@ static const struct pci_device_id mt7921_pci_device_table[] = { .driver_data = (kernel_ulong_t)MT7921_FIRMWARE_WM }, { PCI_DEVICE(PCI_VENDOR_ID_MEDIATEK, 0x7922), .driver_data = (kernel_ulong_t)MT7922_FIRMWARE_WM }, + { PCI_DEVICE(PCI_VENDOR_ID_ITTIM, 0x7922), + .driver_data = (kernel_ulong_t)MT7922_FIRMWARE_WM }, { PCI_DEVICE(PCI_VENDOR_ID_MEDIATEK, 0x0608), .driver_data = (kernel_ulong_t)MT7921_FIRMWARE_WM }, { PCI_DEVICE(PCI_VENDOR_ID_MEDIATEK, 0x0616), diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 7702f078ef4ad..54bc1ca7b66fc 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -180,6 +180,8 @@ #define PCI_DEVICE_ID_BERKOM_A4T 0xffa4 #define PCI_DEVICE_ID_BERKOM_SCITEL_QUADRO 0xffa8
+#define PCI_VENDOR_ID_ITTIM 0x0b48 + #define PCI_VENDOR_ID_COMPAQ 0x0e11 #define PCI_DEVICE_ID_COMPAQ_TOKENRING 0x0508 #define PCI_DEVICE_ID_COMPAQ_TACHYON 0xa0fc
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0bb4d124d34044179b42a769a0c76f389ae973b6 ]
This field can be read or written without socket lock being held.
Add annotations to avoid load-store tearing.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index fc189910e63fc..df19f99d26357 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2006,21 +2006,33 @@ static inline void sk_tx_queue_set(struct sock *sk, int tx_queue) /* sk_tx_queue_mapping accept only upto a 16-bit value */ if (WARN_ON_ONCE((unsigned short)tx_queue >= USHRT_MAX)) return; - sk->sk_tx_queue_mapping = tx_queue; + /* Paired with READ_ONCE() in sk_tx_queue_get() and + * other WRITE_ONCE() because socket lock might be not held. + */ + WRITE_ONCE(sk->sk_tx_queue_mapping, tx_queue); }
#define NO_QUEUE_MAPPING USHRT_MAX
static inline void sk_tx_queue_clear(struct sock *sk) { - sk->sk_tx_queue_mapping = NO_QUEUE_MAPPING; + /* Paired with READ_ONCE() in sk_tx_queue_get() and + * other WRITE_ONCE() because socket lock might be not held. + */ + WRITE_ONCE(sk->sk_tx_queue_mapping, NO_QUEUE_MAPPING); }
static inline int sk_tx_queue_get(const struct sock *sk) { - if (sk && sk->sk_tx_queue_mapping != NO_QUEUE_MAPPING) - return sk->sk_tx_queue_mapping; + if (sk) { + /* Paired with WRITE_ONCE() in sk_tx_queue_clear() + * and sk_tx_queue_set(). + */ + int val = READ_ONCE(sk->sk_tx_queue_mapping);
+ if (val != NO_QUEUE_MAPPING) + return val; + } return -1; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit eb44ad4e635132754bfbcb18103f1dcb7058aedd ]
This field can be read or written without socket lock being held.
Add annotations to avoid load-store tearing.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 6 +++--- net/core/sock.c | 2 +- net/ipv4/tcp_output.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index df19f99d26357..b9f0ef4bb527a 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2181,7 +2181,7 @@ static inline void __dst_negative_advice(struct sock *sk) if (ndst != dst) { rcu_assign_pointer(sk->sk_dst_cache, ndst); sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); } } } @@ -2198,7 +2198,7 @@ __sk_dst_set(struct sock *sk, struct dst_entry *dst) struct dst_entry *old_dst;
sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); old_dst = rcu_dereference_protected(sk->sk_dst_cache, lockdep_sock_is_held(sk)); rcu_assign_pointer(sk->sk_dst_cache, dst); @@ -2211,7 +2211,7 @@ sk_dst_set(struct sock *sk, struct dst_entry *dst) struct dst_entry *old_dst;
sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); old_dst = xchg((__force struct dst_entry **)&sk->sk_dst_cache, dst); dst_release(old_dst); } diff --git a/net/core/sock.c b/net/core/sock.c index eef27812013a4..6df04c705200a 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -600,7 +600,7 @@ struct dst_entry *__sk_dst_check(struct sock *sk, u32 cookie) INDIRECT_CALL_INET(dst->ops->check, ip6_dst_check, ipv4_dst_check, dst, cookie) == NULL) { sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); RCU_INIT_POINTER(sk->sk_dst_cache, NULL); dst_release(dst); return NULL; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index afa819eede6a3..c2403fea8ec9a 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1316,7 +1316,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, skb->destructor = skb_is_tcp_pure_ack(skb) ? __sock_wfree : tcp_wfree; refcount_add(skb->truesize, &sk->sk_wmem_alloc);
- skb_set_dst_pending_confirm(skb, sk->sk_dst_pending_confirm); + skb_set_dst_pending_confirm(skb, READ_ONCE(sk->sk_dst_pending_confirm));
/* Build TCP header and checksum it. */ th = (struct tcphdr *)skb->data;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit 47c27aa7ded4b8ead19b3487cc42a6185b762903 ]
mhi_alloc_controller() allocates a memory space for mhi_ctrl. When some errors occur, mhi_ctrl should be freed by mhi_free_controller() and set ab_pci->mhi_ctrl = NULL.
We can fix it by calling mhi_free_controller() when the failure happens and set ab_pci->mhi_ctrl = NULL in all of the places where we call mhi_free_controller().
Signed-off-by: Ma Ke make_ruc2021@163.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230922021036.3604157-1-make_ruc2021@163.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath12k/mhi.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c index 42f1140baa4fe..f83d3e09ae366 100644 --- a/drivers/net/wireless/ath/ath12k/mhi.c +++ b/drivers/net/wireless/ath/ath12k/mhi.c @@ -370,8 +370,7 @@ int ath12k_mhi_register(struct ath12k_pci *ab_pci) ret = ath12k_mhi_get_msi(ab_pci); if (ret) { ath12k_err(ab, "failed to get msi for mhi\n"); - mhi_free_controller(mhi_ctrl); - return ret; + goto free_controller; }
mhi_ctrl->iova_start = 0; @@ -388,11 +387,15 @@ int ath12k_mhi_register(struct ath12k_pci *ab_pci) ret = mhi_register_controller(mhi_ctrl, ab->hw_params->mhi_config); if (ret) { ath12k_err(ab, "failed to register to mhi bus, err = %d\n", ret); - mhi_free_controller(mhi_ctrl); - return ret; + goto free_controller; }
return 0; + +free_controller: + mhi_free_controller(mhi_ctrl); + ab_pci->mhi_ctrl = NULL; + return ret; }
void ath12k_mhi_unregister(struct ath12k_pci *ab_pci)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 170c75d43a77dc937c58f07ecf847ba1b42ab74e ]
As talked about in commit d66d24ac300c ("ath10k: Keep track of which interrupts fired, don't poll them"), if we access the copy engine register at a bad time then ath10k can go boom. However, it's not necessarily easy to know when it's safe to access them.
The ChromeOS test labs saw a crash that looked like this at shutdown/reboot time (on a chromeos-5.15 kernel, but likely the problem could also reproduce upstream):
Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP ... CPU: 4 PID: 6168 Comm: reboot Not tainted 5.15.111-lockdep-19350-g1d624fe6758f #1 010b9b233ab055c27c6dc88efb0be2f4e9e86f51 Hardware name: Google Kingoftown (DT) ... pc : ath10k_snoc_read32+0x50/0x74 [ath10k_snoc] lr : ath10k_snoc_read32+0x24/0x74 [ath10k_snoc] ... Call trace: ath10k_snoc_read32+0x50/0x74 [ath10k_snoc ...] ath10k_ce_disable_interrupt+0x190/0x65c [ath10k_core ...] ath10k_ce_disable_interrupts+0x8c/0x120 [ath10k_core ...] ath10k_snoc_hif_stop+0x78/0x660 [ath10k_snoc ...] ath10k_core_stop+0x13c/0x1ec [ath10k_core ...] ath10k_halt+0x398/0x5b0 [ath10k_core ...] ath10k_stop+0xfc/0x1a8 [ath10k_core ...] drv_stop+0x148/0x6b4 [mac80211 ...] ieee80211_stop_device+0x70/0x80 [mac80211 ...] ieee80211_do_stop+0x10d8/0x15b0 [mac80211 ...] ieee80211_stop+0x144/0x1a0 [mac80211 ...] __dev_close_many+0x1e8/0x2c0 dev_close_many+0x198/0x33c dev_close+0x140/0x210 cfg80211_shutdown_all_interfaces+0xc8/0x1e0 [cfg80211 ...] ieee80211_remove_interfaces+0x118/0x5c4 [mac80211 ...] ieee80211_unregister_hw+0x64/0x1f4 [mac80211 ...] ath10k_mac_unregister+0x4c/0xf0 [ath10k_core ...] ath10k_core_unregister+0x80/0xb0 [ath10k_core ...] ath10k_snoc_free_resources+0xb8/0x1ec [ath10k_snoc ...] ath10k_snoc_shutdown+0x98/0xd0 [ath10k_snoc ...] platform_shutdown+0x7c/0xa0 device_shutdown+0x3e0/0x58c kernel_restart_prepare+0x68/0xa0 kernel_restart+0x28/0x7c
Though there's no known way to reproduce the problem, it makes sense that it would be the same issue where we're trying to access copy engine registers when it's not allowed.
Let's fix this by changing how we "disable" the interrupts. Instead of tweaking the copy engine registers we'll just use disable_irq() and enable_irq(). Then we'll configure the interrupts once at power up time.
Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2.c10-00754-QCAHLSWMTPL-1
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230630151842.1.If764ede23c4e09a43a842771c2ddf996... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/snoc.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index 26214c00cd0d7..2c39bad7ebfb9 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -828,12 +828,20 @@ static void ath10k_snoc_hif_get_default_pipe(struct ath10k *ar,
static inline void ath10k_snoc_irq_disable(struct ath10k *ar) { - ath10k_ce_disable_interrupts(ar); + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); + int id; + + for (id = 0; id < CE_COUNT_MAX; id++) + disable_irq(ar_snoc->ce_irqs[id].irq_line); }
static inline void ath10k_snoc_irq_enable(struct ath10k *ar) { - ath10k_ce_enable_interrupts(ar); + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); + int id; + + for (id = 0; id < CE_COUNT_MAX; id++) + enable_irq(ar_snoc->ce_irqs[id].irq_line); }
static void ath10k_snoc_rx_pipe_cleanup(struct ath10k_snoc_pipe *snoc_pipe) @@ -1090,6 +1098,8 @@ static int ath10k_snoc_hif_power_up(struct ath10k *ar, goto err_free_rri; }
+ ath10k_ce_enable_interrupts(ar); + return 0;
err_free_rri: @@ -1253,8 +1263,8 @@ static int ath10k_snoc_request_irq(struct ath10k *ar)
for (id = 0; id < CE_COUNT_MAX; id++) { ret = request_irq(ar_snoc->ce_irqs[id].irq_line, - ath10k_snoc_per_engine_handler, 0, - ce_name[id], ar); + ath10k_snoc_per_engine_handler, + IRQF_NO_AUTOEN, ce_name[id], ar); if (ret) { ath10k_err(ar, "failed to register IRQ handler for CE %d: %d\n",
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Raju Lakkaraju Raju.Lakkaraju@microchip.com
[ Upstream commit e27aca3760c08b7b05aea71068bd609aa93e7b35 ]
Add a quirk for a copper SFP that identifies itself as "FS" "SFP-2.5G-T". This module's PHY is inaccessible, and can only run at 2500base-X with the host without negotiation. Add a quirk to enable the 2500base-X interface mode with 2500base-T support and disable auto negotiation.
Signed-off-by: Raju Lakkaraju Raju.Lakkaraju@microchip.com Link: https://lore.kernel.org/r/20230925080059.266240-1-Raju.Lakkaraju@microchip.c... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/sfp.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 338b9769d91a1..f411ded5344a8 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -468,6 +468,9 @@ static const struct sfp_quirk sfp_quirks[] = { SFP_QUIRK("HUAWEI", "MA5671A", sfp_quirk_2500basex, sfp_fixup_ignore_tx_fault),
+ // FS 2.5G Base-T + SFP_QUIRK_M("FS", "SFP-2.5G-T", sfp_quirk_oem_2_5g), + // Lantech 8330-262D-E can operate at 2500base-X, but incorrectly report // 2500MBd NRZ in their EEPROM SFP_QUIRK_M("Lantech", "8330-262D-E", sfp_quirk_2500basex),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arseniy Krasnov avkrasnov@salutedevices.com
[ Upstream commit 49dbe25adac42d3e06f65d1420946bec65896222 ]
This adds handling of MSG_ERRQUEUE input flag in receive call. This flag is used to read socket's error queue instead of data queue. Possible scenario of error queue usage is receiving completions for transmission with MSG_ZEROCOPY flag. This patch also adds new defines: 'SOL_VSOCK' and 'VSOCK_RECVERR'.
Signed-off-by: Arseniy Krasnov avkrasnov@salutedevices.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/socket.h | 1 + include/uapi/linux/vm_sockets.h | 17 +++++++++++++++++ net/vmw_vsock/af_vsock.c | 6 ++++++ 3 files changed, 24 insertions(+)
diff --git a/include/linux/socket.h b/include/linux/socket.h index 39b74d83c7c4a..cfcb7e2c3813f 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -383,6 +383,7 @@ struct ucred { #define SOL_MPTCP 284 #define SOL_MCTP 285 #define SOL_SMC 286 +#define SOL_VSOCK 287
/* IPX options */ #define IPX_TYPE 1 diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h index c60ca33eac594..ed07181d4eff9 100644 --- a/include/uapi/linux/vm_sockets.h +++ b/include/uapi/linux/vm_sockets.h @@ -191,4 +191,21 @@ struct sockaddr_vm {
#define IOCTL_VM_SOCKETS_GET_LOCAL_CID _IO(7, 0xb9)
+/* MSG_ZEROCOPY notifications are encoded in the standard error format, + * sock_extended_err. See Documentation/networking/msg_zerocopy.rst in + * kernel source tree for more details. + */ + +/* 'cmsg_level' field value of 'struct cmsghdr' for notification parsing + * when MSG_ZEROCOPY flag is used on transmissions. + */ + +#define SOL_VSOCK 287 + +/* 'cmsg_type' field value of 'struct cmsghdr' for notification parsing + * when MSG_ZEROCOPY flag is used on transmissions. + */ + +#define VSOCK_RECVERR 1 + #endif /* _UAPI_VM_SOCKETS_H */ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 020cf17ab7e47..ccd8cefeea7ba 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -89,6 +89,7 @@ #include <linux/types.h> #include <linux/bitops.h> #include <linux/cred.h> +#include <linux/errqueue.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -110,6 +111,7 @@ #include <linux/workqueue.h> #include <net/sock.h> #include <net/af_vsock.h> +#include <uapi/linux/vm_sockets.h>
static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr); static void vsock_sk_destruct(struct sock *sk); @@ -2134,6 +2136,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int err;
sk = sock->sk; + + if (unlikely(flags & MSG_ERRQUEUE)) + return sock_recv_errqueue(sk, msg, len, SOL_VSOCK, VSOCK_RECVERR); + vsk = vsock_sk(sk); err = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit 1a8a315f008a58f54fecb012b928aa6a494435b3 ]
Verifier emits relevant register state involved in any given instruction next to it after `;` to the right, if possible. Or, worst case, on the separate line repeating instruction index.
E.g., a nice and simple case would be:
2: (d5) if r0 s<= 0x0 goto pc+1 ; R0_w=0
But if there is some intervening extra output (e.g., precision backtracking log) involved, we are supposed to see the state after the precision backtrack log:
4: (75) if r0 s>= 0x0 goto pc+1 mark_precise: frame0: last_idx 4 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r0 stack= before 2: (d5) if r0 s<= 0x0 goto pc+1 mark_precise: frame0: regs=r0 stack= before 1: (b7) r0 = 0 6: R0_w=0
First off, note that in `6: R0_w=0` instruction index corresponds to the next instruction, not to the conditional jump instruction itself, which is wrong and we'll get to that.
But besides that, the above is a happy case that does work today. Yet, if it so happens that precision backtracking had to traverse some of the parent states, this `6: R0_w=0` state output would be missing.
This is due to a quirk of print_verifier_state() routine, which performs mark_verifier_state_clean(env) at the end. This marks all registers as "non-scratched", which means that subsequent logic to print *relevant* registers (that is, "scratched ones") fails and doesn't see anything relevant to print and skips the output altogether.
print_verifier_state() is used both to print instruction context, but also to print an **entire** verifier state indiscriminately, e.g., during precision backtracking (and in a few other situations, like during entering or exiting subprogram). Which means if we have to print entire parent state before getting to printing instruction context state, instruction context is marked as clean and is omitted.
Long story short, this is definitely not intentional. So we fix this behavior in this patch by teaching print_verifier_state() to clear scratch state only if it was used to print instruction state, not the parent/callback state. This is determined by print_all option, so if it's not set, we don't clear scratch state. This fixes missing instruction state for these cases.
As for the mismatched instruction index, we fix that by making sure we call print_insn_state() early inside check_cond_jmp_op() before we adjusted insn_idx based on jump branch taken logic. And with that we get desired correct information:
9: (16) if w4 == 0x1 goto pc+9 mark_precise: frame0: last_idx 9 first_idx 9 subseq_idx -1 mark_precise: frame0: parent state regs=r4 stack=: R2_w=1944 R4_rw=P1 R10=fp0 mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx 9 mark_precise: frame0: regs=r4 stack= before 8: (66) if w4 s> 0x3 goto pc+5 mark_precise: frame0: regs=r4 stack= before 7: (b7) r4 = 1 9: R4=1
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: John Fastabend john.fastabend@gmail.com Acked-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/bpf/20231011223728.3188086-6-andrii@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e7e2687c35884..c3b6f282128bf 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1513,7 +1513,8 @@ static void print_verifier_state(struct bpf_verifier_env *env, if (state->in_async_callback_fn) verbose(env, " async_cb"); verbose(env, "\n"); - mark_verifier_state_clean(env); + if (!print_all) + mark_verifier_state_clean(env); }
static inline u32 vlog_alignment(u32 pos) @@ -13915,6 +13916,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, !sanitize_speculative_path(env, insn, *insn_idx + 1, *insn_idx)) return -EFAULT; + if (env->log.level & BPF_LOG_LEVEL) + print_insn_state(env, this_branch->frame[this_branch->curframe]); *insn_idx += insn->off; return 0; } else if (pred == 0) { @@ -13927,6 +13930,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, *insn_idx + insn->off + 1, *insn_idx)) return -EFAULT; + if (env->log.level & BPF_LOG_LEVEL) + print_insn_state(env, this_branch->frame[this_branch->curframe]); return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregory Greenman gregory.greenman@intel.com
[ Upstream commit e25bd1853cc8308158d97e5b3696ea3689fa0840 ]
Check that fw_link_id does not exceed the size of link_id_to_link_conf array. There's no any codepath that can cause that, but it's still safer to verify in case fw_link_id gets corrupted.
Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20231017115047.3385bd11f423.I2d30fdb464f951c648217... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/link.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/link.c b/drivers/net/wireless/intel/iwlwifi/mvm/link.c index 6e1ad65527d12..4ab55a1fcbf04 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/link.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/link.c @@ -60,7 +60,7 @@ int iwl_mvm_add_link(struct iwl_mvm *mvm, struct ieee80211_vif *vif, if (link_info->fw_link_id == IWL_MVM_FW_LINK_ID_INVALID) { link_info->fw_link_id = iwl_mvm_get_free_fw_link_id(mvm, mvmvif); - if (link_info->fw_link_id == IWL_MVM_FW_LINK_ID_INVALID) + if (link_info->fw_link_id >= ARRAY_SIZE(mvm->link_id_to_link_conf)) return -EINVAL;
rcu_assign_pointer(mvm->link_id_to_link_conf[link_info->fw_link_id], @@ -243,7 +243,7 @@ int iwl_mvm_remove_link(struct iwl_mvm *mvm, struct ieee80211_vif *vif, int ret;
if (WARN_ON(!link_info || - link_info->fw_link_id == IWL_MVM_FW_LINK_ID_INVALID)) + link_info->fw_link_id >= ARRAY_SIZE(mvm->link_id_to_link_conf))) return -EINVAL;
RCU_INIT_POINTER(mvm->link_id_to_link_conf[link_info->fw_link_id],
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: youwan Wang wangyouwan@126.com
[ Upstream commit 624820f7c8826dd010e8b1963303c145f99816e9 ]
fix crash because of null pointers
[ 6104.969662] BUG: kernel NULL pointer dereference, address: 00000000000000c8 [ 6104.969667] #PF: supervisor read access in kernel mode [ 6104.969668] #PF: error_code(0x0000) - not-present page [ 6104.969670] PGD 0 P4D 0 [ 6104.969673] Oops: 0000 [#1] SMP NOPTI [ 6104.969684] RIP: 0010:btusb_mtk_hci_wmt_sync+0x144/0x220 [btusb] [ 6104.969688] RSP: 0018:ffffb8d681533d48 EFLAGS: 00010246 [ 6104.969689] RAX: 0000000000000000 RBX: ffff8ad560bb2000 RCX: 0000000000000006 [ 6104.969691] RDX: 0000000000000000 RSI: ffffb8d681533d08 RDI: 0000000000000000 [ 6104.969692] RBP: ffffb8d681533d70 R08: 0000000000000001 R09: 0000000000000001 [ 6104.969694] R10: 0000000000000001 R11: 00000000fa83b2da R12: ffff8ad461d1d7c0 [ 6104.969695] R13: 0000000000000000 R14: ffff8ad459618c18 R15: ffffb8d681533d90 [ 6104.969697] FS: 00007f5a1cab9d40(0000) GS:ffff8ad578200000(0000) knlGS:00000 [ 6104.969699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6104.969700] CR2: 00000000000000c8 CR3: 000000018620c001 CR4: 0000000000760ef0 [ 6104.969701] PKRU: 55555554 [ 6104.969702] Call Trace: [ 6104.969708] btusb_mtk_shutdown+0x44/0x80 [btusb] [ 6104.969732] hci_dev_do_close+0x470/0x5c0 [bluetooth] [ 6104.969748] hci_rfkill_set_block+0x56/0xa0 [bluetooth] [ 6104.969753] rfkill_set_block+0x92/0x160 [ 6104.969755] rfkill_fop_write+0x136/0x1e0 [ 6104.969759] __vfs_write+0x18/0x40 [ 6104.969761] vfs_write+0xdf/0x1c0 [ 6104.969763] ksys_write+0xb1/0xe0 [ 6104.969765] __x64_sys_write+0x1a/0x20 [ 6104.969769] do_syscall_64+0x51/0x180 [ 6104.969771] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 6104.969773] RIP: 0033:0x7f5a21f18fef [ 6104.9] RSP: 002b:00007ffeefe39010 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 6104.969780] RAX: ffffffffffffffda RBX: 000055c10a7560a0 RCX: 00007f5a21f18fef [ 6104.969781] RDX: 0000000000000008 RSI: 00007ffeefe39060 RDI: 0000000000000012 [ 6104.969782] RBP: 00007ffeefe39060 R08: 0000000000000000 R09: 0000000000000017 [ 6104.969784] R10: 00007ffeefe38d97 R11: 0000000000000293 R12: 0000000000000002 [ 6104.969785] R13: 00007ffeefe39220 R14: 00007ffeefe391a0 R15: 000055c10a72acf0
Signed-off-by: youwan Wang wangyouwan@126.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index ca9e2a210fff2..ea29469fe0cff 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -2803,6 +2803,9 @@ static int btusb_mtk_hci_wmt_sync(struct hci_dev *hdev, goto err_free_wc; }
+ if (data->evt_skb == NULL) + goto err_free_wc; + /* Parse and handle the return WMT event */ wmt_evt = (struct btmtk_hci_wmt_evt *)data->evt_skb->data; if (wmt_evt->whdr.op != hdr->op) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: ZhengHan Wang wzhmmmmm@gmail.com
[ Upstream commit a85fb91e3d728bdfc80833167e8162cce8bc7004 ]
syzbot reports a slab use-after-free in hci_conn_hash_flush [1]. After releasing an object using hci_conn_del_sysfs in the hci_conn_cleanup function, releasing the same object again using the hci_dev_put and hci_conn_put functions causes a double free. Here's a simplified flow:
hci_conn_del_sysfs: hci_dev_put put_device kobject_put kref_put kobject_release kobject_cleanup kfree_const kfree(name)
hci_dev_put: ... kfree(name)
hci_conn_put: put_device ... kfree(name)
This patch drop the hci_dev_put and hci_conn_put function call in hci_conn_cleanup function, because the object is freed in hci_conn_del_sysfs function.
This patch also fixes the refcounting in hci_conn_add_sysfs() and hci_conn_del_sysfs() to take into account device_add() failures.
This fixes CVE-2023-28464.
Link: https://syzkaller.appspot.com/bug?id=1bb51491ca5df96a5f724899d1dbb87afda6141... [1]
Signed-off-by: ZhengHan Wang wzhmmmmm@gmail.com Co-developed-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_conn.c | 6 ++---- net/bluetooth/hci_sysfs.c | 23 ++++++++++++----------- 2 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 4e03642488230..c090627ff9751 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -172,13 +172,11 @@ static void hci_conn_cleanup(struct hci_conn *conn) hdev->notify(hdev, HCI_NOTIFY_CONN_DEL); }
- hci_conn_del_sysfs(conn); - debugfs_remove_recursive(conn->debugfs);
- hci_dev_put(hdev); + hci_conn_del_sysfs(conn);
- hci_conn_put(conn); + hci_dev_put(hdev); }
static void hci_acl_create_connection(struct hci_conn *conn) diff --git a/net/bluetooth/hci_sysfs.c b/net/bluetooth/hci_sysfs.c index 15b33579007cb..367e32fe30eb8 100644 --- a/net/bluetooth/hci_sysfs.c +++ b/net/bluetooth/hci_sysfs.c @@ -35,7 +35,7 @@ void hci_conn_init_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- BT_DBG("conn %p", conn); + bt_dev_dbg(hdev, "conn %p", conn);
conn->dev.type = &bt_link; conn->dev.class = &bt_class; @@ -48,27 +48,30 @@ void hci_conn_add_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- BT_DBG("conn %p", conn); + bt_dev_dbg(hdev, "conn %p", conn);
if (device_is_registered(&conn->dev)) return;
dev_set_name(&conn->dev, "%s:%d", hdev->name, conn->handle);
- if (device_add(&conn->dev) < 0) { + if (device_add(&conn->dev) < 0) bt_dev_err(hdev, "failed to register connection device"); - return; - } - - hci_dev_hold(hdev); }
void hci_conn_del_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- if (!device_is_registered(&conn->dev)) + bt_dev_dbg(hdev, "conn %p", conn); + + if (!device_is_registered(&conn->dev)) { + /* If device_add() has *not* succeeded, use *only* put_device() + * to drop the reference count. + */ + put_device(&conn->dev); return; + }
while (1) { struct device *dev; @@ -80,9 +83,7 @@ void hci_conn_del_sysfs(struct hci_conn *conn) put_device(dev); }
- device_del(&conn->dev); - - hci_dev_put(hdev); + device_unregister(&conn->dev); }
static void bt_host_release(struct device *dev)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Denose jdenose@chromium.org
[ Upstream commit 891ddc03e2f4395e24795596e032f57d5ab37fe7 ]
Add GPE quirk entry for HP 250 G7 Notebook PC.
This change allows the lid switch to be identified as the lid switch and not a keyboard button. With the lid switch properly identified, the device triggers suspend correctly on lid close.
Signed-off-by: Jonathan Denose jdenose@google.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/ec.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index c95d0edb0be9e..a59c11df73754 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -1924,6 +1924,16 @@ static const struct dmi_system_id ec_dmi_table[] __initconst = { DMI_MATCH(DMI_PRODUCT_NAME, "HP Pavilion Gaming Laptop 15-dk1xxx"), }, }, + { + /* + * HP 250 G7 Notebook PC + */ + .callback = ec_honor_dsdt_gpe, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "HP"), + DMI_MATCH(DMI_PRODUCT_NAME, "HP 250 G7 Notebook PC"), + }, + }, { /* * Samsung hardware
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gerhard Engleder gerhard@engleder-embedded.com
[ Upstream commit 00e984cb986b31e9313745e51daceaa1e1eb7351 ]
Compiler warns about a possible format-overflow in tsnep_request_irq(): drivers/net/ethernet/engleder/tsnep_main.c:884:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=] sprintf(queue->name, "%s-rx-%d", name, ^ drivers/net/ethernet/engleder/tsnep_main.c:881:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=] sprintf(queue->name, "%s-tx-%d", name, ^ drivers/net/ethernet/engleder/tsnep_main.c:878:49: warning: '-txrx-' directive writing 6 bytes into a region of size between 5 and 25 [-Wformat-overflow=] sprintf(queue->name, "%s-txrx-%d", name, ^~~~~~
Actually overflow cannot happen. Name is limited to IFNAMSIZ, because netdev_name() is called during ndo_open(). queue_index is single char, because less than 10 queues are supported.
Fix warning with snprintf(). Additionally increase buffer to 32 bytes, because those 7 additional bytes were unused anyway.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202310182028.vmDthIUa-lkp@intel.com/ Signed-off-by: Gerhard Engleder gerhard@engleder-embedded.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://lore.kernel.org/r/20231023183856.58373-1-gerhard@engleder-embedded.c... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/engleder/tsnep.h | 2 +- drivers/net/ethernet/engleder/tsnep_main.c | 12 ++++++------ 2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h index 11b29f56aaf9c..b91abe9efb517 100644 --- a/drivers/net/ethernet/engleder/tsnep.h +++ b/drivers/net/ethernet/engleder/tsnep.h @@ -142,7 +142,7 @@ struct tsnep_rx {
struct tsnep_queue { struct tsnep_adapter *adapter; - char name[IFNAMSIZ + 9]; + char name[IFNAMSIZ + 16];
struct tsnep_tx *tx; struct tsnep_rx *rx; diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c index 479156576bc8a..e3fc894fa3f6f 100644 --- a/drivers/net/ethernet/engleder/tsnep_main.c +++ b/drivers/net/ethernet/engleder/tsnep_main.c @@ -1778,14 +1778,14 @@ static int tsnep_request_irq(struct tsnep_queue *queue, bool first) dev = queue->adapter; } else { if (queue->tx && queue->rx) - sprintf(queue->name, "%s-txrx-%d", name, - queue->rx->queue_index); + snprintf(queue->name, sizeof(queue->name), "%s-txrx-%d", + name, queue->rx->queue_index); else if (queue->tx) - sprintf(queue->name, "%s-tx-%d", name, - queue->tx->queue_index); + snprintf(queue->name, sizeof(queue->name), "%s-tx-%d", + name, queue->tx->queue_index); else - sprintf(queue->name, "%s-rx-%d", name, - queue->rx->queue_index); + snprintf(queue->name, sizeof(queue->name), "%s-rx-%d", + name, queue->rx->queue_index); handler = tsnep_irq_txrx; dev = queue; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 6cc64f6173751d212c9833bde39e856b4f585a3e ]
On the Peaq C1010 2-in-1 INT33FC:00 pin 3 is connected to a "dolby" button. At the ACPI level an _AEI event-handler is connected which sets an ACPI variable to 1 on both edges. This variable can be polled + cleared to 0 using WMI.
Since the variable is set on both edges the WMI interface is pretty useless even when polling. So instead of writing a custom WMI driver for this the x86-android-tablets code instantiates a gpio-keys platform device for the "dolby" button.
Add an ignore_interrupt quirk for INT33FC:00 pin 3 on the Peaq C1010, so that it is not seen as busy when the gpio-keys driver requests it.
Note this replaces a hack in x86-android-tablets where it would call acpi_gpiochip_free_interrupts() on the INT33FC:00 GPIO controller. acpi_gpiochip_free_interrupts() is considered private (internal) gpiolib API so x86-android-tablets should stop using it.
Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Acked-by: Linus Walleij linus.walleij@linaro.org Acked-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Link: https://lore.kernel.org/r/20230909141816.58358-3-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib-acpi.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index a775d2bdac94f..980ec04892173 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -1655,6 +1655,26 @@ static const struct dmi_system_id gpiolib_acpi_quirks[] __initconst = { .ignore_wake = "SYNA1202:00@16", }, }, + { + /* + * On the Peaq C1010 2-in-1 INT33FC:00 pin 3 is connected to + * a "dolby" button. At the ACPI level an _AEI event-handler + * is connected which sets an ACPI variable to 1 on both + * edges. This variable can be polled + cleared to 0 using + * WMI. But since the variable is set on both edges the WMI + * interface is pretty useless even when polling. + * So instead the x86-android-tablets code instantiates + * a gpio-keys platform device for it. + * Ignore the _AEI handler for the pin, so that it is not busy. + */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "PEAQ"), + DMI_MATCH(DMI_PRODUCT_NAME, "PEAQ PMM C1010 MD99187"), + }, + .driver_data = &(struct acpi_gpiolib_dmi_quirk) { + .ignore_interrupt = "INT33FC:00@3", + }, + }, {} /* Terminating entry */ };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tzung-Bi Shih tzungbi@kernel.org
[ Upstream commit e410b4ade83d06a046f6e32b5085997502ba0559 ]
cros_ec_cmd_xfer() uses ec_dev->lock. Initialize it.
Otherwise, dmesg shows the following:
DEBUG_LOCKS_WARN_ON(lock->magic != lock) ... Call Trace: ? __mutex_lock ? __warn ? __mutex_lock ... ? cros_ec_cmd_xfer
Reviewed-by: Guenter Roeck groeck@chromium.org Link: https://lore.kernel.org/r/20231003080504.4011337-1-tzungbi@kernel.org Signed-off-by: Tzung-Bi Shih tzungbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/chrome/cros_ec_proto_test.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/platform/chrome/cros_ec_proto_test.c b/drivers/platform/chrome/cros_ec_proto_test.c index 5b9748e0463bc..63e38671e95a6 100644 --- a/drivers/platform/chrome/cros_ec_proto_test.c +++ b/drivers/platform/chrome/cros_ec_proto_test.c @@ -2668,6 +2668,7 @@ static int cros_ec_proto_test_init(struct kunit *test) ec_dev->dev->release = cros_ec_proto_test_release; ec_dev->cmd_xfer = cros_kunit_ec_xfer_mock; ec_dev->pkt_xfer = cros_kunit_ec_xfer_mock; + mutex_init(&ec_dev->lock);
priv->msg = (struct cros_ec_command *)priv->_msg;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
[ Upstream commit 42604f8eb7ba04b589375049cc76282dad4677d2 ]
With the recent addition of of_pci_prop_ranges() in commit 407d1a51921e ("PCI: Create device tree node for bridge"), the ranges property can have a 3 cells child address, a 3 cells parent address and a 2 cells child size.
A range item property for a PCI device is filled as follow: <BAR_nbr> 0 0 <phys.hi> <phys.mid> <phys.low> <BAR_sizeh> <BAR_sizel> <-- Child --> <-- Parent (PCI definition) --> <- BAR size (64bit) -->
This allow to translate BAR addresses from the DT. For instance: pci@0,0 { #address-cells = <0x03>; #size-cells = <0x02>; device_type = "pci"; compatible = "pci11ab,100", "pciclass,060400", "pciclass,0604"; ranges = <0x82000000 0x00 0xe8000000 0x82000000 0x00 0xe8000000 0x00 0x4400000>; ... dev@0,0 { #address-cells = <0x03>; #size-cells = <0x02>; compatible = "pci1055,9660", "pciclass,020000", "pciclass,0200"; /* Translations for BAR0 to BAR5 */ ranges = <0x00 0x00 0x00 0x82010000 0x00 0xe8000000 0x00 0x2000000 0x01 0x00 0x00 0x82010000 0x00 0xea000000 0x00 0x1000000 0x02 0x00 0x00 0x82010000 0x00 0xeb000000 0x00 0x800000 0x03 0x00 0x00 0x82010000 0x00 0xeb800000 0x00 0x800000 0x04 0x00 0x00 0x82010000 0x00 0xec000000 0x00 0x20000 0x05 0x00 0x00 0x82010000 0x00 0xec020000 0x00 0x2000>; ... pci-ep-bus@0 { #address-cells = <0x01>; #size-cells = <0x01>; compatible = "simple-bus"; /* Translate 0xe2000000 to BAR0 and 0xe0000000 to BAR1 */ ranges = <0xe2000000 0x00 0x00 0x00 0x2000000 0xe0000000 0x01 0x00 0x00 0x1000000>; ... }; }; };
During the translation process, the "default-flags" map() function is used to select the matching item in the ranges table and determine the address offset from this matching item. This map() function simply calls of_read_number() and when address-size is greater than 2, the map() function skips the extra high address part (ie part over 64bit). This lead to a wrong matching item and a wrong offset computation. Also during the translation itself, the extra high part related to the parent address is not present in the translated address.
Fix the "default-flags" map() and translate() in order to take into account the child extra high address part in map() and the parent extra high address part in translate() and so having a correct address translation for ranges patterns such as the one given in the example above.
Signed-off-by: Herve Codina herve.codina@bootlin.com Link: https://lore.kernel.org/r/20231017110221.189299-2-herve.codina@bootlin.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/address.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index e692809ff8227..3219c51777507 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -100,6 +100,32 @@ static unsigned int of_bus_default_get_flags(const __be32 *addr) return IORESOURCE_MEM; }
+static u64 of_bus_default_flags_map(__be32 *addr, const __be32 *range, int na, + int ns, int pna) +{ + u64 cp, s, da; + + /* Check that flags match */ + if (*addr != *range) + return OF_BAD_ADDR; + + /* Read address values, skipping high cell */ + cp = of_read_number(range + 1, na - 1); + s = of_read_number(range + na + pna, ns); + da = of_read_number(addr + 1, na - 1); + + pr_debug("default flags map, cp=%llx, s=%llx, da=%llx\n", cp, s, da); + + if (da < cp || da >= (cp + s)) + return OF_BAD_ADDR; + return da - cp; +} + +static int of_bus_default_flags_translate(__be32 *addr, u64 offset, int na) +{ + /* Keep "flags" part (high cell) in translated address */ + return of_bus_default_translate(addr + 1, offset, na - 1); +}
#ifdef CONFIG_PCI static unsigned int of_bus_pci_get_flags(const __be32 *addr) @@ -374,8 +400,8 @@ static struct of_bus of_busses[] = { .addresses = "reg", .match = of_bus_default_flags_match, .count_cells = of_bus_default_count_cells, - .map = of_bus_default_map, - .translate = of_bus_default_translate, + .map = of_bus_default_flags_map, + .translate = of_bus_default_flags_translate, .has_flags = true, .get_flags = of_bus_default_flags_get_flags, },
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olli Asikainen olli.asikainen@gmail.com
[ Upstream commit 916646758aea81a143ce89103910f715ed923346 ]
Thinkpad X120e also needs this battery quirk.
Signed-off-by: Olli Asikainen olli.asikainen@gmail.com Link: https://lore.kernel.org/r/20231024190922.2742-1-olli.asikainen@gmail.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/thinkpad_acpi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index ad460417f901a..4b13d3e704bf3 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -9810,6 +9810,7 @@ static const struct tpacpi_quirk battery_quirk_table[] __initconst = { * Individual addressing is broken on models that expose the * primary battery as BAT1. */ + TPACPI_Q_LNV('8', 'F', true), /* Thinkpad X120e */ TPACPI_Q_LNV('J', '7', true), /* B5400 */ TPACPI_Q_LNV('J', 'I', true), /* Thinkpad 11e */ TPACPI_Q_LNV3('R', '0', 'B', true), /* Thinkpad 11e gen 3 */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sui Jingfeng suijingfeng@loongson.cn
[ Upstream commit da596080b2b400c50fe9f8f237bcaf09fed06af8 ]
Because the gma_irq_install() is call after psb_gem_mm_init() function, when psb_gem_mm_init() fails, the interrupt line haven't been allocated. Yet the gma_irq_uninstall() is called in the psb_driver_unload() function without checking if checking the irq is registered or not.
The calltrace is appended as following:
[ 20.539253] ioremap memtype_reserve failed -16 [ 20.543895] gma500 0000:00:02.0: Failure to map stolen base. [ 20.565049] ------------[ cut here ]------------ [ 20.565066] Trying to free already-free IRQ 16 [ 20.565087] WARNING: CPU: 1 PID: 381 at kernel/irq/manage.c:1893 free_irq+0x209/0x370 [ 20.565316] CPU: 1 PID: 381 Comm: systemd-udevd Tainted: G C 6.5.0-rc1+ #368 [ 20.565329] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./IMB-140D Plus, BIOS P1.10 11/18/2013 [ 20.565338] RIP: 0010:free_irq+0x209/0x370 [ 20.565357] Code: 41 5d 41 5e 41 5f 5d 31 d2 89 d1 89 d6 89 d7 41 89 d1 c3 cc cc cc cc 8b 75 d0 48 c7 c7 e0 77 12 9f 4c 89 4d c8 e8 57 fe f4 ff <0f> 0b 48 8b 75 c8 4c 89 f7 e8 29 f3 f1 00 49 8b 47 40 48 8b 40 78 [ 20.565369] RSP: 0018:ffffae3b40733808 EFLAGS: 00010046 [ 20.565382] RAX: 0000000000000000 RBX: ffff9f8082bfe000 RCX: 0000000000000000 [ 20.565390] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 20.565397] RBP: ffffae3b40733840 R08: 0000000000000000 R09: 0000000000000000 [ 20.565405] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f80871c3100 [ 20.565413] R13: ffff9f80835d3360 R14: ffff9f80835d32a4 R15: ffff9f80835d3200 [ 20.565424] FS: 00007f13d36458c0(0000) GS:ffff9f8138880000(0000) knlGS:0000000000000000 [ 20.565434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 20.565441] CR2: 00007f0d046f3f20 CR3: 0000000006c8c000 CR4: 00000000000006e0 [ 20.565450] Call Trace: [ 20.565458] <TASK> [ 20.565470] ? show_regs+0x72/0x90 [ 20.565488] ? free_irq+0x209/0x370 [ 20.565504] ? __warn+0x8d/0x160 [ 20.565520] ? free_irq+0x209/0x370 [ 20.565536] ? report_bug+0x1bb/0x1d0 [ 20.565555] ? handle_bug+0x46/0x90 [ 20.565572] ? exc_invalid_op+0x19/0x80 [ 20.565587] ? asm_exc_invalid_op+0x1b/0x20 [ 20.565607] ? free_irq+0x209/0x370 [ 20.565625] ? free_irq+0x209/0x370 [ 20.565644] gma_irq_uninstall+0x15b/0x1e0 [gma500_gfx] [ 20.565728] psb_driver_unload+0x27/0x190 [gma500_gfx] [ 20.565800] psb_pci_probe+0x5d2/0x790 [gma500_gfx] [ 20.565873] local_pci_probe+0x48/0xb0 [ 20.565892] pci_device_probe+0xc8/0x280 [ 20.565912] really_probe+0x1d2/0x440 [ 20.565929] __driver_probe_device+0x8a/0x190 [ 20.565944] driver_probe_device+0x23/0xd0 [ 20.565957] __driver_attach+0x10f/0x220 [ 20.565971] ? __pfx___driver_attach+0x10/0x10 [ 20.565984] bus_for_each_dev+0x7a/0xe0 [ 20.566002] driver_attach+0x1e/0x30 [ 20.566014] bus_add_driver+0x127/0x240 [ 20.566029] driver_register+0x64/0x140 [ 20.566043] ? __pfx_psb_init+0x10/0x10 [gma500_gfx] [ 20.566111] __pci_register_driver+0x68/0x80 [ 20.566128] psb_init+0x2c/0xff0 [gma500_gfx] [ 20.566194] do_one_initcall+0x46/0x330 [ 20.566214] ? kmalloc_trace+0x2a/0xb0 [ 20.566233] do_init_module+0x6a/0x270 [ 20.566250] load_module+0x207f/0x23a0 [ 20.566278] init_module_from_file+0x9c/0xf0 [ 20.566293] ? init_module_from_file+0x9c/0xf0 [ 20.566315] idempotent_init_module+0x184/0x240 [ 20.566335] __x64_sys_finit_module+0x64/0xd0 [ 20.566352] do_syscall_64+0x59/0x90 [ 20.566366] ? ksys_mmap_pgoff+0x123/0x270 [ 20.566378] ? __secure_computing+0x9b/0x110 [ 20.566392] ? exit_to_user_mode_prepare+0x39/0x190 [ 20.566406] ? syscall_exit_to_user_mode+0x2a/0x50 [ 20.566420] ? do_syscall_64+0x69/0x90 [ 20.566433] ? do_syscall_64+0x69/0x90 [ 20.566445] ? do_syscall_64+0x69/0x90 [ 20.566458] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 20.566472] RIP: 0033:0x7f13d351ea3d [ 20.566485] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 a3 0f 00 f7 d8 64 89 01 48 [ 20.566496] RSP: 002b:00007ffe566c1fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 20.566510] RAX: ffffffffffffffda RBX: 000055e66806eec0 RCX: 00007f13d351ea3d [ 20.566519] RDX: 0000000000000000 RSI: 00007f13d36d9441 RDI: 0000000000000010 [ 20.566527] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002 [ 20.566535] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f13d36d9441 [ 20.566543] R13: 000055e6681108c0 R14: 000055e66805ba70 R15: 000055e66819a9c0 [ 20.566559] </TASK> [ 20.566566] ---[ end trace 0000000000000000 ]---
Signed-off-by: Sui Jingfeng suijingfeng@loongson.cn Signed-off-by: Patrik Jakobsson patrik.r.jakobsson@gmail.com Link: https://patchwork.freedesktop.org/patch/msgid/20230727185855.713318-1-suijin... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/gma500/psb_drv.h | 1 + drivers/gpu/drm/gma500/psb_irq.c | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h index f7f709df99b49..70d9adafa2333 100644 --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -424,6 +424,7 @@ struct drm_psb_private { uint32_t pipestat[PSB_NUM_PIPE];
spinlock_t irqmask_lock; + bool irq_enabled;
/* Power */ bool pm_initialized; diff --git a/drivers/gpu/drm/gma500/psb_irq.c b/drivers/gpu/drm/gma500/psb_irq.c index 343c51250207d..7bbb79b0497d8 100644 --- a/drivers/gpu/drm/gma500/psb_irq.c +++ b/drivers/gpu/drm/gma500/psb_irq.c @@ -327,6 +327,8 @@ int gma_irq_install(struct drm_device *dev)
gma_irq_postinstall(dev);
+ dev_priv->irq_enabled = true; + return 0; }
@@ -337,6 +339,9 @@ void gma_irq_uninstall(struct drm_device *dev) unsigned long irqflags; unsigned int i;
+ if (!dev_priv->irq_enabled) + return; + spin_lock_irqsave(&dev_priv->irqmask_lock, irqflags);
if (dev_priv->ops->hotplug_enable)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harish Kasiviswanathan Harish.Kasiviswanathan@amd.com
[ Upstream commit 37fb87910724f21a1f27a75743d4f9accdee77fb ]
No functional change. Use ratelimited version of pr_ to avoid overflowing of dmesg buffer
Signed-off-by: Harish Kasiviswanathan Harish.Kasiviswanathan@amd.com Reviewed-by: Philip Yang philip.yang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c index c7991e07b6be5..a7697ec8188e0 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c @@ -268,7 +268,7 @@ static void event_interrupt_wq_v10(struct kfd_node *dev, SQ_INTERRUPT_WORD_WAVE_CTXID1, ENCODING); switch (encoding) { case SQ_INTERRUPT_WORD_ENCODING_AUTO: - pr_debug( + pr_debug_ratelimited( "sq_intr: auto, se %d, ttrace %d, wlt %d, ttrac_buf0_full %d, ttrac_buf1_full %d, ttrace_utc_err %d\n", REG_GET_FIELD(context_id1, SQ_INTERRUPT_WORD_AUTO_CTXID1, SE_ID), @@ -284,7 +284,7 @@ static void event_interrupt_wq_v10(struct kfd_node *dev, THREAD_TRACE_UTC_ERROR)); break; case SQ_INTERRUPT_WORD_ENCODING_INST: - pr_debug("sq_intr: inst, se %d, data 0x%x, sa %d, priv %d, wave_id %d, simd_id %d, wgp_id %d\n", + pr_debug_ratelimited("sq_intr: inst, se %d, data 0x%x, sa %d, priv %d, wave_id %d, simd_id %d, wgp_id %d\n", REG_GET_FIELD(context_id1, SQ_INTERRUPT_WORD_WAVE_CTXID1, SE_ID), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID0, @@ -310,7 +310,7 @@ static void event_interrupt_wq_v10(struct kfd_node *dev, case SQ_INTERRUPT_WORD_ENCODING_ERROR: sq_intr_err_type = REG_GET_FIELD(context_id0, KFD_CTXID0, ERR_TYPE); - pr_warn("sq_intr: error, se %d, data 0x%x, sa %d, priv %d, wave_id %d, simd_id %d, wgp_id %d, err_type %d\n", + pr_warn_ratelimited("sq_intr: error, se %d, data 0x%x, sa %d, priv %d, wave_id %d, simd_id %d, wgp_id %d, err_type %d\n", REG_GET_FIELD(context_id1, SQ_INTERRUPT_WORD_WAVE_CTXID1, SE_ID), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID0, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c index f933bd231fb9c..2a65792fd1162 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c @@ -150,7 +150,7 @@ enum SQ_INTERRUPT_ERROR_TYPE {
static void print_sq_intr_info_auto(uint32_t context_id0, uint32_t context_id1) { - pr_debug( + pr_debug_ratelimited( "sq_intr: auto, ttrace %d, wlt %d, ttrace_buf_full %d, reg_tms %d, cmd_tms %d, host_cmd_ovf %d, host_reg_ovf %d, immed_ovf %d, ttrace_utc_err %d\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_AUTO_CTXID0, THREAD_TRACE), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_AUTO_CTXID0, WLT), @@ -165,7 +165,7 @@ static void print_sq_intr_info_auto(uint32_t context_id0, uint32_t context_id1)
static void print_sq_intr_info_inst(uint32_t context_id0, uint32_t context_id1) { - pr_debug( + pr_debug_ratelimited( "sq_intr: inst, data 0x%08x, sh %d, priv %d, wave_id %d, simd_id %d, wgp_id %d\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID0, DATA), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID0, SH_ID), @@ -177,7 +177,7 @@ static void print_sq_intr_info_inst(uint32_t context_id0, uint32_t context_id1)
static void print_sq_intr_info_error(uint32_t context_id0, uint32_t context_id1) { - pr_warn( + pr_warn_ratelimited( "sq_intr: error, detail 0x%08x, type %d, sh %d, priv %d, wave_id %d, simd_id %d, wgp_id %d\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID0, DETAIL), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_ERROR_CTXID0, TYPE), diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c index f0731a6a5306c..02695ccd22d6e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c @@ -333,7 +333,7 @@ static void event_interrupt_wq_v9(struct kfd_node *dev, encoding = REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, ENCODING); switch (encoding) { case SQ_INTERRUPT_WORD_ENCODING_AUTO: - pr_debug( + pr_debug_ratelimited( "sq_intr: auto, se %d, ttrace %d, wlt %d, ttrac_buf_full %d, reg_tms %d, cmd_tms %d, host_cmd_ovf %d, host_reg_ovf %d, immed_ovf %d, ttrace_utc_err %d\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_AUTO_CTXID, SE_ID), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_AUTO_CTXID, THREAD_TRACE), @@ -347,7 +347,7 @@ static void event_interrupt_wq_v9(struct kfd_node *dev, REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_AUTO_CTXID, THREAD_TRACE_UTC_ERROR)); break; case SQ_INTERRUPT_WORD_ENCODING_INST: - pr_debug("sq_intr: inst, se %d, data 0x%x, sh %d, priv %d, wave_id %d, simd_id %d, cu_id %d, intr_data 0x%x\n", + pr_debug_ratelimited("sq_intr: inst, se %d, data 0x%x, sh %d, priv %d, wave_id %d, simd_id %d, cu_id %d, intr_data 0x%x\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, SE_ID), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, DATA), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, SH_ID), @@ -366,7 +366,7 @@ static void event_interrupt_wq_v9(struct kfd_node *dev, break; case SQ_INTERRUPT_WORD_ENCODING_ERROR: sq_intr_err = REG_GET_FIELD(sq_int_data, KFD_SQ_INT_DATA, ERR_TYPE); - pr_warn("sq_intr: error, se %d, data 0x%x, sh %d, priv %d, wave_id %d, simd_id %d, cu_id %d, err_type %d\n", + pr_warn_ratelimited("sq_intr: error, se %d, data 0x%x, sh %d, priv %d, wave_id %d, simd_id %d, cu_id %d, err_type %d\n", REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, SE_ID), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, DATA), REG_GET_FIELD(context_id0, SQ_INTERRUPT_WORD_WAVE_CTXID, SH_ID),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: baozhu.liu lucas.liu@siengine.com
[ Upstream commit 19ecbe8325a2a7ffda5ff4790955b84eaccba49f ]
If komeda_pipeline_unbound_components() returns -EDEADLK, it means that a deadlock happened in the locking context. Currently, komeda is not dealing with the deadlock properly,producing the following output when CONFIG_DEBUG_WW_MUTEX_SLOWPATH is enabled:
------------[ cut here ]------------ [ 26.103984] WARNING: CPU: 2 PID: 345 at drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c:1248 komeda_release_unclaimed_resources+0x13c/0x170 [ 26.117453] Modules linked in: [ 26.120511] CPU: 2 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 [ 26.131374] Hardware name: Siengine Se1000 Evaluation board (DT) [ 26.137379] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) [ 26.143385] pc : komeda_release_unclaimed_resources+0x13c/0x170 [ 26.149301] lr : komeda_release_unclaimed_resources+0xbc/0x170 [ 26.155130] sp : ffff800017b8b8d0 [ 26.158442] pmr_save: 000000e0 [ 26.161493] x29: ffff800017b8b8d0 x28: ffff000cf2f96200 [ 26.166805] x27: ffff000c8f5a8800 x26: 0000000000000000 [ 26.172116] x25: 0000000000000038 x24: ffff8000116a0140 [ 26.177428] x23: 0000000000000038 x22: ffff000cf2f96200 [ 26.182739] x21: ffff000cfc300300 x20: ffff000c8ab77080 [ 26.188051] x19: 0000000000000003 x18: 0000000000000000 [ 26.193362] x17: 0000000000000000 x16: 0000000000000000 [ 26.198672] x15: b400e638f738ba38 x14: 0000000000000000 [ 26.203983] x13: 0000000106400a00 x12: 0000000000000000 [ 26.209294] x11: 0000000000000000 x10: 0000000000000000 [ 26.214604] x9 : ffff800012f80000 x8 : ffff000ca3308000 [ 26.219915] x7 : 0000000ff3000000 x6 : ffff80001084034c [ 26.225226] x5 : ffff800017b8bc40 x4 : 000000000000000f [ 26.230536] x3 : ffff000ca3308000 x2 : 0000000000000000 [ 26.235847] x1 : 0000000000000000 x0 : ffffffffffffffdd [ 26.241158] Call trace: [ 26.243604] komeda_release_unclaimed_resources+0x13c/0x170 [ 26.249175] komeda_crtc_atomic_check+0x68/0xf0 [ 26.253706] drm_atomic_helper_check_planes+0x138/0x1f4 [ 26.258929] komeda_kms_check+0x284/0x36c [ 26.262939] drm_atomic_check_only+0x40c/0x714 [ 26.267381] drm_atomic_nonblocking_commit+0x1c/0x60 [ 26.272344] drm_mode_atomic_ioctl+0xa3c/0xb8c [ 26.276787] drm_ioctl_kernel+0xc4/0x120 [ 26.280708] drm_ioctl+0x268/0x534 [ 26.284109] __arm64_sys_ioctl+0xa8/0xf0 [ 26.288030] el0_svc_common.constprop.0+0x80/0x240 [ 26.292817] do_el0_svc+0x24/0x90 [ 26.296132] el0_svc+0x20/0x30 [ 26.299185] el0_sync_handler+0xe8/0xf0 [ 26.303018] el0_sync+0x1a4/0x1c0 [ 26.306330] irq event stamp: 0 [ 26.309384] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 26.315650] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.323825] softirqs last enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.331997] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 26.338261] ---[ end trace 20ae984fa860184a ]--- [ 26.343021] ------------[ cut here ]------------ [ 26.347646] WARNING: CPU: 3 PID: 345 at drivers/gpu/drm/drm_modeset_lock.c:228 drm_modeset_drop_locks+0x84/0x90 [ 26.357727] Modules linked in: [ 26.360783] CPU: 3 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 [ 26.371645] Hardware name: Siengine Se1000 Evaluation board (DT) [ 26.377647] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) [ 26.383649] pc : drm_modeset_drop_locks+0x84/0x90 [ 26.388351] lr : drm_mode_atomic_ioctl+0x860/0xb8c [ 26.393137] sp : ffff800017b8bb10 [ 26.396447] pmr_save: 000000e0 [ 26.399497] x29: ffff800017b8bb10 x28: 0000000000000001 [ 26.404807] x27: 0000000000000038 x26: 0000000000000002 [ 26.410115] x25: ffff000cecbefa00 x24: ffff000cf2f96200 [ 26.415423] x23: 0000000000000001 x22: 0000000000000018 [ 26.420731] x21: 0000000000000001 x20: ffff800017b8bc10 [ 26.426039] x19: 0000000000000000 x18: 0000000000000000 [ 26.431347] x17: 0000000002e8bf2c x16: 0000000002e94c6b [ 26.436655] x15: 0000000002ea48b9 x14: ffff8000121f0300 [ 26.441963] x13: 0000000002ee2ca8 x12: ffff80001129cae0 [ 26.447272] x11: ffff800012435000 x10: ffff000ed46b5e88 [ 26.452580] x9 : ffff000c9935e600 x8 : 0000000000000000 [ 26.457888] x7 : 000000008020001e x6 : 000000008020001f [ 26.463196] x5 : ffff80001085fbe0 x4 : fffffe0033a59f20 [ 26.468504] x3 : 000000008020001e x2 : 0000000000000000 [ 26.473813] x1 : 0000000000000000 x0 : ffff000c8f596090 [ 26.479122] Call trace: [ 26.481566] drm_modeset_drop_locks+0x84/0x90 [ 26.485918] drm_mode_atomic_ioctl+0x860/0xb8c [ 26.490359] drm_ioctl_kernel+0xc4/0x120 [ 26.494278] drm_ioctl+0x268/0x534 [ 26.497677] __arm64_sys_ioctl+0xa8/0xf0 [ 26.501598] el0_svc_common.constprop.0+0x80/0x240 [ 26.506384] do_el0_svc+0x24/0x90 [ 26.509697] el0_svc+0x20/0x30 [ 26.512748] el0_sync_handler+0xe8/0xf0 [ 26.516580] el0_sync+0x1a4/0x1c0 [ 26.519891] irq event stamp: 0 [ 26.522943] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 26.529207] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.537379] softirqs last enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.545550] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 26.551812] ---[ end trace 20ae984fa860184b ]---
According to the call trace information,it can be located to be WARN_ON(IS_ERR(c_st)) in the komeda_pipeline_unbound_components function; Then follow the function. komeda_pipeline_unbound_components -> komeda_component_get_state_and_set_user -> komeda_pipeline_get_state_and_set_crtc -> komeda_pipeline_get_state ->drm_atomic_get_private_obj_state -> drm_atomic_get_private_obj_state -> drm_modeset_lock
komeda_pipeline_unbound_components -> komeda_component_get_state_and_set_user -> komeda_component_get_state -> drm_atomic_get_private_obj_state -> drm_modeset_lock
ret = drm_modeset_lock(&obj->lock, state->acquire_ctx); if (ret) return ERR_PTR(ret); Here it return -EDEADLK.
deal with the deadlock as suggested by [1], using the function drm_modeset_backoff(). [1] https://docs.kernel.org/gpu/drm-kms.html?highlight=kms#kms-locking
Therefore, handling this problem can be solved by adding return -EDEADLK back to the drm_modeset_backoff processing flow in the drm_mode_atomic_ioctl function.
Signed-off-by: baozhu.liu lucas.liu@siengine.com Signed-off-by: menghui.huang menghui.huang@siengine.com Reviewed-by: Liviu Dudau liviu.dudau@arm.com Signed-off-by: Liviu Dudau liviu.dudau@arm.com Link: https://patchwork.freedesktop.org/patch/msgid/20230804013117.6870-1-menghui.... Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/arm/display/komeda/komeda_pipeline_state.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c index 3276a3e82c628..916f2c36bf2f7 100644 --- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c +++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c @@ -1223,7 +1223,7 @@ int komeda_build_display_data_flow(struct komeda_crtc *kcrtc, return 0; }
-static void +static int komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, struct komeda_pipeline_state *new) { @@ -1243,8 +1243,12 @@ komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, c = komeda_pipeline_get_component(pipe, id); c_st = komeda_component_get_state_and_set_user(c, drm_st, NULL, new->crtc); + if (PTR_ERR(c_st) == -EDEADLK) + return -EDEADLK; WARN_ON(IS_ERR(c_st)); } + + return 0; }
/* release unclaimed pipeline resource */ @@ -1266,9 +1270,8 @@ int komeda_release_unclaimed_resources(struct komeda_pipeline *pipe, if (WARN_ON(IS_ERR_OR_NULL(st))) return -EINVAL;
- komeda_pipeline_unbound_components(pipe, st); + return komeda_pipeline_unbound_components(pipe, st);
- return 0; }
/* Since standalone disabled components must be disabled separately and in the
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alvin Lee Alvin.Lee2@amd.com
[ Upstream commit e87a6c5b7780b5f423797351eb586ed96cc6d151 ]
[Description] Before enabling the phantom OTG for an update we must enable DPG to avoid underflow.
Reviewed-by: Samson Tam samson.tam@amd.com Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Alvin Lee Alvin.Lee2@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 50 +------------------ .../drm/amd/display/dc/dcn20/dcn20_hwseq.c | 10 +++- .../drm/amd/display/dc/dcn32/dcn32_hwseq.c | 46 +++++++++++++++++ .../drm/amd/display/dc/dcn32/dcn32_hwseq.h | 5 ++ .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 1 + .../gpu/drm/amd/display/dc/inc/hw_sequencer.h | 5 ++ 6 files changed, 68 insertions(+), 49 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 609048160aa20..ab79bcd264164 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1070,53 +1070,6 @@ static void apply_ctx_interdependent_lock(struct dc *dc, struct dc_state *contex } }
-static void phantom_pipe_blank( - struct dc *dc, - struct timing_generator *tg, - int width, - int height) -{ - struct dce_hwseq *hws = dc->hwseq; - enum dc_color_space color_space; - struct tg_color black_color = {0}; - struct output_pixel_processor *opp = NULL; - uint32_t num_opps, opp_id_src0, opp_id_src1; - uint32_t otg_active_width, otg_active_height; - uint32_t i; - - /* program opp dpg blank color */ - color_space = COLOR_SPACE_SRGB; - color_space_to_black_color(dc, color_space, &black_color); - - otg_active_width = width; - otg_active_height = height; - - /* get the OPTC source */ - tg->funcs->get_optc_source(tg, &num_opps, &opp_id_src0, &opp_id_src1); - ASSERT(opp_id_src0 < dc->res_pool->res_cap->num_opp); - - for (i = 0; i < dc->res_pool->res_cap->num_opp; i++) { - if (dc->res_pool->opps[i] != NULL && dc->res_pool->opps[i]->inst == opp_id_src0) { - opp = dc->res_pool->opps[i]; - break; - } - } - - if (opp && opp->funcs->opp_set_disp_pattern_generator) - opp->funcs->opp_set_disp_pattern_generator( - opp, - CONTROLLER_DP_TEST_PATTERN_SOLID_COLOR, - CONTROLLER_DP_COLOR_SPACE_UDEFINED, - COLOR_DEPTH_UNDEFINED, - &black_color, - otg_active_width, - otg_active_height, - 0); - - if (tg->funcs->is_tg_enabled(tg)) - hws->funcs.wait_for_blank_complete(opp); -} - static void dc_update_viusal_confirm_color(struct dc *dc, struct dc_state *context, struct pipe_ctx *pipe_ctx) { if (dc->ctx->dce_version >= DCN_VERSION_1_0) { @@ -1207,7 +1160,8 @@ static void disable_dangling_plane(struct dc *dc, struct dc_state *context)
main_pipe_width = old_stream->mall_stream_config.paired_stream->dst.width; main_pipe_height = old_stream->mall_stream_config.paired_stream->dst.height; - phantom_pipe_blank(dc, tg, main_pipe_width, main_pipe_height); + if (dc->hwss.blank_phantom) + dc->hwss.blank_phantom(dc, tg, main_pipe_width, main_pipe_height); tg->funcs->enable_crtc(tg); } } diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c index 62a077adcdbfa..84fe449a2c7ed 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1846,8 +1846,16 @@ void dcn20_program_front_end_for_ctx( dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.type == SUBVP_PHANTOM) { struct timing_generator *tg = dc->current_state->res_ctx.pipe_ctx[i].stream_res.tg;
- if (tg->funcs->enable_crtc) + if (tg->funcs->enable_crtc) { + if (dc->hwss.blank_phantom) { + int main_pipe_width, main_pipe_height; + + main_pipe_width = dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.paired_stream->dst.width; + main_pipe_height = dc->current_state->res_ctx.pipe_ctx[i].stream->mall_stream_config.paired_stream->dst.height; + dc->hwss.blank_phantom(dc, tg, main_pipe_width, main_pipe_height); + } tg->funcs->enable_crtc(tg); + } } } /* OTG blank before disabling all front ends */ diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c index d52d5feeb311b..b6608d7ab4450 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -1575,3 +1575,49 @@ void dcn32_init_blank( if (opp) hws->funcs.wait_for_blank_complete(opp); } + +void dcn32_blank_phantom(struct dc *dc, + struct timing_generator *tg, + int width, + int height) +{ + struct dce_hwseq *hws = dc->hwseq; + enum dc_color_space color_space; + struct tg_color black_color = {0}; + struct output_pixel_processor *opp = NULL; + uint32_t num_opps, opp_id_src0, opp_id_src1; + uint32_t otg_active_width, otg_active_height; + uint32_t i; + + /* program opp dpg blank color */ + color_space = COLOR_SPACE_SRGB; + color_space_to_black_color(dc, color_space, &black_color); + + otg_active_width = width; + otg_active_height = height; + + /* get the OPTC source */ + tg->funcs->get_optc_source(tg, &num_opps, &opp_id_src0, &opp_id_src1); + ASSERT(opp_id_src0 < dc->res_pool->res_cap->num_opp); + + for (i = 0; i < dc->res_pool->res_cap->num_opp; i++) { + if (dc->res_pool->opps[i] != NULL && dc->res_pool->opps[i]->inst == opp_id_src0) { + opp = dc->res_pool->opps[i]; + break; + } + } + + if (opp && opp->funcs->opp_set_disp_pattern_generator) + opp->funcs->opp_set_disp_pattern_generator( + opp, + CONTROLLER_DP_TEST_PATTERN_SOLID_COLOR, + CONTROLLER_DP_COLOR_SPACE_UDEFINED, + COLOR_DEPTH_UNDEFINED, + &black_color, + otg_active_width, + otg_active_height, + 0); + + if (tg->funcs->is_tg_enabled(tg)) + hws->funcs.wait_for_blank_complete(opp); +} diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h index 2d2628f31bed7..616d5219119e9 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h @@ -115,4 +115,9 @@ void dcn32_init_blank( struct dc *dc, struct timing_generator *tg);
+void dcn32_blank_phantom(struct dc *dc, + struct timing_generator *tg, + int width, + int height); + #endif /* __DC_HWSS_DCN32_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c index 777b2fac20c4e..279f312f74076 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c @@ -115,6 +115,7 @@ static const struct hw_sequencer_funcs dcn32_funcs = { .update_phantom_vp_position = dcn32_update_phantom_vp_position, .update_dsc_pg = dcn32_update_dsc_pg, .apply_update_flags_for_phantom = dcn32_apply_update_flags_for_phantom, + .blank_phantom = dcn32_blank_phantom, };
static const struct hwseq_private_funcs dcn32_private_funcs = { diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h index 02ff99f7bec2b..7a702e216e530 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h @@ -388,6 +388,11 @@ struct hw_sequencer_funcs { void (*z10_restore)(const struct dc *dc); void (*z10_save_init)(struct dc *dc);
+ void (*blank_phantom)(struct dc *dc, + struct timing_generator *tg, + int width, + int height); + void (*update_visual_confirm_color)(struct dc *dc, struct pipe_ctx *pipe_ctx, int mpcc_id);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alvin Lee alvin.lee2@amd.com
[ Upstream commit cbb4c9bc55427774ca4d819933e1b5fa38a6fb44 ]
[Description] - When disabling a phantom pipe, we first enable the phantom OTG so the double buffer update can successfully take place - However, want to avoid locking the phantom otherwise setting DPG_EN=1 for the phantom pipe is blocked (without this we could hit underflow due to phantom HUBP being blanked by default)
Reviewed-by: Samson Tam samson.tam@amd.com Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Alvin Lee alvin.lee2@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index 9834b75f1837b..79befa17bb037 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -111,7 +111,8 @@ void dcn10_lock_all_pipes(struct dc *dc, if (pipe_ctx->top_pipe || !pipe_ctx->stream || (!pipe_ctx->plane_state && !old_pipe_ctx->plane_state) || - !tg->funcs->is_tg_enabled(tg)) + !tg->funcs->is_tg_enabled(tg) || + pipe_ctx->stream->mall_stream_config.type == SUBVP_PHANTOM) continue;
if (lock)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenjing Liu wenjing.liu@amd.com
[ Upstream commit 15c6798ae26d5c7a7776f4f7d0c1fa8c462688a2 ]
[why] We have a few cases where we need to perform update topology update in dc update interface. However some of the updates are not seamless This could cause user noticible glitches. To enforce seamless transition we are adding a checking condition and error logging so the corruption as result of non seamless transition can be easily spotted.
Reviewed-by: Dillon Varone dillon.varone@amd.com Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Wenjing Liu wenjing.liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 8 +++ .../drm/amd/display/dc/dcn32/dcn32_hwseq.c | 52 +++++++++++++++++++ .../drm/amd/display/dc/dcn32/dcn32_hwseq.h | 4 ++ .../gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 1 + .../gpu/drm/amd/display/dc/inc/hw_sequencer.h | 3 ++ 5 files changed, 68 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index ab79bcd264164..93e6265e58509 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -4414,6 +4414,14 @@ bool dc_update_planes_and_stream(struct dc *dc, update_type, context); } else { + if (!stream_update && + dc->hwss.is_pipe_topology_transition_seamless && + !dc->hwss.is_pipe_topology_transition_seamless( + dc, dc->current_state, context)) { + + DC_LOG_ERROR("performing non-seamless pipe topology transition with surface only update!\n"); + BREAK_TO_DEBUGGER(); + } commit_planes_for_stream( dc, srf_updates, diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c index b6608d7ab4450..5b3d0e5b90a3e 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -1621,3 +1621,55 @@ void dcn32_blank_phantom(struct dc *dc, if (tg->funcs->is_tg_enabled(tg)) hws->funcs.wait_for_blank_complete(opp); } + +bool dcn32_is_pipe_topology_transition_seamless(struct dc *dc, + const struct dc_state *cur_ctx, + const struct dc_state *new_ctx) +{ + int i; + const struct pipe_ctx *cur_pipe, *new_pipe; + bool is_seamless = true; + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + cur_pipe = &cur_ctx->res_ctx.pipe_ctx[i]; + new_pipe = &new_ctx->res_ctx.pipe_ctx[i]; + + if (resource_is_pipe_type(cur_pipe, FREE_PIPE) || + resource_is_pipe_type(new_pipe, FREE_PIPE)) + /* adding or removing free pipes is always seamless */ + continue; + else if (resource_is_pipe_type(cur_pipe, OTG_MASTER)) { + if (resource_is_pipe_type(new_pipe, OTG_MASTER)) + if (cur_pipe->stream->stream_id == new_pipe->stream->stream_id) + /* OTG master with the same stream is seamless */ + continue; + } else if (resource_is_pipe_type(cur_pipe, OPP_HEAD)) { + if (resource_is_pipe_type(new_pipe, OPP_HEAD)) { + if (cur_pipe->stream_res.tg == new_pipe->stream_res.tg) + /* + * OPP heads sharing the same timing + * generator is seamless + */ + continue; + } + } else if (resource_is_pipe_type(cur_pipe, DPP_PIPE)) { + if (resource_is_pipe_type(new_pipe, DPP_PIPE)) { + if (cur_pipe->stream_res.opp == new_pipe->stream_res.opp) + /* + * DPP pipes sharing the same OPP head is + * seamless + */ + continue; + } + } + + /* + * This pipe's transition doesn't fall under any seamless + * conditions + */ + is_seamless = false; + break; + } + + return is_seamless; +} diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h index 616d5219119e9..9992e40acd217 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.h @@ -120,4 +120,8 @@ void dcn32_blank_phantom(struct dc *dc, int width, int height);
+bool dcn32_is_pipe_topology_transition_seamless(struct dc *dc, + const struct dc_state *cur_ctx, + const struct dc_state *new_ctx); + #endif /* __DC_HWSS_DCN32_H__ */ diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c index 279f312f74076..12e0f48a13e48 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c @@ -116,6 +116,7 @@ static const struct hw_sequencer_funcs dcn32_funcs = { .update_dsc_pg = dcn32_update_dsc_pg, .apply_update_flags_for_phantom = dcn32_apply_update_flags_for_phantom, .blank_phantom = dcn32_blank_phantom, + .is_pipe_topology_transition_seamless = dcn32_is_pipe_topology_transition_seamless, };
static const struct hwseq_private_funcs dcn32_private_funcs = { diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h index 7a702e216e530..66e680902c95c 100644 --- a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h @@ -401,6 +401,9 @@ struct hw_sequencer_funcs { struct dc_state *context, struct pipe_ctx *phantom_pipe); void (*apply_update_flags_for_phantom)(struct pipe_ctx *phantom_pipe); + bool (*is_pipe_topology_transition_seamless)(struct dc *dc, + const struct dc_state *cur_ctx, + const struct dc_state *new_ctx);
void (*commit_subvp_config)(struct dc *dc, struct dc_state *context); void (*enable_phantom_streams)(struct dc *dc, struct dc_state *context);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ville Syrjälä ville.syrjala@linux.intel.com
[ Upstream commit 2682768bde745b10ae126a322cdcaf532cf88851 ]
There are some weird EDIDs floating around that have the sync pulse extending beyond the end of the blanking period.
On the currently problemtic machine (HP Omni 120) EDID reports the following mode: "1600x900": 60 108000 1600 1780 1860 1800 900 910 913 1000 0x40 0x5 which is then "corrected" to have htotal=1861 by the current drm_edid.c code.
The fixup code was originally added in commit 7064fef56369 ("drm: work around EDIDs with bad htotal/vtotal values"). Googling around we end up in https://bugs.launchpad.net/ubuntu/hardy/+source/xserver-xorg-video-intel/+bu... where we find an EDID for a Dell Studio 15, which reports: (II) VESA(0): clock: 65.0 MHz Image Size: 331 x 207 mm (II) VESA(0): h_active: 1280 h_sync: 1328 h_sync_end 1360 h_blank_end 1337 h_border: 0 (II) VESA(0): v_active: 800 v_sync: 803 v_sync_end 809 v_blanking: 810 v_border: 0
Note that if we use the hblank size (as opposed of the hsync_end) from the DTD to determine htotal we get exactly 60Hz refresh rate in both cases, whereas using hsync_end to determine htotal we get a slightly lower refresh rates. This makes me believe the using the hblank size is what was intended even in those cases.
Also note that in case of the HP Onmi 120 the VBIOS boots with these: crtc timings: 108000 1600 1780 1860 1800 900 910 913 1000, type: 0x40 flags: 0x5 ie. it just blindly stuffs the bogus hsync_end and htotal from the DTD into the transcoder timing registers, and the display works. I believe the (at least more modern) hardware will automagically terminate the hsync pulse when the timing generator reaches htotal, which again points that we should use the hblank size to determine htotal. Unfortunatley the old bug reports for the Dell machines are extremely lacking in useful details so we have no idea what kind of timings the VBIOS programmed into the hardware :(
Let's just flip this quirk around and reduce the length of the sync pulse instead of extending the blanking period. This at least seems to be the correct thing to do on more modern hardware. And if any issues crop up on older hardware we need to debug them properly.
v2: Add debug message breadcrumbs (Jani)
Reviewed-by: Jani Nikula jani.nikula@intel.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8895 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920211934.14920-1-ville.s... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/drm_edid.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 69d855123d3e3..f1ceb7d08519e 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3499,11 +3499,19 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_connector *connecto mode->vsync_end = mode->vsync_start + vsync_pulse_width; mode->vtotal = mode->vdisplay + vblank;
- /* Some EDIDs have bogus h/vtotal values */ - if (mode->hsync_end > mode->htotal) - mode->htotal = mode->hsync_end + 1; - if (mode->vsync_end > mode->vtotal) - mode->vtotal = mode->vsync_end + 1; + /* Some EDIDs have bogus h/vsync_end values */ + if (mode->hsync_end > mode->htotal) { + drm_dbg_kms(dev, "[CONNECTOR:%d:%s] reducing hsync_end %d->%d\n", + connector->base.id, connector->name, + mode->hsync_end, mode->htotal); + mode->hsync_end = mode->htotal; + } + if (mode->vsync_end > mode->vtotal) { + drm_dbg_kms(dev, "[CONNECTOR:%d:%s] reducing vsync_end %d->%d\n", + connector->base.id, connector->name, + mode->vsync_end, mode->vtotal); + mode->vsync_end = mode->vtotal; + }
drm_mode_do_interlace_quirk(mode, pt);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit b721e7885eb242aa2459ee66bb42ceef1bcf0f0c ]
'active_io' used to be initialized while the array is running, and 'mddev->pers' is set while the array is running as well. Hence caller must hold 'reconfig_mutex' and guarantee 'mddev->pers' is set before calling mddev_suspend().
Now that 'active_io' is initialized when mddev is allocated, such restriction doesn't exist anymore. In the meantime, follow up patches will refactor mddev_suspend(), hence add checking for 'mddev->pers' to prevent null-ptr-deref.
Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20230825030956.1527023-4-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 78d51dddf3a00..34b7196d9634c 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -449,7 +449,7 @@ void mddev_suspend(struct mddev *mddev) set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags); percpu_ref_kill(&mddev->active_io);
- if (mddev->pers->prepare_suspend) + if (mddev->pers && mddev->pers->prepare_suspend) mddev->pers->prepare_suspend(mddev);
wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David (Ming Qiang) Wu David.Wu3@amd.com
[ Upstream commit fa1f1cc09d588a90c8ce3f507c47df257461d148 ]
err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly.
Acked-by: Leo Liu leo.liu@amd.com Signed-off-by: David (Ming Qiang) Wu David.Wu3@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c index ae455aab5d29d..7e54abca45206 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c @@ -292,8 +292,15 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev) void *ptr; int i, idx;
+ bool in_ras_intr = amdgpu_ras_intr_triggered(); + cancel_delayed_work_sync(&adev->vcn.idle_work);
+ /* err_event_athub will corrupt VCPU buffer, so we need to + * restore fw data and clear buffer in amdgpu_vcn_resume() */ + if (in_ras_intr) + return 0; + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) { if (adev->vcn.harvest_config & (1 << i)) continue;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaogang Chen xiaogang.chen@amd.com
[ Upstream commit 709c348261618da7ed89d6c303e2ceb9e453ba74 ]
prange->svm_bo unref can happen in both mmu callback and a callback after migrate to system ram. Both are async call in different tasks. Sync svm_bo unref operation to avoid random "use-after-free".
Signed-off-by: Xiaogang Chen xiaogang.chen@amd.com Reviewed-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Jesse Zhang Jesse.Zhang@amd.com Tested-by: Jesse Zhang Jesse.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 50f943e04f8a4..e1d73a7223675 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -617,8 +617,15 @@ svm_range_vram_node_new(struct kfd_node *node, struct svm_range *prange,
void svm_range_vram_node_free(struct svm_range *prange) { - svm_range_bo_unref(prange->svm_bo); - prange->ttm_res = NULL; + /* serialize prange->svm_bo unref */ + mutex_lock(&prange->lock); + /* prange->svm_bo has not been unref */ + if (prange->ttm_res) { + prange->ttm_res = NULL; + mutex_unlock(&prange->lock); + svm_range_bo_unref(prange->svm_bo); + } else + mutex_unlock(&prange->lock); }
struct kfd_node *
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 7752ccf85b929a22e658ec145283e8f31232f4bb ]
The matching values for `pcie_gen_cap` and `pcie_width_cap` when fetched from powerplay tables are 1 byte, so narrow the arguments to match to ensure min() and max() comparisons without casts.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 2 +- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 4 ++-- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 4 ++-- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 8 ++++---- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 4 ++-- 6 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 222af2fae7458..16c03771c1239 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1232,7 +1232,7 @@ static int smu_smc_hw_setup(struct smu_context *smu) { struct smu_feature *feature = &smu->smu_feature; struct amdgpu_device *adev = smu->adev; - uint32_t pcie_gen = 0, pcie_width = 0; + uint8_t pcie_gen = 0, pcie_width = 0; uint64_t features_supported; int ret = 0;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index 6e2069dcb6b9d..d1d7713b97794 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h @@ -856,7 +856,7 @@ struct pptable_funcs { * &pcie_gen_cap: Maximum allowed PCIe generation. * &pcie_width_cap: Maximum allowed PCIe width. */ - int (*update_pcie_parameters)(struct smu_context *smu, uint32_t pcie_gen_cap, uint32_t pcie_width_cap); + int (*update_pcie_parameters)(struct smu_context *smu, uint8_t pcie_gen_cap, uint8_t pcie_width_cap);
/** * @i2c_init: Initialize i2c. diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h index 355c156d871af..cc02f979e9e98 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -296,8 +296,8 @@ int smu_v13_0_get_pptable_from_firmware(struct smu_context *smu, uint32_t pptable_id);
int smu_v13_0_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap); + uint8_t pcie_gen_cap, + uint8_t pcie_width_cap);
#endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 95f6d821bacbc..addaa69119b8e 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -2375,8 +2375,8 @@ static int navi10_get_power_limit(struct smu_context *smu, }
static int navi10_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap) + uint8_t pcie_gen_cap, + uint8_t pcie_width_cap) { struct smu_11_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; PPTable_t *pptable = smu->smu_table.driver_pptable; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index 9119b0df2419f..9a5f3d31e7780 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -2084,14 +2084,14 @@ static int sienna_cichlid_display_disable_memory_clock_switch(struct smu_context #define MAX(a, b) ((a) > (b) ? (a) : (b))
static int sienna_cichlid_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap) + uint8_t pcie_gen_cap, + uint8_t pcie_width_cap) { struct smu_11_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; struct smu_11_0_pcie_table *pcie_table = &dpm_context->dpm_tables.pcie_table; uint8_t *table_member1, *table_member2; - uint32_t min_gen_speed, max_gen_speed; - uint32_t min_lane_width, max_lane_width; + uint8_t min_gen_speed, max_gen_speed; + uint8_t min_lane_width, max_lane_width; uint32_t smu_pcie_arg; int ret, i;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 9b62b45ebb7f0..4fa94f583b87c 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2426,8 +2426,8 @@ int smu_v13_0_mode1_reset(struct smu_context *smu) }
int smu_v13_0_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap) + uint8_t pcie_gen_cap, + uint8_t pcie_width_cap) { struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; struct smu_13_0_pcie_table *pcie_table =
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenjing Liu wenjing.liu@amd.com
[ Upstream commit 05b78277ef0efc1deebc8a22384fffec29a3676e ]
[why] Clip size increase will increase viewport, which could cause us to switch to MPC combine. If we skip full update, we are not able to change to MPC combine in fast update. This will cause corruption showing on the video plane.
[how] treat clip size increase of a surface larger than 5k as a full update.
Reviewed-by: Jun Lei jun.lei@amd.com Acked-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Wenjing Liu wenjing.liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 12 ++++++++++-- drivers/gpu/drm/amd/display/dc/dc.h | 5 +++++ 2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 93e6265e58509..b386f3b0fd428 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -993,7 +993,8 @@ static bool dc_construct(struct dc *dc, /* set i2c speed if not done by the respective dcnxxx__resource.c */ if (dc->caps.i2c_speed_in_khz_hdcp == 0) dc->caps.i2c_speed_in_khz_hdcp = dc->caps.i2c_speed_in_khz; - + if (dc->caps.max_optimizable_video_width == 0) + dc->caps.max_optimizable_video_width = 5120; dc->clk_mgr = dc_clk_mgr_create(dc->ctx, dc->res_pool->pp_smu, dc->res_pool->dccg); if (!dc->clk_mgr) goto fail; @@ -2430,6 +2431,7 @@ static enum surface_update_type get_plane_info_update_type(const struct dc_surfa }
static enum surface_update_type get_scaling_info_update_type( + const struct dc *dc, const struct dc_surface_update *u) { union surface_update_flags *update_flags = &u->surface->update_flags; @@ -2464,6 +2466,12 @@ static enum surface_update_type get_scaling_info_update_type( update_flags->bits.clock_change = 1; }
+ if (u->scaling_info->src_rect.width > dc->caps.max_optimizable_video_width && + (u->scaling_info->clip_rect.width > u->surface->clip_rect.width || + u->scaling_info->clip_rect.height > u->surface->clip_rect.height)) + /* Changing clip size of a large surface may result in MPC slice count change */ + update_flags->bits.bandwidth_change = 1; + if (u->scaling_info->src_rect.x != u->surface->src_rect.x || u->scaling_info->src_rect.y != u->surface->src_rect.y || u->scaling_info->clip_rect.x != u->surface->clip_rect.x @@ -2501,7 +2509,7 @@ static enum surface_update_type det_surface_update(const struct dc *dc, type = get_plane_info_update_type(u); elevate_update_type(&overall_type, type);
- type = get_scaling_info_update_type(u); + type = get_scaling_info_update_type(dc, u); elevate_update_type(&overall_type, type);
if (u->flip_addr) { diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 81258392d44a1..dc0e0af616506 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -229,6 +229,11 @@ struct dc_caps { uint32_t dmdata_alloc_size; unsigned int max_cursor_size; unsigned int max_video_width; + /* + * max video plane width that can be safely assumed to be always + * supported by single DPP pipe. + */ + unsigned int max_optimizable_video_width; unsigned int min_horizontal_blanking_period; int linear_pitch_alignment; bool dcc_const_color;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 313ebe47d75558511aa1237b6e35c663b5c0ec6f ]
Currently, user array duplications are sometimes done without an overflow check. Sometimes the checks are done manually; sometimes the array size is calculated with array_size() and sometimes by calculating n * size directly in code.
Introduce wrappers for arrays for memdup_user() and vmemdup_user() to provide a standardized and safe way for duplicating user arrays.
This is both for new code as well as replacing usage of (v)memdup_user() in existing code that uses, e.g., n * size to calculate array sizes.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-3-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/string.h | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index 9e3cb6923b0ef..5077776e995e0 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -5,7 +5,9 @@ #include <linux/compiler.h> /* for inline */ #include <linux/types.h> /* for size_t */ #include <linux/stddef.h> /* for NULL */ +#include <linux/err.h> /* for ERR_PTR() */ #include <linux/errno.h> /* for E2BIG */ +#include <linux/overflow.h> /* for check_mul_overflow() */ #include <linux/stdarg.h> #include <uapi/linux/string.h>
@@ -14,6 +16,44 @@ extern void *memdup_user(const void __user *, size_t); extern void *vmemdup_user(const void __user *, size_t); extern void *memdup_user_nul(const void __user *, size_t);
+/** + * memdup_array_user - duplicate array from user space + * @src: source address in user space + * @n: number of array members to copy + * @size: size of one array member + * + * Return: an ERR_PTR() on failure. Result is physically + * contiguous, to be freed by kfree(). + */ +static inline void *memdup_array_user(const void __user *src, size_t n, size_t size) +{ + size_t nbytes; + + if (check_mul_overflow(n, size, &nbytes)) + return ERR_PTR(-EOVERFLOW); + + return memdup_user(src, nbytes); +} + +/** + * vmemdup_array_user - duplicate array from user space + * @src: source address in user space + * @n: number of array members to copy + * @size: size of one array member + * + * Return: an ERR_PTR() on failure. Result may be not + * physically contiguous. Use kvfree() to free. + */ +static inline void *vmemdup_array_user(const void __user *src, size_t n, size_t size) +{ + size_t nbytes; + + if (check_mul_overflow(n, size, &nbytes)) + return ERR_PTR(-EOVERFLOW); + + return vmemdup_user(src, nbytes); +} + /* * Include machine specific inline routines */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 569c8d82f95eb5993c84fb61a649a9c4ddd208b3 ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Acked-by: Baoquan He bhe@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-4-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/kexec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/kexec.c b/kernel/kexec.c index 92d301f987766..f6067c1bb0893 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -242,7 +242,7 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, ((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT)) return -EINVAL;
- ksegments = memdup_user(segments, nr_segments * sizeof(ksegments[0])); + ksegments = memdup_array_user(segments, nr_segments, sizeof(ksegments[0])); if (IS_ERR(ksegments)) return PTR_ERR(ksegments);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit ca0776571d3163bd03b3e8c9e3da936abfaecbf6 ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-5-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/watch_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c index d0b6b390ee423..778b4056700ff 100644 --- a/kernel/watch_queue.c +++ b/kernel/watch_queue.c @@ -331,7 +331,7 @@ long watch_queue_set_filter(struct pipe_inode_info *pipe, filter.__reserved != 0) return -EINVAL;
- tf = memdup_user(_filter->filters, filter.nr_filters * sizeof(*tf)); + tf = memdup_array_user(_filter->filters, filter.nr_filters, sizeof(*tf)); if (IS_ERR(tf)) return PTR_ERR(tf);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit f37d63e219c39199a59b8b8a211412ff27192830 ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-6-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/drm_lease.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c index 150fe15550680..94375c6a54256 100644 --- a/drivers/gpu/drm/drm_lease.c +++ b/drivers/gpu/drm/drm_lease.c @@ -510,8 +510,8 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, /* Handle leased objects, if any */ idr_init(&leases); if (object_count != 0) { - object_ids = memdup_user(u64_to_user_ptr(cl->object_ids), - array_size(object_count, sizeof(__u32))); + object_ids = memdup_array_user(u64_to_user_ptr(cl->object_ids), + object_count, sizeof(__u32)); if (IS_ERR(object_ids)) { ret = PTR_ERR(object_ids); idr_destroy(&leases);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 06ab64a0d836ac430c5f94669710a78aa43942cb ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-7-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c index 3829be282ff00..17463aeeef28f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -774,9 +774,9 @@ int vmw_surface_define_ioctl(struct drm_device *dev, void *data, sizeof(metadata->mip_levels)); metadata->num_sizes = num_sizes; metadata->sizes = - memdup_user((struct drm_vmw_size __user *)(unsigned long) + memdup_array_user((struct drm_vmw_size __user *)(unsigned long) req->size_addr, - sizeof(*metadata->sizes) * metadata->num_sizes); + metadata->num_sizes, sizeof(*metadata->sizes)); if (IS_ERR(metadata->sizes)) { ret = PTR_ERR(metadata->sizes); goto out_no_sizes;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit a251c9d8e30833b260101edb9383b176ee2b7cb1 ]
The DP CTS test for EDID last block checksum expects the checksum for the last block, invalid or not. Skip the validity check.
For the most part (*), the EDIDs returned by drm_get_edid() will be valid anyway, and there's the CTS workaround to get the checksum for completely invalid EDIDs. See commit 7948fe12d47a ("drm/msm/dp: return correct edid checksum after corrupted edid checksum read").
This lets us remove one user of drm_edid_block_valid() with hopes the function can be removed altogether in the future.
(*) drm_get_edid() ignores checksum errors on CTA extensions.
Cc: Abhinav Kumar quic_abhinavk@quicinc.com Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Kuogee Hsieh khsieh@codeaurora.org Cc: Marijn Suijten marijn.suijten@somainline.org Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: Stephen Boyd swboyd@chromium.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Reviewed-by: Kuogee Hsieh quic_khsieh@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/555361/ Link: https://lore.kernel.org/r/20230901142034.580802-1-jani.nikula@intel.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dp/dp_panel.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_panel.c b/drivers/gpu/drm/msm/dp/dp_panel.c index 42d52510ffd4a..86a8e06c7a60f 100644 --- a/drivers/gpu/drm/msm/dp/dp_panel.c +++ b/drivers/gpu/drm/msm/dp/dp_panel.c @@ -289,26 +289,9 @@ int dp_panel_get_modes(struct dp_panel *dp_panel,
static u8 dp_panel_get_edid_checksum(struct edid *edid) { - struct edid *last_block; - u8 *raw_edid; - bool is_edid_corrupt = false; + edid += edid->extensions;
- if (!edid) { - DRM_ERROR("invalid edid input\n"); - return 0; - } - - raw_edid = (u8 *)edid; - raw_edid += (edid->extensions * EDID_LENGTH); - last_block = (struct edid *)raw_edid; - - /* block type extension */ - drm_edid_block_valid(raw_edid, 1, false, &is_edid_corrupt); - if (!is_edid_corrupt) - return last_block->checksum; - - DRM_ERROR("Invalid block, no checksum\n"); - return 0; + return edid->checksum; }
void dp_panel_handle_sink_request(struct dp_panel *dp_panel)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 760efbca74a405dc439a013a5efaa9fadc95a8c3 ]
For pptable structs that use flexible array sizes, use flexible arrays.
Suggested-by: Felix Held felix.held@amd.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/include/pptable.h | 4 ++-- drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/include/pptable.h b/drivers/gpu/drm/amd/include/pptable.h index 0b6a057e0a4c4..5aac8d545bdc6 100644 --- a/drivers/gpu/drm/amd/include/pptable.h +++ b/drivers/gpu/drm/amd/include/pptable.h @@ -78,7 +78,7 @@ typedef struct _ATOM_PPLIB_THERMALCONTROLLER typedef struct _ATOM_PPLIB_STATE { UCHAR ucNonClockStateIndex; - UCHAR ucClockStateIndices[1]; // variable-sized + UCHAR ucClockStateIndices[]; // variable-sized } ATOM_PPLIB_STATE;
@@ -473,7 +473,7 @@ typedef struct _ATOM_PPLIB_STATE_V2 /** * Driver will read the first ucNumDPMLevels in this array */ - UCHAR clockInfoIndex[1]; + UCHAR clockInfoIndex[]; } ATOM_PPLIB_STATE_V2;
typedef struct _StateArray{ diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index b0ac4d121adca..41444e27bfc0c 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -179,7 +179,7 @@ typedef struct _ATOM_Tonga_MCLK_Dependency_Record { typedef struct _ATOM_Tonga_MCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_MCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_MCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_MCLK_Dependency_Table;
typedef struct _ATOM_Tonga_SCLK_Dependency_Record { @@ -194,7 +194,7 @@ typedef struct _ATOM_Tonga_SCLK_Dependency_Record { typedef struct _ATOM_Tonga_SCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_SCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_SCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_SCLK_Dependency_Table;
typedef struct _ATOM_Polaris_SCLK_Dependency_Record {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 0f0e59075b5c22f1e871fbd508d6e4f495048356 ]
For pptable structs that use flexible array sizes, use flexible arrays.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index 41444e27bfc0c..e0e40b054c08b 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -164,7 +164,7 @@ typedef struct _ATOM_Tonga_State { typedef struct _ATOM_Tonga_State_Array { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_State entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_State entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_State_Array;
typedef struct _ATOM_Tonga_MCLK_Dependency_Record { @@ -210,7 +210,7 @@ typedef struct _ATOM_Polaris_SCLK_Dependency_Record { typedef struct _ATOM_Polaris_SCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Polaris_SCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Polaris_SCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Polaris_SCLK_Dependency_Table;
typedef struct _ATOM_Tonga_PCIE_Record { @@ -222,7 +222,7 @@ typedef struct _ATOM_Tonga_PCIE_Record { typedef struct _ATOM_Tonga_PCIE_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_PCIE_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_PCIE_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_PCIE_Table;
typedef struct _ATOM_Polaris10_PCIE_Record { @@ -235,7 +235,7 @@ typedef struct _ATOM_Polaris10_PCIE_Record { typedef struct _ATOM_Polaris10_PCIE_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Polaris10_PCIE_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Polaris10_PCIE_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Polaris10_PCIE_Table;
@@ -252,7 +252,7 @@ typedef struct _ATOM_Tonga_MM_Dependency_Record { typedef struct _ATOM_Tonga_MM_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_MM_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_MM_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_MM_Dependency_Table;
typedef struct _ATOM_Tonga_Voltage_Lookup_Record { @@ -265,7 +265,7 @@ typedef struct _ATOM_Tonga_Voltage_Lookup_Record { typedef struct _ATOM_Tonga_Voltage_Lookup_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_Voltage_Lookup_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_Voltage_Lookup_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_Voltage_Lookup_Table;
typedef struct _ATOM_Tonga_Fan_Table {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanley.Yang Stanley.Yang@amd.com
[ Upstream commit 80285ae1ec8717b597b20de38866c29d84d321a1 ]
The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using.
Signed-off-by: Stanley.Yang Stanley.Yang@amd.com Reviewed-by: Tao Zhou tao.zhou1@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 8940ee73f2dfe..ecc61a6d13e13 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5399,7 +5399,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, * Flush RAM to disk so that after reboot * the user can read log and see why the system rebooted. */ - if (need_emergency_restart && amdgpu_ras_get_context(adev)->reboot) { + if (need_emergency_restart && amdgpu_ras_get_context(adev) && + amdgpu_ras_get_context(adev)->reboot) { DRM_WARN("Emergency reboot.");
ksys_sync_helper();
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit 924e5814d1f84e6fa5cb19c6eceb69f066225229 ]
In versatile_panel_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.
Signed-off-by: Ma Ke make_ruc2021@163.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20231007033105.3997998-1-make_ruc2021@163.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20231007033105.3997998-1-make_... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-arm-versatile.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-arm-versatile.c b/drivers/gpu/drm/panel/panel-arm-versatile.c index abb0788843c60..503ecea72c5ea 100644 --- a/drivers/gpu/drm/panel/panel-arm-versatile.c +++ b/drivers/gpu/drm/panel/panel-arm-versatile.c @@ -267,6 +267,8 @@ static int versatile_panel_get_modes(struct drm_panel *panel, connector->display_info.bus_flags = vpanel->panel_type->bus_flags;
mode = drm_mode_duplicate(connector->dev, &vpanel->panel_type->mode); + if (!mode) + return -ENOMEM; drm_mode_set_name(mode); mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit f22def5970c423ea7f87d5247bd0ef91416b0658 ]
In tpg110_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.
Signed-off-by: Ma Ke make_ruc2021@163.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20231009090446.4043798-1-make_ruc2021@163.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20231009090446.4043798-1-make_... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-tpo-tpg110.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-tpo-tpg110.c b/drivers/gpu/drm/panel/panel-tpo-tpg110.c index 845304435e235..f6a212e542cb9 100644 --- a/drivers/gpu/drm/panel/panel-tpo-tpg110.c +++ b/drivers/gpu/drm/panel/panel-tpo-tpg110.c @@ -379,6 +379,8 @@ static int tpg110_get_modes(struct drm_panel *panel, connector->display_info.bus_flags = tpg->panel_mode->bus_flags;
mode = drm_mode_duplicate(connector->dev, &tpg->panel_mode->mode); + if (!mode) + return -ENOMEM; drm_mode_set_name(mode); mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit 2c1fe3c480f9e1deefd50d4b18be4a046011ee1f ]
In radeon_tv_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null point dereference.
Signed-off-by: Ma Ke make_ruc2021@163.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/radeon/radeon_connectors.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c index 07193cd0c4174..4859d965d67e3 100644 --- a/drivers/gpu/drm/radeon/radeon_connectors.c +++ b/drivers/gpu/drm/radeon/radeon_connectors.c @@ -1122,6 +1122,8 @@ static int radeon_tv_get_modes(struct drm_connector *connector) else { /* only 800x600 is supported right now on pre-avivo chips */ tv_mode = drm_cvt_mode(dev, 800, 600, 60, false, false, false); + if (!tv_mode) + return 0; tv_mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED; drm_mode_probed_add(connector, tv_mode); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit cd90511557fdfb394bb4ac4c3b539b007383914c ]
In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference.
Signed-off-by: Ma Ke make_ruc2021@163.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c index d0748bcfad16b..75d25fba80821 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -239,6 +239,8 @@ static int amdgpu_vkms_conn_get_modes(struct drm_connector *connector)
for (i = 0; i < ARRAY_SIZE(common_modes); i++) { mode = drm_cvt_mode(dev, common_modes[i].w, common_modes[i].h, 60, false, false, false); + if (!mode) + continue; drm_mode_probed_add(connector, mode); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Jirman megi@xff.cz
[ Upstream commit d12d635bb03c7cb4830acb641eb176ee9ff2aa89 ]
Switching to a different reset sequence, enabling IOVCC before enabling VCC.
There also needs to be a delay after enabling the supplies and before deasserting the reset. The datasheet specifies 1ms after the supplies reach the required voltage. Use 10-20ms to also give the power supplies some time to reach the required voltage, too.
This fixes intermittent panel initialization failures and screen corruption during resume from sleep on panel xingbangda,xbd599 (e.g. used in PinePhone).
Signed-off-by: Ondrej Jirman megi@xff.cz Signed-off-by: Frank Oltmanns frank@oltmanns.dev Reported-by: Samuel Holland samuel@sholland.org Reviewed-by: Guido Günther agx@sigxcpu.org Tested-by: Guido Günther agx@sigxcpu.org Signed-off-by: Guido Günther agx@sigxcpu.org Link: https://patchwork.freedesktop.org/patch/msgid/20230211171748.36692-2-frank@o... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-sitronix-st7703.c | 25 ++++++++++--------- 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7703.c b/drivers/gpu/drm/panel/panel-sitronix-st7703.c index 3aa31f3d61574..687749bbec62c 100644 --- a/drivers/gpu/drm/panel/panel-sitronix-st7703.c +++ b/drivers/gpu/drm/panel/panel-sitronix-st7703.c @@ -506,29 +506,30 @@ static int st7703_prepare(struct drm_panel *panel) return 0;
dev_dbg(ctx->dev, "Resetting the panel\n"); - ret = regulator_enable(ctx->vcc); + gpiod_set_value_cansleep(ctx->reset_gpio, 1); + + ret = regulator_enable(ctx->iovcc); if (ret < 0) { - dev_err(ctx->dev, "Failed to enable vcc supply: %d\n", ret); + dev_err(ctx->dev, "Failed to enable iovcc supply: %d\n", ret); return ret; } - ret = regulator_enable(ctx->iovcc); + + ret = regulator_enable(ctx->vcc); if (ret < 0) { - dev_err(ctx->dev, "Failed to enable iovcc supply: %d\n", ret); - goto disable_vcc; + dev_err(ctx->dev, "Failed to enable vcc supply: %d\n", ret); + regulator_disable(ctx->iovcc); + return ret; }
- gpiod_set_value_cansleep(ctx->reset_gpio, 1); - usleep_range(20, 40); + /* Give power supplies time to stabilize before deasserting reset. */ + usleep_range(10000, 20000); + gpiod_set_value_cansleep(ctx->reset_gpio, 0); - msleep(20); + usleep_range(15000, 20000);
ctx->prepared = true;
return 0; - -disable_vcc: - regulator_disable(ctx->vcc); - return ret; }
static const u32 mantix_bus_formats[] = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse Zhang jesse.zhang@amd.com
[ Upstream commit 282c1d793076c2edac6c3db51b7e8ed2b41d60a5 ]
[ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30
Signed-off-by: Jesse Zhang jesse.zhang@amd.com Suggested-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Yifan Zhang yifan1.zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index e1d73a7223675..a5c394fcbb350 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -756,7 +756,7 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, prange->flags &= ~attrs[i].value; break; case KFD_IOCTL_SVM_ATTR_GRANULARITY: - prange->granularity = attrs[i].value; + prange->granularity = min_t(uint32_t, attrs[i].value, 0x3F); break; default: WARN_ONCE(1, "svm_range_check_attrs wasn't called?");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Huang qu.huang@linux.dev
[ Upstream commit 5104fdf50d326db2c1a994f8b35dcd46e63ae4ad ]
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log:
1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic #46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0
Signed-off-by: Qu Huang qu.huang@linux.dev Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 56e89e76ff179..33cada366eeb1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -747,6 +747,9 @@ static ssize_t amdgpu_debugfs_regs_smc_read(struct file *f, char __user *buf, ssize_t result = 0; int r;
+ if (!adev->smc_rreg) + return -EPERM; + if (size & 0x3 || *pos & 0x3) return -EINVAL;
@@ -803,6 +806,9 @@ static ssize_t amdgpu_debugfs_regs_smc_write(struct file *f, const char __user * ssize_t result = 0; int r;
+ if (!adev->smc_wreg) + return -EPERM; + if (size & 0x3 || *pos & 0x3) return -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit fbf1035b033a51eee48d5f42e781b02fff272ca0 ]
Rather than individual ASICs checking for the quirk, set the quirk at the driver level.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 4 +--- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ecc61a6d13e13..65779cbdbad13 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2318,6 +2318,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) adev->pm.pp_feature &= ~PP_GFXOFF_MASK; if (amdgpu_sriov_vf(adev) && adev->asic_type == CHIP_SIENNA_CICHLID) adev->pm.pp_feature &= ~PP_OVERDRIVE_MASK; + if (!amdgpu_device_pcie_dynamic_switching_supported()) + adev->pm.pp_feature &= ~PP_PCIE_DPM_MASK;
total = true; for (i = 0; i < adev->num_ip_blocks; i++) { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index 1cb4022644977..a38888176805d 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -1823,9 +1823,7 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK ? false : true; data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK ? false : true; - data->pcie_dpm_key_disabled = - !amdgpu_device_pcie_dynamic_switching_supported() || - !(hwmgr->feature_mask & PP_PCIE_DPM_MASK); + data->pcie_dpm_key_disabled = !(hwmgr->feature_mask & PP_PCIE_DPM_MASK); /* need to set voltage control types before EVV patching */ data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE; data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index 9a5f3d31e7780..94f22df5ac205 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -2107,7 +2107,7 @@ static int sienna_cichlid_update_pcie_parameters(struct smu_context *smu, min_lane_width = min_lane_width > max_lane_width ? max_lane_width : min_lane_width;
- if (!amdgpu_device_pcie_dynamic_switching_supported()) { + if (!(smu->adev->pm.pp_feature & PP_PCIE_DPM_MASK)) { pcie_table->pcie_gen[0] = max_gen_speed; pcie_table->pcie_lane[0] = max_lane_width; } else { diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 4fa94f583b87c..223e890575a2b 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2436,7 +2436,7 @@ int smu_v13_0_update_pcie_parameters(struct smu_context *smu, uint32_t smu_pcie_arg; int ret, i;
- if (!amdgpu_device_pcie_dynamic_switching_supported()) { + if (!(smu->adev->pm.pp_feature & PP_PCIE_DPM_MASK)) { if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap) pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samson Tam samson.tam@amd.com
[ Upstream commit 79f3f1b66753b3a3a269d73676bf50987921f267 ]
[Why] Helper function calculates num_ways using 32-bit. But is returned as 8-bit. If num_ways exceeds 8-bit, then it reports back the incorrect num_ways and erroneously uses MALL when it should not
[How] Make returned value 32-bit and convert after it checks against caps.cache_num_ways, which is under 8-bit
Reviewed-by: Alvin Lee alvin.lee2@amd.com Acked-by: Roman Li roman.li@amd.com Signed-off-by: Samson Tam samson.tam@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c index 5b3d0e5b90a3e..ccbcfd6bd6b85 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -216,7 +216,7 @@ static bool dcn32_check_no_memory_request_for_cab(struct dc *dc) static uint32_t dcn32_calculate_cab_allocation(struct dc *dc, struct dc_state *ctx) { int i; - uint8_t num_ways = 0; + uint32_t num_ways = 0; uint32_t mall_ss_size_bytes = 0;
mall_ss_size_bytes = ctx->bw_ctx.bw.dcn.mall_ss_size_bytes; @@ -246,7 +246,8 @@ static uint32_t dcn32_calculate_cab_allocation(struct dc *dc, struct dc_state *c bool dcn32_apply_idle_power_optimizations(struct dc *dc, bool enable) { union dmub_rb_cmd cmd; - uint8_t ways, i; + uint8_t i; + uint32_t ways; int j; bool mall_ss_unsupported = false; struct dc_plane_state *plane = NULL; @@ -306,7 +307,7 @@ bool dcn32_apply_idle_power_optimizations(struct dc *dc, bool enable) cmd.cab.header.type = DMUB_CMD__CAB_FOR_SS; cmd.cab.header.sub_type = DMUB_CMD__CAB_DCN_SS_FIT_IN_CAB; cmd.cab.header.payload_bytes = sizeof(cmd.cab) - sizeof(cmd.cab.header); - cmd.cab.cab_alloc_ways = ways; + cmd.cab.cab_alloc_ways = (uint8_t)ways;
dm_execute_dmub_cmd(dc->ctx, &cmd, DM_DMUB_WAIT_TYPE_NO_WAIT);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lin.Cao lincao12@amd.com
[ Upstream commit 406e8845356d18bdf3d3a23b347faf67706472ec ]
In SR-IOV environment, the value of pcie_table->num_of_link_levels will be 0, and num_of_levels - 1 will cause array index out of bounds
Signed-off-by: Lin.Cao lincao12@amd.com Acked-by: Jingwen Chen Jingwen.Chen2@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 223e890575a2b..3bc60ecc7bfef 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2436,6 +2436,9 @@ int smu_v13_0_update_pcie_parameters(struct smu_context *smu, uint32_t smu_pcie_arg; int ret, i;
+ if (!num_of_levels) + return 0; + if (!(smu->adev->pm.pp_feature & PP_PCIE_DPM_MASK)) { if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap) pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laurentiu Tudor laurentiu.tudor@nxp.com
[ Upstream commit b39d5016456871a88f5cd141914a5043591b46f3 ]
Wrap the usb controllers in an intermediate simple-bus and use it to constrain the dma address size of these usb controllers to the 40b that they generate toward the interconnect. This is required because the SoC uses 48b address sizes and this mismatch would lead to smmu context faults [1] because the usb generates 40b addresses while the smmu page tables are populated with 48b wide addresses.
[1] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1 xhci-hcd xhci-hcd.0.auto: hcc params 0x0220f66d hci version 0x100 quirks 0x0000000002000010 xhci-hcd xhci-hcd.0.auto: irq 108, io mem 0x03100000 xhci-hcd xhci-hcd.0.auto: xHCI Host Controller xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2 xhci-hcd xhci-hcd.0.auto: Host supports USB 3.0 SuperSpeed arm-smmu 5000000.iommu: Unhandled context fault: fsr=0x402, iova=0xffffffb000, fsynr=0x0, cbfrsynra=0xc01, cb=3
Signed-off-by: Laurentiu Tudor laurentiu.tudor@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 46 +++++++++++-------- 1 file changed, 27 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi index d2f5345d05600..717288bbdb8b6 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi @@ -1186,26 +1186,34 @@ dma-coherent; };
- usb0: usb@3100000 { - status = "disabled"; - compatible = "snps,dwc3"; - reg = <0x0 0x3100000 0x0 0x10000>; - interrupts = <0 80 0x4>; /* Level high type */ - dr_mode = "host"; - snps,quirk-frame-length-adjustment = <0x20>; - snps,dis_rxdet_inp3_quirk; - snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; - }; + bus: bus { + #address-cells = <2>; + #size-cells = <2>; + compatible = "simple-bus"; + ranges; + dma-ranges = <0x0 0x0 0x0 0x0 0x100 0x00000000>; + + usb0: usb@3100000 { + compatible = "snps,dwc3"; + reg = <0x0 0x3100000 0x0 0x10000>; + interrupts = <0 80 0x4>; /* Level high type */ + dr_mode = "host"; + snps,quirk-frame-length-adjustment = <0x20>; + snps,dis_rxdet_inp3_quirk; + snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + status = "disabled"; + };
- usb1: usb@3110000 { - status = "disabled"; - compatible = "snps,dwc3"; - reg = <0x0 0x3110000 0x0 0x10000>; - interrupts = <0 81 0x4>; /* Level high type */ - dr_mode = "host"; - snps,quirk-frame-length-adjustment = <0x20>; - snps,dis_rxdet_inp3_quirk; - snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + usb1: usb@3110000 { + compatible = "snps,dwc3"; + reg = <0x0 0x3110000 0x0 0x10000>; + interrupts = <0 81 0x4>; /* Level high type */ + dr_mode = "host"; + snps,quirk-frame-length-adjustment = <0x20>; + snps,dis_rxdet_inp3_quirk; + snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + status = "disabled"; + }; };
ccn@4000000 {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhujun2 zhujun2@cmss.chinamobile.com
[ Upstream commit 3f6f8a8c5e11a9b384a36df4f40f0c9a653b6975 ]
The opened file should be closed in main(), otherwise resource leak will occur that this problem was discovered by code reading
Signed-off-by: zhujun2 zhujun2@cmss.chinamobile.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/efivarfs/create-read.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/efivarfs/create-read.c b/tools/testing/selftests/efivarfs/create-read.c index 9674a19396a32..7bc7af4eb2c17 100644 --- a/tools/testing/selftests/efivarfs/create-read.c +++ b/tools/testing/selftests/efivarfs/create-read.c @@ -32,8 +32,10 @@ int main(int argc, char **argv) rc = read(fd, buf, sizeof(buf)); if (rc != 0) { fprintf(stderr, "Reading a new var should return EOF\n"); + close(fd); return EXIT_FAILURE; }
+ close(fd); return EXIT_SUCCESS; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trevor Wu trevor.wu@mediatek.com
[ Upstream commit d601bb78f06b9e3cbb52e6b87b88add9920a11b6 ]
To avoid power leakage, it is recommended to replace the default pinctrl state with dynamic pinctrl since certain audio pinmux functions can remain in a HIGH state even when audio is disabled. Linking pinctrl with DAPM using SND_SOC_DAPM_PINCTRL will ensure that audio pins remain in GPIO mode by default and only switch to an audio function when necessary.
Signed-off-by: Trevor Wu trevor.wu@mediatek.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://lore.kernel.org/r/20230825024935.10878-2-trevor.wu@mediatek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/mediatek/mt8188/mt8188-mt6359.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/sound/soc/mediatek/mt8188/mt8188-mt6359.c b/sound/soc/mediatek/mt8188/mt8188-mt6359.c index ac69c23e0da1c..7048ff52ab86a 100644 --- a/sound/soc/mediatek/mt8188/mt8188-mt6359.c +++ b/sound/soc/mediatek/mt8188/mt8188-mt6359.c @@ -246,6 +246,11 @@ static const struct snd_soc_dapm_widget mt8188_mt6359_widgets[] = { SND_SOC_DAPM_MIC("Headset Mic", NULL), SND_SOC_DAPM_SINK("HDMI"), SND_SOC_DAPM_SINK("DP"), + + /* dynamic pinctrl */ + SND_SOC_DAPM_PINCTRL("ETDM_SPK_PIN", "aud_etdm_spk_on", "aud_etdm_spk_off"), + SND_SOC_DAPM_PINCTRL("ETDM_HP_PIN", "aud_etdm_hp_on", "aud_etdm_hp_off"), + SND_SOC_DAPM_PINCTRL("MTKAIF_PIN", "aud_mtkaif_on", "aud_mtkaif_off"), };
static const struct snd_kcontrol_new mt8188_mt6359_controls[] = { @@ -267,6 +272,7 @@ static int mt8188_mt6359_mtkaif_calibration(struct snd_soc_pcm_runtime *rtd) snd_soc_rtdcom_lookup(rtd, AFE_PCM_NAME); struct snd_soc_component *cmpnt_codec = asoc_rtd_to_codec(rtd, 0)->component; + struct snd_soc_dapm_widget *pin_w = NULL, *w; struct mtk_base_afe *afe; struct mt8188_afe_private *afe_priv; struct mtkaif_param *param; @@ -306,6 +312,18 @@ static int mt8188_mt6359_mtkaif_calibration(struct snd_soc_pcm_runtime *rtd) return 0; }
+ for_each_card_widgets(rtd->card, w) { + if (!strcmp(w->name, "MTKAIF_PIN")) { + pin_w = w; + break; + } + } + + if (pin_w) + dapm_pinctrl_event(pin_w, NULL, SND_SOC_DAPM_PRE_PMU); + else + dev_dbg(afe->dev, "%s(), no pinmux widget, please check if default on\n", __func__); + pm_runtime_get_sync(afe->dev); mt6359_mtkaif_calibration_enable(cmpnt_codec);
@@ -403,6 +421,9 @@ static int mt8188_mt6359_mtkaif_calibration(struct snd_soc_pcm_runtime *rtd) for (i = 0; i < MT8188_MTKAIF_MISO_NUM; i++) param->mtkaif_phase_cycle[i] = mtkaif_phase_cycle[i];
+ if (pin_w) + dapm_pinctrl_event(pin_w, NULL, SND_SOC_DAPM_POST_PMD); + dev_dbg(afe->dev, "%s(), end, calibration ok %d\n", __func__, param->mtkaif_calibration_ok);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 47f56e38a199bd45514b8e0142399cba4feeaf1a ]
Add members to struct snd_soc_card to store the PCI subsystem ID (SSID) of the soundcard.
The PCI specification provides two registers to store a vendor-specific SSID that can be read by drivers to uniquely identify a particular "soundcard". This is defined in the PCI specification to distinguish products that use the same silicon (and therefore have the same silicon ID) so that product-specific differences can be applied.
PCI only defines 0xFFFF as an invalid value. 0x0000 is not defined as invalid. So the usual pattern of zero-filling the struct and then assuming a zero value unset will not work. A flag is included to indicate when the SSID information has been filled in.
Unlike DMI information, which has a free-format entirely up to the vendor, the PCI SSID has a strictly defined format and a registry of vendor IDs.
It is usual in Windows drivers that the SSID is used as the sole identifier of the specific end-product and the Windows driver contains tables mapping that to information about the hardware setup, rather than using ACPI properties.
This SSID is important information for ASoC components that need to apply hardware-specific configuration on PCI-based systems.
As the SSID is a generic part of the PCI specification and is treated as identifying the "soundcard", it is reasonable to include this information in struct snd_soc_card, instead of components inventing their own custom ways to pass this information around.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20230912163207.3498161-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/sound/soc-card.h | 37 +++++++++++++++++++++++++++++++++++++ include/sound/soc.h | 11 +++++++++++ 2 files changed, 48 insertions(+)
diff --git a/include/sound/soc-card.h b/include/sound/soc-card.h index fc94dfb0021fd..e8ff2e089cd00 100644 --- a/include/sound/soc-card.h +++ b/include/sound/soc-card.h @@ -59,6 +59,43 @@ int snd_soc_card_add_dai_link(struct snd_soc_card *card, void snd_soc_card_remove_dai_link(struct snd_soc_card *card, struct snd_soc_dai_link *dai_link);
+#ifdef CONFIG_PCI +static inline void snd_soc_card_set_pci_ssid(struct snd_soc_card *card, + unsigned short vendor, + unsigned short device) +{ + card->pci_subsystem_vendor = vendor; + card->pci_subsystem_device = device; + card->pci_subsystem_set = true; +} + +static inline int snd_soc_card_get_pci_ssid(struct snd_soc_card *card, + unsigned short *vendor, + unsigned short *device) +{ + if (!card->pci_subsystem_set) + return -ENOENT; + + *vendor = card->pci_subsystem_vendor; + *device = card->pci_subsystem_device; + + return 0; +} +#else /* !CONFIG_PCI */ +static inline void snd_soc_card_set_pci_ssid(struct snd_soc_card *card, + unsigned short vendor, + unsigned short device) +{ +} + +static inline int snd_soc_card_get_pci_ssid(struct snd_soc_card *card, + unsigned short *vendor, + unsigned short *device) +{ + return -ENOENT; +} +#endif /* CONFIG_PCI */ + /* device driver data */ static inline void snd_soc_card_set_drvdata(struct snd_soc_card *card, void *data) diff --git a/include/sound/soc.h b/include/sound/soc.h index cf34810882347..0c54b343d3e5d 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -931,6 +931,17 @@ struct snd_soc_card { #ifdef CONFIG_DMI char dmi_longname[80]; #endif /* CONFIG_DMI */ + +#ifdef CONFIG_PCI + /* + * PCI does not define 0 as invalid, so pci_subsystem_set indicates + * whether a value has been written to these fields. + */ + unsigned short pci_subsystem_vendor; + unsigned short pci_subsystem_device; + bool pci_subsystem_set; +#endif /* CONFIG_PCI */ + char topology_shortname[32];
struct device *dev;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit ba2de401d32625fe538d3f2c00ca73740dd2d516 ]
Pass the PCI SSID of the audio interface through to the machine driver. This allows the machine driver to use the SSID to uniquely identify the specific hardware configuration and apply any platform-specific configuration.
struct snd_sof_pdata is passed around inside the SOF code, but it then passes configuration information to the machine driver through struct snd_soc_acpi_mach and struct snd_soc_acpi_mach_params. So SSID information has been added to both snd_sof_pdata and snd_soc_acpi_mach_params.
PCI does not define 0x0000 as an invalid value so we can't use zero to indicate that the struct member was not written. Instead a flag is included to indicate that a value has been written to the subsystem_vendor and subsystem_device members.
sof_pci_probe() creates the struct snd_sof_pdata. It is passed a struct pci_dev so it can fill in the SSID value.
sof_machine_check() finds the appropriate struct snd_soc_acpi_mach. It copies the SSID information across to the struct snd_soc_acpi_mach_params. This done before calling any custom set_mach_params() so that it could be used by the set_mach_params() callback to apply variant params.
The machine driver receives the struct snd_soc_acpi_mach as its platform_data.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20230912163207.3498161-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/sound/soc-acpi.h | 7 +++++++ include/sound/sof.h | 8 ++++++++ sound/soc/sof/sof-audio.c | 7 +++++++ sound/soc/sof/sof-pci-dev.c | 8 ++++++++ 4 files changed, 30 insertions(+)
diff --git a/include/sound/soc-acpi.h b/include/sound/soc-acpi.h index 528279056b3ab..1a5f90b0a5463 100644 --- a/include/sound/soc-acpi.h +++ b/include/sound/soc-acpi.h @@ -67,6 +67,10 @@ static inline struct snd_soc_acpi_mach *snd_soc_acpi_codec_list(void *arg) * @i2s_link_mask: I2S/TDM links enabled on the board * @num_dai_drivers: number of elements in @dai_drivers * @dai_drivers: pointer to dai_drivers, used e.g. in nocodec mode + * @subsystem_vendor: optional PCI SSID vendor value + * @subsystem_device: optional PCI SSID device value + * @subsystem_id_set: true if a value has been written to + * subsystem_vendor and subsystem_device. */ struct snd_soc_acpi_mach_params { u32 acpi_ipc_irq_index; @@ -79,6 +83,9 @@ struct snd_soc_acpi_mach_params { u32 i2s_link_mask; u32 num_dai_drivers; struct snd_soc_dai_driver *dai_drivers; + unsigned short subsystem_vendor; + unsigned short subsystem_device; + bool subsystem_id_set; };
/** diff --git a/include/sound/sof.h b/include/sound/sof.h index d3c41f87ac319..51294f2ba302c 100644 --- a/include/sound/sof.h +++ b/include/sound/sof.h @@ -64,6 +64,14 @@ struct snd_sof_pdata { const char *name; const char *platform;
+ /* + * PCI SSID. As PCI does not define 0 as invalid, the subsystem_id_set + * flag indicates that a value has been written to these members. + */ + unsigned short subsystem_vendor; + unsigned short subsystem_device; + bool subsystem_id_set; + struct device *dev;
/* diff --git a/sound/soc/sof/sof-audio.c b/sound/soc/sof/sof-audio.c index e5405f854a910..563fe6f7789f7 100644 --- a/sound/soc/sof/sof-audio.c +++ b/sound/soc/sof/sof-audio.c @@ -1032,6 +1032,13 @@ int sof_machine_check(struct snd_sof_dev *sdev) mach = snd_sof_machine_select(sdev); if (mach) { sof_pdata->machine = mach; + + if (sof_pdata->subsystem_id_set) { + mach->mach_params.subsystem_vendor = sof_pdata->subsystem_vendor; + mach->mach_params.subsystem_device = sof_pdata->subsystem_device; + mach->mach_params.subsystem_id_set = true; + } + snd_sof_set_mach_params(mach, sdev); return 0; } diff --git a/sound/soc/sof/sof-pci-dev.c b/sound/soc/sof/sof-pci-dev.c index f42c85df88a80..69a2352f2e1a0 100644 --- a/sound/soc/sof/sof-pci-dev.c +++ b/sound/soc/sof/sof-pci-dev.c @@ -221,6 +221,14 @@ int sof_pci_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) return ret;
sof_pdata->name = pci_name(pci); + + /* PCI defines a vendor ID of 0xFFFF as invalid. */ + if (pci->subsystem_vendor != 0xFFFF) { + sof_pdata->subsystem_vendor = pci->subsystem_vendor; + sof_pdata->subsystem_device = pci->subsystem_device; + sof_pdata->subsystem_id_set = true; + } + sof_pdata->desc = desc; sof_pdata->dev = dev;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lu Jialin lujialin4@huawei.com
[ Upstream commit 8f4f68e788c3a7a696546291258bfa5fdb215523 ]
We found a hungtask bug in test_aead_vec_cfg as follows:
INFO: task cryptomgr_test:391009 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Call trace: __switch_to+0x98/0xe0 __schedule+0x6c4/0xf40 schedule+0xd8/0x1b4 schedule_timeout+0x474/0x560 wait_for_common+0x368/0x4e0 wait_for_completion+0x20/0x30 wait_for_completion+0x20/0x30 test_aead_vec_cfg+0xab4/0xd50 test_aead+0x144/0x1f0 alg_test_aead+0xd8/0x1e0 alg_test+0x634/0x890 cryptomgr_test+0x40/0x70 kthread+0x1e0/0x220 ret_from_fork+0x10/0x18 Kernel panic - not syncing: hung_task: blocked tasks
For padata_do_parallel, when the return err is 0 or -EBUSY, it will call wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal case, aead_request_complete() will be called in pcrypt_aead_serial and the return err is 0 for padata_do_parallel. But, when pinst->flags is PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it won't call aead_request_complete(). Therefore, test_aead_vec_cfg will hung at wait_for_completion(&wait->completion), which will cause hungtask.
The problem comes as following: (padata_do_parallel) | rcu_read_lock_bh(); | err = -EINVAL; | (padata_replace) | pinst->flags |= PADATA_RESET; err = -EBUSY | if (pinst->flags & PADATA_RESET) | rcu_read_unlock_bh() | return err
In order to resolve the problem, we replace the return err -EBUSY with -EAGAIN, which means parallel_data is changing, and the caller should call it again.
v3: remove retry and just change the return err. v2: introduce padata_try_do_parallel() in pcrypt_aead_encrypt and pcrypt_aead_decrypt to solve the hungtask.
Signed-off-by: Lu Jialin lujialin4@huawei.com Signed-off-by: Guo Zihua guozihua@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/pcrypt.c | 4 ++++ kernel/padata.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c index 8c1d0ca412137..d0d954fe9d54f 100644 --- a/crypto/pcrypt.c +++ b/crypto/pcrypt.c @@ -117,6 +117,8 @@ static int pcrypt_aead_encrypt(struct aead_request *req) err = padata_do_parallel(ictx->psenc, padata, &ctx->cb_cpu); if (!err) return -EINPROGRESS; + if (err == -EBUSY) + return -EAGAIN;
return err; } @@ -164,6 +166,8 @@ static int pcrypt_aead_decrypt(struct aead_request *req) err = padata_do_parallel(ictx->psdec, padata, &ctx->cb_cpu); if (!err) return -EINPROGRESS; + if (err == -EBUSY) + return -EAGAIN;
return err; } diff --git a/kernel/padata.c b/kernel/padata.c index ff349e1084c1d..179fb1518070c 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -202,7 +202,7 @@ int padata_do_parallel(struct padata_shell *ps, *cb_cpu = cpu; }
- err = -EBUSY; + err = -EBUSY; if ((pinst->flags & PADATA_RESET)) goto out;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geoffrey D. Bennett g@b4.vu
[ Upstream commit d98cc489029dba4d99714c2e8ec4f5ba249f6851 ]
By moving the USB IDs from the device_info struct into scarlett2_devices[], that will allow for devices with different USB IDs to share the same device_info.
Tested-by: Philippe Perrot philippe@perrot-net.fr Signed-off-by: Geoffrey D. Bennett g@b4.vu Link: https://lore.kernel.org/r/8263368e8d49e6fcebc709817bd82ab79b404468.169470581... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/mixer_scarlett_gen2.c | 63 ++++++++++++--------------------- 1 file changed, 23 insertions(+), 40 deletions(-)
diff --git a/sound/usb/mixer_scarlett_gen2.c b/sound/usb/mixer_scarlett_gen2.c index 9d11bb08667e7..48f5c9b9790dc 100644 --- a/sound/usb/mixer_scarlett_gen2.c +++ b/sound/usb/mixer_scarlett_gen2.c @@ -317,8 +317,6 @@ struct scarlett2_mux_entry { };
struct scarlett2_device_info { - u32 usb_id; /* USB device identifier */ - /* Gen 3 devices have an internal MSD mode switch that needs * to be disabled in order to access the full functionality of * the device. @@ -440,8 +438,6 @@ struct scarlett2_data { /*** Model-specific data ***/
static const struct scarlett2_device_info s6i6_gen2_info = { - .usb_id = USB_ID(0x1235, 0x8203), - .config_set = SCARLETT2_CONFIG_SET_GEN_2, .level_input_count = 2, .pad_input_count = 2, @@ -486,8 +482,6 @@ static const struct scarlett2_device_info s6i6_gen2_info = { };
static const struct scarlett2_device_info s18i8_gen2_info = { - .usb_id = USB_ID(0x1235, 0x8204), - .config_set = SCARLETT2_CONFIG_SET_GEN_2, .level_input_count = 2, .pad_input_count = 4, @@ -535,8 +529,6 @@ static const struct scarlett2_device_info s18i8_gen2_info = { };
static const struct scarlett2_device_info s18i20_gen2_info = { - .usb_id = USB_ID(0x1235, 0x8201), - .config_set = SCARLETT2_CONFIG_SET_GEN_2, .line_out_hw_vol = 1,
@@ -589,8 +581,6 @@ static const struct scarlett2_device_info s18i20_gen2_info = { };
static const struct scarlett2_device_info solo_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8211), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_NO_MIXER, .level_input_count = 1, @@ -602,8 +592,6 @@ static const struct scarlett2_device_info solo_gen3_info = { };
static const struct scarlett2_device_info s2i2_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8210), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_NO_MIXER, .level_input_count = 2, @@ -614,8 +602,6 @@ static const struct scarlett2_device_info s2i2_gen3_info = { };
static const struct scarlett2_device_info s4i4_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8212), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_GEN_3, .level_input_count = 2, @@ -660,8 +646,6 @@ static const struct scarlett2_device_info s4i4_gen3_info = { };
static const struct scarlett2_device_info s8i6_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8213), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_GEN_3, .level_input_count = 2, @@ -713,8 +697,6 @@ static const struct scarlett2_device_info s8i6_gen3_info = { };
static const struct scarlett2_device_info s18i8_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8214), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_GEN_3, .line_out_hw_vol = 1, @@ -783,8 +765,6 @@ static const struct scarlett2_device_info s18i8_gen3_info = { };
static const struct scarlett2_device_info s18i20_gen3_info = { - .usb_id = USB_ID(0x1235, 0x8215), - .has_msd_mode = 1, .config_set = SCARLETT2_CONFIG_SET_GEN_3, .line_out_hw_vol = 1, @@ -848,8 +828,6 @@ static const struct scarlett2_device_info s18i20_gen3_info = { };
static const struct scarlett2_device_info clarett_8pre_info = { - .usb_id = USB_ID(0x1235, 0x820c), - .config_set = SCARLETT2_CONFIG_SET_CLARETT, .line_out_hw_vol = 1, .level_input_count = 2, @@ -902,25 +880,30 @@ static const struct scarlett2_device_info clarett_8pre_info = { } }, };
-static const struct scarlett2_device_info *scarlett2_devices[] = { +struct scarlett2_device_entry { + const u32 usb_id; /* USB device identifier */ + const struct scarlett2_device_info *info; +}; + +static const struct scarlett2_device_entry scarlett2_devices[] = { /* Supported Gen 2 devices */ - &s6i6_gen2_info, - &s18i8_gen2_info, - &s18i20_gen2_info, + { USB_ID(0x1235, 0x8203), &s6i6_gen2_info }, + { USB_ID(0x1235, 0x8204), &s18i8_gen2_info }, + { USB_ID(0x1235, 0x8201), &s18i20_gen2_info },
/* Supported Gen 3 devices */ - &solo_gen3_info, - &s2i2_gen3_info, - &s4i4_gen3_info, - &s8i6_gen3_info, - &s18i8_gen3_info, - &s18i20_gen3_info, + { USB_ID(0x1235, 0x8211), &solo_gen3_info }, + { USB_ID(0x1235, 0x8210), &s2i2_gen3_info }, + { USB_ID(0x1235, 0x8212), &s4i4_gen3_info }, + { USB_ID(0x1235, 0x8213), &s8i6_gen3_info }, + { USB_ID(0x1235, 0x8214), &s18i8_gen3_info }, + { USB_ID(0x1235, 0x8215), &s18i20_gen3_info },
/* Supported Clarett+ devices */ - &clarett_8pre_info, + { USB_ID(0x1235, 0x820c), &clarett_8pre_info },
/* End of list */ - NULL + { 0, NULL }, };
/* get the starting port index number for a given port type/direction */ @@ -4072,17 +4055,17 @@ static int scarlett2_init_notify(struct usb_mixer_interface *mixer)
static int snd_scarlett_gen2_controls_create(struct usb_mixer_interface *mixer) { - const struct scarlett2_device_info **info = scarlett2_devices; + const struct scarlett2_device_entry *entry = scarlett2_devices; int err;
- /* Find device in scarlett2_devices */ - while (*info && (*info)->usb_id != mixer->chip->usb_id) - info++; - if (!*info) + /* Find entry in scarlett2_devices */ + while (entry->usb_id && entry->usb_id != mixer->chip->usb_id) + entry++; + if (!entry->usb_id) return -EINVAL;
/* Initialise private data */ - err = scarlett2_init_private(mixer, *info); + err = scarlett2_init_private(mixer, entry->info); if (err < 0) return err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rander Wang rander.wang@intel.com
[ Upstream commit c1c48fd6bbe788458e3685fea74bdb3cb148ff93 ]
Driver will receive exception IPC message and process it by snd_sof_dsp_panic.
Signed-off-by: Rander Wang rander.wang@intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Guennadi Liakhovetski guennadi.liakhovetski@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230919092416.4137-10-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc4.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/sof/ipc4.c b/sound/soc/sof/ipc4.c index ab6eddd91bb77..1b09496733fb8 100644 --- a/sound/soc/sof/ipc4.c +++ b/sound/soc/sof/ipc4.c @@ -614,6 +614,9 @@ static void sof_ipc4_rx_msg(struct snd_sof_dev *sdev) case SOF_IPC4_NOTIFY_LOG_BUFFER_STATUS: sof_ipc4_mtrace_update_pos(sdev, SOF_IPC4_LOG_CORE_GET(ipc4_msg->primary)); break; + case SOF_IPC4_NOTIFY_EXCEPTION_CAUGHT: + snd_sof_dsp_panic(sdev, 0, true); + break; default: dev_dbg(sdev->dev, "Unhandled DSP message: %#x|%#x\n", ipc4_msg->primary, ipc4_msg->extension);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 8bf7187d978610b9e327a3d92728c8864a575ebd ]
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting, and remove extract_width() which only wraps that FIELD_GET().
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20230919125648.1920-2-ilpo.jarvinen@linux.intel.co... Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Dean Luick dean.luick@cornelisnetworks.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/pcie.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c index 08732e1ac9662..c132a9c073bff 100644 --- a/drivers/infiniband/hw/hfi1/pcie.c +++ b/drivers/infiniband/hw/hfi1/pcie.c @@ -3,6 +3,7 @@ * Copyright(c) 2015 - 2019 Intel Corporation. */
+#include <linux/bitfield.h> #include <linux/pci.h> #include <linux/io.h> #include <linux/delay.h> @@ -210,12 +211,6 @@ static u32 extract_speed(u16 linkstat) return speed; }
-/* return the PCIe link speed from the given link status */ -static u32 extract_width(u16 linkstat) -{ - return (linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT; -} - /* read the link status and set dd->{lbus_width,lbus_speed,lbus_info} */ static void update_lbus_info(struct hfi1_devdata *dd) { @@ -228,7 +223,7 @@ static void update_lbus_info(struct hfi1_devdata *dd) return; }
- dd->lbus_width = extract_width(linkstat); + dd->lbus_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat); dd->lbus_speed = extract_speed(linkstat); snprintf(dd->lbus_info, sizeof(dd->lbus_info), "PCIe,%uMHz,x%u", dd->lbus_speed, dd->lbus_width);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 6de426f9276c448e2db7238911c97fb157cb23be ]
If init debugfs failed during device registration due to memory allocation failure, debugfs_remove_recursive() is called, after which debugfs_dir is not set to NULL. debugfs_remove_recursive() will be called again during device removal. As a result, illegal pointer is accessed.
[ 1665.467244] hisi_sas_v3_hw 0000:b4:02.0: failed to init debugfs! ... [ 1669.836708] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [ 1669.872669] pc : down_write+0x24/0x70 [ 1669.876315] lr : down_write+0x1c/0x70 [ 1669.879961] sp : ffff000036f53a30 [ 1669.883260] x29: ffff000036f53a30 x28: ffffa027c31549f8 [ 1669.888547] x27: ffffa027c3140000 x26: 0000000000000000 [ 1669.893834] x25: ffffa027bf37c270 x24: ffffa027bf37c270 [ 1669.899122] x23: ffff0000095406b8 x22: ffff0000095406a8 [ 1669.904408] x21: 0000000000000000 x20: ffffa027bf37c310 [ 1669.909695] x19: 00000000000000a0 x18: ffff8027dcd86f10 [ 1669.914982] x17: 0000000000000000 x16: 0000000000000000 [ 1669.920268] x15: 0000000000000000 x14: ffffa0274014f870 [ 1669.925555] x13: 0000000000000040 x12: 0000000000000228 [ 1669.930842] x11: 0000000000000020 x10: 0000000000000bb0 [ 1669.936129] x9 : ffff000036f537f0 x8 : ffff80273088ca10 [ 1669.941416] x7 : 000000000000001d x6 : 00000000ffffffff [ 1669.946702] x5 : ffff000008a36310 x4 : ffff80273088be00 [ 1669.951989] x3 : ffff000009513e90 x2 : 0000000000000000 [ 1669.957276] x1 : 00000000000000a0 x0 : ffffffff00000001 [ 1669.962563] Call trace: [ 1669.965000] down_write+0x24/0x70 [ 1669.968301] debugfs_remove_recursive+0x5c/0x1b0 [ 1669.972905] hisi_sas_debugfs_exit+0x24/0x30 [hisi_sas_main] [ 1669.978541] hisi_sas_v3_remove+0x130/0x150 [hisi_sas_v3_hw] [ 1669.984175] pci_device_remove+0x48/0xd8 [ 1669.988082] device_release_driver_internal+0x1b4/0x250 [ 1669.993282] device_release_driver+0x28/0x38 [ 1669.997534] pci_stop_bus_device+0x84/0xb8 [ 1670.001611] pci_stop_and_remove_bus_device_locked+0x24/0x40 [ 1670.007244] remove_store+0xfc/0x140 [ 1670.010802] dev_attr_store+0x44/0x60 [ 1670.014448] sysfs_kf_write+0x58/0x80 [ 1670.018095] kernfs_fop_write+0xe8/0x1f0 [ 1670.022000] __vfs_write+0x60/0x190 [ 1670.025472] vfs_write+0xac/0x1c0 [ 1670.028771] ksys_write+0x6c/0xd8 [ 1670.032071] __arm64_sys_write+0x24/0x30 [ 1670.035977] el0_svc_common+0x78/0x130 [ 1670.039710] el0_svc_handler+0x38/0x78 [ 1670.043442] el0_svc+0x8/0xc
To fix this, set debugfs_dir to NULL after debugfs_remove_recursive().
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xingui Yang yangxingui@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1694571327-78697-2-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 2f33e6b4a92fb..9285ae508afa6 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -4861,6 +4861,12 @@ static void debugfs_bist_init_v3_hw(struct hisi_hba *hisi_hba) hisi_hba->debugfs_bist_linkrate = SAS_LINK_RATE_1_5_GBPS; }
+static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) +{ + debugfs_remove_recursive(hisi_hba->debugfs_dir); + hisi_hba->debugfs_dir = NULL; +} + static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba) { struct device *dev = hisi_hba->dev; @@ -4884,18 +4890,13 @@ static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba)
for (i = 0; i < hisi_sas_debugfs_dump_count; i++) { if (debugfs_alloc_v3_hw(hisi_hba, i)) { - debugfs_remove_recursive(hisi_hba->debugfs_dir); + debugfs_exit_v3_hw(hisi_hba); dev_dbg(dev, "failed to init debugfs!\n"); break; } } }
-static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) -{ - debugfs_remove_recursive(hisi_hba->debugfs_dir); -} - static int hisi_sas_v3_probe(struct pci_dev *pdev, const struct pci_device_id *id) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tyrel Datwyler tyreld@linux.ibm.com
[ Upstream commit b39f2d10b86d0af353ea339e5815820026bca48f ]
In practice the driver should never send more commands than are allocated to a queue's event pool. In the unlikely event that this happens, the code asserts a BUG_ON, and in the case that the kernel is not configured to crash on panic returns a junk event pointer from the empty event list causing things to spiral from there. This BUG_ON is a historical artifact of the ibmvfc driver first being upstreamed, and it is well known now that the use of BUG_ON is bad practice except in the most unrecoverable scenario. There is nothing about this scenario that prevents the driver from recovering and carrying on.
Remove the BUG_ON in question from ibmvfc_get_event() and return a NULL pointer in the case of an empty event pool. Update all call sites to ibmvfc_get_event() to check for a NULL pointer and perfrom the appropriate failure or recovery action.
Signed-off-by: Tyrel Datwyler tyreld@linux.ibm.com Link: https://lore.kernel.org/r/20230921225435.3537728-2-tyreld@linux.ibm.com Reviewed-by: Brian King brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ibmvscsi/ibmvfc.c | 124 ++++++++++++++++++++++++++++++++- 1 file changed, 122 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 470e8e6c41b62..c98346e464b48 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -1518,7 +1518,11 @@ static struct ibmvfc_event *ibmvfc_get_event(struct ibmvfc_queue *queue) unsigned long flags;
spin_lock_irqsave(&queue->l_lock, flags); - BUG_ON(list_empty(&queue->free)); + if (list_empty(&queue->free)) { + ibmvfc_log(queue->vhost, 4, "empty event pool on queue:%ld\n", queue->hwq_id); + spin_unlock_irqrestore(&queue->l_lock, flags); + return NULL; + } evt = list_entry(queue->free.next, struct ibmvfc_event, queue_list); atomic_set(&evt->free, 0); list_del(&evt->queue_list); @@ -1947,9 +1951,15 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd) if (vhost->using_channels) { scsi_channel = hwq % vhost->scsi_scrqs.active_queues; evt = ibmvfc_get_event(&vhost->scsi_scrqs.scrqs[scsi_channel]); + if (!evt) + return SCSI_MLQUEUE_HOST_BUSY; + evt->hwq = hwq % vhost->scsi_scrqs.active_queues; - } else + } else { evt = ibmvfc_get_event(&vhost->crq); + if (!evt) + return SCSI_MLQUEUE_HOST_BUSY; + }
ibmvfc_init_event(evt, ibmvfc_scsi_done, IBMVFC_CMD_FORMAT); evt->cmnd = cmnd; @@ -2037,6 +2047,11 @@ static int ibmvfc_bsg_timeout(struct bsg_job *job)
vhost->aborting_passthru = 1; evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } + ibmvfc_init_event(evt, ibmvfc_bsg_timeout_done, IBMVFC_MAD_FORMAT);
tmf = &evt->iu.tmf; @@ -2095,6 +2110,10 @@ static int ibmvfc_bsg_plogi(struct ibmvfc_host *vhost, unsigned int port_id) goto unlock_out;
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + rc = -ENOMEM; + goto unlock_out; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT); plogi = &evt->iu.plogi; memset(plogi, 0, sizeof(*plogi)); @@ -2213,6 +2232,11 @@ static int ibmvfc_bsg_request(struct bsg_job *job) }
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + rc = -ENOMEM; + goto out; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT); mad = &evt->iu.passthru;
@@ -2301,6 +2325,11 @@ static int ibmvfc_reset_device(struct scsi_device *sdev, int type, char *desc) else evt = ibmvfc_get_event(&vhost->crq);
+ if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } + ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_CMD_FORMAT); tmf = ibmvfc_init_vfc_cmd(evt, sdev); iu = ibmvfc_get_fcp_iu(vhost, tmf); @@ -2504,6 +2533,8 @@ static struct ibmvfc_event *ibmvfc_init_tmf(struct ibmvfc_queue *queue, struct ibmvfc_tmf *tmf;
evt = ibmvfc_get_event(queue); + if (!evt) + return NULL; ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
tmf = &evt->iu.tmf; @@ -2560,6 +2591,11 @@ static int ibmvfc_cancel_all_mq(struct scsi_device *sdev, int type)
if (found_evt && vhost->logged_in) { evt = ibmvfc_init_tmf(&queues[i], sdev, type); + if (!evt) { + spin_unlock(queues[i].q_lock); + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } evt->sync_iu = &queues[i].cancel_rsp; ibmvfc_send_event(evt, vhost, default_timeout); list_add_tail(&evt->cancel, &cancelq); @@ -2773,6 +2809,10 @@ static int ibmvfc_abort_task_set(struct scsi_device *sdev)
if (vhost->state == IBMVFC_ACTIVE) { evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_CMD_FORMAT); tmf = ibmvfc_init_vfc_cmd(evt, sdev); iu = ibmvfc_get_fcp_iu(vhost, tmf); @@ -4031,6 +4071,12 @@ static void ibmvfc_tgt_send_prli(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_init_event(evt, ibmvfc_tgt_prli_done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; @@ -4138,6 +4184,12 @@ static void ibmvfc_tgt_send_plogi(struct ibmvfc_target *tgt) kref_get(&tgt->kref); tgt->logo_rcvd = 0; evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); ibmvfc_init_event(evt, ibmvfc_tgt_plogi_done, IBMVFC_MAD_FORMAT); @@ -4214,6 +4266,8 @@ static struct ibmvfc_event *__ibmvfc_tgt_get_implicit_logout_evt(struct ibmvfc_t
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) + return NULL; ibmvfc_init_event(evt, done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; mad = &evt->iu.implicit_logout; @@ -4241,6 +4295,13 @@ static void ibmvfc_tgt_implicit_logout(struct ibmvfc_target *tgt) vhost->discovery_threads++; evt = __ibmvfc_tgt_get_implicit_logout_evt(tgt, ibmvfc_tgt_implicit_logout_done); + if (!evt) { + vhost->discovery_threads--; + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + }
ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); if (ibmvfc_send_event(evt, vhost, default_timeout)) { @@ -4380,6 +4441,12 @@ static void ibmvfc_tgt_move_login(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_DEL_RPORT); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); ibmvfc_init_event(evt, ibmvfc_tgt_move_login_done, IBMVFC_MAD_FORMAT); @@ -4546,6 +4613,14 @@ static void ibmvfc_adisc_timeout(struct timer_list *t) vhost->abort_threads++; kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + tgt_err(tgt, "Failed to get cancel event for ADISC.\n"); + vhost->abort_threads--; + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return; + } ibmvfc_init_event(evt, ibmvfc_tgt_adisc_cancel_done, IBMVFC_MAD_FORMAT);
evt->tgt = tgt; @@ -4596,6 +4671,12 @@ static void ibmvfc_tgt_adisc(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_init_event(evt, ibmvfc_tgt_adisc_done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; @@ -4699,6 +4780,12 @@ static void ibmvfc_tgt_query_target(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; evt->tgt = tgt; ibmvfc_init_event(evt, ibmvfc_tgt_query_target_done, IBMVFC_MAD_FORMAT); @@ -4871,6 +4958,13 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost) { struct ibmvfc_discover_targets *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq); + int level = IBMVFC_DEFAULT_LOG_LEVEL; + + if (!evt) { + ibmvfc_log(vhost, level, "Discover Targets failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + }
ibmvfc_init_event(evt, ibmvfc_discover_targets_done, IBMVFC_MAD_FORMAT); mad = &evt->iu.discover_targets; @@ -4948,8 +5042,15 @@ static void ibmvfc_channel_setup(struct ibmvfc_host *vhost) struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs; unsigned int num_channels = min(vhost->client_scsi_channels, vhost->max_vios_scsi_channels); + int level = IBMVFC_DEFAULT_LOG_LEVEL; int i;
+ if (!evt) { + ibmvfc_log(vhost, level, "Channel Setup failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + memset(setup_buf, 0, sizeof(*setup_buf)); if (num_channels == 0) setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS); @@ -5011,6 +5112,13 @@ static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost) { struct ibmvfc_channel_enquiry *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq); + int level = IBMVFC_DEFAULT_LOG_LEVEL; + + if (!evt) { + ibmvfc_log(vhost, level, "Channel Enquiry failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + }
ibmvfc_init_event(evt, ibmvfc_channel_enquiry_done, IBMVFC_MAD_FORMAT); mad = &evt->iu.channel_enquiry; @@ -5133,6 +5241,12 @@ static void ibmvfc_npiv_login(struct ibmvfc_host *vhost) struct ibmvfc_npiv_login_mad *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq);
+ if (!evt) { + ibmvfc_dbg(vhost, "NPIV Login failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + ibmvfc_gather_partition_info(vhost); ibmvfc_set_login_info(vhost); ibmvfc_init_event(evt, ibmvfc_npiv_login_done, IBMVFC_MAD_FORMAT); @@ -5197,6 +5311,12 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *vhost) struct ibmvfc_event *evt;
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_dbg(vhost, "NPIV Logout failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + ibmvfc_init_event(evt, ibmvfc_npiv_logout_done, IBMVFC_MAD_FORMAT);
mad = &evt->iu.npiv_logout;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juntong Deng juntong.deng@outlook.com
[ Upstream commit 525b861a008143048535011f3816d407940f4bfa ]
l2nbperpage is log2(number of blks per page), and the minimum legal value should be 0, not negative.
In the case of l2nbperpage being negative, an error will occur when subsequently used as shift exponent.
Syzbot reported this bug:
UBSAN: shift-out-of-bounds in fs/jfs/jfs_dmap.c:799:12 shift exponent -16777216 is negative
Reported-by: syzbot+debee9ab7ae2b34b0307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=debee9ab7ae2b34b0307 Signed-off-by: Juntong Deng juntong.deng@outlook.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index 88afd108c2dd2..3a1842348112d 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -180,7 +180,8 @@ int dbMount(struct inode *ipbmap) bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree);
bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage); - if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) { + if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE || + bmp->db_l2nbperpage < 0) { err = -EINVAL; goto err_release_metapage; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juntong Deng juntong.deng@outlook.com
[ Upstream commit 64933ab7b04881c6c18b21ff206c12278341c72e ]
Both db_maxag and db_agpref are used as the index of the db_agfree array, but there is currently no validity check for db_maxag and db_agpref, which can lead to errors.
The following is related bug reported by Syzbot:
UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dmap.c:639:20 index 7936 is out of range for type 'atomic_t[128]'
Add checking that the values of db_maxag and db_agpref are valid indexes for the db_agfree array.
Reported-by: syzbot+38e876a8aa44b7115c76@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=38e876a8aa44b7115c76 Signed-off-by: Juntong Deng juntong.deng@outlook.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index 3a1842348112d..4d59373f9e6c9 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -195,6 +195,12 @@ int dbMount(struct inode *ipbmap) bmp->db_maxlevel = le32_to_cpu(dbmp_le->dn_maxlevel); bmp->db_maxag = le32_to_cpu(dbmp_le->dn_maxag); bmp->db_agpref = le32_to_cpu(dbmp_le->dn_agpref); + if (bmp->db_maxag >= MAXAG || bmp->db_maxag < 0 || + bmp->db_agpref >= MAXAG || bmp->db_agpref < 0) { + err = -EINVAL; + goto err_release_metapage; + } + bmp->db_aglevel = le32_to_cpu(dbmp_le->dn_aglevel); bmp->db_agheight = le32_to_cpu(dbmp_le->dn_agheight); bmp->db_agwidth = le32_to_cpu(dbmp_le->dn_agwidth);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manas Ghandat ghandatmanas@gmail.com
[ Upstream commit 22cad8bc1d36547cdae0eef316c47d917ce3147c ]
Currently while searching for dmtree_t for sufficient free blocks there is an array out of bounds while getting element in tp->dm_stree. To add the required check for out of bound we first need to determine the type of dmtree. Thus added an extra parameter to dbFindLeaf so that the type of tree can be determined and the required check can be applied.
Reported-by: syzbot+aea1ad91e854d0a83e04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=aea1ad91e854d0a83e04 Signed-off-by: Manas Ghandat ghandatmanas@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index 4d59373f9e6c9..11c77757ead9e 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -87,7 +87,7 @@ static int dbAllocCtl(struct bmap * bmp, s64 nblocks, int l2nb, s64 blkno, static int dbExtend(struct inode *ip, s64 blkno, s64 nblocks, s64 addnblocks); static int dbFindBits(u32 word, int l2nb); static int dbFindCtl(struct bmap * bmp, int l2nb, int level, s64 * blkno); -static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx); +static int dbFindLeaf(dmtree_t *tp, int l2nb, int *leafidx, bool is_ctl); static int dbFreeBits(struct bmap * bmp, struct dmap * dp, s64 blkno, int nblocks); static int dbFreeDmap(struct bmap * bmp, struct dmap * dp, s64 blkno, @@ -1717,7 +1717,7 @@ static int dbFindCtl(struct bmap * bmp, int l2nb, int level, s64 * blkno) * dbFindLeaf() returns the index of the leaf at which * free space was found. */ - rc = dbFindLeaf((dmtree_t *) dcp, l2nb, &leafidx); + rc = dbFindLeaf((dmtree_t *) dcp, l2nb, &leafidx, true);
/* release the buffer. */ @@ -1964,7 +1964,7 @@ dbAllocDmapLev(struct bmap * bmp, * free space. if sufficient free space is found, dbFindLeaf() * returns the index of the leaf at which free space was found. */ - if (dbFindLeaf((dmtree_t *) & dp->tree, l2nb, &leafidx)) + if (dbFindLeaf((dmtree_t *) &dp->tree, l2nb, &leafidx, false)) return -ENOSPC;
if (leafidx < 0) @@ -2928,14 +2928,18 @@ static void dbAdjTree(dmtree_t * tp, int leafno, int newval) * leafidx - return pointer to be set to the index of the leaf * describing at least l2nb free blocks if sufficient * free blocks are found. + * is_ctl - determines if the tree is of type ctl * * RETURN VALUES: * 0 - success * -ENOSPC - insufficient free blocks. */ -static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx) +static int dbFindLeaf(dmtree_t *tp, int l2nb, int *leafidx, bool is_ctl) { int ti, n = 0, k, x = 0; + int max_size; + + max_size = is_ctl ? CTLTREESIZE : TREESIZE;
/* first check the root of the tree to see if there is * sufficient free space. @@ -2956,6 +2960,8 @@ static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx) /* sufficient free space found. move to the next * level (or quit if this is the last level). */ + if (x + n > max_size) + return -ENOSPC; if (l2nb <= tp->dmt_stree[x + n]) break; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manas Ghandat ghandatmanas@gmail.com
[ Upstream commit 05d9ea1ceb62a55af6727a69269a4fd310edf483 ]
Currently there is not check against the agno of the iag while allocating new inodes to avoid fragmentation problem. Added the check which is required.
Reported-by: syzbot+79d792676d8ac050949f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=79d792676d8ac050949f Signed-off-by: Manas Ghandat ghandatmanas@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_imap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c index 6fb28572cb2c6..34f1358264e23 100644 --- a/fs/jfs/jfs_imap.c +++ b/fs/jfs/jfs_imap.c @@ -1320,7 +1320,7 @@ diInitInode(struct inode *ip, int iagno, int ino, int extno, struct iag * iagp) int diAlloc(struct inode *pip, bool dir, struct inode *ip) { int rc, ino, iagno, addext, extno, bitno, sword; - int nwords, rem, i, agno; + int nwords, rem, i, agno, dn_numag; u32 mask, inosmap, extsmap; struct inode *ipimap; struct metapage *mp; @@ -1356,6 +1356,9 @@ int diAlloc(struct inode *pip, bool dir, struct inode *ip)
/* get the ag number of this iag */ agno = BLKTOAG(JFS_IP(pip)->agstart, JFS_SBI(pip->i_sb)); + dn_numag = JFS_SBI(pip->i_sb)->bmap->db_numag; + if (agno < 0 || agno > dn_numag) + return -EIO;
if (atomic_read(&JFS_SBI(pip->i_sb)->bmap->db_active[agno])) { /*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikhail Khvainitski me@khvoinitsky.org
[ Upstream commit 46a0a2c96f0f47628190f122c2e3d879e590bcbe ]
Built-in firmware of cptkbd handles scrolling by itself (when middle button is pressed) but with issues: it does not support horizontal and hi-res scrolling and upon middle button release it sends middle button click even if there was a scrolling event. Commit 3cb5ff0220e3 ("HID: lenovo: Hide middle-button press until release") workarounds last issue but it's impossible to workaround scrolling-related issues without firmware modification.
Likely, Dennis Schneider has reverse engineered the firmware and provided an instruction on how to patch it [1]. However, aforementioned workaround prevents userspace (libinput) from knowing exact moment when middle button has been pressed down and performing "On-Button scrolling". This commit detects correctly-behaving patched firmware if cursor movement events has been received during middle button being pressed and stops applying workaround for this device.
Link: https://hohlerde.org/rauch/en/elektronik/projekte/tpkbd-fix/ [1]
Signed-off-by: Mikhail Khvainitski me@khvoinitsky.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-lenovo.c | 68 ++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 23 deletions(-)
diff --git a/drivers/hid/hid-lenovo.c b/drivers/hid/hid-lenovo.c index 44763c0da4441..9c1181313e44d 100644 --- a/drivers/hid/hid-lenovo.c +++ b/drivers/hid/hid-lenovo.c @@ -51,7 +51,12 @@ struct lenovo_drvdata { int select_right; int sensitivity; int press_speed; - u8 middlebutton_state; /* 0:Up, 1:Down (undecided), 2:Scrolling */ + /* 0: Up + * 1: Down (undecided) + * 2: Scrolling + * 3: Patched firmware, disable workaround + */ + u8 middlebutton_state; bool fn_lock; };
@@ -668,31 +673,48 @@ static int lenovo_event_cptkbd(struct hid_device *hdev, { struct lenovo_drvdata *cptkbd_data = hid_get_drvdata(hdev);
- /* "wheel" scroll events */ - if (usage->type == EV_REL && (usage->code == REL_WHEEL || - usage->code == REL_HWHEEL)) { - /* Scroll events disable middle-click event */ - cptkbd_data->middlebutton_state = 2; - return 0; - } + if (cptkbd_data->middlebutton_state != 3) { + /* REL_X and REL_Y events during middle button pressed + * are only possible on patched, bug-free firmware + * so set middlebutton_state to 3 + * to never apply workaround anymore + */ + if (cptkbd_data->middlebutton_state == 1 && + usage->type == EV_REL && + (usage->code == REL_X || usage->code == REL_Y)) { + cptkbd_data->middlebutton_state = 3; + /* send middle button press which was hold before */ + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 1); + input_sync(field->hidinput->input); + }
- /* Middle click events */ - if (usage->type == EV_KEY && usage->code == BTN_MIDDLE) { - if (value == 1) { - cptkbd_data->middlebutton_state = 1; - } else if (value == 0) { - if (cptkbd_data->middlebutton_state == 1) { - /* No scrolling inbetween, send middle-click */ - input_event(field->hidinput->input, - EV_KEY, BTN_MIDDLE, 1); - input_sync(field->hidinput->input); - input_event(field->hidinput->input, - EV_KEY, BTN_MIDDLE, 0); - input_sync(field->hidinput->input); + /* "wheel" scroll events */ + if (usage->type == EV_REL && (usage->code == REL_WHEEL || + usage->code == REL_HWHEEL)) { + /* Scroll events disable middle-click event */ + cptkbd_data->middlebutton_state = 2; + return 0; + } + + /* Middle click events */ + if (usage->type == EV_KEY && usage->code == BTN_MIDDLE) { + if (value == 1) { + cptkbd_data->middlebutton_state = 1; + } else if (value == 0) { + if (cptkbd_data->middlebutton_state == 1) { + /* No scrolling inbetween, send middle-click */ + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 1); + input_sync(field->hidinput->input); + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 0); + input_sync(field->hidinput->input); + } + cptkbd_data->middlebutton_state = 0; } - cptkbd_data->middlebutton_state = 0; + return 1; } - return 1; }
if (usage->type == EV_KEY && usage->code == KEY_FN_ESC && value == 1) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit b0150014878c32197cfa66e3e2f79e57f66babc0 ]
Place IRQ handlers such as gic_handle_irq() in the irqentry section even if FUNCTION_GRAPH_TRACER is not enabled. Without this, the stack depot's filter_irq_stacks() does not correctly filter out IRQ stacks in those configurations, which hampers deduplication and eventually leads to "Stack depot reached limit capacity" splats with KASAN.
A similar fix was done for arm64 in commit f6794950f0e5ba37e3bbed ("arm64: set __exception_irq_entry with __irq_entry as a default").
Link: https://lore.kernel.org/r/20230803-arm-irqentry-v1-1-8aad8e260b1c@axis.com
Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/include/asm/exception.h | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/include/asm/exception.h b/arch/arm/include/asm/exception.h index 58e039a851af0..3c82975d46db3 100644 --- a/arch/arm/include/asm/exception.h +++ b/arch/arm/include/asm/exception.h @@ -10,10 +10,6 @@
#include <linux/interrupt.h>
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER #define __exception_irq_entry __irq_entry -#else -#define __exception_irq_entry -#endif
#endif /* __ASM_ARM_EXCEPTION_H */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit f93dc90c2e8ed664985e366aa6459ac83cdab236 ]
While AudioDSP drivers assign streams exclusively of HOST or LINK type, nothing blocks a user to attempt to assign a COUPLED stream. As supplied substream instance may be a stub, what is the case when code-loading, such scenario ends with null-ptr-deref.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://lore.kernel.org/r/20231006102857.749143-2-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/hdac_stream.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_stream.c b/sound/hda/hdac_stream.c index 2633a4bb1d85d..214a0680524b0 100644 --- a/sound/hda/hdac_stream.c +++ b/sound/hda/hdac_stream.c @@ -354,8 +354,10 @@ struct hdac_stream *snd_hdac_stream_assign(struct hdac_bus *bus, struct hdac_stream *res = NULL;
/* make a non-zero unique key for the substream */ - int key = (substream->pcm->device << 16) | (substream->number << 2) | - (substream->stream + 1); + int key = (substream->number << 2) | (substream->stream + 1); + + if (substream->pcm) + key |= (substream->pcm->device << 16);
spin_lock_irq(&bus->reg_lock); list_for_each_entry(azx_dev, &bus->stream_list, list) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 9e189e80dcb68528dea9e061d9704993f98cb84f ]
These gpio names are due to old DT bindings not following the "-gpio"/"-gpios" conventions. Handle it using a quirk so the driver can just look up the GPIOs.
Signed-off-by: Linus Walleij linus.walleij@linaro.org Acked-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://lore.kernel.org/r/20231006-descriptors-asoc-mediatek-v1-1-07fe79f337... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib-of.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c index 1436cdb5fa268..219bf8a82d8f9 100644 --- a/drivers/gpio/gpiolib-of.c +++ b/drivers/gpio/gpiolib-of.c @@ -496,6 +496,10 @@ static struct gpio_desc *of_find_gpio_rename(struct device_node *np, #if IS_ENABLED(CONFIG_SND_SOC_CS42L56) { "reset", "cirrus,gpio-nreset", "cirrus,cs42l56" }, #endif +#if IS_ENABLED(CONFIG_SND_SOC_MT2701_CS42448) + { "i2s1-in-sel-gpio1", NULL, "mediatek,mt2701-cs42448-machine" }, + { "i2s1-in-sel-gpio2", NULL, "mediatek,mt2701-cs42448-machine" }, +#endif #if IS_ENABLED(CONFIG_SND_SOC_TLV320AIC3X) { "reset", "gpio-reset", "ti,tlv320aic3x" }, { "reset", "gpio-reset", "ti,tlv320aic33" },
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 759574abd78e3b47ec45bbd31a64e8832cf73f97 ]
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting.
Similarly, change custom code that misleadingly used PCI_EXP_LNKSTA_NLW_SHIFT to prepare value for PCI_EXP_LNKCAP write to use FIELD_PREP() with correct field define (PCI_EXP_LNKCAP_MLW).
Link: https://lore.kernel.org/r/20230919125648.1920-5-ilpo.jarvinen@linux.intel.co... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c index ccff8cde5cff6..07cb0818a5138 100644 --- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -9,6 +9,7 @@ * Author: Vidya Sagar vidyas@nvidia.com */
+#include <linux/bitfield.h> #include <linux/clk.h> #include <linux/debugfs.h> #include <linux/delay.h> @@ -347,8 +348,7 @@ static void apply_bad_link_workaround(struct dw_pcie_rp *pp) */ val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA); if (val & PCI_EXP_LNKSTA_LBMS) { - current_link_width = (val & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + current_link_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, val); if (pcie->init_link_width > current_link_width) { dev_warn(pci->dev, "PCIe link is bad, width reduced\n"); val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + @@ -761,8 +761,7 @@ static void tegra_pcie_enable_system_interrupts(struct dw_pcie_rp *pp)
val_w = dw_pcie_readw_dbi(&pcie->pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA); - pcie->init_link_width = (val_w & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + pcie->init_link_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, val_w);
val_w = dw_pcie_readw_dbi(&pcie->pci, pcie->pcie_cap_base + PCI_EXP_LNKCTL); @@ -921,7 +920,7 @@ static int tegra_pcie_dw_host_init(struct dw_pcie_rp *pp) /* Configure Max lane width from DT */ val = dw_pcie_readl_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKCAP); val &= ~PCI_EXP_LNKCAP_MLW; - val |= (pcie->num_lanes << PCI_EXP_LNKSTA_NLW_SHIFT); + val |= FIELD_PREP(PCI_EXP_LNKCAP_MLW, pcie->num_lanes); dw_pcie_writel_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKCAP, val);
/* Clear Slot Clock Configuration bit if SRNS configuration */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 408599ec561ad5862cda4f107626009f6fa97a74 ]
mvebu_pcie_setup_hw() setups the Maximum Link Width field in the Link Capabilities registers using an open-coded variant of FIELD_PREP() with a literal in shift. Improve readability by using FIELD_PREP(PCI_EXP_LNKCAP_MLW, ...).
Link: https://lore.kernel.org/r/20230919125648.1920-6-ilpo.jarvinen@linux.intel.co... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-mvebu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c index c931b1b07b1d8..0cacd58f6c05e 100644 --- a/drivers/pci/controller/pci-mvebu.c +++ b/drivers/pci/controller/pci-mvebu.c @@ -265,7 +265,7 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) */ lnkcap = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCAP); lnkcap &= ~PCI_EXP_LNKCAP_MLW; - lnkcap |= (port->is_x4 ? 4 : 1) << 4; + lnkcap |= FIELD_PREP(PCI_EXP_LNKCAP_MLW, port->is_x4 ? 4 : 1); mvebu_writel(port, lnkcap, PCIE_CAP_PCIEXP + PCI_EXP_LNKCAP);
/* Disable Root Bridge I/O space, memory space and bus mastering. */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit c28742447ca9879b52fbaf022ad844f0ffcd749c ]
In get_esi() PCI errors are checked inside line-split "if" conditions (in addition to the file not following the coding style). To make the code in get_esi() more readable, fix the coding style and use the usual error handling pattern with a separate variable.
In addition, initialization of 'error' variable at declaration is not needed.
No functional changes intended.
Link: https://lore.kernel.org/r/20230911125354.25501-4-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/iphase.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/atm/iphase.c b/drivers/atm/iphase.c index 3241486869530..9bba8f280a4d4 100644 --- a/drivers/atm/iphase.c +++ b/drivers/atm/iphase.c @@ -2291,19 +2291,21 @@ static int get_esi(struct atm_dev *dev) static int reset_sar(struct atm_dev *dev) { IADEV *iadev; - int i, error = 1; + int i, error; unsigned int pci[64]; iadev = INPH_IA_DEV(dev); - for(i=0; i<64; i++) - if ((error = pci_read_config_dword(iadev->pci, - i*4, &pci[i])) != PCIBIOS_SUCCESSFUL) - return error; + for (i = 0; i < 64; i++) { + error = pci_read_config_dword(iadev->pci, i * 4, &pci[i]); + if (error != PCIBIOS_SUCCESSFUL) + return error; + } writel(0, iadev->reg+IPHASE5575_EXT_RESET); - for(i=0; i<64; i++) - if ((error = pci_write_config_dword(iadev->pci, - i*4, pci[i])) != PCIBIOS_SUCCESSFUL) - return error; + for (i = 0; i < 64; i++) { + error = pci_write_config_dword(iadev->pci, i * 4, pci[i]); + if (error != PCIBIOS_SUCCESSFUL) + return error; + } udelay(5); return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit d15f18053e5cc5576af9e7eef0b2a91169b6326d ]
Placing PCI error code check inside "if" condition usually results in need to split lines. Combined with additional conditions the "if" condition becomes messy.
Convert to the usual error handling pattern with an additional variable to improve code readability. In addition, reverse the logic in pci_find_vsec_capability() to get rid of &&.
No functional changes intended.
Link: https://lore.kernel.org/r/20230911125354.25501-5-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com [bhelgaas: PCI_POSSIBLE_ERROR()] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 9 ++++++--- drivers/pci/probe.c | 6 +++--- drivers/pci/quirks.c | 6 +++--- 3 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 702fe577089b4..3cd907eb67b74 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -732,15 +732,18 @@ u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap) { u16 vsec = 0; u32 header; + int ret;
if (vendor != dev->vendor) return 0;
while ((vsec = pci_find_next_ext_capability(dev, vsec, PCI_EXT_CAP_ID_VNDR))) { - if (pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, - &header) == PCIBIOS_SUCCESSFUL && - PCI_VNDR_HEADER_ID(header) == cap) + ret = pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, &header); + if (ret != PCIBIOS_SUCCESSFUL) + continue; + + if (PCI_VNDR_HEADER_ID(header) == cap) return vsec; }
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 24a83cf5ace8c..cd08d39fdb1ff 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1653,15 +1653,15 @@ static void pci_set_removable(struct pci_dev *dev) static bool pci_ext_cfg_is_aliased(struct pci_dev *dev) { #ifdef CONFIG_PCI_QUIRKS - int pos; + int pos, ret; u32 header, tmp;
pci_read_config_dword(dev, PCI_VENDOR_ID, &header);
for (pos = PCI_CFG_SPACE_SIZE; pos < PCI_CFG_SPACE_EXP_SIZE; pos += PCI_CFG_SPACE_SIZE) { - if (pci_read_config_dword(dev, pos, &tmp) != PCIBIOS_SUCCESSFUL - || header != tmp) + ret = pci_read_config_dword(dev, pos, &tmp); + if ((ret != PCIBIOS_SUCCESSFUL) || (header != tmp)) return false; }
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index eb65170b97ff0..e8a051e2d8140 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5383,7 +5383,7 @@ int pci_dev_specific_disable_acs_redir(struct pci_dev *dev) */ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev) { - int pos, i = 0; + int pos, i = 0, ret; u8 next_cap; u16 reg16, *cap; struct pci_cap_saved_state *state; @@ -5429,8 +5429,8 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev) pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE; - if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) != - PCIBIOS_SUCCESSFUL || (status == 0xffffffff)) + ret = pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status); + if ((ret != PCIBIOS_SUCCESSFUL) || (PCI_POSSIBLE_ERROR(status))) pdev->cfg_size = PCI_CFG_SPACE_SIZE;
if (pci_find_saved_cap(pdev, PCI_CAP_ID_EXP))
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenchao Hao haowenchao2@huawei.com
[ Upstream commit 4df105f0ce9f6f30cda4e99f577150d23f0c9c5f ]
fc_lport_ptp_setup() did not check the return value of fc_rport_create() which can return NULL and would cause a NULL pointer dereference. Address this issue by checking return value of fc_rport_create() and log error message on fc_rport_create() failed.
Signed-off-by: Wenchao Hao haowenchao2@huawei.com Link: https://lore.kernel.org/r/20231011130350.819571-1-haowenchao2@huawei.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libfc/fc_lport.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c index 9c02c9523c4d4..ab06e9aeb613e 100644 --- a/drivers/scsi/libfc/fc_lport.c +++ b/drivers/scsi/libfc/fc_lport.c @@ -241,6 +241,12 @@ static void fc_lport_ptp_setup(struct fc_lport *lport, } mutex_lock(&lport->disc.disc_mutex); lport->ptp_rdata = fc_rport_create(lport, remote_fid); + if (!lport->ptp_rdata) { + printk(KERN_WARNING "libfc: Failed to setup lport 0x%x\n", + lport->port_id); + mutex_unlock(&lport->disc.disc_mutex); + return; + } kref_get(&lport->ptp_rdata->kref); lport->ptp_rdata->ids.port_name = remote_wwpn; lport->ptp_rdata->ids.node_name = remote_wwnn;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit d1f9b39da4a5347150246871325190018cda8cb3 ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting.
Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.co... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-sysfs.c | 5 ++--- drivers/pci/pci.c | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index ab32a91f287b4..4473773dc2af4 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -12,7 +12,7 @@ * Modeled after usb's driverfs.c */
- +#include <linux/bitfield.h> #include <linux/kernel.h> #include <linux/sched.h> #include <linux/pci.h> @@ -230,8 +230,7 @@ static ssize_t current_link_width_show(struct device *dev, if (err) return -EINVAL;
- return sysfs_emit(buf, "%u\n", - (linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT); + return sysfs_emit(buf, "%u\n", FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat)); } static DEVICE_ATTR_RO(current_link_width);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 3cd907eb67b74..2c5662d43df1b 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6255,8 +6255,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; - next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + next_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, lnksta);
next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
@@ -6328,7 +6327,7 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); if (lnkcap) - return (lnkcap & PCI_EXP_LNKCAP_MLW) >> 4; + return FIELD_GET(PCI_EXP_LNKCAP_MLW, lnkcap);
return PCIE_LNK_WIDTH_UNKNOWN; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Pawlowski bartosz.pawlowski@intel.com
[ Upstream commit f18b1137d38c091cc8c16365219f0a1d4a30b3d1 ]
Introduce quirk_no_ats() helper function to provide a standard way to disable ATS capability in PCI quirks.
Suggested-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230908143606.685930-2-bartosz.pawlowski@intel.co... Signed-off-by: Bartosz Pawlowski bartosz.pawlowski@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index e8a051e2d8140..b82f94af2fae4 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5507,6 +5507,12 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0420, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0422, quirk_no_ext_tags);
#ifdef CONFIG_PCI_ATS +static void quirk_no_ats(struct pci_dev *pdev) +{ + pci_info(pdev, "disabling ATS\n"); + pdev->ats_cap = 0; +} + /* * Some devices require additional driver setup to enable ATS. Don't use * ATS for those devices as ATS will be enabled before the driver has had a @@ -5520,14 +5526,10 @@ static void quirk_amd_harvest_no_ats(struct pci_dev *pdev) (pdev->subsystem_device == 0xce19 || pdev->subsystem_device == 0xcc10 || pdev->subsystem_device == 0xcc08)) - goto no_ats; - else - return; + quirk_no_ats(pdev); + } else { + quirk_no_ats(pdev); } - -no_ats: - pci_info(pdev, "disabling ATS\n"); - pdev->ats_cap = 0; }
/* AMD Stoney platform GPU */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Pawlowski bartosz.pawlowski@intel.com
[ Upstream commit a18615b1cfc04f00548c60eb9a77e0ce56e848fd ]
Due to a hardware issue in A and B steppings of Intel IPU E2000, it expects wrong endianness in ATS invalidation message body. This problem can lead to outdated translations being returned as valid and finally cause system instability.
To prevent such issues, add quirk_intel_e2000_no_ats() to disable ATS for vulnerable IPU E2000 devices.
Link: https://lore.kernel.org/r/20230908143606.685930-3-bartosz.pawlowski@intel.co... Signed-off-by: Bartosz Pawlowski bartosz.pawlowski@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Alexander Lobakin aleksander.lobakin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b82f94af2fae4..9fa3c9225bb30 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5552,6 +5552,25 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7347, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x734f, quirk_amd_harvest_no_ats); /* AMD Raven platform iGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x15d8, quirk_amd_harvest_no_ats); + +/* + * Intel IPU E2000 revisions before C0 implement incorrect endianness + * in ATS Invalidate Request message body. Disable ATS for those devices. + */ +static void quirk_intel_e2000_no_ats(struct pci_dev *pdev) +{ + if (pdev->revision < 0x20) + quirk_no_ats(pdev); +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1451, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1452, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1453, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1454, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1455, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1457, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1459, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145a, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145c, quirk_intel_e2000_no_ats); #endif /* CONFIG_PCI_ATS */
/* Freescale PCIe doesn't support MSI in RC mode */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit a9a1bcba90254975d4adbcca53f720318cf81c0c ]
This is a preparation before adding the Max-Link-width capability setup which would in its turn complete the max-link-width setup procedure defined by Synopsys in the HW-manual.
Seeing there is a max-link-speed setup method defined in the DW PCIe core driver it would be good to have a similar function for the link width setup.
That's why we need to define a dedicated function first from already implemented but incomplete link-width setting up code.
Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-3-yoshihiro.shimoda... Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-designware.c | 86 ++++++++++---------- 1 file changed, 41 insertions(+), 45 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c index 1f2ee71da4da2..d14b4da700eaf 100644 --- a/drivers/pci/controller/dwc/pcie-designware.c +++ b/drivers/pci/controller/dwc/pcie-designware.c @@ -732,6 +732,46 @@ static void dw_pcie_link_set_max_speed(struct dw_pcie *pci, u32 link_gen)
}
+static void dw_pcie_link_set_max_link_width(struct dw_pcie *pci, u32 num_lanes) +{ + u32 lwsc, plc; + + if (!num_lanes) + return; + + /* Set the number of lanes */ + plc = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL); + plc &= ~PORT_LINK_FAST_LINK_MODE; + plc &= ~PORT_LINK_MODE_MASK; + + /* Set link width speed control register */ + lwsc = dw_pcie_readl_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL); + lwsc &= ~PORT_LOGIC_LINK_WIDTH_MASK; + switch (num_lanes) { + case 1: + plc |= PORT_LINK_MODE_1_LANES; + lwsc |= PORT_LOGIC_LINK_WIDTH_1_LANES; + break; + case 2: + plc |= PORT_LINK_MODE_2_LANES; + lwsc |= PORT_LOGIC_LINK_WIDTH_2_LANES; + break; + case 4: + plc |= PORT_LINK_MODE_4_LANES; + lwsc |= PORT_LOGIC_LINK_WIDTH_4_LANES; + break; + case 8: + plc |= PORT_LINK_MODE_8_LANES; + lwsc |= PORT_LOGIC_LINK_WIDTH_8_LANES; + break; + default: + dev_err(pci->dev, "num-lanes %u: invalid value\n", num_lanes); + return; + } + dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, plc); + dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, lwsc); +} + void dw_pcie_iatu_detect(struct dw_pcie *pci) { int max_region, ob, ib; @@ -1013,49 +1053,5 @@ void dw_pcie_setup(struct dw_pcie *pci) val |= PORT_LINK_DLL_LINK_EN; dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
- if (!pci->num_lanes) { - dev_dbg(pci->dev, "Using h/w default number of lanes\n"); - return; - } - - /* Set the number of lanes */ - val &= ~PORT_LINK_FAST_LINK_MODE; - val &= ~PORT_LINK_MODE_MASK; - switch (pci->num_lanes) { - case 1: - val |= PORT_LINK_MODE_1_LANES; - break; - case 2: - val |= PORT_LINK_MODE_2_LANES; - break; - case 4: - val |= PORT_LINK_MODE_4_LANES; - break; - case 8: - val |= PORT_LINK_MODE_8_LANES; - break; - default: - dev_err(pci->dev, "num-lanes %u: invalid value\n", pci->num_lanes); - return; - } - dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val); - - /* Set link width speed control register */ - val = dw_pcie_readl_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL); - val &= ~PORT_LOGIC_LINK_WIDTH_MASK; - switch (pci->num_lanes) { - case 1: - val |= PORT_LOGIC_LINK_WIDTH_1_LANES; - break; - case 2: - val |= PORT_LOGIC_LINK_WIDTH_2_LANES; - break; - case 4: - val |= PORT_LOGIC_LINK_WIDTH_4_LANES; - break; - case 8: - val |= PORT_LOGIC_LINK_WIDTH_8_LANES; - break; - } - dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val); + dw_pcie_link_set_max_link_width(pci, pci->num_lanes); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit 89db0793c9f2da265ecb6c1681f899d9af157f37 ]
Update dw_pcie_link_set_max_link_width() to set PCI_EXP_LNKCAP_MLW.
In accordance with the DW PCIe RC/EP HW manuals [1,2,3,...] aside with the PORT_LINK_CTRL_OFF.LINK_CAPABLE and GEN2_CTRL_OFF.NUM_OF_LANES[8:0] field there is another one which needs to be updated.
It's LINK_CAPABILITIES_REG.PCIE_CAP_MAX_LINK_WIDTH. If it isn't done at the very least the maximum link-width capability CSR won't expose the actual maximum capability.
[1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.60a, March 2015, p.1032 [2] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.70a, March 2016, p.1065 [3] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.90a, March 2016, p.1057 ... [X] DesignWare Cores PCI Express Controller Databook - DWC PCIe Endpoint, Version 5.40a, March 2019, p.1396 [X+1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 5.40a, March 2019, p.1266
Suggested-by: Serge Semin fancer.lancer@gmail.com Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-4-yoshihiro.shimoda... Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-designware.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c index d14b4da700eaf..8e6f6ac42dc96 100644 --- a/drivers/pci/controller/dwc/pcie-designware.c +++ b/drivers/pci/controller/dwc/pcie-designware.c @@ -734,7 +734,8 @@ static void dw_pcie_link_set_max_speed(struct dw_pcie *pci, u32 link_gen)
static void dw_pcie_link_set_max_link_width(struct dw_pcie *pci, u32 num_lanes) { - u32 lwsc, plc; + u32 lnkcap, lwsc, plc; + u8 cap;
if (!num_lanes) return; @@ -770,6 +771,12 @@ static void dw_pcie_link_set_max_link_width(struct dw_pcie *pci, u32 num_lanes) } dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, plc); dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, lwsc); + + cap = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); + lnkcap = dw_pcie_readl_dbi(pci, cap + PCI_EXP_LNKCAP); + lnkcap &= ~PCI_EXP_LNKCAP_MLW; + lnkcap |= FIELD_PREP(PCI_EXP_LNKCAP_MLW, num_lanes); + dw_pcie_writel_dbi(pci, cap + PCI_EXP_LNKCAP, lnkcap); }
void dw_pcie_iatu_detect(struct dw_pcie *pci)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit 6c4b39937f4e65688ea294725ae432b2565821ff ]
Add Renesas R8A779F0 in pci_device_id table so that pci-epf-test can be used for testing PCIe EP on R-Car S4-8.
Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-16-yoshihiro.shimod... Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Acked-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/pci_endpoint_test.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c index 7e1acc68d4359..af519088732d9 100644 --- a/drivers/misc/pci_endpoint_test.c +++ b/drivers/misc/pci_endpoint_test.c @@ -82,6 +82,7 @@ #define PCI_DEVICE_ID_RENESAS_R8A774B1 0x002b #define PCI_DEVICE_ID_RENESAS_R8A774C0 0x002d #define PCI_DEVICE_ID_RENESAS_R8A774E1 0x0025 +#define PCI_DEVICE_ID_RENESAS_R8A779F0 0x0031
static DEFINE_IDA(pci_endpoint_test_ida);
@@ -991,6 +992,9 @@ static const struct pci_device_id pci_endpoint_test_tbl[] = { { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774B1),}, { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774C0),}, { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774E1),}, + { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A779F0), + .driver_data = (kernel_ulong_t)&default_data, + }, { PCI_DEVICE(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_J721E), .driver_data = (kernel_ulong_t)&j721e_data, },
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit 04e82fa5951ca66495d7b05665eff673aa3852b4 ]
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended.
Separate because this isn't as trivial as the other FIELD_GET() changes.
See 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Link: https://lore.kernel.org/r/20231010204436.1000644-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Nirmoy Das nirmoy.das@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2c5662d43df1b..a7793abdd74ee 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3746,14 +3746,14 @@ u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar) return 0;
pci_read_config_dword(pdev, pos + PCI_REBAR_CAP, &cap); - cap &= PCI_REBAR_CAP_SIZES; + cap = FIELD_GET(PCI_REBAR_CAP_SIZES, cap);
/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */ if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f && - bar == 0 && cap == 0x7000) - cap = 0x3f000; + bar == 0 && cap == 0x700) + return 0x3f00;
- return cap >> 4; + return cap; } EXPORT_SYMBOL(pci_rebar_get_possible_sizes);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 2cb54788393134d8174ee594002baae3ce52c61e ]
The Lenovo Yoga Tab 3 Pro YT3-X90 x86 tablet, which ships with Android with a custom kernel as factory OS, does not list the used WM5102 codec inside its DSDT.
Workaround this with a new snd_soc_acpi_intel_baytrail_machines[] entry which matches on the SST id instead of the codec id like nocodec does, combined with using a machine_quirk callback which returns NULL on other machines to skip the new entry on other machines.
Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20231021211534.114991-1-hdegoede@redhat.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../intel/common/soc-acpi-intel-cht-match.c | 43 +++++++++++++++++++ 1 file changed, 43 insertions(+)
diff --git a/sound/soc/intel/common/soc-acpi-intel-cht-match.c b/sound/soc/intel/common/soc-acpi-intel-cht-match.c index cdcbf04b8832f..5e2ec60e2954b 100644 --- a/sound/soc/intel/common/soc-acpi-intel-cht-match.c +++ b/sound/soc/intel/common/soc-acpi-intel-cht-match.c @@ -75,6 +75,39 @@ static struct snd_soc_acpi_mach *cht_ess8316_quirk(void *arg) return arg; }
+/* + * The Lenovo Yoga Tab 3 Pro YT3-X90, with Android factory OS has a buggy DSDT + * with the coded not being listed at all. + */ +static const struct dmi_system_id lenovo_yoga_tab3_x90[] = { + { + /* Lenovo Yoga Tab 3 Pro YT3-X90, codec missing from DSDT */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"), + DMI_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"), + DMI_MATCH(DMI_PRODUCT_VERSION, "Blade3-10A-001"), + }, + }, + { } +}; + +static struct snd_soc_acpi_mach cht_lenovo_yoga_tab3_x90_mach = { + .id = "10WM5102", + .drv_name = "bytcr_wm5102", + .fw_filename = "intel/fw_sst_22a8.bin", + .board = "bytcr_wm5102", + .sof_tplg_filename = "sof-cht-wm5102.tplg", +}; + +static struct snd_soc_acpi_mach *lenovo_yt3_x90_quirk(void *arg) +{ + if (dmi_check_system(lenovo_yoga_tab3_x90)) + return &cht_lenovo_yoga_tab3_x90_mach; + + /* Skip wildcard match snd_soc_acpi_intel_cherrytrail_machines[] entry */ + return NULL; +} + static const struct snd_soc_acpi_codecs rt5640_comp_ids = { .num_codecs = 2, .codecs = { "10EC5640", "10EC3276" }, @@ -175,6 +208,16 @@ struct snd_soc_acpi_mach snd_soc_acpi_intel_cherrytrail_machines[] = { .drv_name = "sof_pcm512x", .sof_tplg_filename = "sof-cht-src-50khz-pcm512x.tplg", }, + /* + * Special case for the Lenovo Yoga Tab 3 Pro YT3-X90 where the DSDT + * misses the codec. Match on the SST id instead, lenovo_yt3_x90_quirk() + * will return a YT3 specific mach or NULL when called on other hw, + * skipping this entry. + */ + { + .id = "808622A8", + .machine_quirk = lenovo_yt3_x90_quirk, + },
#if IS_ENABLED(CONFIG_SND_SOC_INTEL_BYT_CHT_NOCODEC_MACH) /*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Longfang Liu liulongfang@huawei.com
[ Upstream commit 33fc506d2ac514be1072499a263c3bff8c7c95a0 ]
In the scenario where the accelerator business is fully loaded. When the workqueue receiving messages and performing callback processing, there are a large number of messages that need to be received, and there are continuously messages that have been processed and need to be received. This will cause the receive loop here to be locked for a long time. This scenario will cause watchdog timeout problems on OS with kernel preemption turned off.
The error logs: watchdog: BUG: soft lockup - CPU#23 stuck for 23s! [kworker/u262:1:1407] [ 1461.978428][ C23] Call trace: [ 1461.981890][ C23] complete+0x8c/0xf0 [ 1461.986031][ C23] kcryptd_async_done+0x154/0x1f4 [dm_crypt] [ 1461.992154][ C23] sec_skcipher_callback+0x7c/0xf4 [hisi_sec2] [ 1461.998446][ C23] sec_req_cb+0x104/0x1f4 [hisi_sec2] [ 1462.003950][ C23] qm_poll_req_cb+0xcc/0x150 [hisi_qm] [ 1462.009531][ C23] qm_work_process+0x60/0xc0 [hisi_qm] [ 1462.015101][ C23] process_one_work+0x1c4/0x470 [ 1462.020052][ C23] worker_thread+0x150/0x3c4 [ 1462.024735][ C23] kthread+0x108/0x13c [ 1462.028889][ C23] ret_from_fork+0x10/0x18
Therefore, it is necessary to add an actively scheduled operation in the while loop to prevent this problem. After adding it, no matter whether the OS turns on or off the kernel preemption function. Neither will cause watchdog timeout issues.
Signed-off-by: Longfang Liu liulongfang@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/hisilicon/qm.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index ba4852744c052..2aec118ba6775 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -845,6 +845,8 @@ static void qm_poll_req_cb(struct hisi_qp *qp) qm_db(qm, qp->qp_id, QM_DOORBELL_CMD_CQ, qp->qp_status.cq_head, 0); atomic_dec(&qp->qp_status.used); + + cond_resched(); }
/* set c_flag */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Kosina jkosina@suse.cz
[ Upstream commit 62cc9c3cb3ec1bf31cc116146185ed97b450836a ]
This device needs ALWAYS_POLL quirk, otherwise it keeps reconnecting indefinitely.
Reported-by: Robert Ayrapetyan robert.ayrapetyan@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index cc0d0186a0d95..fafc40ecfd200 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -366,6 +366,7 @@
#define USB_VENDOR_ID_DELL 0x413c #define USB_DEVICE_ID_DELL_PIXART_USB_OPTICAL_MOUSE 0x301a +#define USB_DEVICE_ID_DELL_PRO_WIRELESS_KM5221W 0x4503
#define USB_VENDOR_ID_DELORME 0x1163 #define USB_DEVICE_ID_DELORME_EARTHMATE 0x0100 diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 3983b4f282f8f..5a48fcaa32f00 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -66,6 +66,7 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_CORSAIR, USB_DEVICE_ID_CORSAIR_STRAFE), HID_QUIRK_NO_INIT_REPORTS | HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_CREATIVELABS, USB_DEVICE_ID_CREATIVE_SB_OMNI_SURROUND_51), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_PIXART_USB_OPTICAL_MOUSE), HID_QUIRK_ALWAYS_POLL }, + { HID_USB_DEVICE(USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_PRO_WIRELESS_KM5221W), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_DMI, USB_DEVICE_ID_DMI_ENC), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_DRACAL_RAPHNET, USB_DEVICE_ID_RAPHNET_2NES2SNES), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_DRACAL_RAPHNET, USB_DEVICE_ID_RAPHNET_4NES4SNES), HID_QUIRK_MULTI_INPUT },
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit dab48b8f2fe7264d51ec9eed0adea0fe3c78830a ]
After repairing a corrupted file system with exfatprogs' fsck.exfat, zero-size directories may result. It is also possible to create zero-size directories in other exFAT implementation, such as Paragon ufsd dirver.
As described in the specification, the lower directory size limits is 0 bytes.
Without this commit, sub-directories and files cannot be created under a zero-size directory, and it cannot be removed.
Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Reviewed-by: Andy Wu Andy.Wu@sony.com Reviewed-by: Aoyama Wataru wataru.aoyama@sony.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/namei.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-)
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index e0ff9d156f6f5..43774693f65f5 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -351,14 +351,20 @@ static int exfat_find_empty_entry(struct inode *inode, if (exfat_check_max_dentries(inode)) return -ENOSPC;
- /* we trust p_dir->size regardless of FAT type */ - if (exfat_find_last_cluster(sb, p_dir, &last_clu)) - return -EIO; - /* * Allocate new cluster to this directory */ - exfat_chain_set(&clu, last_clu + 1, 0, p_dir->flags); + if (ei->start_clu != EXFAT_EOF_CLUSTER) { + /* we trust p_dir->size regardless of FAT type */ + if (exfat_find_last_cluster(sb, p_dir, &last_clu)) + return -EIO; + + exfat_chain_set(&clu, last_clu + 1, 0, p_dir->flags); + } else { + /* This directory is empty */ + exfat_chain_set(&clu, EXFAT_EOF_CLUSTER, 0, + ALLOC_NO_FAT_CHAIN); + }
/* allocate a cluster */ ret = exfat_alloc_cluster(inode, 1, &clu, IS_DIRSYNC(inode)); @@ -368,6 +374,11 @@ static int exfat_find_empty_entry(struct inode *inode, if (exfat_zeroed_cluster(inode, clu.dir)) return -EIO;
+ if (ei->start_clu == EXFAT_EOF_CLUSTER) { + ei->start_clu = clu.dir; + p_dir->dir = clu.dir; + } + /* append to the FAT chain */ if (clu.flags != p_dir->flags) { /* no-fat-chain bit is disabled, @@ -646,7 +657,7 @@ static int exfat_find(struct inode *dir, struct qstr *qname, info->type = exfat_get_entry_type(ep); info->attr = le16_to_cpu(ep->dentry.file.attr); info->size = le64_to_cpu(ep2->dentry.stream.valid_size); - if ((info->type == TYPE_FILE) && (info->size == 0)) { + if (info->size == 0) { info->flags = ALLOC_NO_FAT_CHAIN; info->start_clu = EXFAT_EOF_CLUSTER; } else { @@ -890,6 +901,9 @@ static int exfat_check_dir_empty(struct super_block *sb,
dentries_per_clu = sbi->dentries_per_clu;
+ if (p_dir->dir == EXFAT_EOF_CLUSTER) + return 0; + exfat_chain_dup(&clu, p_dir);
while (clu.dir != EXFAT_EOF_CLUSTER) { @@ -1257,7 +1271,8 @@ static int __exfat_rename(struct inode *old_parent_inode, }
/* Free the clusters if new_inode is a dir(as if exfat_rmdir) */ - if (new_entry_type == TYPE_DIR) { + if (new_entry_type == TYPE_DIR && + new_ei->start_clu != EXFAT_EOF_CLUSTER) { /* new_ei, new_clu_to_free */ struct exfat_chain new_clu_to_free;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit e53b22b10c6e0de0cf2a03a92b18fdad70f266c7 ]
Add Intel Lunar Lake-M SoC PCI IDs.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Link: https://lore.kernel.org/r/20231002083344.75611-1-jarkko.nikula@linux.intel.c... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mfd/intel-lpss-pci.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/mfd/intel-lpss-pci.c b/drivers/mfd/intel-lpss-pci.c index 699f44ffff0e4..ae5759200622c 100644 --- a/drivers/mfd/intel-lpss-pci.c +++ b/drivers/mfd/intel-lpss-pci.c @@ -561,6 +561,19 @@ static const struct pci_device_id intel_lpss_pci_ids[] = { { PCI_VDEVICE(INTEL, 0xa3e2), (kernel_ulong_t)&spt_i2c_info }, { PCI_VDEVICE(INTEL, 0xa3e3), (kernel_ulong_t)&spt_i2c_info }, { PCI_VDEVICE(INTEL, 0xa3e6), (kernel_ulong_t)&spt_uart_info }, + /* LNL-M */ + { PCI_VDEVICE(INTEL, 0xa825), (kernel_ulong_t)&bxt_uart_info }, + { PCI_VDEVICE(INTEL, 0xa826), (kernel_ulong_t)&bxt_uart_info }, + { PCI_VDEVICE(INTEL, 0xa827), (kernel_ulong_t)&tgl_info }, + { PCI_VDEVICE(INTEL, 0xa830), (kernel_ulong_t)&tgl_info }, + { PCI_VDEVICE(INTEL, 0xa846), (kernel_ulong_t)&tgl_info }, + { PCI_VDEVICE(INTEL, 0xa850), (kernel_ulong_t)&ehl_i2c_info }, + { PCI_VDEVICE(INTEL, 0xa851), (kernel_ulong_t)&ehl_i2c_info }, + { PCI_VDEVICE(INTEL, 0xa852), (kernel_ulong_t)&bxt_uart_info }, + { PCI_VDEVICE(INTEL, 0xa878), (kernel_ulong_t)&ehl_i2c_info }, + { PCI_VDEVICE(INTEL, 0xa879), (kernel_ulong_t)&ehl_i2c_info }, + { PCI_VDEVICE(INTEL, 0xa87a), (kernel_ulong_t)&ehl_i2c_info }, + { PCI_VDEVICE(INTEL, 0xa87b), (kernel_ulong_t)&ehl_i2c_info }, { } }; MODULE_DEVICE_TABLE(pci, intel_lpss_pci_ids);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 3a23b384e7e3d64d5587ad10729a34d4f761517e ]
of_match_device() may fail and returns a NULL pointer.
In practice there is no known reasonable way to trigger this, but in case one is added in future, harden the code by adding the check
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Link: https://lore.kernel.org/r/tencent_994DA85912C937E3B5405BA960B31ED90A08@qq.co... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/stm32-adc-core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/adc/stm32-adc-core.c b/drivers/iio/adc/stm32-adc-core.c index 48f02dcc81c1b..70011fdbf5f63 100644 --- a/drivers/iio/adc/stm32-adc-core.c +++ b/drivers/iio/adc/stm32-adc-core.c @@ -706,6 +706,8 @@ static int stm32_adc_probe(struct platform_device *pdev) struct stm32_adc_priv *priv; struct device *dev = &pdev->dev; struct device_node *np = pdev->dev.of_node; + const struct of_device_id *of_id; + struct resource *res; u32 max_rate; int ret; @@ -718,8 +720,11 @@ static int stm32_adc_probe(struct platform_device *pdev) return -ENOMEM; platform_set_drvdata(pdev, &priv->common);
- priv->cfg = (const struct stm32_adc_priv_cfg *) - of_match_device(dev->driver->of_match_table, dev)->data; + of_id = of_match_device(dev->driver->of_match_table, dev); + if (!of_id) + return -ENODEV; + + priv->cfg = (const struct stm32_adc_priv_cfg *)of_id->data; priv->nb_adc_max = priv->cfg->num_adcs; spin_lock_init(&priv->common.lock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 0c35ac18256942e66d8dab6ca049185812e60c69 ]
This is not needed when firmware connection manager is run so limit this to software connection manager.
Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/quirks.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c index 488138a28ae13..e6bfa63b40aee 100644 --- a/drivers/thunderbolt/quirks.c +++ b/drivers/thunderbolt/quirks.c @@ -31,6 +31,9 @@ static void quirk_usb3_maximum_bandwidth(struct tb_switch *sw) { struct tb_port *port;
+ if (tb_switch_is_icm(sw)) + return; + tb_switch_for_each_port(sw, port) { if (!tb_port_is_usb3_down(port)) continue;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Yang yiyang13@huawei.com
[ Upstream commit d81ffb87aaa75f842cd7aa57091810353755b3e6 ]
Add check for the return value of kstrdup() and return the error, if it fails in order to avoid NULL pointer dereference.
Signed-off-by: Yi Yang yiyang13@huawei.com Reviewed-by: Jiri Slaby jirislaby@kernel.org Link: https://lore.kernel.org/r/20230904035220.48164-1-yiyang13@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/vcc.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/vcc.c b/drivers/tty/vcc.c index 34ba6e54789a7..b8b832c75b856 100644 --- a/drivers/tty/vcc.c +++ b/drivers/tty/vcc.c @@ -579,18 +579,22 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) return -ENOMEM;
name = kstrdup(dev_name(&vdev->dev), GFP_KERNEL); + if (!name) { + rv = -ENOMEM; + goto free_port; + }
rv = vio_driver_init(&port->vio, vdev, VDEV_CONSOLE_CON, vcc_versions, ARRAY_SIZE(vcc_versions), NULL, name); if (rv) - goto free_port; + goto free_name;
port->vio.debug = vcc_dbg_vio; vcc_ldc_cfg.debug = vcc_dbg_ldc;
rv = vio_ldc_alloc(&port->vio, &vcc_ldc_cfg, port); if (rv) - goto free_port; + goto free_name;
spin_lock_init(&port->lock);
@@ -624,6 +628,11 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) goto unreg_tty; } port->domain = kstrdup(domain, GFP_KERNEL); + if (!port->domain) { + rv = -ENOMEM; + goto unreg_tty; + } +
mdesc_release(hp);
@@ -653,8 +662,9 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) vcc_table_remove(port->index); free_ldc: vio_ldc_free(&port->vio); -free_port: +free_name: kfree(name); +free_port: kfree(port);
return rv;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konrad Dybcio konrad.dybcio@linaro.org
[ Upstream commit c20b59b2996c89c4f072c3312e6210528a298330 ]
The EUSB2 repeater requires some alterations to its init sequence, depending on board design.
Add support for making the necessary changes to that sequence to make USB functional on SM8550-based Xperia 1 V.
They all have lackluster description due to lack of information.
Signed-off-by: Konrad Dybcio konrad.dybcio@linaro.org Acked-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20230830-topic-eusb2_override-v2-1-7d8c893d93f6@li... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../phy/qcom,snps-eusb2-repeater.yaml | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/Documentation/devicetree/bindings/phy/qcom,snps-eusb2-repeater.yaml b/Documentation/devicetree/bindings/phy/qcom,snps-eusb2-repeater.yaml index 083fda530b484..828650d4c4b09 100644 --- a/Documentation/devicetree/bindings/phy/qcom,snps-eusb2-repeater.yaml +++ b/Documentation/devicetree/bindings/phy/qcom,snps-eusb2-repeater.yaml @@ -27,6 +27,27 @@ properties:
vdd3-supply: true
+ qcom,tune-usb2-disc-thres: + $ref: /schemas/types.yaml#/definitions/uint8 + description: High-Speed disconnect threshold + minimum: 0 + maximum: 7 + default: 0 + + qcom,tune-usb2-amplitude: + $ref: /schemas/types.yaml#/definitions/uint8 + description: High-Speed trasmit amplitude + minimum: 0 + maximum: 15 + default: 8 + + qcom,tune-usb2-preem: + $ref: /schemas/types.yaml#/definitions/uint8 + description: High-Speed TX pre-emphasis tuning + minimum: 0 + maximum: 7 + default: 5 + required: - compatible - reg
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konrad Dybcio konrad.dybcio@linaro.org
[ Upstream commit 4ba2e52718c0ce4ece6a269bec84319c355c030f ]
Switch to regmap_fields, so that the values written into registers are sanitized by their explicit sizes and the different registers are structured in an iterable object to make external changes to the init sequence simpler.
Signed-off-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230830-topic-eusb2_override-v2-2-7d8c893d93f6@li... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../phy/qualcomm/phy-qcom-eusb2-repeater.c | 91 +++++++++++++------ 1 file changed, 61 insertions(+), 30 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c b/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c index 90f8543ba265b..b20b805414a2a 100644 --- a/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c +++ b/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c @@ -29,14 +29,42 @@ #define EUSB2_TUNE_SQUELCH_U 0x54 #define EUSB2_TUNE_USB2_PREEM 0x57
-#define QCOM_EUSB2_REPEATER_INIT_CFG(o, v) \ +#define QCOM_EUSB2_REPEATER_INIT_CFG(r, v) \ { \ - .offset = o, \ + .reg = r, \ .val = v, \ }
+enum reg_fields { + F_TUNE_USB2_PREEM, + F_TUNE_SQUELCH_U, + F_TUNE_IUSB2, + F_NUM_TUNE_FIELDS, + + F_FORCE_VAL_5 = F_NUM_TUNE_FIELDS, + F_FORCE_EN_5, + + F_EN_CTL1, + + F_RPTR_STATUS, + F_NUM_FIELDS, +}; + +static struct reg_field eusb2_repeater_tune_reg_fields[F_NUM_FIELDS] = { + [F_TUNE_USB2_PREEM] = REG_FIELD(EUSB2_TUNE_USB2_PREEM, 0, 2), + [F_TUNE_SQUELCH_U] = REG_FIELD(EUSB2_TUNE_SQUELCH_U, 0, 2), + [F_TUNE_IUSB2] = REG_FIELD(EUSB2_TUNE_IUSB2, 0, 3), + + [F_FORCE_VAL_5] = REG_FIELD(EUSB2_FORCE_VAL_5, 0, 7), + [F_FORCE_EN_5] = REG_FIELD(EUSB2_FORCE_EN_5, 0, 7), + + [F_EN_CTL1] = REG_FIELD(EUSB2_EN_CTL1, 0, 7), + + [F_RPTR_STATUS] = REG_FIELD(EUSB2_RPTR_STATUS, 0, 7), +}; + struct eusb2_repeater_init_tbl { - unsigned int offset; + unsigned int reg; unsigned int val; };
@@ -49,11 +77,10 @@ struct eusb2_repeater_cfg {
struct eusb2_repeater { struct device *dev; - struct regmap *regmap; + struct regmap_field *regs[F_NUM_FIELDS]; struct phy *phy; struct regulator_bulk_data *vregs; const struct eusb2_repeater_cfg *cfg; - u16 base; enum phy_mode mode; };
@@ -62,9 +89,9 @@ static const char * const pm8550b_vreg_l[] = { };
static const struct eusb2_repeater_init_tbl pm8550b_init_tbl[] = { - QCOM_EUSB2_REPEATER_INIT_CFG(EUSB2_TUNE_IUSB2, 0x8), - QCOM_EUSB2_REPEATER_INIT_CFG(EUSB2_TUNE_SQUELCH_U, 0x3), - QCOM_EUSB2_REPEATER_INIT_CFG(EUSB2_TUNE_USB2_PREEM, 0x5), + QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_IUSB2, 0x8), + QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_SQUELCH_U, 0x3), + QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_USB2_PREEM, 0x5), };
static const struct eusb2_repeater_cfg pm8550b_eusb2_cfg = { @@ -94,7 +121,6 @@ static int eusb2_repeater_init(struct phy *phy) { struct eusb2_repeater *rptr = phy_get_drvdata(phy); const struct eusb2_repeater_init_tbl *init_tbl = rptr->cfg->init_tbl; - int num = rptr->cfg->init_tbl_num; u32 val; int ret; int i; @@ -103,17 +129,14 @@ static int eusb2_repeater_init(struct phy *phy) if (ret) return ret;
- regmap_update_bits(rptr->regmap, rptr->base + EUSB2_EN_CTL1, - EUSB2_RPTR_EN, EUSB2_RPTR_EN); + regmap_field_update_bits(rptr->regs[F_EN_CTL1], EUSB2_RPTR_EN, EUSB2_RPTR_EN);
- for (i = 0; i < num; i++) - regmap_update_bits(rptr->regmap, - rptr->base + init_tbl[i].offset, - init_tbl[i].val, init_tbl[i].val); + for (i = 0; i < rptr->cfg->init_tbl_num; i++) + regmap_field_update_bits(rptr->regs[init_tbl[i].reg], + init_tbl[i].val, init_tbl[i].val);
- ret = regmap_read_poll_timeout(rptr->regmap, - rptr->base + EUSB2_RPTR_STATUS, val, - val & RPTR_OK, 10, 5); + ret = regmap_field_read_poll_timeout(rptr->regs[F_RPTR_STATUS], + val, val & RPTR_OK, 10, 5); if (ret) dev_err(rptr->dev, "initialization timed-out\n");
@@ -132,10 +155,10 @@ static int eusb2_repeater_set_mode(struct phy *phy, * per eUSB 1.2 Spec. Below implement software workaround until * PHY and controller is fixing seen observation. */ - regmap_update_bits(rptr->regmap, rptr->base + EUSB2_FORCE_EN_5, - F_CLK_19P2M_EN, F_CLK_19P2M_EN); - regmap_update_bits(rptr->regmap, rptr->base + EUSB2_FORCE_VAL_5, - V_CLK_19P2M_EN, V_CLK_19P2M_EN); + regmap_field_update_bits(rptr->regs[F_FORCE_EN_5], + F_CLK_19P2M_EN, F_CLK_19P2M_EN); + regmap_field_update_bits(rptr->regs[F_FORCE_VAL_5], + V_CLK_19P2M_EN, V_CLK_19P2M_EN); break; case PHY_MODE_USB_DEVICE: /* @@ -144,10 +167,10 @@ static int eusb2_repeater_set_mode(struct phy *phy, * repeater doesn't clear previous value due to shared * regulators (say host <-> device mode switch). */ - regmap_update_bits(rptr->regmap, rptr->base + EUSB2_FORCE_EN_5, - F_CLK_19P2M_EN, 0); - regmap_update_bits(rptr->regmap, rptr->base + EUSB2_FORCE_VAL_5, - V_CLK_19P2M_EN, 0); + regmap_field_update_bits(rptr->regs[F_FORCE_EN_5], + F_CLK_19P2M_EN, 0); + regmap_field_update_bits(rptr->regs[F_FORCE_VAL_5], + V_CLK_19P2M_EN, 0); break; default: return -EINVAL; @@ -176,8 +199,9 @@ static int eusb2_repeater_probe(struct platform_device *pdev) struct device *dev = &pdev->dev; struct phy_provider *phy_provider; struct device_node *np = dev->of_node; + struct regmap *regmap; + int i, ret; u32 res; - int ret;
rptr = devm_kzalloc(dev, sizeof(*rptr), GFP_KERNEL); if (!rptr) @@ -190,15 +214,22 @@ static int eusb2_repeater_probe(struct platform_device *pdev) if (!rptr->cfg) return -EINVAL;
- rptr->regmap = dev_get_regmap(dev->parent, NULL); - if (!rptr->regmap) + regmap = dev_get_regmap(dev->parent, NULL); + if (!regmap) return -ENODEV;
ret = of_property_read_u32(np, "reg", &res); if (ret < 0) return ret;
- rptr->base = res; + for (i = 0; i < F_NUM_FIELDS; i++) + eusb2_repeater_tune_reg_fields[i].reg += res; + + ret = devm_regmap_field_bulk_alloc(dev, regmap, rptr->regs, + eusb2_repeater_tune_reg_fields, + F_NUM_FIELDS); + if (ret) + return ret;
ret = eusb2_repeater_init_vregs(rptr); if (ret < 0) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konrad Dybcio konrad.dybcio@linaro.org
[ Upstream commit 99a517a582fc1272d1d3cf3b9e671a14d7db77b8 ]
The vendor kernel zeroes out all tuning data outside the init sequence as part of initialization. Follow suit to avoid UB.
Signed-off-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230830-topic-eusb2_override-v2-3-7d8c893d93f6@li... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../phy/qualcomm/phy-qcom-eusb2-repeater.c | 58 ++++++++++++++----- 1 file changed, 44 insertions(+), 14 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c b/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c index b20b805414a2a..6777532dd4dc9 100644 --- a/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c +++ b/drivers/phy/qualcomm/phy-qcom-eusb2-repeater.c @@ -25,9 +25,18 @@ #define EUSB2_FORCE_VAL_5 0xeD #define V_CLK_19P2M_EN BIT(6)
+#define EUSB2_TUNE_USB2_CROSSOVER 0x50 #define EUSB2_TUNE_IUSB2 0x51 +#define EUSB2_TUNE_RES_FSDIF 0x52 +#define EUSB2_TUNE_HSDISC 0x53 #define EUSB2_TUNE_SQUELCH_U 0x54 +#define EUSB2_TUNE_USB2_SLEW 0x55 +#define EUSB2_TUNE_USB2_EQU 0x56 #define EUSB2_TUNE_USB2_PREEM 0x57 +#define EUSB2_TUNE_USB2_HS_COMP_CUR 0x58 +#define EUSB2_TUNE_EUSB_SLEW 0x59 +#define EUSB2_TUNE_EUSB_EQU 0x5A +#define EUSB2_TUNE_EUSB_HS_COMP_CUR 0x5B
#define QCOM_EUSB2_REPEATER_INIT_CFG(r, v) \ { \ @@ -36,9 +45,18 @@ }
enum reg_fields { + F_TUNE_EUSB_HS_COMP_CUR, + F_TUNE_EUSB_EQU, + F_TUNE_EUSB_SLEW, + F_TUNE_USB2_HS_COMP_CUR, F_TUNE_USB2_PREEM, + F_TUNE_USB2_EQU, + F_TUNE_USB2_SLEW, F_TUNE_SQUELCH_U, + F_TUNE_HSDISC, + F_TUNE_RES_FSDIF, F_TUNE_IUSB2, + F_TUNE_USB2_CROSSOVER, F_NUM_TUNE_FIELDS,
F_FORCE_VAL_5 = F_NUM_TUNE_FIELDS, @@ -51,9 +69,18 @@ enum reg_fields { };
static struct reg_field eusb2_repeater_tune_reg_fields[F_NUM_FIELDS] = { + [F_TUNE_EUSB_HS_COMP_CUR] = REG_FIELD(EUSB2_TUNE_EUSB_HS_COMP_CUR, 0, 1), + [F_TUNE_EUSB_EQU] = REG_FIELD(EUSB2_TUNE_EUSB_EQU, 0, 1), + [F_TUNE_EUSB_SLEW] = REG_FIELD(EUSB2_TUNE_EUSB_SLEW, 0, 1), + [F_TUNE_USB2_HS_COMP_CUR] = REG_FIELD(EUSB2_TUNE_USB2_HS_COMP_CUR, 0, 1), [F_TUNE_USB2_PREEM] = REG_FIELD(EUSB2_TUNE_USB2_PREEM, 0, 2), + [F_TUNE_USB2_EQU] = REG_FIELD(EUSB2_TUNE_USB2_EQU, 0, 1), + [F_TUNE_USB2_SLEW] = REG_FIELD(EUSB2_TUNE_USB2_SLEW, 0, 1), [F_TUNE_SQUELCH_U] = REG_FIELD(EUSB2_TUNE_SQUELCH_U, 0, 2), + [F_TUNE_HSDISC] = REG_FIELD(EUSB2_TUNE_HSDISC, 0, 2), + [F_TUNE_RES_FSDIF] = REG_FIELD(EUSB2_TUNE_RES_FSDIF, 0, 2), [F_TUNE_IUSB2] = REG_FIELD(EUSB2_TUNE_IUSB2, 0, 3), + [F_TUNE_USB2_CROSSOVER] = REG_FIELD(EUSB2_TUNE_USB2_CROSSOVER, 0, 2),
[F_FORCE_VAL_5] = REG_FIELD(EUSB2_FORCE_VAL_5, 0, 7), [F_FORCE_EN_5] = REG_FIELD(EUSB2_FORCE_EN_5, 0, 7), @@ -63,13 +90,8 @@ static struct reg_field eusb2_repeater_tune_reg_fields[F_NUM_FIELDS] = { [F_RPTR_STATUS] = REG_FIELD(EUSB2_RPTR_STATUS, 0, 7), };
-struct eusb2_repeater_init_tbl { - unsigned int reg; - unsigned int val; -}; - struct eusb2_repeater_cfg { - const struct eusb2_repeater_init_tbl *init_tbl; + const u32 *init_tbl; int init_tbl_num; const char * const *vreg_list; int num_vregs; @@ -88,10 +110,10 @@ static const char * const pm8550b_vreg_l[] = { "vdd18", "vdd3", };
-static const struct eusb2_repeater_init_tbl pm8550b_init_tbl[] = { - QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_IUSB2, 0x8), - QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_SQUELCH_U, 0x3), - QCOM_EUSB2_REPEATER_INIT_CFG(F_TUNE_USB2_PREEM, 0x5), +static const u32 pm8550b_init_tbl[F_NUM_TUNE_FIELDS] = { + [F_TUNE_IUSB2] = 0x8, + [F_TUNE_SQUELCH_U] = 0x3, + [F_TUNE_USB2_PREEM] = 0x5, };
static const struct eusb2_repeater_cfg pm8550b_eusb2_cfg = { @@ -119,8 +141,9 @@ static int eusb2_repeater_init_vregs(struct eusb2_repeater *rptr)
static int eusb2_repeater_init(struct phy *phy) { + struct reg_field *regfields = eusb2_repeater_tune_reg_fields; struct eusb2_repeater *rptr = phy_get_drvdata(phy); - const struct eusb2_repeater_init_tbl *init_tbl = rptr->cfg->init_tbl; + const u32 *init_tbl = rptr->cfg->init_tbl; u32 val; int ret; int i; @@ -131,9 +154,16 @@ static int eusb2_repeater_init(struct phy *phy)
regmap_field_update_bits(rptr->regs[F_EN_CTL1], EUSB2_RPTR_EN, EUSB2_RPTR_EN);
- for (i = 0; i < rptr->cfg->init_tbl_num; i++) - regmap_field_update_bits(rptr->regs[init_tbl[i].reg], - init_tbl[i].val, init_tbl[i].val); + for (i = 0; i < F_NUM_TUNE_FIELDS; i++) { + if (init_tbl[i]) { + regmap_field_update_bits(rptr->regs[i], init_tbl[i], init_tbl[i]); + } else { + /* Write 0 if there's no value set */ + u32 mask = GENMASK(regfields[i].msb, regfields[i].lsb); + + regmap_field_update_bits(rptr->regs[i], mask, 0); + } + }
ret = regmap_field_read_poll_timeout(rptr->regs[F_RPTR_STATUS], val, val & RPTR_OK, 10, 5);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanley Chang stanley_chang@realtek.com
[ Upstream commit e72fc8d6a12af7ae8dd1b52cf68ed68569d29f80 ]
In Synopsys's dwc3 data book: To avoid underrun and overrun during the burst, in a high-latency bus system (like USB), threshold and burst size control is provided through GTXTHRCFG and GRXTHRCFG registers.
In Realtek DHC SoC, DWC3 USB 3.0 uses AHB system bus. When dwc3 is connected with USB 2.5G Ethernet, there will be overrun problem. Therefore, setting TX/RX thresholds can avoid this issue.
Signed-off-by: Stanley Chang stanley_chang@realtek.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20230912041904.30721-1-stanley_chang@realtek.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc3/core.c | 160 +++++++++++++++++++++++++++++++--------- drivers/usb/dwc3/core.h | 13 ++++ 2 files changed, 137 insertions(+), 36 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 343d2570189ff..d25490965b27f 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1094,6 +1094,111 @@ static void dwc3_set_power_down_clk_scale(struct dwc3 *dwc) } }
+static void dwc3_config_threshold(struct dwc3 *dwc) +{ + u32 reg; + u8 rx_thr_num; + u8 rx_maxburst; + u8 tx_thr_num; + u8 tx_maxburst; + + /* + * Must config both number of packets and max burst settings to enable + * RX and/or TX threshold. + */ + if (!DWC3_IP_IS(DWC3) && dwc->dr_mode == USB_DR_MODE_HOST) { + rx_thr_num = dwc->rx_thr_num_pkt_prd; + rx_maxburst = dwc->rx_max_burst_prd; + tx_thr_num = dwc->tx_thr_num_pkt_prd; + tx_maxburst = dwc->tx_max_burst_prd; + + if (rx_thr_num && rx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GRXTHRCFG); + reg |= DWC31_RXTHRNUMPKTSEL_PRD; + + reg &= ~DWC31_RXTHRNUMPKT_PRD(~0); + reg |= DWC31_RXTHRNUMPKT_PRD(rx_thr_num); + + reg &= ~DWC31_MAXRXBURSTSIZE_PRD(~0); + reg |= DWC31_MAXRXBURSTSIZE_PRD(rx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GRXTHRCFG, reg); + } + + if (tx_thr_num && tx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GTXTHRCFG); + reg |= DWC31_TXTHRNUMPKTSEL_PRD; + + reg &= ~DWC31_TXTHRNUMPKT_PRD(~0); + reg |= DWC31_TXTHRNUMPKT_PRD(tx_thr_num); + + reg &= ~DWC31_MAXTXBURSTSIZE_PRD(~0); + reg |= DWC31_MAXTXBURSTSIZE_PRD(tx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GTXTHRCFG, reg); + } + } + + rx_thr_num = dwc->rx_thr_num_pkt; + rx_maxburst = dwc->rx_max_burst; + tx_thr_num = dwc->tx_thr_num_pkt; + tx_maxburst = dwc->tx_max_burst; + + if (DWC3_IP_IS(DWC3)) { + if (rx_thr_num && rx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GRXTHRCFG); + reg |= DWC3_GRXTHRCFG_PKTCNTSEL; + + reg &= ~DWC3_GRXTHRCFG_RXPKTCNT(~0); + reg |= DWC3_GRXTHRCFG_RXPKTCNT(rx_thr_num); + + reg &= ~DWC3_GRXTHRCFG_MAXRXBURSTSIZE(~0); + reg |= DWC3_GRXTHRCFG_MAXRXBURSTSIZE(rx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GRXTHRCFG, reg); + } + + if (tx_thr_num && tx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GTXTHRCFG); + reg |= DWC3_GTXTHRCFG_PKTCNTSEL; + + reg &= ~DWC3_GTXTHRCFG_TXPKTCNT(~0); + reg |= DWC3_GTXTHRCFG_TXPKTCNT(tx_thr_num); + + reg &= ~DWC3_GTXTHRCFG_MAXTXBURSTSIZE(~0); + reg |= DWC3_GTXTHRCFG_MAXTXBURSTSIZE(tx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GTXTHRCFG, reg); + } + } else { + if (rx_thr_num && rx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GRXTHRCFG); + reg |= DWC31_GRXTHRCFG_PKTCNTSEL; + + reg &= ~DWC31_GRXTHRCFG_RXPKTCNT(~0); + reg |= DWC31_GRXTHRCFG_RXPKTCNT(rx_thr_num); + + reg &= ~DWC31_GRXTHRCFG_MAXRXBURSTSIZE(~0); + reg |= DWC31_GRXTHRCFG_MAXRXBURSTSIZE(rx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GRXTHRCFG, reg); + } + + if (tx_thr_num && tx_maxburst) { + reg = dwc3_readl(dwc->regs, DWC3_GTXTHRCFG); + reg |= DWC31_GTXTHRCFG_PKTCNTSEL; + + reg &= ~DWC31_GTXTHRCFG_TXPKTCNT(~0); + reg |= DWC31_GTXTHRCFG_TXPKTCNT(tx_thr_num); + + reg &= ~DWC31_GTXTHRCFG_MAXTXBURSTSIZE(~0); + reg |= DWC31_GTXTHRCFG_MAXTXBURSTSIZE(tx_maxburst); + + dwc3_writel(dwc->regs, DWC3_GTXTHRCFG, reg); + } + } +} + /** * dwc3_core_init - Low-level initialization of DWC3 Core * @dwc: Pointer to our controller context structure @@ -1246,42 +1351,7 @@ static int dwc3_core_init(struct dwc3 *dwc) dwc3_writel(dwc->regs, DWC3_GUCTL1, reg); }
- /* - * Must config both number of packets and max burst settings to enable - * RX and/or TX threshold. - */ - if (!DWC3_IP_IS(DWC3) && dwc->dr_mode == USB_DR_MODE_HOST) { - u8 rx_thr_num = dwc->rx_thr_num_pkt_prd; - u8 rx_maxburst = dwc->rx_max_burst_prd; - u8 tx_thr_num = dwc->tx_thr_num_pkt_prd; - u8 tx_maxburst = dwc->tx_max_burst_prd; - - if (rx_thr_num && rx_maxburst) { - reg = dwc3_readl(dwc->regs, DWC3_GRXTHRCFG); - reg |= DWC31_RXTHRNUMPKTSEL_PRD; - - reg &= ~DWC31_RXTHRNUMPKT_PRD(~0); - reg |= DWC31_RXTHRNUMPKT_PRD(rx_thr_num); - - reg &= ~DWC31_MAXRXBURSTSIZE_PRD(~0); - reg |= DWC31_MAXRXBURSTSIZE_PRD(rx_maxburst); - - dwc3_writel(dwc->regs, DWC3_GRXTHRCFG, reg); - } - - if (tx_thr_num && tx_maxburst) { - reg = dwc3_readl(dwc->regs, DWC3_GTXTHRCFG); - reg |= DWC31_TXTHRNUMPKTSEL_PRD; - - reg &= ~DWC31_TXTHRNUMPKT_PRD(~0); - reg |= DWC31_TXTHRNUMPKT_PRD(tx_thr_num); - - reg &= ~DWC31_MAXTXBURSTSIZE_PRD(~0); - reg |= DWC31_MAXTXBURSTSIZE_PRD(tx_maxburst); - - dwc3_writel(dwc->regs, DWC3_GTXTHRCFG, reg); - } - } + dwc3_config_threshold(dwc);
return 0;
@@ -1417,6 +1487,10 @@ static void dwc3_get_properties(struct dwc3 *dwc) u8 lpm_nyet_threshold; u8 tx_de_emphasis; u8 hird_threshold; + u8 rx_thr_num_pkt = 0; + u8 rx_max_burst = 0; + u8 tx_thr_num_pkt = 0; + u8 tx_max_burst = 0; u8 rx_thr_num_pkt_prd = 0; u8 rx_max_burst_prd = 0; u8 tx_thr_num_pkt_prd = 0; @@ -1479,6 +1553,14 @@ static void dwc3_get_properties(struct dwc3 *dwc) "snps,usb2-lpm-disable"); dwc->usb2_gadget_lpm_disable = device_property_read_bool(dev, "snps,usb2-gadget-lpm-disable"); + device_property_read_u8(dev, "snps,rx-thr-num-pkt", + &rx_thr_num_pkt); + device_property_read_u8(dev, "snps,rx-max-burst", + &rx_max_burst); + device_property_read_u8(dev, "snps,tx-thr-num-pkt", + &tx_thr_num_pkt); + device_property_read_u8(dev, "snps,tx-max-burst", + &tx_max_burst); device_property_read_u8(dev, "snps,rx-thr-num-pkt-prd", &rx_thr_num_pkt_prd); device_property_read_u8(dev, "snps,rx-max-burst-prd", @@ -1560,6 +1642,12 @@ static void dwc3_get_properties(struct dwc3 *dwc)
dwc->hird_threshold = hird_threshold;
+ dwc->rx_thr_num_pkt = rx_thr_num_pkt; + dwc->rx_max_burst = rx_max_burst; + + dwc->tx_thr_num_pkt = tx_thr_num_pkt; + dwc->tx_max_burst = tx_max_burst; + dwc->rx_thr_num_pkt_prd = rx_thr_num_pkt_prd; dwc->rx_max_burst_prd = rx_max_burst_prd;
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index a69ac67d89fe6..6782ec8bfd64c 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -211,6 +211,11 @@ #define DWC3_GRXTHRCFG_RXPKTCNT(n) (((n) & 0xf) << 24) #define DWC3_GRXTHRCFG_PKTCNTSEL BIT(29)
+/* Global TX Threshold Configuration Register */ +#define DWC3_GTXTHRCFG_MAXTXBURSTSIZE(n) (((n) & 0xff) << 16) +#define DWC3_GTXTHRCFG_TXPKTCNT(n) (((n) & 0xf) << 24) +#define DWC3_GTXTHRCFG_PKTCNTSEL BIT(29) + /* Global RX Threshold Configuration Register for DWC_usb31 only */ #define DWC31_GRXTHRCFG_MAXRXBURSTSIZE(n) (((n) & 0x1f) << 16) #define DWC31_GRXTHRCFG_RXPKTCNT(n) (((n) & 0x1f) << 21) @@ -1045,6 +1050,10 @@ struct dwc3_scratchpad_array { * @test_mode_nr: test feature selector * @lpm_nyet_threshold: LPM NYET response threshold * @hird_threshold: HIRD threshold + * @rx_thr_num_pkt: USB receive packet count + * @rx_max_burst: max USB receive burst size + * @tx_thr_num_pkt: USB transmit packet count + * @tx_max_burst: max USB transmit burst size * @rx_thr_num_pkt_prd: periodic ESS receive packet count * @rx_max_burst_prd: max periodic ESS receive burst size * @tx_thr_num_pkt_prd: periodic ESS transmit packet count @@ -1273,6 +1282,10 @@ struct dwc3 { u8 test_mode_nr; u8 lpm_nyet_threshold; u8 hird_threshold; + u8 rx_thr_num_pkt; + u8 rx_max_burst; + u8 tx_thr_num_pkt; + u8 tx_max_burst; u8 rx_thr_num_pkt_prd; u8 rx_max_burst_prd; u8 tx_thr_num_pkt_prd;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neil Armstrong neil.armstrong@linaro.org
[ Upstream commit c6165ed2f425c273244191930a47c8be23bc51bd ]
On SM8550, the non-altmode orientation is not given anymore within altmode events, even with USB SVIDs events.
On the other side, the Type-C connector orientation is correctly reported by a signal from the PMIC.
Take this gpio signal when we detect some Type-C port activity to notify any Type-C switches tied to the Type-C port connectors.
Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20231002-topic-sm8550-upstream-type-c-orientation-... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/typec/ucsi/ucsi_glink.c | 54 ++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c index 1fe9cb5b6bd96..a2d862eebcecb 100644 --- a/drivers/usb/typec/ucsi/ucsi_glink.c +++ b/drivers/usb/typec/ucsi/ucsi_glink.c @@ -9,9 +9,13 @@ #include <linux/mutex.h> #include <linux/property.h> #include <linux/soc/qcom/pdr.h> +#include <linux/usb/typec_mux.h> +#include <linux/gpio/consumer.h> #include <linux/soc/qcom/pmic_glink.h> #include "ucsi.h"
+#define PMIC_GLINK_MAX_PORTS 2 + #define UCSI_BUF_SIZE 48
#define MSG_TYPE_REQ_RESP 1 @@ -53,6 +57,9 @@ struct ucsi_notify_ind_msg { struct pmic_glink_ucsi { struct device *dev;
+ struct gpio_desc *port_orientation[PMIC_GLINK_MAX_PORTS]; + struct typec_switch *port_switch[PMIC_GLINK_MAX_PORTS]; + struct pmic_glink_client *client;
struct ucsi *ucsi; @@ -221,8 +228,20 @@ static void pmic_glink_ucsi_notify(struct work_struct *work) }
con_num = UCSI_CCI_CONNECTOR(cci); - if (con_num) + if (con_num) { + if (con_num < PMIC_GLINK_MAX_PORTS && + ucsi->port_orientation[con_num - 1]) { + int orientation = gpiod_get_value(ucsi->port_orientation[con_num - 1]); + + if (orientation >= 0) { + typec_switch_set(ucsi->port_switch[con_num - 1], + orientation ? TYPEC_ORIENTATION_REVERSE + : TYPEC_ORIENTATION_NORMAL); + } + } + ucsi_connector_change(ucsi->ucsi, con_num); + }
if (ucsi->sync_pending && cci & UCSI_CCI_BUSY) { ucsi->sync_val = -EBUSY; @@ -283,6 +302,7 @@ static int pmic_glink_ucsi_probe(struct auxiliary_device *adev, { struct pmic_glink_ucsi *ucsi; struct device *dev = &adev->dev; + struct fwnode_handle *fwnode; int ret;
ucsi = devm_kzalloc(dev, sizeof(*ucsi), GFP_KERNEL); @@ -310,6 +330,38 @@ static int pmic_glink_ucsi_probe(struct auxiliary_device *adev,
ucsi_set_drvdata(ucsi->ucsi, ucsi);
+ device_for_each_child_node(dev, fwnode) { + struct gpio_desc *desc; + u32 port; + + ret = fwnode_property_read_u32(fwnode, "reg", &port); + if (ret < 0) { + dev_err(dev, "missing reg property of %pOFn\n", fwnode); + return ret; + } + + if (port >= PMIC_GLINK_MAX_PORTS) { + dev_warn(dev, "invalid connector number, ignoring\n"); + continue; + } + + desc = devm_gpiod_get_index_optional(&adev->dev, "orientation", port, GPIOD_IN); + + /* If GPIO isn't found, continue */ + if (!desc) + continue; + + if (IS_ERR(desc)) + return dev_err_probe(dev, PTR_ERR(desc), + "unable to acquire orientation gpio\n"); + ucsi->port_orientation[port] = desc; + + ucsi->port_switch[port] = fwnode_typec_switch_get(fwnode); + if (IS_ERR(ucsi->port_switch[port])) + return dev_err_probe(dev, PTR_ERR(ucsi->port_switch[port]), + "failed to acquire orientation-switch\n"); + } + ucsi->client = devm_pmic_glink_register_client(dev, PMIC_GLINK_OWNER_USBC, pmic_glink_ucsi_callback,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 4ea2b6d3128ea4d502c4015df0dc16b7d1070954 ]
New platforms have a slightly different DMI product name, remove trailing characters/digits to handle all cases
Closes: https://github.com/thesofproject/linux/issues/4611 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Rander Wang rander.wang@intel.com Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://lore.kernel.org/r/20231013010833.114271-1-yung-chuan.liao@linux.inte... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/dmi-quirks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soundwire/dmi-quirks.c b/drivers/soundwire/dmi-quirks.c index 2a1096dab63d3..9ebdd0cd0b1cf 100644 --- a/drivers/soundwire/dmi-quirks.c +++ b/drivers/soundwire/dmi-quirks.c @@ -141,7 +141,7 @@ static const struct dmi_system_id adr_remap_quirk_table[] = { { .matches = { DMI_MATCH(DMI_SYS_VENDOR, "HP"), - DMI_MATCH(DMI_PRODUCT_NAME, "OMEN by HP Gaming Laptop 16-k0xxx"), + DMI_MATCH(DMI_PRODUCT_NAME, "OMEN by HP Gaming Laptop 16"), }, .driver_data = (void *)hp_omen_16, },
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhiguo Niu zhiguo.niu@unisoc.com
[ Upstream commit a5e80e18f268ea7c7a36bc4159de0deb3b5a2171 ]
If NAT is corrupted, let scan_nat_page() return EFSCORRUPTED, so that, caller can set SBI_NEED_FSCK flag into checkpoint for later repair by fsck.
Also, this patch introduces a new fscorrupted error flag, and in above scenario, it will persist the error flag into superblock synchronously to avoid it has no luck to trigger a checkpoint to record SBI_NEED_FSCK
Signed-off-by: Zhiguo Niu zhiguo.niu@unisoc.com Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/node.c | 11 +++++++++-- include/linux/f2fs_fs.h | 1 + 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ee2e1dd64f256..248764badcde8 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2389,7 +2389,7 @@ static int scan_nat_page(struct f2fs_sb_info *sbi, blk_addr = le32_to_cpu(nat_blk->entries[i].block_addr);
if (blk_addr == NEW_ADDR) - return -EINVAL; + return -EFSCORRUPTED;
if (blk_addr == NULL_ADDR) { add_free_nid(sbi, start_nid, true, true); @@ -2504,7 +2504,14 @@ static int __f2fs_build_free_nids(struct f2fs_sb_info *sbi,
if (ret) { f2fs_up_read(&nm_i->nat_tree_lock); - f2fs_err(sbi, "NAT is corrupt, run fsck to fix it"); + + if (ret == -EFSCORRUPTED) { + f2fs_err(sbi, "NAT is corrupt, run fsck to fix it"); + set_sbi_flag(sbi, SBI_NEED_FSCK); + f2fs_handle_error(sbi, + ERROR_INCONSISTENT_NAT); + } + return ret; } } diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h index a82a4bb6ce68b..cf1adceb02697 100644 --- a/include/linux/f2fs_fs.h +++ b/include/linux/f2fs_fs.h @@ -104,6 +104,7 @@ enum f2fs_error { ERROR_CORRUPTED_VERITY_XATTR, ERROR_CORRUPTED_XATTR, ERROR_INVALID_NODE_REFERENCE, + ERROR_INCONSISTENT_NAT, ERROR_MAX, };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhiguo Niu zhiguo.niu@unisoc.com
[ Upstream commit 9b4c8dd99fe48721410741651d426015e03a4b7a ]
Use f2fs_handle_error to record inconsistent node block error and return -EFSCORRUPTED instead of -EINVAL.
Signed-off-by: Zhiguo Niu zhiguo.niu@unisoc.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/node.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 248764badcde8..ed963c56ac32a 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1467,7 +1467,8 @@ static struct page *__get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid, ofs_of_node(page), cpver_of_node(page), next_blkaddr_of_node(page)); set_sbi_flag(sbi, SBI_NEED_FSCK); - err = -EINVAL; + f2fs_handle_error(sbi, ERROR_INCONSISTENT_FOOTER); + err = -EFSCORRUPTED; out_err: ClearPageUptodate(page); out_put_err:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wesley Cheng quic_wcheng@quicinc.com
[ Upstream commit 6add6dd345cb754ce18ff992c7264cabf31e59f6 ]
There is a 120ms delay implemented for allowing the XHCI host controller to detect a U3 wakeup pulse. The intention is to wait for the device to retry the wakeup event if the USB3 PORTSC doesn't reflect the RESUME link status by the time it is checked. As per the USB3 specification:
tU3WakeupRetryDelay ("Table 7-12. LTSSM State Transition Timeouts")
This would allow the XHCI resume sequence to determine if the root hub needs to be also resumed. However, in case there is no device connected, or if there is only a HSUSB device connected, this delay would still affect the overall resume timing.
Since this delay is solely for detecting U3 wake events (USB3 specific) then ignore this delay for the disconnected case and the HSUSB connected only case.
[skip helper function, rename usb3_connected variable -Mathias ]
Signed-off-by: Wesley Cheng quic_wcheng@quicinc.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20231019102924.2797346-20-mathias.nyman@linux.inte... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index fae994f679d45..82aab2f9adbb8 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -968,6 +968,7 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg) int retval = 0; bool comp_timer_running = false; bool pending_portevent = false; + bool suspended_usb3_devs = false; bool reinit_xhc = false;
if (!hcd->state) @@ -1115,10 +1116,17 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg) /* * Resume roothubs only if there are pending events. * USB 3 devices resend U3 LFPS wake after a 100ms delay if - * the first wake signalling failed, give it that chance. + * the first wake signalling failed, give it that chance if + * there are suspended USB 3 devices. */ + if (xhci->usb3_rhub.bus_state.suspended_ports || + xhci->usb3_rhub.bus_state.bus_suspended) + suspended_usb3_devs = true; + pending_portevent = xhci_pending_portevent(xhci); - if (!pending_portevent && msg.event == PM_EVENT_AUTO_RESUME) { + + if (suspended_usb3_devs && !pending_portevent && + msg.event == PM_EVENT_AUTO_RESUME) { msleep(120); pending_portevent = xhci_pending_portevent(xhci); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hardik Gajjar hgajjar@de.adit-jv.com
[ Upstream commit a04224da1f3424b2c607b12a3bd1f0e302fb8231 ]
Previously, gadget assignment to the net device occurred exclusively during the initial binding attempt.
Nevertheless, the gadget pointer could change during bind/unbind cycles due to various conditions, including the unloading/loading of the UDC device driver or the detachment/reconnection of an OTG-capable USB hub device.
This patch relocates the gether_set_gadget() function out from ncm_opts->bound condition check, ensuring that the correct gadget is assigned during each bind request.
The provided logs demonstrate the consistency of ncm_opts throughout the power cycle, while the gadget may change.
* OTG hub connected during boot up and assignment of gadget and ncm_opts pointer
[ 2.366301] usb 2-1.5: New USB device found, idVendor=2996, idProduct=0105 [ 2.366304] usb 2-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 2.366306] usb 2-1.5: Product: H2H Bridge [ 2.366308] usb 2-1.5: Manufacturer: Aptiv [ 2.366309] usb 2-1.5: SerialNumber: 13FEB2021 [ 2.427989] usb 2-1.5: New USB device found, VID=2996, PID=0105 [ 2.428959] dabridge 2-1.5:1.0: dabridge 2-4 total endpoints=5, 0000000093a8d681 [ 2.429710] dabridge 2-1.5:1.0: P(0105) D(22.06.22) F(17.3.16) H(1.1) high-speed [ 2.429714] dabridge 2-1.5:1.0: Hub 2-2 P(0151) V(06.87) [ 2.429956] dabridge 2-1.5:1.0: All downstream ports in host mode
[ 2.430093] gadget 000000003c414d59 ------> gadget pointer
* NCM opts and associated gadget pointer during First ncm_bind
[ 34.763929] NCM opts 00000000aa304ac9 [ 34.763930] NCM gadget 000000003c414d59
* OTG capable hub disconnecte or assume driver unload.
[ 97.203114] usb 2-1: USB disconnect, device number 2 [ 97.203118] usb 2-1.1: USB disconnect, device number 3 [ 97.209217] usb 2-1.5: USB disconnect, device number 4 [ 97.230990] dabr_udc deleted
* Reconnect the OTG hub or load driver assaign new gadget pointer.
[ 111.534035] usb 2-1.1: New USB device found, idVendor=2996, idProduct=0120, bcdDevice= 6.87 [ 111.534038] usb 2-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 111.534040] usb 2-1.1: Product: Vendor [ 111.534041] usb 2-1.1: Manufacturer: Aptiv [ 111.534042] usb 2-1.1: SerialNumber: Superior [ 111.535175] usb 2-1.1: New USB device found, VID=2996, PID=0120 [ 111.610995] usb 2-1.5: new high-speed USB device number 8 using xhci-hcd [ 111.630052] usb 2-1.5: New USB device found, idVendor=2996, idProduct=0105, bcdDevice=21.02 [ 111.630055] usb 2-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 111.630057] usb 2-1.5: Product: H2H Bridge [ 111.630058] usb 2-1.5: Manufacturer: Aptiv [ 111.630059] usb 2-1.5: SerialNumber: 13FEB2021 [ 111.687464] usb 2-1.5: New USB device found, VID=2996, PID=0105 [ 111.690375] dabridge 2-1.5:1.0: dabridge 2-8 total endpoints=5, 000000000d87c961 [ 111.691172] dabridge 2-1.5:1.0: P(0105) D(22.06.22) F(17.3.16) H(1.1) high-speed [ 111.691176] dabridge 2-1.5:1.0: Hub 2-6 P(0151) V(06.87) [ 111.691646] dabridge 2-1.5:1.0: All downstream ports in host mode
[ 111.692298] gadget 00000000dc72f7a9 --------> new gadget ptr on connect
* NCM opts and associated gadget pointer during second ncm_bind
[ 113.271786] NCM opts 00000000aa304ac9 -----> same opts ptr used during first bind [ 113.271788] NCM gadget 00000000dc72f7a9 ----> however new gaget ptr, that will not set in net_device due to ncm_opts->bound = true
Signed-off-by: Hardik Gajjar hgajjar@de.adit-jv.com Link: https://lore.kernel.org/r/20231020153324.82794-1-hgajjar@de.adit-jv.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_ncm.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index faf90a2174194..bbb6ff6b11aa1 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1425,7 +1425,7 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f) struct usb_composite_dev *cdev = c->cdev; struct f_ncm *ncm = func_to_ncm(f); struct usb_string *us; - int status; + int status = 0; struct usb_ep *ep; struct f_ncm_opts *ncm_opts;
@@ -1443,22 +1443,17 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f) f->os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc; }
- /* - * in drivers/usb/gadget/configfs.c:configfs_composite_bind() - * configurations are bound in sequence with list_for_each_entry, - * in each configuration its functions are bound in sequence - * with list_for_each_entry, so we assume no race condition - * with regard to ncm_opts->bound access - */ - if (!ncm_opts->bound) { - mutex_lock(&ncm_opts->lock); - gether_set_gadget(ncm_opts->net, cdev->gadget); + mutex_lock(&ncm_opts->lock); + gether_set_gadget(ncm_opts->net, cdev->gadget); + if (!ncm_opts->bound) status = gether_register_netdev(ncm_opts->net); - mutex_unlock(&ncm_opts->lock); - if (status) - goto fail; - ncm_opts->bound = true; - } + mutex_unlock(&ncm_opts->lock); + + if (status) + goto fail; + + ncm_opts->bound = true; + us = usb_gstrings_attach(cdev, ncm_strings, ARRAY_SIZE(ncm_string_defs)); if (IS_ERR(us)) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Elver elver@google.com
[ Upstream commit 355f074609dbf3042900ea9d30fcd2b0c323a365 ]
syzbot reported:
| BUG: KCSAN: data-race in p9_fd_create / p9_fd_create | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | value changed: 0x00008002 -> 0x00008802
Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and write files. This may happen concurrently if e.g. mounting process modifies the fd in another thread.
Mark the plain read-modify-writes as intentional data-races, with the assumption that the result of executing the accesses concurrently will always result in the same result despite the accesses themselves not being atomic.
Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com Signed-off-by: Marco Elver elver@google.com Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Message-ID: 20231025103445.1248103-1-asmadeus@codewreck.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/9p/trans_fd.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 00b684616e8d9..9374790f17ce4 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -832,14 +832,21 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) goto out_free_ts; if (!(ts->rd->f_mode & FMODE_READ)) goto out_put_rd; - /* prevent workers from hanging on IO when fd is a pipe */ - ts->rd->f_flags |= O_NONBLOCK; + /* Prevent workers from hanging on IO when fd is a pipe. + * It's technically possible for userspace or concurrent mounts to + * modify this flag concurrently, which will likely result in a + * broken filesystem. However, just having bad flags here should + * not crash the kernel or cause any other sort of bug, so mark this + * particular data race as intentional so that tooling (like KCSAN) + * can allow it and detect further problems. + */ + data_race(ts->rd->f_flags |= O_NONBLOCK); ts->wr = fget(wfd); if (!ts->wr) goto out_put_rd; if (!(ts->wr->f_mode & FMODE_WRITE)) goto out_put_wr; - ts->wr->f_flags |= O_NONBLOCK; + data_race(ts->wr->f_flags |= O_NONBLOCK);
client->trans = ts; client->status = Connected;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dominique Martinet asmadeus@codewreck.org
[ Upstream commit 9b5c6281838fc84683dd99b47302d81fce399918 ]
W=1 warns about null argument to kprintf: In file included from fs/9p/xattr.c:12: In function ‘v9fs_xattr_get’, inlined from ‘v9fs_listxattr’ at fs/9p/xattr.c:142:9: include/net/9p/9p.h:55:2: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 55 | _p9_debug(level, __func__, fmt, ##__VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use an empty string instead of : - this is ok 9p-wise because p9pdu_vwritef serializes a null string and an empty string the same way (one '0' word for length) - since this degrades the print statements, add new single quotes for xattr's name delimter (Old: "file = (null)", new: "file = ''")
Link: https://lore.kernel.org/r/20231008060138.517057-1-suhui@nfschina.com Suggested-by: Su Hui suhui@nfschina.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Acked-by: Christian Schoenebeck linux_oss@crudebyte.com Message-ID: 20231025103445.1248103-2-asmadeus@codewreck.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/9p/xattr.c | 5 +++-- net/9p/client.c | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/9p/xattr.c b/fs/9p/xattr.c index e00cf8109b3f3..3c4572ef3a488 100644 --- a/fs/9p/xattr.c +++ b/fs/9p/xattr.c @@ -68,7 +68,7 @@ ssize_t v9fs_xattr_get(struct dentry *dentry, const char *name, struct p9_fid *fid; int ret;
- p9_debug(P9_DEBUG_VFS, "name = %s value_len = %zu\n", + p9_debug(P9_DEBUG_VFS, "name = '%s' value_len = %zu\n", name, buffer_size); fid = v9fs_fid_lookup(dentry); if (IS_ERR(fid)) @@ -139,7 +139,8 @@ int v9fs_fid_xattr_set(struct p9_fid *fid, const char *name,
ssize_t v9fs_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size) { - return v9fs_xattr_get(dentry, NULL, buffer, buffer_size); + /* Txattrwalk with an empty string lists xattrs instead */ + return v9fs_xattr_get(dentry, "", buffer, buffer_size); }
static int v9fs_xattr_handler_get(const struct xattr_handler *handler, diff --git a/net/9p/client.c b/net/9p/client.c index b0e7cb7e1a54a..e265a0ca6bddd 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -1981,7 +1981,7 @@ struct p9_fid *p9_client_xattrwalk(struct p9_fid *file_fid, goto error; } p9_debug(P9_DEBUG_9P, - ">>> TXATTRWALK file_fid %d, attr_fid %d name %s\n", + ">>> TXATTRWALK file_fid %d, attr_fid %d name '%s'\n", file_fid->fid, attr_fid->fid, attr_name);
req = p9_client_rpc(clnt, P9_TXATTRWALK, "dds",
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit 45a832f989e520095429589d5b01b0c65da9b574 ]
Do not loop over ring headers in hci_dma_irq_handler() that are not allocated and enabled in hci_dma_init(). Otherwise out of bounds access will occur from rings->headers[i] access when i >= number of allocated ring headers.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Link: https://lore.kernel.org/r/20230921055704.1087277-5-jarkko.nikula@linux.intel... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/mipi-i3c-hci/dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i3c/master/mipi-i3c-hci/dma.c b/drivers/i3c/master/mipi-i3c-hci/dma.c index 2990ac9eaade7..71b5dbe45c45c 100644 --- a/drivers/i3c/master/mipi-i3c-hci/dma.c +++ b/drivers/i3c/master/mipi-i3c-hci/dma.c @@ -734,7 +734,7 @@ static bool hci_dma_irq_handler(struct i3c_hci *hci, unsigned int mask) unsigned int i; bool handled = false;
- for (i = 0; mask && i < 8; i++) { + for (i = 0; mask && i < rings->total; i++) { struct hci_rh_data *rh; u32 status;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit 8c56f9ef25a33e51f09a448d25cf863b61c9658d ]
Add SMBus PCI ID on Intel Birch Stream SoC.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Andi Shyti andi.shyti@kernel.org Reviewed-by: Jean Delvare jdelvare@suse.de Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/i2c/busses/i2c-i801.rst | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-i801.c | 3 +++ 3 files changed, 5 insertions(+)
diff --git a/Documentation/i2c/busses/i2c-i801.rst b/Documentation/i2c/busses/i2c-i801.rst index e76e68ccf7182..10eced6c2e462 100644 --- a/Documentation/i2c/busses/i2c-i801.rst +++ b/Documentation/i2c/busses/i2c-i801.rst @@ -47,6 +47,7 @@ Supported adapters: * Intel Alder Lake (PCH) * Intel Raptor Lake (PCH) * Intel Meteor Lake (SOC and PCH) + * Intel Birch Stream (SOC)
Datasheets: Publicly available at the Intel website
diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 9cfe8fc509d7d..c91d4ea35c9b8 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -158,6 +158,7 @@ config I2C_I801 Alder Lake (PCH) Raptor Lake (PCH) Meteor Lake (SOC and PCH) + Birch Stream (SOC)
This driver can also be built as a module. If so, the module will be called i2c-i801. diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 2a3215ac01b3a..db4d73456f255 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -79,6 +79,7 @@ * Meteor Lake-P (SOC) 0x7e22 32 hard yes yes yes * Meteor Lake SoC-S (SOC) 0xae22 32 hard yes yes yes * Meteor Lake PCH-S (PCH) 0x7f23 32 hard yes yes yes + * Birch Stream (SOC) 0x5796 32 hard yes yes yes * * Features supported by this driver: * Software PEC no @@ -231,6 +232,7 @@ #define PCI_DEVICE_ID_INTEL_JASPER_LAKE_SMBUS 0x4da3 #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_P_SMBUS 0x51a3 #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_M_SMBUS 0x54a3 +#define PCI_DEVICE_ID_INTEL_BIRCH_STREAM_SMBUS 0x5796 #define PCI_DEVICE_ID_INTEL_BROXTON_SMBUS 0x5ad4 #define PCI_DEVICE_ID_INTEL_RAPTOR_LAKE_S_SMBUS 0x7a23 #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_S_SMBUS 0x7aa3 @@ -1044,6 +1046,7 @@ static const struct pci_device_id i801_ids[] = { { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_P_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_SOC_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_PCH_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, BIRCH_STREAM_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { 0, } };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit 6af79f7fe748fe6a3c5c3a63d7f35981a82c2769 ]
Yang Yingliang reported a memleak: ===
I got memory leak as follows when doing fault injection test:
unreferenced object 0xffff888014aec078 (size 8): comm "xrun", pid 356, jiffies 4294910619 (age 16.332s) hex dump (first 8 bytes): 31 2d 30 30 31 63 00 00 1-001c.. backtrace: [<00000000eb56c0a9>] __kmalloc_track_caller+0x1a6/0x300 [<000000000b220ea3>] kvasprintf+0xad/0x140 [<00000000b83203e5>] kvasprintf_const+0x62/0x190 [<000000002a5eab37>] kobject_set_name_vargs+0x56/0x140 [<00000000300ac279>] dev_set_name+0xb0/0xe0 [<00000000b66ebd6f>] i2c_new_client_device+0x7e4/0x9a0
If device_register() returns error in i2c_new_client_device(), the name allocated by i2c_dev_set_name() need be freed. As comment of device_register() says, it should use put_device() to give up the reference in the error path.
=== I think this solution is less intrusive and more robust than he originally proposed solutions, though.
Reported-by: Yang Yingliang yangyingliang@huawei.com Closes: http://patchwork.ozlabs.org/project/linux-i2c/patch/20221124085448.3620240-1... Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-core-base.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c index 60746652fd525..7f30bcceebaed 100644 --- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -931,8 +931,9 @@ int i2c_dev_irq_from_resources(const struct resource *resources, struct i2c_client * i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *info) { - struct i2c_client *client; - int status; + struct i2c_client *client; + bool need_put = false; + int status;
client = kzalloc(sizeof *client, GFP_KERNEL); if (!client) @@ -970,7 +971,6 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf client->dev.fwnode = info->fwnode;
device_enable_async_suspend(&client->dev); - i2c_dev_set_name(adap, client, info);
if (info->swnode) { status = device_add_software_node(&client->dev, info->swnode); @@ -982,6 +982,7 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf } }
+ i2c_dev_set_name(adap, client, info); status = device_register(&client->dev); if (status) goto out_remove_swnode; @@ -993,6 +994,7 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf
out_remove_swnode: device_remove_software_node(&client->dev); + need_put = true; out_err_put_of_node: of_node_put(info->of_node); out_err: @@ -1000,7 +1002,10 @@ i2c_new_client_device(struct i2c_adapter *adap, struct i2c_board_info const *inf "Failed to register i2c client %s at 0x%02x (%d)\n", client->name, client->addr, status); out_err_silent: - kfree(client); + if (need_put) + put_device(&client->dev); + else + kfree(client); return ERR_PTR(status); } EXPORT_SYMBOL_GPL(i2c_new_client_device);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Axel Lin axel.lin@ingics.com
[ Upstream commit 5ac61d26b8baff5b2e5a9f3dc1ef63297e4b53e7 ]
Make sure we don't OOPS in case clock-frequency is set to 0 in a DT. The variable set here is later used as a divisor.
Signed-off-by: Axel Lin axel.lin@ingics.com Acked-by: Boris Brezillon boris.brezillon@free-electrons.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-sun6i-p2wi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/i2c/busses/i2c-sun6i-p2wi.c b/drivers/i2c/busses/i2c-sun6i-p2wi.c index fa6020dced595..85e035e7a1d75 100644 --- a/drivers/i2c/busses/i2c-sun6i-p2wi.c +++ b/drivers/i2c/busses/i2c-sun6i-p2wi.c @@ -201,6 +201,11 @@ static int p2wi_probe(struct platform_device *pdev) return -EINVAL; }
+ if (clk_freq == 0) { + dev_err(dev, "clock-frequency is set to 0 in DT\n"); + return -EINVAL; + } + if (of_get_child_count(np) > 1) { dev_err(dev, "P2WI only supports one slave device\n"); return -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhenwei pi pizhenwei@bytedance.com
[ Upstream commit fafb51a67fb883eb2dde352539df939a251851be ]
The following codes have an implicit conversion from size_t to u32: (u32)max_size = (size_t)virtio_max_dma_size(vdev);
This may lead overflow, Ex (size_t)4G -> (u32)0. Once virtio_max_dma_size() has a larger size than U32_MAX, use U32_MAX instead.
Signed-off-by: zhenwei pi pizhenwei@bytedance.com Message-Id: 20230904061045.510460-1-pizhenwei@bytedance.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/virtio_blk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 1fe011676d070..4a4b9bad551e8 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -1313,6 +1313,7 @@ static int virtblk_probe(struct virtio_device *vdev) u16 min_io_size; u8 physical_block_exp, alignment_offset; unsigned int queue_depth; + size_t max_dma_size;
if (!vdev->config->get) { dev_err(&vdev->dev, "%s failure: config access disabled\n", @@ -1411,7 +1412,8 @@ static int virtblk_probe(struct virtio_device *vdev) /* No real sector limit. */ blk_queue_max_hw_sectors(q, UINT_MAX);
- max_size = virtio_max_dma_size(vdev); + max_dma_size = virtio_max_dma_size(vdev); + max_size = max_dma_size > U32_MAX ? U32_MAX : max_dma_size;
/* Host can optionally specify maximum segment size and number of * segments. */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Billy Tsai billy_tsai@aspeedtech.com
[ Upstream commit b53e9758a31c683fc8615df930262192ed5f034b ]
The `i3c_master_bus_init` function may attach the I2C devices before the I3C bus initialization. In this flow, the DAT `alloc_entry`` will be used before the DAT `init`. Additionally, if the `i3c_master_bus_init` fails, the DAT `cleanup` will execute before the device is detached, which will execue DAT `free_entry` function. The above scenario can cause the driver to use DAT_data when it is NULL.
Signed-off-by: Billy Tsai billy_tsai@aspeedtech.com Link: https://lore.kernel.org/r/20231023080237.560936-1-billy_tsai@aspeedtech.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/mipi-i3c-hci/dat_v1.c | 29 ++++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/i3c/master/mipi-i3c-hci/dat_v1.c b/drivers/i3c/master/mipi-i3c-hci/dat_v1.c index 97bb49ff5b53b..47b9b4d4ed3fc 100644 --- a/drivers/i3c/master/mipi-i3c-hci/dat_v1.c +++ b/drivers/i3c/master/mipi-i3c-hci/dat_v1.c @@ -64,15 +64,17 @@ static int hci_dat_v1_init(struct i3c_hci *hci) return -EOPNOTSUPP; }
- /* use a bitmap for faster free slot search */ - hci->DAT_data = bitmap_zalloc(hci->DAT_entries, GFP_KERNEL); - if (!hci->DAT_data) - return -ENOMEM; - - /* clear them */ - for (dat_idx = 0; dat_idx < hci->DAT_entries; dat_idx++) { - dat_w0_write(dat_idx, 0); - dat_w1_write(dat_idx, 0); + if (!hci->DAT_data) { + /* use a bitmap for faster free slot search */ + hci->DAT_data = bitmap_zalloc(hci->DAT_entries, GFP_KERNEL); + if (!hci->DAT_data) + return -ENOMEM; + + /* clear them */ + for (dat_idx = 0; dat_idx < hci->DAT_entries; dat_idx++) { + dat_w0_write(dat_idx, 0); + dat_w1_write(dat_idx, 0); + } }
return 0; @@ -87,7 +89,13 @@ static void hci_dat_v1_cleanup(struct i3c_hci *hci) static int hci_dat_v1_alloc_entry(struct i3c_hci *hci) { unsigned int dat_idx; + int ret;
+ if (!hci->DAT_data) { + ret = hci_dat_v1_init(hci); + if (ret) + return ret; + } dat_idx = find_first_zero_bit(hci->DAT_data, hci->DAT_entries); if (dat_idx >= hci->DAT_entries) return -ENOENT; @@ -103,7 +111,8 @@ static void hci_dat_v1_free_entry(struct i3c_hci *hci, unsigned int dat_idx) { dat_w0_write(dat_idx, 0); dat_w1_write(dat_idx, 0); - __clear_bit(dat_idx, hci->DAT_data); + if (hci->DAT_data) + __clear_bit(dat_idx, hci->DAT_data); }
static void hci_dat_v1_set_dynamic_addr(struct i3c_hci *hci,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rajeshwar R Shinde coolrrsh@gmail.com
[ Upstream commit 099be1822d1f095433f4b08af9cc9d6308ec1953 ]
Syzkaller reported the following issue: UBSAN: shift-out-of-bounds in drivers/media/usb/gspca/cpia1.c:1031:27 shift exponent 245 is too large for 32-bit type 'int'
When the value of the variable "sd->params.exposure.gain" exceeds the number of bits in an integer, a shift-out-of-bounds error is reported. It is triggered because the variable "currentexp" cannot be left-shifted by more than the number of bits in an integer. In order to avoid invalid range during left-shift, the conditional expression is added.
Reported-by: syzbot+e27f3dbdab04e43b9f73@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/20230818164522.12806-1-coolrrsh@gmail.com Link: https://syzkaller.appspot.com/bug?extid=e27f3dbdab04e43b9f73 Signed-off-by: Rajeshwar R Shinde coolrrsh@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/gspca/cpia1.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/media/usb/gspca/cpia1.c b/drivers/media/usb/gspca/cpia1.c index 46ed95483e222..5f5fa851ca640 100644 --- a/drivers/media/usb/gspca/cpia1.c +++ b/drivers/media/usb/gspca/cpia1.c @@ -18,6 +18,7 @@
#include <linux/input.h> #include <linux/sched/signal.h> +#include <linux/bitops.h>
#include "gspca.h"
@@ -1028,6 +1029,8 @@ static int set_flicker(struct gspca_dev *gspca_dev, int on, int apply) sd->params.exposure.expMode = 2; sd->exposure_status = EXPOSURE_NORMAL; } + if (sd->params.exposure.gain >= BITS_PER_TYPE(currentexp)) + return -EINVAL; currentexp = currentexp << sd->params.exposure.gain; sd->params.exposure.gain = 0; /* round down current exposure to nearest value */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 4567ebf8e8f9546b373e78e3b7d584cc30b62028 ]
Fixes these compiler warnings:
drivers/media/test-drivers/vivid/vivid-rds-gen.c: In function 'vivid_rds_gen_fill': drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:56: warning: '.' directive output may be truncated writing 1 byte into a region of size between 0 and 3 [-Wformat-truncation=] 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^ drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:52: note: directive argument in the range [0, 9] 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^~~~~~~~~ drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:9: note: 'snprintf' output between 9 and 12 bytes into a destination of size 9 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 148 | freq / 16, ((freq & 0xf) * 10) / 16); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/test-drivers/vivid/vivid-rds-gen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/test-drivers/vivid/vivid-rds-gen.c b/drivers/media/test-drivers/vivid/vivid-rds-gen.c index b5b104ee64c99..c57771119a34b 100644 --- a/drivers/media/test-drivers/vivid/vivid-rds-gen.c +++ b/drivers/media/test-drivers/vivid/vivid-rds-gen.c @@ -145,7 +145,7 @@ void vivid_rds_gen_fill(struct vivid_rds_gen *rds, unsigned freq, rds->ta = alt; rds->ms = true; snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", - freq / 16, ((freq & 0xf) * 10) / 16); + (freq / 16) % 1000000, (((freq & 0xf) * 10) / 16) % 10); if (alt) strscpy(rds->radiotext, " The Radio Data System can switch between different Radio Texts ",
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 83d0d4cc1423194b580356966107379490edd02e ]
Fixes this compiler warning:
In file included from include/linux/property.h:14, from include/linux/acpi.h:16, from drivers/media/pci/intel/ipu-bridge.c:4: In function 'ipu_bridge_init_swnode_names', inlined from 'ipu_bridge_create_connection_swnodes' at drivers/media/pci/intel/ipu-bridge.c:445:2, inlined from 'ipu_bridge_connect_sensor' at drivers/media/pci/intel/ipu-bridge.c:656:3: include/linux/fwnode.h:81:49: warning: '%u' directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=] 81 | #define SWNODE_GRAPH_PORT_NAME_FMT "port@%u" | ^~~~~~~~~ drivers/media/pci/intel/ipu-bridge.c:384:18: note: in expansion of macro 'SWNODE_GRAPH_PORT_NAME_FMT' 384 | SWNODE_GRAPH_PORT_NAME_FMT, sensor->link); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/fwnode.h: In function 'ipu_bridge_connect_sensor': include/linux/fwnode.h:81:55: note: format string is defined here 81 | #define SWNODE_GRAPH_PORT_NAME_FMT "port@%u" | ^~ In function 'ipu_bridge_init_swnode_names', inlined from 'ipu_bridge_create_connection_swnodes' at drivers/media/pci/intel/ipu-bridge.c:445:2, inlined from 'ipu_bridge_connect_sensor' at drivers/media/pci/intel/ipu-bridge.c:656:3: include/linux/fwnode.h:81:49: note: directive argument in the range [0, 255] 81 | #define SWNODE_GRAPH_PORT_NAME_FMT "port@%u" | ^~~~~~~~~ drivers/media/pci/intel/ipu-bridge.c:384:18: note: in expansion of macro 'SWNODE_GRAPH_PORT_NAME_FMT' 384 | SWNODE_GRAPH_PORT_NAME_FMT, sensor->link); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/media/pci/intel/ipu-bridge.c:382:9: note: 'snprintf' output between 7 and 9 bytes into a destination of size 7 382 | snprintf(sensor->node_names.remote_port, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 383 | sizeof(sensor->node_names.remote_port), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 384 | SWNODE_GRAPH_PORT_NAME_FMT, sensor->link); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/intel/ipu-bridge.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/pci/intel/ipu-bridge.h b/drivers/media/pci/intel/ipu-bridge.h index 1ff0b2d04d929..1ed53d51e16a1 100644 --- a/drivers/media/pci/intel/ipu-bridge.h +++ b/drivers/media/pci/intel/ipu-bridge.h @@ -103,7 +103,7 @@ struct ipu_property_names { struct ipu_node_names { char port[7]; char endpoint[11]; - char remote_port[7]; + char remote_port[9]; char vcm[16]; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit 4c6a08125f2249531ec01783a5f4317d7342add5 ]
When lots of quota changes are made, there may be cases in which an inode's quota information is increased and then decreased, such as when blocks are added to a file, then deleted from it. If the timing is right, function do_qc can add pending quota changes to a transaction, then later, another call to do_qc can negate those changes, resulting in a net gain of 0. The quota_change information is recorded in the qc buffer (and qd element of the inode as well). The buffer is added to the transaction by the first call to do_qc, but a subsequent call changes the value from non-zero back to zero. At that point it's too late to remove the buffer_head from the transaction. Later, when the quota sync code is called, the zero-change qd element is discovered and flagged as an assert warning. If the fs is mounted with errors=panic, the kernel will panic.
This is usually seen when files are truncated and the quota changes are negated by punch_hole/truncate which uses gfs2_quota_hold and gfs2_quota_unhold rather than block allocations that use gfs2_quota_lock and gfs2_quota_unlock which automatically do quota sync.
This patch solves the problem by adding a check to qd_check_sync such that net-zero quota changes already added to the transaction are no longer deemed necessary to be synced, and skipped.
In this case references are taken for the qd and the slot from do_qc so those need to be put. The normal sequence of events for a normal non-zero quota change is as follows:
gfs2_quota_change do_qc qd_hold slot_hold
Later, when the changes are to be synced:
gfs2_quota_sync qd_fish qd_check_sync gets qd ref via lockref_get_not_dead do_sync do_qc(QC_SYNC) qd_put lockref_put_or_lock qd_unlock qd_put lockref_put_or_lock
In the net-zero change case, we add a check to qd_check_sync so it puts the qd and slot references acquired in gfs2_quota_change and skip the unneeded sync.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/quota.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index 704192b736050..ccecb79eeaf8e 100644 --- a/fs/gfs2/quota.c +++ b/fs/gfs2/quota.c @@ -441,6 +441,17 @@ static int qd_check_sync(struct gfs2_sbd *sdp, struct gfs2_quota_data *qd, (sync_gen && (qd->qd_sync_gen >= *sync_gen))) return 0;
+ /* + * If qd_change is 0 it means a pending quota change was negated. + * We should not sync it, but we still have a qd reference and slot + * reference taken by gfs2_quota_change -> do_qc that need to be put. + */ + if (!qd->qd_change && test_and_clear_bit(QDF_CHANGE, &qd->qd_flags)) { + slot_put(qd); + qd_put(qd); + return 0; + } + if (!lockref_get_not_dead(&qd->qd_lockref)) return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 0abd1557e21c617bd13fc18f7725fc6363c05913 ]
In RCU mode, we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode".
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/inode.c | 11 +++++++++-- fs/gfs2/super.c | 2 +- 2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 17c994a0c0d09..117b4b5a03072 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1862,14 +1862,21 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, { struct gfs2_inode *ip; struct gfs2_holder i_gh; + struct gfs2_glock *gl; int error;
gfs2_holder_mark_uninitialized(&i_gh); ip = GFS2_I(inode); - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { + gl = rcu_dereference(ip->i_gl); + if (unlikely(!gl)) { + /* inode is getting torn down, must be RCU mode */ + WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); + return -ECHILD; + } + if (gfs2_glock_is_locked_by_me(gl) == NULL) { if (mask & MAY_NOT_BLOCK) return -ECHILD; - error = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); + error = gfs2_glock_nq_init(gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); if (error) return error; } diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 9f4d5d6549ee6..f98ddb9d19a21 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1558,7 +1558,7 @@ static void gfs2_evict_inode(struct inode *inode) wait_on_bit_io(&ip->i_flags, GIF_GLOP_PENDING, TASK_UNINTERRUPTIBLE); gfs2_glock_add_to_lru(ip->i_gl); gfs2_glock_put_eventually(ip->i_gl); - ip->i_gl = NULL; + rcu_assign_pointer(ip->i_gl, NULL); } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit f301fedbeecfdce91cb898d6fa5e62f269801fee ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/cobalt/cobalt-driver.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c index 74edcc76d12f4..6e1a0614e6d06 100644 --- a/drivers/media/pci/cobalt/cobalt-driver.c +++ b/drivers/media/pci/cobalt/cobalt-driver.c @@ -8,6 +8,7 @@ * All rights reserved. */
+#include <linux/bitfield.h> #include <linux/delay.h> #include <media/i2c/adv7604.h> #include <media/i2c/adv7842.h> @@ -210,17 +211,17 @@ void cobalt_pcie_status_show(struct cobalt *cobalt) pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &stat); cobalt_info("PCIe link capability 0x%08x: %s per lane and %u lanes\n", capa, get_link_speed(capa), - (capa & PCI_EXP_LNKCAP_MLW) >> 4); + FIELD_GET(PCI_EXP_LNKCAP_MLW, capa)); cobalt_info("PCIe link control 0x%04x\n", ctrl); cobalt_info("PCIe link status 0x%04x: %s per lane and %u lanes\n", stat, get_link_speed(stat), - (stat & PCI_EXP_LNKSTA_NLW) >> 4); + FIELD_GET(PCI_EXP_LNKSTA_NLW, stat));
/* Bus */ pcie_capability_read_dword(pci_bus_dev, PCI_EXP_LNKCAP, &capa); cobalt_info("PCIe bus link capability 0x%08x: %s per lane and %u lanes\n", capa, get_link_speed(capa), - (capa & PCI_EXP_LNKCAP_MLW) >> 4); + FIELD_GET(PCI_EXP_LNKCAP_MLW, capa));
/* Slot */ pcie_capability_read_dword(pci_dev, PCI_EXP_SLTCAP, &capa); @@ -239,7 +240,7 @@ static unsigned pcie_link_get_lanes(struct cobalt *cobalt) if (!pci_is_pcie(pci_dev)) return 0; pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &link); - return (link & PCI_EXP_LNKSTA_NLW) >> 4; + return FIELD_GET(PCI_EXP_LNKSTA_NLW, link); }
static unsigned pcie_bus_link_get_lanes(struct cobalt *cobalt) @@ -250,7 +251,7 @@ static unsigned pcie_bus_link_get_lanes(struct cobalt *cobalt) if (!pci_is_pcie(pci_dev)) return 0; pcie_capability_read_dword(pci_dev, PCI_EXP_LNKCAP, &link); - return (link & PCI_EXP_LNKCAP_MLW) >> 4; + return FIELD_GET(PCI_EXP_LNKCAP_MLW, link); }
static void msi_config_show(struct cobalt *cobalt, struct pci_dev *pci_dev)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 441b5c63d71ec9ec5453328f7e83384ecc1dddd9 ]
Fix documentation for struct ccs_quirk, a device specific struct for managing deviations from the standard. The flags field was drifted away from where it should have been.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/ccs/ccs-quirk.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/media/i2c/ccs/ccs-quirk.h b/drivers/media/i2c/ccs/ccs-quirk.h index 5838fcda92fd4..0b1a64958d714 100644 --- a/drivers/media/i2c/ccs/ccs-quirk.h +++ b/drivers/media/i2c/ccs/ccs-quirk.h @@ -32,12 +32,10 @@ struct ccs_sensor; * @reg: Pointer to the register to access * @value: Register value, set by the caller on write, or * by the quirk on read - * - * @flags: Quirk flags - * * @return: 0 on success, -ENOIOCTLCMD if no register * access may be done by the caller (default read * value is zero), else negative error code on error + * @flags: Quirk flags */ struct ccs_quirk { int (*limits)(struct ccs_sensor *sensor);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit a1766a4fd83befa0b34d932d532e7ebb7fab1fa7 ]
imon driver probes two USB interfaces, and at the probe of the second interface, the driver assumes blindly that the first interface got bound with the same imon driver. It's usually true, but it's still possible that the first interface is bound with another driver via a malformed descriptor. Then it may lead to a memory corruption, as spotted by syzkaller; imon driver accesses the data from drvdata as struct imon_context object although it's a completely different one that was assigned by another driver.
This patch adds a sanity check -- whether the first interface is really bound with the imon driver or not -- for avoiding the problem above at the probe time.
Reported-by: syzbot+59875ffef5cb9c9b29e9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000a838aa0603cc74d6@google.com/ Tested-by: Ricardo B. Marliere ricardo@marliere.net Link: https://lore.kernel.org/r/20230922005152.163640-1-ricardo@marliere.net Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/rc/imon.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/media/rc/imon.c b/drivers/media/rc/imon.c index 74546f7e34691..5719dda6e0f0e 100644 --- a/drivers/media/rc/imon.c +++ b/drivers/media/rc/imon.c @@ -2427,6 +2427,12 @@ static int imon_probe(struct usb_interface *interface, goto fail; }
+ if (first_if->dev.driver != interface->dev.driver) { + dev_err(&interface->dev, "inconsistent driver matching\n"); + ret = -EINVAL; + goto fail; + } + if (ifnum == 0) { ictx = imon_init_intf0(interface, id); if (!ictx) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin wayne.lin@amd.com
[ Upstream commit b1904ed480cee3f9f4036ea0e36d139cb5fee2d6 ]
[Why & How] Check whether assigned timing generator is NULL or not before accessing its funcs to prevent NULL dereference.
Reviewed-by: Jun Lei jun.lei@amd.com Acked-by: Hersen Wu hersenxs.wu@amd.com Signed-off-by: Wayne Lin wayne.lin@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c index 6e11d2b701f82..569d40eb7059d 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c @@ -556,7 +556,7 @@ uint32_t dc_stream_get_vblank_counter(const struct dc_stream_state *stream) for (i = 0; i < MAX_PIPES; i++) { struct timing_generator *tg = res_ctx->pipe_ctx[i].stream_res.tg;
- if (res_ctx->pipe_ctx[i].stream != stream) + if (res_ctx->pipe_ctx[i].stream != stream || !tg) continue;
return tg->funcs->get_frame_count(tg); @@ -615,7 +615,7 @@ bool dc_stream_get_scanoutpos(const struct dc_stream_state *stream, for (i = 0; i < MAX_PIPES; i++) { struct timing_generator *tg = res_ctx->pipe_ctx[i].stream_res.tg;
- if (res_ctx->pipe_ctx[i].stream != stream) + if (res_ctx->pipe_ctx[i].stream != stream || !tg) continue;
tg->funcs->get_scanoutpos(tg,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson dianders@chromium.org
[ Upstream commit dd712d3d45807db9fcae28a522deee85c1f2fde6 ]
When entering kdb/kgdb on a kernel panic, it was be observed that the console isn't flushed before the `kdb` prompt came up. Specifically, when using the buddy lockup detector on arm64 and running: echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
I could see: [ 26.161099] lkdtm: Performing direct entry HARDLOCKUP [ 32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6 [ 32.552865] Sending NMI from CPU 5 to CPUs 6: [ 32.557359] NMI backtrace for cpu 6 ... [backtrace for cpu 6] ... [ 32.558353] NMI backtrace for cpu 5 ... [backtrace for cpu 5] ... [ 32.867471] Sending NMI from CPU 5 to CPUs 0-4,7: [ 32.872321] NMI backtrace forP cpuANC: Hard LOCKUP
Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry [5]kdb>
As you can see, backtraces for the other CPUs start printing and get interleaved with the kdb PANIC print.
Let's replicate the commands to flush the console in the kdb panic entry point to avoid this.
Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://lore.kernel.org/r/20230822131945.1.I5b460ae8f954e4c4f628a373d6e74713... Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/debug/debug_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index d5e9ccde3ab8e..3a904d8697c8f 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -1006,6 +1006,9 @@ void kgdb_panic(const char *msg) if (panic_timeout) return;
+ debug_locks_off(); + console_flush_on_panic(CONSOLE_FLUSH_PENDING); + if (dbg_kdb_mode) kdb_printf("PANIC: %s\n", msg);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepak Gupta debug@rivosinc.com
[ Upstream commit be97d0db5f44c0674480cb79ac6f5b0529b84c76 ]
commit 31da94c25aea ("riscv: add VMAP_STACK overflow detection") added support for CONFIG_VMAP_STACK. If overflow is detected, CPU switches to `shadow_stack` temporarily before switching finally to per-cpu `overflow_stack`.
If two CPUs/harts are racing and end up in over flowing kernel stack, one or both will end up corrupting each other state because `shadow_stack` is not per-cpu. This patch optimizes per-cpu overflow stack switch by directly picking per-cpu `overflow_stack` and gets rid of `shadow_stack`.
Following are the changes in this patch
- Defines an asm macro to obtain per-cpu symbols in destination register. - In entry.S, when overflow is detected, per-cpu overflow stack is located using per-cpu asm macro. Computing per-cpu symbol requires a temporary register. x31 is saved away into CSR_SCRATCH (CSR_SCRATCH is anyways zero since we're in kernel).
Please see Links for additional relevant disccussion and alternative solution.
Tested by `echo EXHAUST_STACK > /sys/kernel/debug/provoke-crash/DIRECT` Kernel crash log below
Insufficient stack space to handle exception!/debug/provoke-crash/DIRECT Task stack: [0xff20000010a98000..0xff20000010a9c000] Overflow stack: [0xff600001f7d98370..0xff600001f7d99370] CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) epc : __memset+0x60/0xfc ra : recursive_loop+0x48/0xc6 [lkdtm] epc : ffffffff808de0e4 ra : ffffffff0163a752 sp : ff20000010a97e80 gp : ffffffff815c0330 tp : ff600000820ea280 t0 : ff20000010a97e88 t1 : 000000000000002e t2 : 3233206874706564 s0 : ff20000010a982b0 s1 : 0000000000000012 a0 : ff20000010a97e88 a1 : 0000000000000000 a2 : 0000000000000400 a3 : ff20000010a98288 a4 : 0000000000000000 a5 : 0000000000000000 a6 : fffffffffffe43f0 a7 : 00007fffffffffff s2 : ff20000010a97e88 s3 : ffffffff01644680 s4 : ff20000010a9be90 s5 : ff600000842ba6c0 s6 : 00aaaaaac29e42b0 s7 : 00fffffff0aa3684 s8 : 00aaaaaac2978040 s9 : 0000000000000065 s10: 00ffffff8a7cad10 s11: 00ffffff8a76a4e0 t3 : ffffffff815dbaf4 t4 : ffffffff815dbaf4 t5 : ffffffff815dbab8 t6 : ff20000010a9bb48 status: 0000000200000120 badaddr: ff20000010a97e88 cause: 000000000000000f Kernel panic - not syncing: Kernel stack overflow CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80006754>] dump_backtrace+0x30/0x38 [<ffffffff808de798>] show_stack+0x40/0x4c [<ffffffff808ea2a8>] dump_stack_lvl+0x44/0x5c [<ffffffff808ea2d8>] dump_stack+0x18/0x20 [<ffffffff808dec06>] panic+0x126/0x2fe [<ffffffff800065ea>] walk_stackframe+0x0/0xf0 [<ffffffff0163a752>] recursive_loop+0x48/0xc6 [lkdtm] SMP: stopping secondary CPUs ---[ end Kernel panic - not syncing: Kernel stack overflow ]---
Cc: Guo Ren guoren@kernel.org Cc: Jisheng Zhang jszhang@kernel.org Link: https://lore.kernel.org/linux-riscv/Y347B0x4VUNOd6V7@xhacker/T/#t Link: https://lore.kernel.org/lkml/20221124094845.1907443-1-debug@rivosinc.com/ Signed-off-by: Deepak Gupta debug@rivosinc.com Co-developed-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Sami Tolvanen samitolvanen@google.com Acked-by: Guo Ren guoren@kernel.org Tested-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/20230927224757.1154247-9-samitolvanen@google.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/asm-prototypes.h | 1 - arch/riscv/include/asm/asm.h | 22 ++++++++ arch/riscv/include/asm/thread_info.h | 3 -- arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 70 ++++--------------------- arch/riscv/kernel/traps.c | 36 +------------ 6 files changed, 34 insertions(+), 99 deletions(-)
diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index 61ba8ed43d8fe..36b955c762ba0 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); DECLARE_DO_ERROR_INFO(do_trap_ecall_m); DECLARE_DO_ERROR_INFO(do_trap_break);
-asmlinkage unsigned long get_overflow_stack(void); asmlinkage void handle_bad_stack(struct pt_regs *regs); asmlinkage void do_page_fault(struct pt_regs *regs); asmlinkage void do_irq(struct pt_regs *regs); diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h index 114bbadaef41e..bfb4c26f113c4 100644 --- a/arch/riscv/include/asm/asm.h +++ b/arch/riscv/include/asm/asm.h @@ -82,6 +82,28 @@ .endr .endm
+#ifdef CONFIG_SMP +#ifdef CONFIG_32BIT +#define PER_CPU_OFFSET_SHIFT 2 +#else +#define PER_CPU_OFFSET_SHIFT 3 +#endif + +.macro asm_per_cpu dst sym tmp + REG_L \tmp, TASK_TI_CPU_NUM(tp) + slli \tmp, \tmp, PER_CPU_OFFSET_SHIFT + la \dst, __per_cpu_offset + add \dst, \dst, \tmp + REG_L \tmp, 0(\dst) + la \dst, \sym + add \dst, \dst, \tmp +.endm +#else /* CONFIG_SMP */ +.macro asm_per_cpu dst sym tmp + la \dst, \sym +.endm +#endif /* CONFIG_SMP */ + /* save all GPs except x1 ~ x5 */ .macro save_from_x6_to_x31 REG_S x6, PT_T1(sp) diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index 1833beb00489c..d18ce0113ca1f 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -34,9 +34,6 @@
#ifndef __ASSEMBLY__
-extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; -extern unsigned long spin_shadow_stack; - #include <asm/processor.h> #include <asm/csr.h>
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index d6a75aac1d27a..9f535d5de33f9 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -39,6 +39,7 @@ void asm_offsets(void) OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
+ OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu); OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); OFFSET(TASK_THREAD_F2, task_struct, thread.fstate.f[2]); diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 143a2bb3e6976..3d11aa3af105e 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -10,9 +10,11 @@ #include <asm/asm.h> #include <asm/csr.h> #include <asm/unistd.h> +#include <asm/page.h> #include <asm/thread_info.h> #include <asm/asm-offsets.h> #include <asm/errata_list.h> +#include <linux/sizes.h>
SYM_CODE_START(handle_exception) /* @@ -170,67 +172,15 @@ SYM_CODE_END(ret_from_exception)
#ifdef CONFIG_VMAP_STACK SYM_CODE_START_LOCAL(handle_kernel_stack_overflow) - /* - * Takes the psuedo-spinlock for the shadow stack, in case multiple - * harts are concurrently overflowing their kernel stacks. We could - * store any value here, but since we're overflowing the kernel stack - * already we only have SP to use as a scratch register. So we just - * swap in the address of the spinlock, as that's definately non-zero. - * - * Pairs with a store_release in handle_bad_stack(). - */ -1: la sp, spin_shadow_stack - REG_AMOSWAP_AQ sp, sp, (sp) - bnez sp, 1b - - la sp, shadow_stack - addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE - - //save caller register to shadow stack - addi sp, sp, -(PT_SIZE_ON_STACK) - REG_S x1, PT_RA(sp) - REG_S x5, PT_T0(sp) - REG_S x6, PT_T1(sp) - REG_S x7, PT_T2(sp) - REG_S x10, PT_A0(sp) - REG_S x11, PT_A1(sp) - REG_S x12, PT_A2(sp) - REG_S x13, PT_A3(sp) - REG_S x14, PT_A4(sp) - REG_S x15, PT_A5(sp) - REG_S x16, PT_A6(sp) - REG_S x17, PT_A7(sp) - REG_S x28, PT_T3(sp) - REG_S x29, PT_T4(sp) - REG_S x30, PT_T5(sp) - REG_S x31, PT_T6(sp) - - la ra, restore_caller_reg - tail get_overflow_stack - -restore_caller_reg: - //save per-cpu overflow stack - REG_S a0, -8(sp) - //restore caller register from shadow_stack - REG_L x1, PT_RA(sp) - REG_L x5, PT_T0(sp) - REG_L x6, PT_T1(sp) - REG_L x7, PT_T2(sp) - REG_L x10, PT_A0(sp) - REG_L x11, PT_A1(sp) - REG_L x12, PT_A2(sp) - REG_L x13, PT_A3(sp) - REG_L x14, PT_A4(sp) - REG_L x15, PT_A5(sp) - REG_L x16, PT_A6(sp) - REG_L x17, PT_A7(sp) - REG_L x28, PT_T3(sp) - REG_L x29, PT_T4(sp) - REG_L x30, PT_T5(sp) - REG_L x31, PT_T6(sp) + /* we reach here from kernel context, sscratch must be 0 */ + csrrw x31, CSR_SCRATCH, x31 + asm_per_cpu sp, overflow_stack, x31 + li x31, OVERFLOW_STACK_SIZE + add sp, sp, x31 + /* zero out x31 again and restore x31 */ + xor x31, x31, x31 + csrrw x31, CSR_SCRATCH, x31
- //load per-cpu overflow stack - REG_L sp, -8(sp) addi sp, sp, -(PT_SIZE_ON_STACK)
//save context to overflow stack diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index cd6f10c73a163..061117b8a3438 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -408,48 +408,14 @@ int is_valid_bugaddr(unsigned long pc) #endif /* CONFIG_GENERIC_BUG */
#ifdef CONFIG_VMAP_STACK -/* - * Extra stack space that allows us to provide panic messages when the kernel - * has overflowed its stack. - */ -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], +DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)__aligned(16); -/* - * A temporary stack for use by handle_kernel_stack_overflow. This is used so - * we can call into C code to get the per-hart overflow stack. Usage of this - * stack must be protected by spin_shadow_stack. - */ -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); - -/* - * A pseudo spinlock to protect the shadow stack from being used by multiple - * harts concurrently. This isn't a real spinlock because the lock side must - * be taken without a valid stack and only a single register, it's only taken - * while in the process of panicing anyway so the performance and error - * checking a proper spinlock gives us doesn't matter. - */ -unsigned long spin_shadow_stack; - -asmlinkage unsigned long get_overflow_stack(void) -{ - return (unsigned long)this_cpu_ptr(overflow_stack) + - OVERFLOW_STACK_SIZE; -}
asmlinkage void handle_bad_stack(struct pt_regs *regs) { unsigned long tsk_stk = (unsigned long)current->stack; unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
- /* - * We're done with the shadow stack by this point, as we're on the - * overflow stack. Tell any other concurrent overflowing harts that - * they can proceed with panicing by releasing the pseudo-spinlock. - * - * This pairs with an amoswap.aq in handle_kernel_stack_overflow. - */ - smp_store_release(&spin_shadow_stack, 0); - console_verbose();
pr_emerg("Insufficient stack space to handle exception!\n");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit cc9c54232f04aef3a5d7f64a0ece7df00f1aaa3d ]
i2c-dev.c utilizes memdup_user() to copy a userspace array. This is done without an overflow check.
Use the new wrapper memdup_array_user() to copy the array more safely.
Suggested-by: Dave Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index a01b59e3599b5..7d337380a05d9 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -450,8 +450,8 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (rdwr_arg.nmsgs > I2C_RDWR_IOCTL_MAX_MSGS) return -EINVAL;
- rdwr_pa = memdup_user(rdwr_arg.msgs, - rdwr_arg.nmsgs * sizeof(struct i2c_msg)); + rdwr_pa = memdup_array_user(rdwr_arg.msgs, + rdwr_arg.nmsgs, sizeof(struct i2c_msg)); if (IS_ERR(rdwr_pa)) return PTR_ERR(rdwr_pa);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren tony@atomide.com
[ Upstream commit fbb74e56378d8306f214658e3d525a8b3f000c5a ]
We need to check for an active device as otherwise we get warnings for some mcbsp instances for "Runtime PM usage count underflow!".
Reported-by: Andreas Kemnade andreas@kemnade.info Signed-off-by: Tony Lindgren tony@atomide.com Link: https://lore.kernel.org/r/20231030052340.13415-1-tony@atomide.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/ti/omap-mcbsp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/ti/omap-mcbsp.c b/sound/soc/ti/omap-mcbsp.c index 21fa7b9787997..94c514e57eef9 100644 --- a/sound/soc/ti/omap-mcbsp.c +++ b/sound/soc/ti/omap-mcbsp.c @@ -74,14 +74,16 @@ static int omap2_mcbsp_set_clks_src(struct omap_mcbsp *mcbsp, u8 fck_src_id) return -EINVAL; }
- pm_runtime_put_sync(mcbsp->dev); + if (mcbsp->active) + pm_runtime_put_sync(mcbsp->dev);
r = clk_set_parent(mcbsp->fclk, fck_src); if (r) dev_err(mcbsp->dev, "CLKS: could not clk_set_parent() to %s\n", src);
- pm_runtime_get_sync(mcbsp->dev); + if (mcbsp->active) + pm_runtime_get_sync(mcbsp->dev);
clk_put(fck_src);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zongmin Zhou zhouzongmin@kylinos.cn
[ Upstream commit 0e8b9f258baed25f1c5672613699247c76b007b5 ]
The allocated memory for qdev->dumb_heads should be released in qxl_destroy_monitors_object before qxl suspend. otherwise,qxl_create_monitors_object will be called to reallocate memory for qdev->dumb_heads after qxl resume, it will cause memory leak.
Signed-off-by: Zongmin Zhou zhouzongmin@kylinos.cn Link: https://lore.kernel.org/r/20230801025309.4049813-1-zhouzongmin@kylinos.cn Reviewed-by: Dave Airlie airlied@redhat.com Signed-off-by: Maxime Ripard mripard@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/qxl/qxl_display.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c index 6492a70e3c396..404b0483bb7cb 100644 --- a/drivers/gpu/drm/qxl/qxl_display.c +++ b/drivers/gpu/drm/qxl/qxl_display.c @@ -1229,6 +1229,9 @@ int qxl_destroy_monitors_object(struct qxl_device *qdev) if (!qdev->monitors_config_bo) return 0;
+ kfree(qdev->dumb_heads); + qdev->dumb_heads = NULL; + qdev->monitors_config = NULL; qdev->ram_header->monitors_config = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Spataru alex_spataru@outlook.com
[ Upstream commit 26fd31ef9c02a5e91cdb8eea127b056bd7cf0b3b ]
Enables the SPI-connected CSC35L41 audio amplifier for this laptop model.
As of BIOS version 303 it's still necessary to modify the ACPI table to add the related _DSD properties: https://github.com/alex-spataru/asus_zenbook_ux7602zm_sound/
Signed-off-by: Alex Spataru alex_spataru@outlook.com Link: https://lore.kernel.org/r/DS7PR07MB7621BB5BB14F5473D181624CE3A4A@DS7PR07MB76... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 7f1d79f450a2a..fe9a29f358792 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9798,6 +9798,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x1d4e, "ASUS TM420", ALC256_FIXUP_ASUS_HPE), SND_PCI_QUIRK(0x1043, 0x1e02, "ASUS UX3402ZA", ALC245_FIXUP_CS35L41_SPI_2), SND_PCI_QUIRK(0x1043, 0x16a3, "ASUS UX3402VA", ALC245_FIXUP_CS35L41_SPI_2), + SND_PCI_QUIRK(0x1043, 0x1f62, "ASUS UX7602ZM", ALC245_FIXUP_CS35L41_SPI_2), SND_PCI_QUIRK(0x1043, 0x1e11, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA502), SND_PCI_QUIRK(0x1043, 0x1e12, "ASUS UM3402", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x1e51, "ASUS Zephyrus M15", ALC294_FIXUP_ASUS_GU502_PINS),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vitaly Prosyak vitaly.prosyak@amd.com
[ Upstream commit 4638e0c29a3f2294d5de0d052a4b8c9f33ccb957 ]
When software 'pci unplug' using IGT is executed we got a sysfs directory entry is NULL for differant ras blocks like hdp, umc, etc. Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group' check that 'sd' is not NULL.
[ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90 [ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 [ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246 [ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000 [ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000 [ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000001] ? show_regs+0x72/0x90 [ +0.000002] ? sysfs_remove_group+0x83/0x90 [ +0.000002] ? __warn+0x8d/0x160 [ +0.000001] ? sysfs_remove_group+0x83/0x90 [ +0.000001] ? report_bug+0x1bb/0x1d0 [ +0.000003] ? handle_bug+0x46/0x90 [ +0.000001] ? exc_invalid_op+0x19/0x80 [ +0.000002] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000003] ? sysfs_remove_group+0x83/0x90 [ +0.000001] dpm_sysfs_remove+0x61/0x70 [ +0.000002] device_del+0xa3/0x3d0 [ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0 [ +0.000002] device_unregister+0x18/0x70 [ +0.000001] i2c_del_adapter+0x26d/0x330 [ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu] [ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu] [ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu] [ +0.000186] ? mutex_lock+0x13/0x50 [ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu] [ +0.000192] drm_minor_release+0x4f/0x80 [drm] [ +0.000025] drm_release+0xfe/0x150 [drm] [ +0.000027] __fput+0x9f/0x290 [ +0.000002] ____fput+0xe/0x20 [ +0.000002] task_work_run+0x61/0xa0 [ +0.000002] exit_to_user_mode_prepare+0x150/0x170 [ +0.000002] syscall_exit_to_user_mode+0x2a/0x50
Cc: Hawking Zhang hawking.zhang@amd.com Cc: Luben Tuikov luben.tuikov@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Christian Koenig christian.koenig@amd.com Signed-off-by: Vitaly Prosyak vitaly.prosyak@amd.com Reviewed-by: Luben Tuikov luben.tuikov@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 7d5019a884024..2003be3390aab 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1380,7 +1380,8 @@ static void amdgpu_ras_sysfs_remove_bad_page_node(struct amdgpu_device *adev) { struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
- sysfs_remove_file_from_group(&adev->dev->kobj, + if (adev->dev->kobj.sd) + sysfs_remove_file_from_group(&adev->dev->kobj, &con->badpages_attr.attr, RAS_FS_NAME); } @@ -1397,7 +1398,8 @@ static int amdgpu_ras_sysfs_remove_feature_node(struct amdgpu_device *adev) .attrs = attrs, };
- sysfs_remove_group(&adev->dev->kobj, &group); + if (adev->dev->kobj.sd) + sysfs_remove_group(&adev->dev->kobj, &group);
return 0; } @@ -1444,7 +1446,8 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev, if (!obj || !obj->attr_inuse) return -EINVAL;
- sysfs_remove_file_from_group(&adev->dev->kobj, + if (adev->dev->kobj.sd) + sysfs_remove_file_from_group(&adev->dev->kobj, &obj->sysfs_attr.attr, RAS_FS_NAME); obj->attr_inuse = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit d27abbfd4888d79dd24baf50e774631046ac4732 ]
These enums are passed to set/test_bit(). The set/test_bit() functions take a bit number instead of a shifted value. Passing a shifted value is a double shift bug like doing BIT(BIT(1)). The double shift bug doesn't cause a problem here because we are only checking 0 and 1 but if the value was 5 or above then it can lead to a buffer overflow.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Sam Protsenko semen.protsenko@linaro.org Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/pwm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/pwm.h b/include/linux/pwm.h index 04ae1d9073a74..0755ba9938f74 100644 --- a/include/linux/pwm.h +++ b/include/linux/pwm.h @@ -41,8 +41,8 @@ struct pwm_args { };
enum { - PWMF_REQUESTED = 1 << 0, - PWMF_EXPORTED = 1 << 1, + PWMF_REQUESTED = 0, + PWMF_EXPORTED = 1, };
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Yang yiyang13@huawei.com
[ Upstream commit 0a1166c27d4e53186e6bf9147ea6db9cd1d65847 ]
Add the missing check for platform_get_irq() and return error code if it fails.
Fixes: d7d9f8ec77fe ("mtd: rawnand: add NVIDIA Tegra NAND Flash controller driver") Signed-off-by: Yi Yang yiyang13@huawei.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20230821084046.217025-1-yiyang13@huawei.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/tegra_nand.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/mtd/nand/raw/tegra_nand.c b/drivers/mtd/nand/raw/tegra_nand.c index eb0b9d16e8dae..a553e3ac8ff41 100644 --- a/drivers/mtd/nand/raw/tegra_nand.c +++ b/drivers/mtd/nand/raw/tegra_nand.c @@ -1197,6 +1197,10 @@ static int tegra_nand_probe(struct platform_device *pdev) init_completion(&ctrl->dma_complete);
ctrl->irq = platform_get_irq(pdev, 0); + if (ctrl->irq < 0) { + err = ctrl->irq; + goto err_put_pm; + } err = devm_request_irq(&pdev->dev, ctrl->irq, tegra_nand_irq, 0, dev_name(&pdev->dev), ctrl); if (err) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miri Korenblit miriam.rachel.korenblit@intel.com
[ Upstream commit 499d02790495958506a64f37ceda7e97345a50a8 ]
Currently we are setting the rate in the tx cmd for mgmt frames (e.g. during connection establishment). This was problematic when sending mgmt frames in eSR mode, as we don't know what link this frame will be sent on (This is decided by the FW), so we don't know what is the lowest rate. Fix this by not setting the rate in tx cmd and rely on FW to choose the right one. Set rate only for injected frames with fixed rate, or when no sta is given. Also set for important frames (EAPOL etc.) the High Priority flag.
Fixes: 055b22e770dd ("iwlwifi: mvm: Set Tx rate and flags when there is not station") Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230913145231.6c7e59620ee0.I6eaed3ccdd6dd62b9e664... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 2ede69132fee9..177a4628a913e 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -536,16 +536,20 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, flags |= IWL_TX_FLAGS_ENCRYPT_DIS;
/* - * For data packets rate info comes from the fw. Only - * set rate/antenna during connection establishment or in case - * no station is given. + * For data and mgmt packets rate info comes from the fw. Only + * set rate/antenna for injected frames with fixed rate, or + * when no sta is given. */ - if (!sta || !ieee80211_is_data(hdr->frame_control) || - mvmsta->sta_state < IEEE80211_STA_AUTHORIZED) { + if (unlikely(!sta || + info->control.flags & IEEE80211_TX_CTRL_RATE_INJECT)) { flags |= IWL_TX_FLAGS_CMD_RATE; rate_n_flags = iwl_mvm_get_tx_rate_n_flags(mvm, info, sta, hdr->frame_control); + } else if (!ieee80211_is_data(hdr->frame_control) || + mvmsta->sta_state < IEEE80211_STA_AUTHORIZED) { + /* These are important frames */ + flags |= IWL_TX_FLAGS_HIGH_PRI; }
if (mvm->trans->trans_cfg->device_family >=
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Finn Thain fthain@linux-m68k.org
[ Upstream commit 87c3a5893e865739ce78aa7192d36011022e0af7 ]
Except on x86, preempt_count is always accessed with READ_ONCE(). Repeated invocations in macros like irq_count() produce repeated loads. These redundant instructions appear in various fast paths. In the one shown below, for example, irq_count() is evaluated during kernel entry if !tick_nohz_full_cpu(smp_processor_id()).
0001ed0a <irq_enter_rcu>: 1ed0a: 4e56 0000 linkw %fp,#0 1ed0e: 200f movel %sp,%d0 1ed10: 0280 ffff e000 andil #-8192,%d0 1ed16: 2040 moveal %d0,%a0 1ed18: 2028 0008 movel %a0@(8),%d0 1ed1c: 0680 0001 0000 addil #65536,%d0 1ed22: 2140 0008 movel %d0,%a0@(8) 1ed26: 082a 0001 000f btst #1,%a2@(15) 1ed2c: 670c beqs 1ed3a <irq_enter_rcu+0x30> 1ed2e: 2028 0008 movel %a0@(8),%d0 1ed32: 2028 0008 movel %a0@(8),%d0 1ed36: 2028 0008 movel %a0@(8),%d0 1ed3a: 4e5e unlk %fp 1ed3c: 4e75 rts
This patch doesn't prevent the pointless btst and beqs instructions above, but it does eliminate 2 of the 3 pointless move instructions here and elsewhere.
On x86, preempt_count is per-cpu data and the problem does not arise presumably because the compiler is free to optimize more effectively.
This patch was tested on m68k and x86. I was expecting no changes to object code for x86 and mostly that's what I saw. However, there were a few places where code generation was perturbed for some reason.
The performance issue addressed here is minor on uniprocessor m68k. I got a 0.01% improvement from this patch for a simple "find /sys -false" benchmark. For architectures and workloads susceptible to cache line bounce the improvement is expected to be larger. The only SMP architecture I have is x86, and as x86 unaffected I have not done any further measurements.
Fixes: 15115830c887 ("preempt: Cleanup the macro maze a bit") Signed-off-by: Finn Thain fthain@linux-m68k.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/0a403120a682a525e6db2d81d1a3ffcc137c3742.169475683... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/preempt.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/linux/preempt.h b/include/linux/preempt.h index 1424670df161d..9aa6358a1a16b 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -99,14 +99,21 @@ static __always_inline unsigned char interrupt_context_level(void) return level; }
+/* + * These macro definitions avoid redundant invocations of preempt_count() + * because such invocations would result in redundant loads given that + * preempt_count() is commonly implemented with READ_ONCE(). + */ + #define nmi_count() (preempt_count() & NMI_MASK) #define hardirq_count() (preempt_count() & HARDIRQ_MASK) #ifdef CONFIG_PREEMPT_RT # define softirq_count() (current->softirq_disable_cnt & SOFTIRQ_MASK) +# define irq_count() ((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | softirq_count()) #else # define softirq_count() (preempt_count() & SOFTIRQ_MASK) +# define irq_count() (preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_MASK)) #endif -#define irq_count() (nmi_count() | hardirq_count() | softirq_count())
/* * Macros to retrieve the current execution context: @@ -119,7 +126,11 @@ static __always_inline unsigned char interrupt_context_level(void) #define in_nmi() (nmi_count()) #define in_hardirq() (hardirq_count()) #define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET) -#define in_task() (!(in_nmi() | in_hardirq() | in_serving_softirq())) +#ifdef CONFIG_PREEMPT_RT +# define in_task() (!((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | in_serving_softirq())) +#else +# define in_task() (!(preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) +#endif
/* * The following macros are deprecated and should not be used in new code:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinghao Jia jinghao@linux.ibm.com
[ Upstream commit 0ee352fe0d28015cab161b04d202fa3231c0ba3b ]
The variable name num_progs causes confusion because that variable really controls the number of rounds the test should be executed.
Rename num_progs into nr_tests for the sake of clarity.
Signed-off-by: Jinghao Jia jinghao@linux.ibm.com Signed-off-by: Ruowen Qin ruowenq2@illinois.edu Signed-off-by: Jinghao Jia jinghao7@illinois.edu Link: https://lore.kernel.org/r/20230917214220.637721-3-jinghao7@illinois.edu Signed-off-by: Alexei Starovoitov ast@kernel.org Stable-dep-of: 9220c3ef6fef ("samples/bpf: syscall_tp_user: Fix array out-of-bound access") Signed-off-by: Sasha Levin sashal@kernel.org --- samples/bpf/syscall_tp_user.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/samples/bpf/syscall_tp_user.c b/samples/bpf/syscall_tp_user.c index 7a788bb837fc1..18c94c7e8a40e 100644 --- a/samples/bpf/syscall_tp_user.c +++ b/samples/bpf/syscall_tp_user.c @@ -17,9 +17,9 @@
static void usage(const char *cmd) { - printf("USAGE: %s [-i num_progs] [-h]\n", cmd); - printf(" -i num_progs # number of progs of the test\n"); - printf(" -h # help\n"); + printf("USAGE: %s [-i nr_tests] [-h]\n", cmd); + printf(" -i nr_tests # rounds of test to run\n"); + printf(" -h # help\n"); }
static void verify_map(int map_id) @@ -45,14 +45,14 @@ static void verify_map(int map_id) } }
-static int test(char *filename, int num_progs) +static int test(char *filename, int nr_tests) { - int map0_fds[num_progs], map1_fds[num_progs], fd, i, j = 0; - struct bpf_link *links[num_progs * 4]; - struct bpf_object *objs[num_progs]; + int map0_fds[nr_tests], map1_fds[nr_tests], fd, i, j = 0; + struct bpf_link *links[nr_tests * 4]; + struct bpf_object *objs[nr_tests]; struct bpf_program *prog;
- for (i = 0; i < num_progs; i++) { + for (i = 0; i < nr_tests; i++) { objs[i] = bpf_object__open_file(filename, NULL); if (libbpf_get_error(objs[i])) { fprintf(stderr, "opening BPF object file failed\n"); @@ -101,7 +101,7 @@ static int test(char *filename, int num_progs) close(fd);
/* verify the map */ - for (i = 0; i < num_progs; i++) { + for (i = 0; i < nr_tests; i++) { verify_map(map0_fds[i]); verify_map(map1_fds[i]); } @@ -117,13 +117,13 @@ static int test(char *filename, int num_progs)
int main(int argc, char **argv) { - int opt, num_progs = 1; + int opt, nr_tests = 1; char filename[256];
while ((opt = getopt(argc, argv, "i:h")) != -1) { switch (opt) { case 'i': - num_progs = atoi(optarg); + nr_tests = atoi(optarg); break; case 'h': default: @@ -134,5 +134,5 @@ int main(int argc, char **argv)
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
- return test(filename, num_progs); + return test(filename, nr_tests); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinghao Jia jinghao@linux.ibm.com
[ Upstream commit 9220c3ef6fefbf18f24aeedb1142a642b3de0596 ]
Commit 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample") added two more eBPF programs to support the openat2() syscall. However, it did not increase the size of the array that holds the corresponding bpf_links. This leads to an out-of-bound access on that array in the bpf_object__for_each_program loop and could corrupt other variables on the stack. On our testing QEMU, it corrupts the map1_fds array and causes the sample to fail:
# ./syscall_tp prog #0: map ids 4 5 verify map:4 val: 5 map_lookup failed: Bad file descriptor
Dynamically allocate the array based on the number of programs reported by libbpf to prevent similar inconsistencies in the future
Fixes: 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample") Signed-off-by: Jinghao Jia jinghao@linux.ibm.com Signed-off-by: Ruowen Qin ruowenq2@illinois.edu Signed-off-by: Jinghao Jia jinghao7@illinois.edu Link: https://lore.kernel.org/r/20230917214220.637721-4-jinghao7@illinois.edu Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- samples/bpf/syscall_tp_user.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/samples/bpf/syscall_tp_user.c b/samples/bpf/syscall_tp_user.c index 18c94c7e8a40e..7a09ac74fac07 100644 --- a/samples/bpf/syscall_tp_user.c +++ b/samples/bpf/syscall_tp_user.c @@ -48,7 +48,7 @@ static void verify_map(int map_id) static int test(char *filename, int nr_tests) { int map0_fds[nr_tests], map1_fds[nr_tests], fd, i, j = 0; - struct bpf_link *links[nr_tests * 4]; + struct bpf_link **links = NULL; struct bpf_object *objs[nr_tests]; struct bpf_program *prog;
@@ -60,6 +60,19 @@ static int test(char *filename, int nr_tests) goto cleanup; }
+ /* One-time initialization */ + if (!links) { + int nr_progs = 0; + + bpf_object__for_each_program(prog, objs[i]) + nr_progs += 1; + + links = calloc(nr_progs * nr_tests, sizeof(struct bpf_link *)); + + if (!links) + goto cleanup; + } + /* load BPF program */ if (bpf_object__load(objs[i])) { fprintf(stderr, "loading BPF object file failed\n"); @@ -107,8 +120,12 @@ static int test(char *filename, int nr_tests) }
cleanup: - for (j--; j >= 0; j--) - bpf_link__destroy(links[j]); + if (links) { + for (j--; j >= 0; j--) + bpf_link__destroy(links[j]); + + free(links); + }
for (i--; i >= 0; i--) bpf_object__close(objs[i]);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 42851dfd4dbe38e34724a00063a9fad5cfc48dcd ]
The regular expression pattern for matching serial node children should accept only nodes starting and ending with the set of words: bluetooth, gnss, gps or mcu. Add missing brackets to enforce such matching.
Fixes: 0c559bc8abfb ("dt-bindings: serial: restrict possible child node names") Reported-by: Andreas Kemnade andreas@kemnade.info Closes: https://lore.kernel.org/all/20231004170021.36b32465@aktux/ Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Acked-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20231005093247.128166-1-krzysztof.kozlowski@linaro... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/devicetree/bindings/serial/serial.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/serial/serial.yaml b/Documentation/devicetree/bindings/serial/serial.yaml index ea277560a5966..5727bd549deca 100644 --- a/Documentation/devicetree/bindings/serial/serial.yaml +++ b/Documentation/devicetree/bindings/serial/serial.yaml @@ -96,7 +96,7 @@ then: rts-gpios: false
patternProperties: - "^bluetooth|gnss|gps|mcu$": + "^(bluetooth|gnss|gps|mcu)$": if: type: object then:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 4b09ca1508a60be30b2e3940264e93d7aeb5c97e ]
If connect() is returning ECONNRESET, it usually means that nothing is listening on that port. If so, a rebind might be required in order to obtain the new port on which the RPC service is listening.
Fixes: fd01b2597941 ("SUNRPC: ECONNREFUSED should cause a rebind.") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/clnt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 9fb0ccabc1a26..2e3b7b2c1b431 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -2169,6 +2169,7 @@ call_connect_status(struct rpc_task *task) task->tk_status = 0; switch (status) { case -ECONNREFUSED: + case -ECONNRESET: /* A positive refusal suggests a rebind is needed. */ if (RPC_IS_SOFTCONN(task)) break; @@ -2177,7 +2178,6 @@ call_connect_status(struct rpc_task *task) goto out_retry; } fallthrough; - case -ECONNRESET: case -ECONNABORTED: case -ENETDOWN: case -ENETUNREACH:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Yang yiyang13@huawei.com
[ Upstream commit 74ac5b5e2375f1e8ef797ac7770887e9969f2516 ]
devm_kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Ensure the allocation was successful by checking the pointer validity.
Fixes: 0b1039f016e8 ("mtd: rawnand: Add NAND controller support on Intel LGM SoC") Signed-off-by: Yi Yang yiyang13@huawei.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20231019065537.318391-1-yiyang13@huawei.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/intel-nand-controller.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/mtd/nand/raw/intel-nand-controller.c b/drivers/mtd/nand/raw/intel-nand-controller.c index a9909eb081244..8231e9828dce7 100644 --- a/drivers/mtd/nand/raw/intel-nand-controller.c +++ b/drivers/mtd/nand/raw/intel-nand-controller.c @@ -619,6 +619,11 @@ static int ebu_nand_probe(struct platform_device *pdev) ebu_host->cs_num = cs;
resname = devm_kasprintf(dev, GFP_KERNEL, "nand_cs%d", cs); + if (!resname) { + ret = -ENOMEM; + goto err_of_node_put; + } + ebu_host->cs[cs].chipaddr = devm_platform_ioremap_resource_byname(pdev, resname); if (IS_ERR(ebu_host->cs[cs].chipaddr)) { @@ -655,6 +660,11 @@ static int ebu_nand_probe(struct platform_device *pdev) }
resname = devm_kasprintf(dev, GFP_KERNEL, "addr_sel%d", cs); + if (!resname) { + ret = -ENOMEM; + goto err_cleanup_dma; + } + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, resname); if (!res) { ret = -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Yang yiyang13@huawei.com
[ Upstream commit 5a985960a4dd041c21dbe9956958c1633d2da706 ]
devm_kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Ensure the allocation was successful by checking the pointer validity.
Fixes: 1e4d3ba66888 ("mtd: rawnand: meson: fix the clock") Signed-off-by: Yi Yang yiyang13@huawei.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20231019065548.318443-1-yiyang13@huawei.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/meson_nand.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c index b10011dec1e62..b16adc0e92e9d 100644 --- a/drivers/mtd/nand/raw/meson_nand.c +++ b/drivers/mtd/nand/raw/meson_nand.c @@ -1115,6 +1115,9 @@ static int meson_nfc_clk_init(struct meson_nfc *nfc) init.name = devm_kasprintf(nfc->dev, GFP_KERNEL, "%s#div", dev_name(nfc->dev)); + if (!init.name) + return -ENOMEM; + init.ops = &clk_divider_ops; nfc_divider_parent_data[0].fw_name = "device"; init.parent_data = nfc_divider_parent_data;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 390001d648ffa027b750b7dceb5d43f4c1d1a39e ]
The newly added memset() causes a warning for some reason I could not figure out:
In file included from arch/x86/include/asm/string.h:3, from drivers/gpu/drm/i915/gt/intel_rc6.c:6: In function 'rc6_res_reg_init', inlined from 'intel_rc6_init' at drivers/gpu/drm/i915/gt/intel_rc6.c:610:2: arch/x86/include/asm/string_32.h:195:29: error: '__builtin_memset' writing 16 bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=] 195 | #define memset(s, c, count) __builtin_memset(s, c, count) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/gt/intel_rc6.c:584:9: note: in expansion of macro 'memset' 584 | memset(rc6->res_reg, INVALID_MMIO_REG.reg, sizeof(rc6->res_reg)); | ^~~~~~ In function 'intel_rc6_init':
Change it to an normal initializer and an added memcpy() that does not have this problem.
Fixes: 4bb9ca7ee074 ("drm/i915/mtl: C6 residency and C state type for MTL SAMedia") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231016201012.1022812-1-arnd@... (cherry picked from commit 0520b30b219053cd789909bca45b3c486ef3ee09) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_rc6.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c index 58bb1c55294c9..ccdc1afbf11b5 100644 --- a/drivers/gpu/drm/i915/gt/intel_rc6.c +++ b/drivers/gpu/drm/i915/gt/intel_rc6.c @@ -584,19 +584,23 @@ static void __intel_rc6_disable(struct intel_rc6 *rc6)
static void rc6_res_reg_init(struct intel_rc6 *rc6) { - memset(rc6->res_reg, INVALID_MMIO_REG.reg, sizeof(rc6->res_reg)); + i915_reg_t res_reg[INTEL_RC6_RES_MAX] = { + [0 ... INTEL_RC6_RES_MAX - 1] = INVALID_MMIO_REG, + };
switch (rc6_to_gt(rc6)->type) { case GT_MEDIA: - rc6->res_reg[INTEL_RC6_RES_RC6] = MTL_MEDIA_MC6; + res_reg[INTEL_RC6_RES_RC6] = MTL_MEDIA_MC6; break; default: - rc6->res_reg[INTEL_RC6_RES_RC6_LOCKED] = GEN6_GT_GFX_RC6_LOCKED; - rc6->res_reg[INTEL_RC6_RES_RC6] = GEN6_GT_GFX_RC6; - rc6->res_reg[INTEL_RC6_RES_RC6p] = GEN6_GT_GFX_RC6p; - rc6->res_reg[INTEL_RC6_RES_RC6pp] = GEN6_GT_GFX_RC6pp; + res_reg[INTEL_RC6_RES_RC6_LOCKED] = GEN6_GT_GFX_RC6_LOCKED; + res_reg[INTEL_RC6_RES_RC6] = GEN6_GT_GFX_RC6; + res_reg[INTEL_RC6_RES_RC6p] = GEN6_GT_GFX_RC6p; + res_reg[INTEL_RC6_RES_RC6pp] = GEN6_GT_GFX_RC6pp; break; } + + memcpy(rc6->res_reg, res_reg, sizeof(res_reg)); }
void intel_rc6_init(struct intel_rc6 *rc6)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia kolga@netapp.com
[ Upstream commit 6bd1a77dc72dea0b0d8b6014f231143984d18f6d ]
Currently when client sends an EXCHANGE_ID for a possible trunked connection, for any error that happened, the trunk will be thrown out. However, an NFS4ERR_DELAY is a transient error that should be retried instead.
Fixes: e818bd085baf ("NFSv4.1 remove xprt from xprt_switch if session trunking test fails") Signed-off-by: Olga Kornievskaia kolga@netapp.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 5f088e3eeca1d..c710fde58be11 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -8934,6 +8934,7 @@ void nfs4_test_session_trunk(struct rpc_clnt *clnt, struct rpc_xprt *xprt,
sp4_how = (adata->clp->cl_sp4_flags == 0 ? SP4_NONE : SP4_MACH_CRED);
+try_again: /* Test connection for session trunking. Async exchange_id call */ task = nfs4_run_exchange_id(adata->clp, adata->cred, sp4_how, xprt); if (IS_ERR(task)) @@ -8946,11 +8947,15 @@ void nfs4_test_session_trunk(struct rpc_clnt *clnt, struct rpc_xprt *xprt,
if (status == 0) rpc_clnt_xprt_switch_add_xprt(clnt, xprt); - else if (rpc_clnt_xprt_switch_has_addr(clnt, + else if (status != -NFS4ERR_DELAY && rpc_clnt_xprt_switch_has_addr(clnt, (struct sockaddr *)&xprt->addr)) rpc_clnt_xprt_switch_remove_xprt(clnt, xprt);
rpc_put_task(task); + if (status == -NFS4ERR_DELAY) { + ssleep(1); + goto try_again; + } } EXPORT_SYMBOL_GPL(nfs4_test_session_trunk);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 4f3ed837186fc0d2722ba8d2457a594322e9c2ef ]
This IS_ERR() check was deleted during in a cleanup because, at the time, the rpcb_call_async() function could not return an error pointer. That changed in commit 25cf32ad5dba ("SUNRPC: Handle allocation failure in rpc_new_task()") and now it can return an error pointer. Put the check back.
A related revert was done in commit 13bd90141804 ("Revert "SUNRPC: Remove unreachable error condition"").
Fixes: 037e910b52b0 ("SUNRPC: Remove unreachable error condition in rpcb_getport_async()") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/rpcb_clnt.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c index 5988a5c5ff3f0..102c3818bc54d 100644 --- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -769,6 +769,10 @@ void rpcb_getport_async(struct rpc_task *task)
child = rpcb_call_async(rpcb_clnt, map, proc); rpc_release_client(rpcb_clnt); + if (IS_ERR(child)) { + /* rpcb_map_release() has freed the arguments */ + return; + }
xprt->stat.bind_count++; rpc_put_task(child);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia kolga@netapp.com
[ Upstream commit 5cc7688bae7f0757c39c1d3dfdd827b724061067 ]
If the client is doing pnfs IO and Kerberos is configured and EXCHANGEID successfully negotiated SP4_MACH_CRED and WRITE/COMMIT are on the list of state protected operations, then we need to make sure to choose the DS's rpc_client structure instead of the MDS's one.
Fixes: fb91fb0ee7b2 ("NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup()") Signed-off-by: Olga Kornievskaia kolga@netapp.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c710fde58be11..8374fa230ba5a 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5622,7 +5622,7 @@ static void nfs4_proc_write_setup(struct nfs_pgio_header *hdr,
msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_WRITE]; nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0); - nfs4_state_protect_write(server->nfs_client, clnt, msg, hdr); + nfs4_state_protect_write(hdr->ds_clp ? hdr->ds_clp : server->nfs_client, clnt, msg, hdr); }
static void nfs4_proc_commit_rpc_prepare(struct rpc_task *task, struct nfs_commit_data *data) @@ -5663,7 +5663,8 @@ static void nfs4_proc_commit_setup(struct nfs_commit_data *data, struct rpc_mess data->res.server = server; msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COMMIT]; nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1, 0); - nfs4_state_protect(server->nfs_client, NFS_SP4_MACH_CRED_COMMIT, clnt, msg); + nfs4_state_protect(data->ds_clp ? data->ds_clp : server->nfs_client, + NFS_SP4_MACH_CRED_COMMIT, clnt, msg); }
static int _nfs4_proc_commit(struct file *dst, struct nfs_commitargs *args,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: felix fuzhen5@huawei.com
[ Upstream commit bfca5fb4e97c46503ddfc582335917b0cc228264 ]
RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir() workqueue,which takes care about pipefs superblock locking. In some special scenarios, when kernel frees the pipefs sb of the current client and immediately alloctes a new pipefs sb, rpc_remove_pipedir function would misjudge the existence of pipefs sb which is not the one it used to hold. As a result, the rpc_remove_pipedir would clean the released freed pipefs dentries.
To fix this issue, rpc_remove_pipedir should check whether the current pipefs sb is consistent with the original pipefs sb.
This error can be catched by KASAN: ========================================================= [ 250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200 [ 250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503 [ 250.500549] Workqueue: events rpc_free_client_work [ 250.501001] Call Trace: [ 250.502880] kasan_report+0xb6/0xf0 [ 250.503209] ? dget_parent+0x195/0x200 [ 250.503561] dget_parent+0x195/0x200 [ 250.503897] ? __pfx_rpc_clntdir_depopulate+0x10/0x10 [ 250.504384] rpc_rmdir_depopulate+0x1b/0x90 [ 250.504781] rpc_remove_client_dir+0xf5/0x150 [ 250.505195] rpc_free_client_work+0xe4/0x230 [ 250.505598] process_one_work+0x8ee/0x13b0 ... [ 22.039056] Allocated by task 244: [ 22.039390] kasan_save_stack+0x22/0x50 [ 22.039758] kasan_set_track+0x25/0x30 [ 22.040109] __kasan_slab_alloc+0x59/0x70 [ 22.040487] kmem_cache_alloc_lru+0xf0/0x240 [ 22.040889] __d_alloc+0x31/0x8e0 [ 22.041207] d_alloc+0x44/0x1f0 [ 22.041514] __rpc_lookup_create_exclusive+0x11c/0x140 [ 22.041987] rpc_mkdir_populate.constprop.0+0x5f/0x110 [ 22.042459] rpc_create_client_dir+0x34/0x150 [ 22.042874] rpc_setup_pipedir_sb+0x102/0x1c0 [ 22.043284] rpc_client_register+0x136/0x4e0 [ 22.043689] rpc_new_client+0x911/0x1020 [ 22.044057] rpc_create_xprt+0xcb/0x370 [ 22.044417] rpc_create+0x36b/0x6c0 ... [ 22.049524] Freed by task 0: [ 22.049803] kasan_save_stack+0x22/0x50 [ 22.050165] kasan_set_track+0x25/0x30 [ 22.050520] kasan_save_free_info+0x2b/0x50 [ 22.050921] __kasan_slab_free+0x10e/0x1a0 [ 22.051306] kmem_cache_free+0xa5/0x390 [ 22.051667] rcu_core+0x62c/0x1930 [ 22.051995] __do_softirq+0x165/0x52a [ 22.052347] [ 22.052503] Last potentially related work creation: [ 22.052952] kasan_save_stack+0x22/0x50 [ 22.053313] __kasan_record_aux_stack+0x8e/0xa0 [ 22.053739] __call_rcu_common.constprop.0+0x6b/0x8b0 [ 22.054209] dentry_free+0xb2/0x140 [ 22.054540] __dentry_kill+0x3be/0x540 [ 22.054900] shrink_dentry_list+0x199/0x510 [ 22.055293] shrink_dcache_parent+0x190/0x240 [ 22.055703] do_one_tree+0x11/0x40 [ 22.056028] shrink_dcache_for_umount+0x61/0x140 [ 22.056461] generic_shutdown_super+0x70/0x590 [ 22.056879] kill_anon_super+0x3a/0x60 [ 22.057234] rpc_kill_sb+0x121/0x200
Fixes: 0157d021d23a ("SUNRPC: handle RPC client pipefs dentries by network namespace aware routines") Signed-off-by: felix fuzhen5@huawei.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sunrpc/clnt.h | 1 + net/sunrpc/clnt.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 4f41d839face4..03722690f2c39 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -92,6 +92,7 @@ struct rpc_clnt { }; const struct cred *cl_cred; unsigned int cl_max_connect; /* max number of transports not to the same IP */ + struct super_block *pipefs_sb; };
/* diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 2e3b7b2c1b431..a148aa8003b88 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -111,7 +111,8 @@ static void rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)
pipefs_sb = rpc_get_sb_net(net); if (pipefs_sb) { - __rpc_clnt_remove_pipedir(clnt); + if (pipefs_sb == clnt->pipefs_sb) + __rpc_clnt_remove_pipedir(clnt); rpc_put_sb_net(net); } } @@ -151,6 +152,8 @@ rpc_setup_pipedir(struct super_block *pipefs_sb, struct rpc_clnt *clnt) { struct dentry *dentry;
+ clnt->pipefs_sb = pipefs_sb; + if (clnt->cl_program->pipe_dir_name != NULL) { dentry = rpc_setup_pipedir_sb(pipefs_sb, clnt); if (IS_ERR(dentry))
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Jones ajones@ventanamicro.com
[ Upstream commit e1c05b3bf80f829ced464bdca90f1dfa96e8d251 ]
A hwprobe pair key is signed, but the hwprobe vDSO function was only checking that the upper bound was valid. In order to help avoid this type of problem in the future, and in anticipation of this check becoming more complicated with sparse keys, introduce and use a "key is valid" predicate function for the check.
Fixes: aa5af0aa90ba ("RISC-V: Add hwprobe vDSO function and data") Signed-off-by: Andrew Jones ajones@ventanamicro.com Reviewed-by: Evan Green evan@rivosinc.com Link: https://lore.kernel.org/r/20231010165101.14942-2-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/hwprobe.h | 5 +++++ arch/riscv/kernel/vdso/hwprobe.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 78936f4ff5133..7cad513538d8d 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -10,4 +10,9 @@
#define RISCV_HWPROBE_MAX_KEY 5
+static inline bool riscv_hwprobe_key_is_valid(__s64 key) +{ + return key >= 0 && key <= RISCV_HWPROBE_MAX_KEY; +} + #endif diff --git a/arch/riscv/kernel/vdso/hwprobe.c b/arch/riscv/kernel/vdso/hwprobe.c index d40bec6ac0786..cadf725ef7983 100644 --- a/arch/riscv/kernel/vdso/hwprobe.c +++ b/arch/riscv/kernel/vdso/hwprobe.c @@ -37,7 +37,7 @@ int __vdso_riscv_hwprobe(struct riscv_hwprobe *pairs, size_t pair_count,
/* This is something we can handle, fill out the pairs. */ while (p < end) { - if (p->key <= RISCV_HWPROBE_MAX_KEY) { + if (riscv_hwprobe_key_is_valid(p->key)) { p->value = avd->all_cpu_hwprobe_values[p->key];
} else {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcaov@gmail.com
[ Upstream commit b701f9e726f0a30a94ea6af596b74c1f07b95b6b ]
uprobes expects is_trap_insn() to return true for any trap instructions, not just the one used for installing uprobe. The current default implementation only returns true for 16-bit c.ebreak if C extension is enabled. This can confuse uprobes if a 32-bit ebreak generates a trap exception from userspace: uprobes asks is_trap_insn() who says there is no trap, so uprobes assume a probe was there before but has been removed, and return to the trap instruction. This causes an infinite loop of entering and exiting trap handler.
Instead of using the default implementation, implement this function speficially for riscv with checks for both ebreak and c.ebreak.
Fixes: 74784081aac8 ("riscv: Add uprobes supported") Signed-off-by: Nam Cao namcaov@gmail.com Tested-by: Björn Töpel bjorn@rivosinc.com Reviewed-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/r/20230829083614.117748-1-namcaov@gmail.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/probes/uprobes.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c index 194f166b2cc40..4b3dc8beaf77d 100644 --- a/arch/riscv/kernel/probes/uprobes.c +++ b/arch/riscv/kernel/probes/uprobes.c @@ -3,6 +3,7 @@ #include <linux/highmem.h> #include <linux/ptrace.h> #include <linux/uprobes.h> +#include <asm/insn.h>
#include "decode-insn.h"
@@ -17,6 +18,11 @@ bool is_swbp_insn(uprobe_opcode_t *insn) #endif }
+bool is_trap_insn(uprobe_opcode_t *insn) +{ + return riscv_insn_is_ebreak(*insn) || riscv_insn_is_c_ebreak(*insn); +} + unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) { return instruction_pointer(regs);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 074d7306a4fe22fcac0b53f699f92757ab1cee99 ]
Commit 0abd1557e21c added rcu_dereference() for dereferencing ip->i_gl in gfs2_permission. This now causes lockdep to complain when gfs2_permission is called in non-RCU context:
WARNING: suspicious RCU usage in gfs2_permission
Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag to shut up lockdep when we know that dereferencing ip->i_gl is safe.
Fixes: 0abd1557e21c ("gfs2: fix an oops in gfs2_permission") Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/inode.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 117b4b5a03072..28c3711628805 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1860,6 +1860,7 @@ static const char *gfs2_get_link(struct dentry *dentry, int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, int mask) { + int may_not_block = mask & MAY_NOT_BLOCK; struct gfs2_inode *ip; struct gfs2_holder i_gh; struct gfs2_glock *gl; @@ -1867,14 +1868,14 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode,
gfs2_holder_mark_uninitialized(&i_gh); ip = GFS2_I(inode); - gl = rcu_dereference(ip->i_gl); + gl = rcu_dereference_check(ip->i_gl, !may_not_block); if (unlikely(!gl)) { /* inode is getting torn down, must be RCU mode */ - WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); + WARN_ON_ONCE(!may_not_block); return -ECHILD; } if (gfs2_glock_is_locked_by_me(gl) == NULL) { - if (mask & MAY_NOT_BLOCK) + if (may_not_block) return -ECHILD; error = gfs2_glock_nq_init(gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); if (error)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nirmoy Das nirmoy.das@intel.com
[ Upstream commit 9506fba463fcbdf8c8b7af3ec9ee34360df843fe ]
Fix below compiler warning:
intel_tc.c:1879:11: error: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 3 [-Werror=format-truncation=] "%c/TC#%d", port_name(port), tc_port + 1); ^~ intel_tc.c:1878:2: note: ‘snprintf’ output between 7 and 17 bytes into a destination of size 8 snprintf(tc->port_name, sizeof(tc->port_name), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%c/TC#%d", port_name(port), tc_port + 1); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
v2: use kasprintf(Imre) v3: use const for port_name, and fix tc mem leak(Imre)
Fixes: 3eafcddf766b ("drm/i915/tc: Move TC port fields to a new intel_tc_port struct") Cc: Mika Kahola mika.kahola@intel.com Cc: Imre Deak imre.deak@intel.com Cc: Jani Nikula jani.nikula@intel.com Signed-off-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Reviewed-by: Imre Deak imre.deak@intel.com Reviewed-by: Mika Kahola mika.kahola@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231026125636.5080-1-nirmoy.d... (cherry picked from commit 70a3cbbe620ee66afb0c066624196077767e61b2) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/display/intel_tc.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c index 3ebf41859043e..cdf2455440bea 100644 --- a/drivers/gpu/drm/i915/display/intel_tc.c +++ b/drivers/gpu/drm/i915/display/intel_tc.c @@ -58,7 +58,7 @@ struct intel_tc_port { struct delayed_work link_reset_work; int link_refcount; bool legacy_port:1; - char port_name[8]; + const char *port_name; enum tc_port_mode mode; enum tc_port_mode init_mode; enum phy_fia phy_fia; @@ -1841,8 +1841,12 @@ int intel_tc_port_init(struct intel_digital_port *dig_port, bool is_legacy) else tc->phy_ops = &icl_tc_phy_ops;
- snprintf(tc->port_name, sizeof(tc->port_name), - "%c/TC#%d", port_name(port), tc_port + 1); + tc->port_name = kasprintf(GFP_KERNEL, "%c/TC#%d", port_name(port), + tc_port + 1); + if (!tc->port_name) { + kfree(tc); + return -ENOMEM; + }
mutex_init(&tc->lock); /* TODO: Combine the two works */ @@ -1863,6 +1867,7 @@ void intel_tc_port_cleanup(struct intel_digital_port *dig_port) { intel_tc_port_suspend(dig_port);
+ kfree(dig_port->tc->port_name); kfree(dig_port->tc); dig_port->tc = NULL; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Garzarella sgarzare@redhat.com
[ Upstream commit 0d82410252ea324f0064e75b9865bb74cccc1dda ]
Deleting and recreating a device can lead to having the same content as the old device, so let's always allocate buffers completely zeroed out.
Fixes: abebb16254b3 ("vdpa_sim_blk: support shared backend") Suggested-by: Qing Wang qinwang@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Message-Id: 20231031144339.121453-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Eugenio Pérez eperezma@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c index b3a3cb1657955..b137f36793439 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c @@ -437,7 +437,7 @@ static int vdpasim_blk_dev_add(struct vdpa_mgmt_dev *mdev, const char *name, if (blk->shared_backend) { blk->buffer = shared_buffer; } else { - blk->buffer = kvmalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT, + blk->buffer = kvzalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT, GFP_KERNEL); if (!blk->buffer) { ret = -ENOMEM; @@ -495,7 +495,7 @@ static int __init vdpasim_blk_init(void) goto parent_err;
if (shared_backend) { - shared_buffer = kvmalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT, + shared_buffer = kvzalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT, GFP_KERNEL); if (!shared_buffer) { ret = -ENOMEM;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit e07754e0a1ea2d63fb29574253d1fd7405607343 ]
The put_device() calls vhost_vdpa_release_dev() which calls ida_simple_remove() and frees "v". So this call to ida_simple_remove() is a use after free and a double free.
Fixes: ebe6a354fa7e ("vhost-vdpa: Call ida_simple_remove() when failed") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Message-Id: cf53cb61-0699-4e36-a980-94fd4268ff00@moroto.mountain Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vhost/vdpa.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index b43e8680eee8d..48357c403867f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1498,7 +1498,6 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
err: put_device(&v->dev); - ida_simple_remove(&vhost_vdpa_ida, v->minor); return r; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
[ Upstream commit 1ee60356c2dca938362528404af95b8ef3e49b6a ]
The randstruct GCC plugin tried to discover "fake" flexible arrays to issue warnings about them in randomized structs. In the future LSM overhead reduction series, it would be legal to have a randomized struct with a 1-element array, and this should _not_ be treated as a flexible array, especially since commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3"). Disable the 0-sized and 1-element array discovery logic in the plugin, but keep the "true" flexible array check.
Cc: KP Singh kpsingh@kernel.org Cc: linux-hardening@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202311021532.iBwuZUZ0-lkp@intel.com/ Fixes: df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") Reviewed-by: Bill Wendling morbo@google.com Acked-by: "Gustavo A. R. Silva" gustavoars@kernel.org Link: https://lore.kernel.org/r/20231104204334.work.160-kees@kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/gcc-plugins/randomize_layout_plugin.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c index 951b74ba1b242..5e5744b65f8a9 100644 --- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -273,8 +273,6 @@ static bool is_flexible_array(const_tree field) { const_tree fieldtype; const_tree typesize; - const_tree elemtype; - const_tree elemsize;
fieldtype = TREE_TYPE(field); typesize = TYPE_SIZE(fieldtype); @@ -282,20 +280,12 @@ static bool is_flexible_array(const_tree field) if (TREE_CODE(fieldtype) != ARRAY_TYPE) return false;
- elemtype = TREE_TYPE(fieldtype); - elemsize = TYPE_SIZE(elemtype); - /* size of type is represented in bits */
if (typesize == NULL_TREE && TYPE_DOMAIN(fieldtype) != NULL_TREE && TYPE_MAX_VALUE(TYPE_DOMAIN(fieldtype)) == NULL_TREE) return true;
- if (typesize != NULL_TREE && - (TREE_CONSTANT(typesize) && (!tree_to_uhwi(typesize) || - tree_to_uhwi(typesize) == tree_to_uhwi(elemsize)))) - return true; - return false; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit 3feb263bb516ee7e1da0acd22b15afbb9a7daa19 ]
ldimm64 instructions are 16-byte long, and so have to be handled appropriately in check_cfg(), just like the rest of BPF verifier does.
This has implications in three places: - when determining next instruction for non-jump instructions; - when determining next instruction for callback address ldimm64 instructions (in visit_func_call_insn()); - when checking for unreachable instructions, where second half of ldimm64 is expected to be unreachable;
We take this also as an opportunity to report jump into the middle of ldimm64. And adjust few test_verifier tests accordingly.
Acked-by: Eduard Zingerman eddyz87@gmail.com Reported-by: Hao Sun sunhao.th@gmail.com Fixes: 475fb78fbf48 ("bpf: verifier (add branch/goto checks)") Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/bpf.h | 8 ++++-- kernel/bpf/verifier.c | 27 ++++++++++++++----- .../testing/selftests/bpf/verifier/ld_imm64.c | 8 +++--- 3 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 98a7d6fd10360..a8b775e9d4d1a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -890,10 +890,14 @@ bpf_ctx_record_field_size(struct bpf_insn_access_aux *aux, u32 size) aux->ctx_field_size = size; }
+static bool bpf_is_ldimm64(const struct bpf_insn *insn) +{ + return insn->code == (BPF_LD | BPF_IMM | BPF_DW); +} + static inline bool bpf_pseudo_func(const struct bpf_insn *insn) { - return insn->code == (BPF_LD | BPF_IMM | BPF_DW) && - insn->src_reg == BPF_PSEUDO_FUNC; + return bpf_is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC; }
struct bpf_prog_ops { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index c3b6f282128bf..cb5621000993e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -14563,15 +14563,16 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns, struct bpf_verifier_env *env, bool visit_callee) { - int ret; + int ret, insn_sz;
- ret = push_insn(t, t + 1, FALLTHROUGH, env, false); + insn_sz = bpf_is_ldimm64(&insns[t]) ? 2 : 1; + ret = push_insn(t, t + insn_sz, FALLTHROUGH, env, false); if (ret) return ret;
- mark_prune_point(env, t + 1); + mark_prune_point(env, t + insn_sz); /* when we exit from subprog, we need to record non-linear history */ - mark_jmp_point(env, t + 1); + mark_jmp_point(env, t + insn_sz);
if (visit_callee) { mark_prune_point(env, t); @@ -14593,15 +14594,17 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns, static int visit_insn(int t, struct bpf_verifier_env *env) { struct bpf_insn *insns = env->prog->insnsi, *insn = &insns[t]; - int ret; + int ret, insn_sz;
if (bpf_pseudo_func(insn)) return visit_func_call_insn(t, insns, env, true);
/* All non-branch instructions have a single fall-through edge. */ if (BPF_CLASS(insn->code) != BPF_JMP && - BPF_CLASS(insn->code) != BPF_JMP32) - return push_insn(t, t + 1, FALLTHROUGH, env, false); + BPF_CLASS(insn->code) != BPF_JMP32) { + insn_sz = bpf_is_ldimm64(insn) ? 2 : 1; + return push_insn(t, t + insn_sz, FALLTHROUGH, env, false); + }
switch (BPF_OP(insn->code)) { case BPF_EXIT: @@ -14715,11 +14718,21 @@ static int check_cfg(struct bpf_verifier_env *env) }
for (i = 0; i < insn_cnt; i++) { + struct bpf_insn *insn = &env->prog->insnsi[i]; + if (insn_state[i] != EXPLORED) { verbose(env, "unreachable insn %d\n", i); ret = -EINVAL; goto err_free; } + if (bpf_is_ldimm64(insn)) { + if (insn_state[i + 1] != 0) { + verbose(env, "jump into the middle of ldimm64 insn %d\n", i); + ret = -EINVAL; + goto err_free; + } + i++; /* skip second half of ldimm64 */ + } } ret = 0; /* cfg looks good */
diff --git a/tools/testing/selftests/bpf/verifier/ld_imm64.c b/tools/testing/selftests/bpf/verifier/ld_imm64.c index f9297900cea6d..78f19c255f20b 100644 --- a/tools/testing/selftests/bpf/verifier/ld_imm64.c +++ b/tools/testing/selftests/bpf/verifier/ld_imm64.c @@ -9,8 +9,8 @@ BPF_MOV64_IMM(BPF_REG_0, 2), BPF_EXIT_INSN(), }, - .errstr = "invalid BPF_LD_IMM insn", - .errstr_unpriv = "R1 pointer comparison", + .errstr = "jump into the middle of ldimm64 insn 1", + .errstr_unpriv = "jump into the middle of ldimm64 insn 1", .result = REJECT, }, { @@ -23,8 +23,8 @@ BPF_LD_IMM64(BPF_REG_0, 1), BPF_EXIT_INSN(), }, - .errstr = "invalid BPF_LD_IMM insn", - .errstr_unpriv = "R1 pointer comparison", + .errstr = "jump into the middle of ldimm64 insn 1", + .errstr_unpriv = "jump into the middle of ldimm64 insn 1", .result = REJECT, }, {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit 4bb7ea946a370707315ab774432963ce47291946 ]
Fix an edge case in __mark_chain_precision() which prematurely stops backtracking instructions in a state if it happens that state's first and last instruction indexes are the same. This situations doesn't necessarily mean that there were no instructions simulated in a state, but rather that we starting from the instruction, jumped around a bit, and then ended up at the same instruction before checkpointing or marking precision.
To distinguish between these two possible situations, we need to consult jump history. If it's empty or contain a single record "bridging" parent state and first instruction of processed state, then we indeed backtracked all instructions in this state. But if history is not empty, we are definitely not done yet.
Move this logic inside get_prev_insn_idx() to contain it more nicely. Use -ENOENT return code to denote "we are out of instructions" situation.
This bug was exposed by verifier_loop1.c's bounded_recursion subtest, once the next fix in this patch set is applied.
Acked-by: Eduard Zingerman eddyz87@gmail.com Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/r/20231110002638.4168352-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index cb5621000993e..b8371feabdbdd 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3193,12 +3193,29 @@ static int push_jmp_history(struct bpf_verifier_env *env,
/* Backtrack one insn at a time. If idx is not at the top of recorded * history then previous instruction came from straight line execution. + * Return -ENOENT if we exhausted all instructions within given state. + * + * It's legal to have a bit of a looping with the same starting and ending + * insn index within the same state, e.g.: 3->4->5->3, so just because current + * instruction index is the same as state's first_idx doesn't mean we are + * done. If there is still some jump history left, we should keep going. We + * need to take into account that we might have a jump history between given + * state's parent and itself, due to checkpointing. In this case, we'll have + * history entry recording a jump from last instruction of parent state and + * first instruction of given state. */ static int get_prev_insn_idx(struct bpf_verifier_state *st, int i, u32 *history) { u32 cnt = *history;
+ if (i == st->first_insn_idx) { + if (cnt == 0) + return -ENOENT; + if (cnt == 1 && st->jmp_history[0].idx == i) + return -ENOENT; + } + if (cnt && st->jmp_history[cnt - 1].idx == i) { i = st->jmp_history[cnt - 1].prev_idx; (*history)--; @@ -4073,10 +4090,10 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno) * Nothing to be tracked further in the parent state. */ return 0; - if (i == first_idx) - break; subseq_idx = i; i = get_prev_insn_idx(st, i, &history); + if (i == -ENOENT) + break; if (i >= env->prog->len) { /* This can happen if backtracking reached insn 0 * and there are still reg_mask or stack_mask
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislav Fomichev sdf@google.com
[ Upstream commit 871019b22d1bcc9fab2d1feba1b9a564acbb6e99 ]
We've started to see the following kernel traces:
WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
Call Trace: <IRQ> __bpf_skc_lookup+0x10d/0x120 bpf_sk_lookup+0x48/0xd0 bpf_sk_lookup_tcp+0x19/0x20 bpf_prog_<redacted>+0x37c/0x16a3 cls_bpf_classify+0x205/0x2e0 tcf_classify+0x92/0x160 __netif_receive_skb_core+0xe52/0xf10 __netif_receive_skb_list_core+0x96/0x2b0 napi_complete_done+0x7b5/0xb70 <redacted>_poll+0x94/0xb0 net_rx_action+0x163/0x1d70 __do_softirq+0xdc/0x32e asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x36/0x50 do_softirq+0x44/0x70
__inet_hash can race with lockless (rcu) readers on the other cpus:
__inet_hash __sk_nulls_add_node_rcu <- (bpf triggers here) sock_set_flag(SOCK_RCU_FREE)
Let's move the SOCK_RCU_FREE part up a bit, before we are inserting the socket into hashtables. Note, that the race is really harmless; the bpf callers are handling this situation (where listener socket doesn't have SOCK_RCU_FREE set) correctly, so the only annoyance is a WARN_ONCE.
More details from Eric regarding SOCK_RCU_FREE timeline:
Commit 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood") added SOCK_RCU_FREE. At that time, the precise location of sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling __inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested at dismantle time.
Commit 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") started checking SOCK_RCU_FREE _after_ the lookup to infer whether the refcount has been taken care of.
Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Stanislav Fomichev sdf@google.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_hashtables.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 60cffabfd4788..c8c2704a320f1 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -731,12 +731,12 @@ int __inet_hash(struct sock *sk, struct sock *osk) if (err) goto unlock; } + sock_set_flag(sk, SOCK_RCU_FREE); if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport && sk->sk_family == AF_INET6) __sk_nulls_add_node_tail_rcu(sk, &ilb2->nulls_head); else __sk_nulls_add_node_rcu(sk, &ilb2->nulls_head); - sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); unlock: spin_unlock(&ilb2->lock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 18f039428c7df183b09c69ebf10ffd4e521035d2 ]
Inspired by syzbot reports using a stack of multiple ipvlan devices.
Reduce stack size needed in ipvlan_process_v6_outbound() by moving the flowi6 struct used for the route lookup in an non inlined helper. ipvlan_route_v6_outbound() needs 120 bytes on the stack, immediately reclaimed.
Also make sure ipvlan_process_v4_outbound() is not inlined.
We might also have to lower MAX_NEST_DEV, because only syzbot uses setups with more than four stacked devices.
BUG: TASK stack guard page was hit at ffffc9000e803ff8 (stack is ffffc9000e804000..ffffc9000e808000) stack guard page: 0000 [#1] SMP KASAN CPU: 0 PID: 13442 Comm: syz-executor.4 Not tainted 6.1.52-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 RIP: 0010:kasan_check_range+0x4/0x2a0 mm/kasan/generic.c:188 Code: 48 01 c6 48 89 c7 e8 db 4e c1 03 31 c0 5d c3 cc 0f 0b eb 02 0f 0b b8 ea ff ff ff 5d c3 cc 00 00 cc cc 00 00 cc cc 55 48 89 e5 <41> 57 41 56 41 55 41 54 53 b0 01 48 85 f6 0f 84 a4 01 00 00 48 89 RSP: 0018:ffffc9000e804000 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817e5bf2 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff887c6568 RBP: ffffc9000e804000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: dffffc0000000001 R12: 1ffff92001d0080c R13: dffffc0000000000 R14: ffffffff87e6b100 R15: 0000000000000000 FS: 00007fd0c55826c0(0000) GS:ffff8881f6800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000e803ff8 CR3: 0000000170ef7000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <#DF> </#DF> <TASK> [<ffffffff81f281d1>] __kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31 [<ffffffff817e5bf2>] instrument_atomic_read include/linux/instrumented.h:72 [inline] [<ffffffff817e5bf2>] _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline] [<ffffffff817e5bf2>] cpumask_test_cpu include/linux/cpumask.h:506 [inline] [<ffffffff817e5bf2>] cpu_online include/linux/cpumask.h:1092 [inline] [<ffffffff817e5bf2>] trace_lock_acquire include/trace/events/lock.h:24 [inline] [<ffffffff817e5bf2>] lock_acquire+0xe2/0x590 kernel/locking/lockdep.c:5632 [<ffffffff8563221e>] rcu_lock_acquire+0x2e/0x40 include/linux/rcupdate.h:306 [<ffffffff8561464d>] rcu_read_lock include/linux/rcupdate.h:747 [inline] [<ffffffff8561464d>] ip6_pol_route+0x15d/0x1440 net/ipv6/route.c:2221 [<ffffffff85618120>] ip6_pol_route_output+0x50/0x80 net/ipv6/route.c:2606 [<ffffffff856f65b5>] pol_lookup_func include/net/ip6_fib.h:584 [inline] [<ffffffff856f65b5>] fib6_rule_lookup+0x265/0x620 net/ipv6/fib6_rules.c:116 [<ffffffff85618009>] ip6_route_output_flags_noref+0x2d9/0x3a0 net/ipv6/route.c:2638 [<ffffffff8561821a>] ip6_route_output_flags+0xca/0x340 net/ipv6/route.c:2651 [<ffffffff838bd5a3>] ip6_route_output include/net/ip6_route.h:100 [inline] [<ffffffff838bd5a3>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:473 [inline] [<ffffffff838bd5a3>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bd5a3>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bd5a3>] ipvlan_queue_xmit+0xc33/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff84d4a65e>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff84d4a65e>] neigh_resolve_output+0x64e/0x750 net/core/neighbour.c:1560 [<ffffffff855ce503>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff855ce503>] ip6_finish_output2+0x1643/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff855b9ce4>] dst_output include/net/dst.h:444 [inline] [<ffffffff855b9ce4>] NF_HOOK include/linux/netfilter.h:309 [inline] [<ffffffff855b9ce4>] ip6_xmit+0x11a4/0x1b20 net/ipv6/ip6_output.c:352 [<ffffffff8597984e>] sctp_v6_xmit+0x9ae/0x1230 net/sctp/ipv6.c:250 [<ffffffff8594623e>] sctp_packet_transmit+0x25de/0x2bc0 net/sctp/output.c:653 [<ffffffff858f5142>] sctp_packet_singleton+0x202/0x310 net/sctp/outqueue.c:783 [<ffffffff858ea411>] sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline] [<ffffffff858ea411>] sctp_outq_flush+0x661/0x3d40 net/sctp/outqueue.c:1212 [<ffffffff858f02f9>] sctp_outq_uncork+0x79/0xb0 net/sctp/outqueue.c:764 [<ffffffff8589f060>] sctp_side_effects net/sctp/sm_sideeffect.c:1199 [inline] [<ffffffff8589f060>] sctp_do_sm+0x55c0/0x5c30 net/sctp/sm_sideeffect.c:1170 [<ffffffff85941567>] sctp_primitive_ASSOCIATE+0x97/0xc0 net/sctp/primitive.c:73 [<ffffffff859408b2>] sctp_sendmsg_to_asoc+0xf62/0x17b0 net/sctp/socket.c:1839 [<ffffffff85910b5e>] sctp_sendmsg+0x212e/0x33b0 net/sctp/socket.c:2029 [<ffffffff8544d559>] inet_sendmsg+0x149/0x310 net/ipv4/af_inet.c:849 [<ffffffff84c6c4d2>] sock_sendmsg_nosec net/socket.c:716 [inline] [<ffffffff84c6c4d2>] sock_sendmsg net/socket.c:736 [inline] [<ffffffff84c6c4d2>] ____sys_sendmsg+0x572/0x8c0 net/socket.c:2504 [<ffffffff84c6ca91>] ___sys_sendmsg net/socket.c:2558 [inline] [<ffffffff84c6ca91>] __sys_sendmsg+0x271/0x360 net/socket.c:2587 [<ffffffff84c6cbff>] __do_sys_sendmsg net/socket.c:2596 [inline] [<ffffffff84c6cbff>] __se_sys_sendmsg net/socket.c:2594 [inline] [<ffffffff84c6cbff>] __x64_sys_sendmsg+0x7f/0x90 net/socket.c:2594 [<ffffffff85b32553>] do_syscall_x64 arch/x86/entry/common.c:51 [inline] [<ffffffff85b32553>] do_syscall_64+0x53/0x80 arch/x86/entry/common.c:84 [<ffffffff85c00087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Mahesh Bandewar maheshb@google.com Cc: Willem de Bruijn willemb@google.com Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipvlan/ipvlan_core.c | 41 +++++++++++++++++++------------- 1 file changed, 25 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index 21e9cac731218..2d5b021b4ea60 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -411,7 +411,7 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h, return addr; }
-static int ipvlan_process_v4_outbound(struct sk_buff *skb) +static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb) { const struct iphdr *ip4h = ip_hdr(skb); struct net_device *dev = skb->dev; @@ -453,13 +453,11 @@ static int ipvlan_process_v4_outbound(struct sk_buff *skb) }
#if IS_ENABLED(CONFIG_IPV6) -static int ipvlan_process_v6_outbound(struct sk_buff *skb) + +static noinline_for_stack int +ipvlan_route_v6_outbound(struct net_device *dev, struct sk_buff *skb) { const struct ipv6hdr *ip6h = ipv6_hdr(skb); - struct net_device *dev = skb->dev; - struct net *net = dev_net(dev); - struct dst_entry *dst; - int err, ret = NET_XMIT_DROP; struct flowi6 fl6 = { .flowi6_oif = dev->ifindex, .daddr = ip6h->daddr, @@ -469,27 +467,38 @@ static int ipvlan_process_v6_outbound(struct sk_buff *skb) .flowi6_mark = skb->mark, .flowi6_proto = ip6h->nexthdr, }; + struct dst_entry *dst; + int err;
- dst = ip6_route_output(net, NULL, &fl6); - if (dst->error) { - ret = dst->error; + dst = ip6_route_output(dev_net(dev), NULL, &fl6); + err = dst->error; + if (err) { dst_release(dst); - goto err; + return err; } skb_dst_set(skb, dst); + return 0; +} + +static int ipvlan_process_v6_outbound(struct sk_buff *skb) +{ + struct net_device *dev = skb->dev; + int err, ret = NET_XMIT_DROP; + + err = ipvlan_route_v6_outbound(dev, skb); + if (unlikely(err)) { + DEV_STATS_INC(dev, tx_errors); + kfree_skb(skb); + return err; + }
memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
- err = ip6_local_out(net, skb->sk, skb); + err = ip6_local_out(dev_net(dev), skb->sk, skb); if (unlikely(net_xmit_eval(err))) DEV_STATS_INC(dev, tx_errors); else ret = NET_XMIT_SUCCESS; - goto out; -err: - DEV_STATS_INC(dev, tx_errors); - kfree_skb(skb); -out: return ret; } #else
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit 719639853d88071dfdfd8d9971eca9c283ff314c ]
KMSAN reported the following uninit-value access issue:
===================================================== BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at: __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691 page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722 page_frag_alloc include/linux/gfp.h:322 [inline] __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 =====================================================
ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS and PPP_UI. However, if the data length is 1 and the first byte is PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking PPP_UI. This patch resolves this issue by checking the data length.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Shigeru Yoshida syoshida@redhat.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_synctty.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ppp/ppp_synctty.c b/drivers/net/ppp/ppp_synctty.c index 18283b7b94bcd..1ac231408398a 100644 --- a/drivers/net/ppp/ppp_synctty.c +++ b/drivers/net/ppp/ppp_synctty.c @@ -697,7 +697,7 @@ ppp_sync_input(struct syncppp *ap, const unsigned char *buf,
/* strip address/control field if present */ p = skb->data; - if (p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) { + if (skb->len >= 2 && p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) { /* chop off address/control */ if (skb->len < 3) goto err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
[ Upstream commit e64e7c74b99ec9e439abca75f522f4b98f220bd1 ]
xen_send_IPI_one() is being used by cpuhp_report_idle_dead() after it calls rcu_report_dead(), meaning that any RCU usage by xen_send_IPI_one() is a bad idea.
Unfortunately xen_send_IPI_one() is using notify_remote_via_irq() today, which is using irq_get_chip_data() via info_for_irq(). And irq_get_chip_data() in turn is using a maple-tree lookup requiring RCU.
Avoid this problem by caching the ipi event channels in another percpu variable, allowing the use notify_remote_via_evtchn() in xen_send_IPI_one().
Fixes: 721255b9826b ("genirq: Use a maple tree for interrupt descriptor management") Reported-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Juergen Gross jgross@suse.com Tested-by: David Woodhouse dwmw@amazon.co.uk Acked-by: Stefano Stabellini sstabellini@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/events/events_base.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index c803714d0f0d1..5de10d291a1cb 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -164,6 +164,8 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1};
/* IRQ <-> IPI mapping */ static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1}; +/* Cache for IPI event channels - needed for hot cpu unplug (avoid RCU usage). */ +static DEFINE_PER_CPU(evtchn_port_t [XEN_NR_IPIS], ipi_to_evtchn) = {[0 ... XEN_NR_IPIS-1] = 0};
/* Event channel distribution data */ static atomic_t channels_on_cpu[NR_CPUS]; @@ -366,6 +368,7 @@ static int xen_irq_info_ipi_setup(unsigned cpu, info->u.ipi = ipi;
per_cpu(ipi_to_irq, cpu)[ipi] = irq; + per_cpu(ipi_to_evtchn, cpu)[ipi] = evtchn;
return xen_irq_info_common_setup(info, irq, IRQT_IPI, evtchn, 0); } @@ -981,6 +984,7 @@ static void __unbind_from_irq(unsigned int irq) break; case IRQT_IPI: per_cpu(ipi_to_irq, cpu)[ipi_from_irq(irq)] = -1; + per_cpu(ipi_to_evtchn, cpu)[ipi_from_irq(irq)] = 0; break; case IRQT_EVTCHN: dev = info->u.interdomain; @@ -1631,7 +1635,7 @@ EXPORT_SYMBOL_GPL(evtchn_put);
void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) { - int irq; + evtchn_port_t evtchn;
#ifdef CONFIG_X86 if (unlikely(vector == XEN_NMI_VECTOR)) { @@ -1642,9 +1646,9 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) return; } #endif - irq = per_cpu(ipi_to_irq, cpu)[vector]; - BUG_ON(irq < 0); - notify_remote_via_irq(irq); + evtchn = per_cpu(ipi_to_evtchn, cpu)[vector]; + BUG_ON(evtchn == 0); + notify_remote_via_evtchn(evtchn); }
struct evtchn_loop_ctrl {
On Fri, 2023-11-24 at 17:46 +0000, Greg Kroah-Hartman wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
From: Juergen Gross jgross@suse.com
[ Upstream commit e64e7c74b99ec9e439abca75f522f4b98f220bd1 ]
xen_send_IPI_one() is being used by cpuhp_report_idle_dead() after it calls rcu_report_dead(), meaning that any RCU usage by xen_send_IPI_one() is a bad idea.
Unfortunately xen_send_IPI_one() is using notify_remote_via_irq() today, which is using irq_get_chip_data() via info_for_irq(). And irq_get_chip_data() in turn is using a maple-tree lookup requiring RCU.
Avoid this problem by caching the ipi event channels in another percpu variable, allowing the use notify_remote_via_evtchn() in xen_send_IPI_one().
IIRC, the result of this bug is a mostly harmless lockdep splat.
On the other hand, the result of *fixing* this bug is that lockdep doesn't whine, and lockdep keeps working. Only to later trigger a warning really early during cpu hotplug, causing the whole machine to go down with a triple-fault.
Can we have the backport of https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?... *first* and then this one?
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 472a2ff63efb30234cbf6b2cdaf8117f21b4f8bc ]
The hclge_sync_vlan_filter is called in periodic task, trying to remove VLAN from vlan_del_fail_bmap. It can be concurrence with VLAN adding operation from user. So once user failed to delete a VLAN id, and add it again soon, it may be removed by the periodic task, which may cause the software configuration being inconsistent with hardware. So add mutex handling to avoid this.
user hns3 driver
periodic task │ add vlan 10 ───── hns3_vlan_rx_add_vid │ │ (suppose success) │ │ │ del vlan 10 ───── hns3_vlan_rx_kill_vid │ │ (suppose fail,add to │ │ vlan_del_fail_bmap) │ │ │ add vlan 10 ───── hns3_vlan_rx_add_vid │ (suppose success) │ foreach vlan_del_fail_bmp del vlan 10
Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_main.c | 28 +++++++++++++------ .../hisilicon/hns3/hns3vf/hclgevf_main.c | 11 ++++++-- 2 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index ed6cf59853bf6..71154c0976aff 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -10026,8 +10026,6 @@ static void hclge_rm_vport_vlan_table(struct hclge_vport *vport, u16 vlan_id, struct hclge_vport_vlan_cfg *vlan, *tmp; struct hclge_dev *hdev = vport->back;
- mutex_lock(&hdev->vport_lock); - list_for_each_entry_safe(vlan, tmp, &vport->vlan_list, node) { if (vlan->vlan_id == vlan_id) { if (is_write_tbl && vlan->hd_tbl_status) @@ -10042,8 +10040,6 @@ static void hclge_rm_vport_vlan_table(struct hclge_vport *vport, u16 vlan_id, break; } } - - mutex_unlock(&hdev->vport_lock); }
void hclge_rm_vport_all_vlan_table(struct hclge_vport *vport, bool is_del_list) @@ -10452,11 +10448,16 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, * handle mailbox. Just record the vlan id, and remove it after * reset finished. */ + mutex_lock(&hdev->vport_lock); if ((test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) || test_bit(HCLGE_STATE_RST_FAIL, &hdev->state)) && is_kill) { set_bit(vlan_id, vport->vlan_del_fail_bmap); + mutex_unlock(&hdev->vport_lock); return -EBUSY; + } else if (!is_kill && test_bit(vlan_id, vport->vlan_del_fail_bmap)) { + clear_bit(vlan_id, vport->vlan_del_fail_bmap); } + mutex_unlock(&hdev->vport_lock);
/* when port base vlan enabled, we use port base vlan as the vlan * filter entry. In this case, we don't update vlan filter table @@ -10471,17 +10472,22 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, }
if (!ret) { - if (!is_kill) + if (!is_kill) { hclge_add_vport_vlan_table(vport, vlan_id, writen_to_tbl); - else if (is_kill && vlan_id != 0) + } else if (is_kill && vlan_id != 0) { + mutex_lock(&hdev->vport_lock); hclge_rm_vport_vlan_table(vport, vlan_id, false); + mutex_unlock(&hdev->vport_lock); + } } else if (is_kill) { /* when remove hw vlan filter failed, record the vlan id, * and try to remove it from hw later, to be consistence * with stack */ + mutex_lock(&hdev->vport_lock); set_bit(vlan_id, vport->vlan_del_fail_bmap); + mutex_unlock(&hdev->vport_lock); }
hclge_set_vport_vlan_fltr_change(vport); @@ -10521,6 +10527,7 @@ static void hclge_sync_vlan_filter(struct hclge_dev *hdev) int i, ret, sync_cnt = 0; u16 vlan_id;
+ mutex_lock(&hdev->vport_lock); /* start from vport 1 for PF is always alive */ for (i = 0; i < hdev->num_alloc_vport; i++) { struct hclge_vport *vport = &hdev->vport[i]; @@ -10531,21 +10538,26 @@ static void hclge_sync_vlan_filter(struct hclge_dev *hdev) ret = hclge_set_vlan_filter_hw(hdev, htons(ETH_P_8021Q), vport->vport_id, vlan_id, true); - if (ret && ret != -EINVAL) + if (ret && ret != -EINVAL) { + mutex_unlock(&hdev->vport_lock); return; + }
clear_bit(vlan_id, vport->vlan_del_fail_bmap); hclge_rm_vport_vlan_table(vport, vlan_id, false); hclge_set_vport_vlan_fltr_change(vport);
sync_cnt++; - if (sync_cnt >= HCLGE_MAX_SYNC_COUNT) + if (sync_cnt >= HCLGE_MAX_SYNC_COUNT) { + mutex_unlock(&hdev->vport_lock); return; + }
vlan_id = find_first_bit(vport->vlan_del_fail_bmap, VLAN_N_VID); } } + mutex_unlock(&hdev->vport_lock);
hclge_sync_vlan_fltr_state(hdev); } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index a4d68fb216fb9..1c62e58ff6d89 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -1206,6 +1206,8 @@ static int hclgevf_set_vlan_filter(struct hnae3_handle *handle, test_bit(HCLGEVF_STATE_RST_FAIL, &hdev->state)) && is_kill) { set_bit(vlan_id, hdev->vlan_del_fail_bmap); return -EBUSY; + } else if (!is_kill && test_bit(vlan_id, hdev->vlan_del_fail_bmap)) { + clear_bit(vlan_id, hdev->vlan_del_fail_bmap); }
hclgevf_build_send_msg(&send_msg, HCLGE_MBX_SET_VLAN, @@ -1233,20 +1235,25 @@ static void hclgevf_sync_vlan_filter(struct hclgevf_dev *hdev) int ret, sync_cnt = 0; u16 vlan_id;
+ if (bitmap_empty(hdev->vlan_del_fail_bmap, VLAN_N_VID)) + return; + + rtnl_lock(); vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID); while (vlan_id != VLAN_N_VID) { ret = hclgevf_set_vlan_filter(handle, htons(ETH_P_8021Q), vlan_id, true); if (ret) - return; + break;
clear_bit(vlan_id, hdev->vlan_del_fail_bmap); sync_cnt++; if (sync_cnt >= HCLGEVF_MAX_SYNC_COUNT) - return; + break;
vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID); } + rtnl_unlock(); }
static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit ac92c0a9a0603fb448e60f38e63302e4eebb8035 ]
In hclgevf_mbx_handler() and hclgevf_get_mbx_resp() functions, there is a typical store-store and load-load scenario between received_resp and additional_info. This patch adds barrier to fix the problem.
Fixes: 4671042f1ef0 ("net: hns3: add match_id to check mailbox response from PF to VF") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c index bbf7b14079de3..85c2a634c8f96 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c @@ -63,6 +63,9 @@ static int hclgevf_get_mbx_resp(struct hclgevf_dev *hdev, u16 code0, u16 code1, i++; }
+ /* ensure additional_info will be seen after received_resp */ + smp_rmb(); + if (i >= HCLGEVF_MAX_TRY_TIMES) { dev_err(&hdev->pdev->dev, "VF could not get mbx(%u,%u) resp(=%d) from PF in %d tries\n", @@ -178,6 +181,10 @@ static void hclgevf_handle_mbx_response(struct hclgevf_dev *hdev, resp->resp_status = hclgevf_resp_to_errno(resp_status); memcpy(resp->additional_info, req->msg.resp_data, HCLGE_MBX_MAX_RESP_DATA_SIZE * sizeof(u8)); + + /* ensure additional_info will be seen before setting received_resp */ + smp_wmb(); + if (match_id) { /* If match_id is not zero, it means PF support match_id. * if the match_id is right, VF get the right response, or
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 75b247b57d8b71bcb679e4cb37d0db104848806c ]
Currently, the FEC capability bit is default set for device version V2. It's incorrect for the copper port. Eventhough it doesn't make the nic work abnormal, but the capability information display in debugfs may confuse user. So clear it when driver get the port type inforamtion.
Fixes: 433ccce83504 ("net: hns3: use FEC capability queried from firmware") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 71154c0976aff..dd2baa05adba0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -11664,6 +11664,7 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) goto err_msi_irq_uninit;
if (hdev->hw.mac.media_type == HNAE3_MEDIA_TYPE_COPPER) { + clear_bit(HNAE3_DEV_SUPPORT_FEC_B, ae_dev->caps); if (hnae3_dev_phy_imp_supported(hdev)) ret = hclge_update_tp_port_info(hdev); else
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit 53aba458f23846112c0d44239580ff59bc5c36c3 ]
The hns3 driver define an array of string to show the coalesce info, but if the kernel adds a new mode or a new state, out-of-bounds access may occur when coalesce info is read via debugfs, this patch fix the problem.
Fixes: c99fead7cb07 ("net: hns3: add debugfs support for interrupt coalesce") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c index 26fb6fefcb9d9..5d1814ed51427 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c @@ -500,11 +500,14 @@ static void hns3_get_coal_info(struct hns3_enet_tqp_vector *tqp_vector, }
sprintf(result[j++], "%d", i); - sprintf(result[j++], "%s", dim_state_str[dim->state]); + sprintf(result[j++], "%s", dim->state < ARRAY_SIZE(dim_state_str) ? + dim_state_str[dim->state] : "unknown"); sprintf(result[j++], "%u", dim->profile_ix); - sprintf(result[j++], "%s", dim_cqe_mode_str[dim->mode]); + sprintf(result[j++], "%s", dim->mode < ARRAY_SIZE(dim_cqe_mode_str) ? + dim_cqe_mode_str[dim->mode] : "unknown"); sprintf(result[j++], "%s", - dim_tune_stat_str[dim->tune_state]); + dim->tune_state < ARRAY_SIZE(dim_tune_stat_str) ? + dim_tune_stat_str[dim->tune_state] : "unknown"); sprintf(result[j++], "%u", dim->steps_left); sprintf(result[j++], "%u", dim->steps_right); sprintf(result[j++], "%u", dim->tired);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit dbd2f3b20c6ae425665b6975d766e3653d453e73 ]
When a VF is calling hns3_init_mac_addr(), get_mac_addr() may return fail, then the value of mac_addr_temp is not initialized.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 71a2ec03f2b38..f644210afb70a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -5139,7 +5139,7 @@ static int hns3_init_mac_addr(struct net_device *netdev) struct hns3_nic_priv *priv = netdev_priv(netdev); char format_mac_addr[HNAE3_FORMAT_MAC_ADDR_LEN]; struct hnae3_handle *h = priv->ae_handle; - u8 mac_addr_temp[ETH_ALEN]; + u8 mac_addr_temp[ETH_ALEN] = {0}; int ret = 0;
if (h->ae_algo->ops->get_mac_addr)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 65e98bb56fa3ce2edb400930c05238c9b380500e ]
Currently the reset process in hns3 and firmware watchdog init process is asynchronous. We think firmware watchdog initialization is completed before VF clear the interrupt source. However, firmware initialization may not complete early. So VF will receive multiple reset interrupts and fail to reset.
So we add delay before VF interrupt source and 5 ms delay is enough to avoid second reset interrupt.
Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 14 +++++++++++++- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 1c62e58ff6d89..0aa9beefd1c7e 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -1981,8 +1981,18 @@ static enum hclgevf_evt_cause hclgevf_check_evt_cause(struct hclgevf_dev *hdev, return HCLGEVF_VECTOR0_EVENT_OTHER; }
+static void hclgevf_reset_timer(struct timer_list *t) +{ + struct hclgevf_dev *hdev = from_timer(hdev, t, reset_timer); + + hclgevf_clear_event_cause(hdev, HCLGEVF_VECTOR0_EVENT_RST); + hclgevf_reset_task_schedule(hdev); +} + static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data) { +#define HCLGEVF_RESET_DELAY 5 + enum hclgevf_evt_cause event_cause; struct hclgevf_dev *hdev = data; u32 clearval; @@ -1994,7 +2004,8 @@ static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
switch (event_cause) { case HCLGEVF_VECTOR0_EVENT_RST: - hclgevf_reset_task_schedule(hdev); + mod_timer(&hdev->reset_timer, + jiffies + msecs_to_jiffies(HCLGEVF_RESET_DELAY)); break; case HCLGEVF_VECTOR0_EVENT_MBX: hclgevf_mbx_handler(hdev); @@ -2937,6 +2948,7 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) HCLGEVF_DRIVER_NAME);
hclgevf_task_schedule(hdev, round_jiffies_relative(HZ)); + timer_setup(&hdev->reset_timer, hclgevf_reset_timer, 0);
return 0;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h index 81c16b8c8da29..a73f2bf3a56a6 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h @@ -219,6 +219,7 @@ struct hclgevf_dev { enum hnae3_reset_type reset_level; unsigned long reset_pending; enum hnae3_reset_type reset_type; + struct timer_list reset_timer;
#define HCLGEVF_RESET_REQUESTED 0 #define HCLGEVF_RESET_PENDING 1
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit dff655e82faffc287d4a72a59f66fa120bf904e4 ]
If PF is down, firmware will returns 10 Mbit/s rate and half-duplex mode when PF queries the port information from firmware.
After imp reset command is executed, PF status changes to down, and PF will query link status and updates port information from firmware in a periodic scheduled task.
However, there is a low probability that port information is updated when PF is down, and then PF link status changes to up. In this case, PF synchronizes incorrect rate and duplex mode to VF.
This patch fixes it by updating port information before PF synchronizes the rate and duplex to the VF when PF changes to up.
Fixes: 18b6e31f8bf4 ("net: hns3: PF add support for pushing link status to VFs") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index dd2baa05adba0..0f868605300a2 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -61,6 +61,7 @@ static void hclge_sync_fd_table(struct hclge_dev *hdev); static void hclge_update_fec_stats(struct hclge_dev *hdev); static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret, int wait_cnt); +static int hclge_update_port_info(struct hclge_dev *hdev);
static struct hnae3_ae_algo ae_algo;
@@ -3043,6 +3044,9 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
if (state != hdev->hw.mac.link) { hdev->hw.mac.link = state; + if (state == HCLGE_LINK_STATUS_UP) + hclge_update_port_info(hdev); + client->ops->link_status_change(handle, state); hclge_config_mac_tnl_int(hdev, state); if (rclient && rclient->ops->link_status_change)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit fb317eb23b5ee4c37b0656a9a52a3db58d9dd072 ]
KMSAN reported the following kernel-infoleak issue:
===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 copy_to_iter include/linux/uio.h:197 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420 skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg net/socket.c:1066 [inline] __sys_recvfrom+0x476/0x860 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at: slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline] tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170 tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545 genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Bytes 34-35 of 36 are uninitialized Memory access of size 36 starts at ffff88802d464a00 Data copied to user address 00007ff55033c0a0
CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 =====================================================
tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and the length of TLV value passed as an argument, and aligns the result to a multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
If the size of struct tlv_desc plus the length of TLV value is not aligned, the current implementation leaves the remaining bytes uninitialized. This is the cause of the above kernel-infoleak issue.
This patch resolves this issue by clearing data up to an aligned size.
Fixes: d0796d1ef63d ("tipc: convert legacy nl bearer dump to nl compat") Signed-off-by: Shigeru Yoshida syoshida@redhat.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/netlink_compat.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index 9b47c84092319..42d9586365ae3 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -102,6 +102,7 @@ static int tipc_add_tlv(struct sk_buff *skb, u16 type, void *data, u16 len) return -EMSGSIZE;
skb_put(skb, TLV_SPACE(len)); + memset(tlv, 0, TLV_SPACE(len)); tlv->tlv_type = htons(type); tlv->tlv_len = htons(TLV_LENGTH(len)); if (len && data)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sven Auhagen sven.auhagen@voleatech.de
[ Upstream commit ca8add922f9c7f6e2e3c71039da8e0dcc64b87ed ]
Calling page_pool_get_stats in the mvneta driver without checks leads to kernel crashes. First the page pool is only available if the bm is not used. The page pool is also not allocated when the port is stopped. It can also be not allocated in case of errors.
The current implementation leads to the following crash calling ethstats on a port that is down or when calling it at the wrong moment:
ble to handle kernel NULL pointer dereference at virtual address 00000070 [00000070] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Hardware name: Marvell Armada 380/385 (Device Tree) PC is at page_pool_get_stats+0x18/0x1cc LR is at mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta] pc : [<c0b413cc>] lr : [<bf0a98d8>] psr: a0000013 sp : f1439d48 ip : f1439dc0 fp : 0000001d r10: 00000100 r9 : c4816b80 r8 : f0d75150 r7 : bf0b400c r6 : c238f000 r5 : 00000000 r4 : f1439d68 r3 : c2091040 r2 : ffffffd8 r1 : f1439d68 r0 : 00000000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 066b004a DAC: 00000051 Register r0 information: NULL pointer Register r1 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390 Register r2 information: non-paged memory Register r3 information: slab kmalloc-2k start c2091000 pointer offset 64 size 2048 Register r4 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390 Register r5 information: NULL pointer Register r6 information: slab kmalloc-cg-4k start c238f000 pointer offset 0 size 4096 Register r7 information: 15-page vmalloc region starting at 0xbf0a8000 allocated at load_module+0xa30/0x219c Register r8 information: 1-page vmalloc region starting at 0xf0d75000 allocated at ethtool_get_stats+0x138/0x208 Register r9 information: slab task_struct start c4816b80 pointer offset 0 Register r10 information: non-paged memory Register r11 information: non-paged memory Register r12 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390 Process snmpd (pid: 733, stack limit = 0x38de3a88) Stack: (0xf1439d48 to 0xf143a000) 9d40: 000000c0 00000001 c238f000 bf0b400c f0d75150 c4816b80 9d60: 00000100 bf0a98d8 00000000 00000000 00000000 00000000 00000000 00000000 9d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9da0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9dc0: 00000dc0 5335509c 00000035 c238f000 bf0b2214 01067f50 f0d75000 c0b9b9c8 9de0: 0000001d 00000035 c2212094 5335509c c4816b80 c238f000 c5ad6e00 01067f50 9e00: c1b0be80 c4816b80 00014813 c0b9d7f0 00000000 00000000 0000001d 0000001d 9e20: 00000000 00001200 00000000 00000000 c216ed90 c73943b8 00000000 00000000 9e40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9e60: 00000000 c0ad9034 00000000 00000000 00000000 00000000 00000000 00000000 9e80: 00000000 00000000 00000000 5335509c c1b0be80 f1439ee4 00008946 c1b0be80 9ea0: 01067f50 f1439ee3 00000000 00000046 b6d77ae0 c0b383f0 00008946 becc83e8 9ec0: c1b0be80 00000051 0000000b c68ca480 c7172d00 c0ad8ff0 f1439ee3 cf600e40 9ee0: 01600e40 32687465 00000000 00000000 00000000 01067f50 00000000 00000000 9f00: 00000000 5335509c 00008946 00008946 00000000 c68ca480 becc83e8 c05e2de0 9f20: f1439fb0 c03002f0 00000006 5ac3c35a c4816b80 00000006 b6d77ae0 c030caf0 9f40: c4817350 00000014 f1439e1c 0000000c 00000000 00000051 01000000 00000014 9f60: 00003fec f1439edc 00000001 c0372abc b6d77ae0 c0372abc cf600e40 5335509c 9f80: c21e6800 01015c9c 0000000b 00008946 00000036 c03002f0 c4816b80 00000036 9fa0: b6d77ae0 c03000c0 01015c9c 0000000b 0000000b 00008946 becc83e8 00000000 9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0 9fe0: b6dbf738 becc838c b6d186d7 b6baa858 40000030 0000000b 00000000 00000000 page_pool_get_stats from mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta] mvneta_ethtool_get_stats [mvneta] from ethtool_get_stats+0x154/0x208 ethtool_get_stats from dev_ethtool+0xf48/0x2480 dev_ethtool from dev_ioctl+0x538/0x63c dev_ioctl from sock_ioctl+0x49c/0x53c sock_ioctl from sys_ioctl+0x134/0xbd8 sys_ioctl from ret_fast_syscall+0x0/0x1c Exception stack(0xf1439fa8 to 0xf1439ff0) 9fa0: 01015c9c 0000000b 0000000b 00008946 becc83e8 00000000 9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0 9fe0: b6dbf738 becc838c b6d186d7 b6baa858 Code: e28dd004 e1a05000 e2514000 0a00006a (e5902070)
This commit adds the proper checks before calling page_pool_get_stats.
Fixes: b3fc79225f05 ("net: mvneta: add support for page_pool_get_stats") Signed-off-by: Sven Auhagen sven.auhagen@voleatech.de Reported-by: Paulo Da Silva Paulo.DaSilva@kyberna.com Acked-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvneta.c | 28 +++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index acf4f6ba73a6f..f4692a8726b1c 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -4790,14 +4790,17 @@ static void mvneta_ethtool_get_strings(struct net_device *netdev, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) { + struct mvneta_port *pp = netdev_priv(netdev); int i;
for (i = 0; i < ARRAY_SIZE(mvneta_statistics); i++) memcpy(data + i * ETH_GSTRING_LEN, mvneta_statistics[i].name, ETH_GSTRING_LEN);
- data += ETH_GSTRING_LEN * ARRAY_SIZE(mvneta_statistics); - page_pool_ethtool_stats_get_strings(data); + if (!pp->bm_priv) { + data += ETH_GSTRING_LEN * ARRAY_SIZE(mvneta_statistics); + page_pool_ethtool_stats_get_strings(data); + } } }
@@ -4915,8 +4918,10 @@ static void mvneta_ethtool_pp_stats(struct mvneta_port *pp, u64 *data) struct page_pool_stats stats = {}; int i;
- for (i = 0; i < rxq_number; i++) - page_pool_get_stats(pp->rxqs[i].page_pool, &stats); + for (i = 0; i < rxq_number; i++) { + if (pp->rxqs[i].page_pool) + page_pool_get_stats(pp->rxqs[i].page_pool, &stats); + }
page_pool_ethtool_stats_get(data, &stats); } @@ -4932,14 +4937,21 @@ static void mvneta_ethtool_get_stats(struct net_device *dev, for (i = 0; i < ARRAY_SIZE(mvneta_statistics); i++) *data++ = pp->ethtool_stats[i];
- mvneta_ethtool_pp_stats(pp, data); + if (!pp->bm_priv) + mvneta_ethtool_pp_stats(pp, data); }
static int mvneta_ethtool_get_sset_count(struct net_device *dev, int sset) { - if (sset == ETH_SS_STATS) - return ARRAY_SIZE(mvneta_statistics) + - page_pool_ethtool_stats_get_count(); + if (sset == ETH_SS_STATS) { + int count = ARRAY_SIZE(mvneta_statistics); + struct mvneta_port *pp = netdev_priv(dev); + + if (!pp->bm_priv) + count += page_pool_ethtool_stats_get_count(); + + return count; + }
return -EOPNOTSUPP; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit c0a2a1b0d631fc460d830f52d06211838874d655 ]
ppp_sync_ioctl allows setting device MRU, but does not sanity check this input.
Limit to a sane upper bound of 64KB.
No implementation I could find generates larger than 64KB frames. RFC 2823 mentions an upper bound of PPP over SDL of 64KB based on the 16-bit length field. Other protocols will be smaller, such as PPPoE (9KB jumbo frame) and PPPoA (18190 maximum CPCS-SDU size, RFC 2364). PPTP and L2TP encapsulate in IP.
Syzbot managed to trigger alloc warning in __alloc_pages:
if (WARN_ON_ONCE_GFP(order > MAX_ORDER, gfp))
WARNING: CPU: 1 PID: 37 at mm/page_alloc.c:4544 __alloc_pages+0x3ab/0x4a0 mm/page_alloc.c:4544
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651 __netdev_alloc_skb+0x72/0x3f0 net/core/skbuff.c:715 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0xff/0x680 drivers/net/ppp/ppp_synctty.c:334 tty_ldisc_receive_buf+0x14c/0x180 drivers/tty/tty_buffer.c:390 tty_port_default_receive_buf+0x70/0xb0 drivers/tty/tty_port.c:37 receive_buf drivers/tty/tty_buffer.c:444 [inline] flush_to_ldisc+0x261/0x780 drivers/tty/tty_buffer.c:494 process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
With call
ioctl$PPPIOCSMRU1(r1, 0x40047452, &(0x7f0000000100)=0x5e6417a8)
Similar code exists in other drivers that implement ppp_channel_ops ioctl PPPIOCSMRU. Those might also be in scope. Notably excluded from this are pppol2tp_ioctl and pppoe_ioctl.
This code goes back to the start of git history.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+6177e1f90d92583bcc58@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn willemb@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_synctty.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ppp/ppp_synctty.c b/drivers/net/ppp/ppp_synctty.c index 1ac231408398a..94ef6f9ca5103 100644 --- a/drivers/net/ppp/ppp_synctty.c +++ b/drivers/net/ppp/ppp_synctty.c @@ -462,6 +462,10 @@ ppp_sync_ioctl(struct ppp_channel *chan, unsigned int cmd, unsigned long arg) case PPPIOCSMRU: if (get_user(val, (int __user *) argp)) break; + if (val > U16_MAX) { + err = -EINVAL; + break; + } if (val < PPP_MRU) val = PPP_MRU; ap->mru = val;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
[ Upstream commit 47d970204054f859f35a2237baa75c2d84fcf436 ]
When delaying eoi handling of events, the related elements are queued into the percpu lateeoi list. In case the list isn't empty, the elements should be sorted by the time when eoi handling is to happen.
Unfortunately a new element will never be queued at the start of the list, even if it has a handling time lower than all other list elements.
Fix that by handling that case the same way as for an empty list.
Fixes: e99502f76271 ("xen/events: defer eoi in case of excessive number of events") Reported-by: Jan Beulich jbeulich@suse.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/events/events_base.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 5de10d291a1cb..87482b3428bf6 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -604,7 +604,9 @@ static void lateeoi_list_add(struct irq_info *info)
spin_lock_irqsave(&eoi->eoi_list_lock, flags);
- if (list_empty(&eoi->eoi_list)) { + elem = list_first_entry_or_null(&eoi->eoi_list, struct irq_info, + eoi_list); + if (!elem || info->eoi_time < elem->eoi_time) { list_add(&info->eoi_list, &eoi->eoi_list); mod_delayed_work_on(info->eoi_cpu, system_wq, &eoi->delayed, delay);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@infradead.org
[ Upstream commit b0077e269f6c152e807fdac90b58caf012cdbaab ]
blk_integrity_unregister() can come if queue usage counter isn't held for one bio with integrity prepared, so this request may be completed with calling profile->complete_fn, then kernel panic.
Another constraint is that bio_integrity_prep() needs to be called before bio merge.
Fix the issue by:
- call bio_integrity_prep() with one queue usage counter grabbed reliably
- call bio_integrity_prep() before bio merge
Fixes: 900e080752025f00 ("block: move queue enter logic into blk_mq_submit_bio()") Reported-by: Yi Zhang yi.zhang@redhat.com Cc: Christoph Hellwig hch@lst.de Signed-off-by: Ming Lei ming.lei@redhat.com Tested-by: Yi Zhang yi.zhang@redhat.com Link: https://lore.kernel.org/r/20231113035231.2708053-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 75 +++++++++++++++++++++++++------------------------- 1 file changed, 38 insertions(+), 37 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index c21bc81a790ff..5fb31b9a16403 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2874,11 +2874,8 @@ static struct request *blk_mq_get_new_requests(struct request_queue *q, }; struct request *rq;
- if (unlikely(bio_queue_enter(bio))) - return NULL; - if (blk_mq_attempt_bio_merge(q, bio, nsegs)) - goto queue_exit; + return NULL;
rq_qos_throttle(q, bio);
@@ -2894,35 +2891,23 @@ static struct request *blk_mq_get_new_requests(struct request_queue *q, rq_qos_cleanup(q, bio); if (bio->bi_opf & REQ_NOWAIT) bio_wouldblock_error(bio); -queue_exit: - blk_queue_exit(q); return NULL; }
-static inline struct request *blk_mq_get_cached_request(struct request_queue *q, - struct blk_plug *plug, struct bio **bio, unsigned int nsegs) +/* return true if this @rq can be used for @bio */ +static bool blk_mq_can_use_cached_rq(struct request *rq, struct blk_plug *plug, + struct bio *bio) { - struct request *rq; - enum hctx_type type, hctx_type; + enum hctx_type type = blk_mq_get_hctx_type(bio->bi_opf); + enum hctx_type hctx_type = rq->mq_hctx->type;
- if (!plug) - return NULL; - rq = rq_list_peek(&plug->cached_rq); - if (!rq || rq->q != q) - return NULL; + WARN_ON_ONCE(rq_list_peek(&plug->cached_rq) != rq);
- if (blk_mq_attempt_bio_merge(q, *bio, nsegs)) { - *bio = NULL; - return NULL; - } - - type = blk_mq_get_hctx_type((*bio)->bi_opf); - hctx_type = rq->mq_hctx->type; if (type != hctx_type && !(type == HCTX_TYPE_READ && hctx_type == HCTX_TYPE_DEFAULT)) - return NULL; - if (op_is_flush(rq->cmd_flags) != op_is_flush((*bio)->bi_opf)) - return NULL; + return false; + if (op_is_flush(rq->cmd_flags) != op_is_flush(bio->bi_opf)) + return false;
/* * If any qos ->throttle() end up blocking, we will have flushed the @@ -2930,12 +2915,12 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q, * before we throttle. */ plug->cached_rq = rq_list_next(rq); - rq_qos_throttle(q, *bio); + rq_qos_throttle(rq->q, bio);
blk_mq_rq_time_init(rq, 0); - rq->cmd_flags = (*bio)->bi_opf; + rq->cmd_flags = bio->bi_opf; INIT_LIST_HEAD(&rq->queuelist); - return rq; + return true; }
static void bio_set_ioprio(struct bio *bio) @@ -2965,7 +2950,7 @@ void blk_mq_submit_bio(struct bio *bio) struct blk_plug *plug = blk_mq_plug(bio); const int is_sync = op_is_sync(bio->bi_opf); struct blk_mq_hw_ctx *hctx; - struct request *rq; + struct request *rq = NULL; unsigned int nr_segs = 1; blk_status_t ret;
@@ -2976,20 +2961,36 @@ void blk_mq_submit_bio(struct bio *bio) return; }
- if (!bio_integrity_prep(bio)) - return; - bio_set_ioprio(bio);
- rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs); - if (!rq) { - if (!bio) + if (plug) { + rq = rq_list_peek(&plug->cached_rq); + if (rq && rq->q != q) + rq = NULL; + } + if (rq) { + if (!bio_integrity_prep(bio)) return; - rq = blk_mq_get_new_requests(q, plug, bio, nr_segs); - if (unlikely(!rq)) + if (blk_mq_attempt_bio_merge(q, bio, nr_segs)) return; + if (blk_mq_can_use_cached_rq(rq, plug, bio)) + goto done; + percpu_ref_get(&q->q_usage_counter); + } else { + if (unlikely(bio_queue_enter(bio))) + return; + if (!bio_integrity_prep(bio)) + goto fail; + } + + rq = blk_mq_get_new_requests(q, plug, bio, nr_segs); + if (unlikely(!rq)) { +fail: + blk_queue_exit(q); + return; }
+done: trace_block_getrq(bio);
rq_qos_track(q, rq, bio);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 73bde5a3294853947252cd9092a3517c7cb0cd2d ]
As I was working on a syzbot report, I found that KCSAN would probably complain that reading q->head or q->tail without barriers could lead to invalid results.
Add corresponding READ_ONCE() and WRITE_ONCE() to avoid load-store tearing.
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Richard Cochran richardcochran@gmail.com Link: https://lore.kernel.org/r/20231109174859.3995880-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_chardev.c | 3 ++- drivers/ptp/ptp_clock.c | 5 +++-- drivers/ptp/ptp_private.h | 8 ++++++-- drivers/ptp/ptp_sysfs.c | 3 ++- 4 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c index 362bf756e6b78..5a3a4cc0bec82 100644 --- a/drivers/ptp/ptp_chardev.c +++ b/drivers/ptp/ptp_chardev.c @@ -490,7 +490,8 @@ ssize_t ptp_read(struct posix_clock *pc,
for (i = 0; i < cnt; i++) { event[i] = queue->buf[queue->head]; - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + /* Paired with READ_ONCE() in queue_cnt() */ + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS); }
spin_unlock_irqrestore(&queue->lock, flags); diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 80f74e38c2da4..9a50bfb56453c 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -56,10 +56,11 @@ static void enqueue_external_timestamp(struct timestamp_event_queue *queue, dst->t.sec = seconds; dst->t.nsec = remainder;
+ /* Both WRITE_ONCE() are paired with READ_ONCE() in queue_cnt() */ if (!queue_free(queue)) - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS);
- queue->tail = (queue->tail + 1) % PTP_MAX_TIMESTAMPS; + WRITE_ONCE(queue->tail, (queue->tail + 1) % PTP_MAX_TIMESTAMPS);
spin_unlock_irqrestore(&queue->lock, flags); } diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h index 75f58fc468a71..b8d4f61f14be4 100644 --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -76,9 +76,13 @@ struct ptp_vclock { * that a writer might concurrently increment the tail does not * matter, since the queue remains nonempty nonetheless. */ -static inline int queue_cnt(struct timestamp_event_queue *q) +static inline int queue_cnt(const struct timestamp_event_queue *q) { - int cnt = q->tail - q->head; + /* + * Paired with WRITE_ONCE() in enqueue_external_timestamp(), + * ptp_read(), extts_fifo_show(). + */ + int cnt = READ_ONCE(q->tail) - READ_ONCE(q->head); return cnt < 0 ? PTP_MAX_TIMESTAMPS + cnt : cnt; }
diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c index 6e4d5456a8851..34ea5c16123a1 100644 --- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -90,7 +90,8 @@ static ssize_t extts_fifo_show(struct device *dev, qcnt = queue_cnt(queue); if (qcnt) { event = queue->buf[queue->head]; - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + /* Paired with READ_ONCE() in queue_cnt() */ + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS); } spin_unlock_irqrestore(&queue->lock, flags);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3cffa2ddc4d3fcf70cde361236f5a614f81a09b2 ]
Commit 9eed321cde22 ("net: lapbether: only support ethernet devices") has been able to keep syzbot away from net/lapb, until today.
In the following splat [1], the issue is that a lapbether device has been created on a bonding device without members. Then adding a non ARPHRD_ETHER member forced the bonding master to change its type.
The fix is to make sure we call dev_close() in bond_setup_by_slave() so that the potential linked lapbether devices (or any other devices having assumptions on the physical device) are removed.
A similar bug has been addressed in commit 40baec225765 ("bonding: fix panic on non-ARPHRD_ETHER enslave failure")
[1] skbuff: skb_under_panic: text:ffff800089508810 len:44 put:40 head:ffff0000c78e7c00 data:ffff0000c78e7bea tail:0x16 end:0x140 dev:bond0 kernel BUG at net/core/skbuff.c:192 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6007 Comm: syz-executor383 Not tainted 6.6.0-rc3-syzkaller-gbf6547d8715b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_panic net/core/skbuff.c:188 [inline] pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 lr : skb_panic net/core/skbuff.c:188 [inline] lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 sp : ffff800096a06aa0 x29: ffff800096a06ab0 x28: ffff800096a06ba0 x27: dfff800000000000 x26: ffff0000ce9b9b50 x25: 0000000000000016 x24: ffff0000c78e7bea x23: ffff0000c78e7c00 x22: 000000000000002c x21: 0000000000000140 x20: 0000000000000028 x19: ffff800089508810 x18: ffff800096a06100 x17: 0000000000000000 x16: ffff80008a629a3c x15: 0000000000000001 x14: 1fffe00036837a32 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000201 x10: 0000000000000000 x9 : cb50b496c519aa00 x8 : cb50b496c519aa00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff800096a063b8 x4 : ffff80008e280f80 x3 : ffff8000805ad11c x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086 Call trace: skb_panic net/core/skbuff.c:188 [inline] skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 skb_push+0xf0/0x108 net/core/skbuff.c:2446 ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1384 dev_hard_header include/linux/netdevice.h:3136 [inline] lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149 lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251 __lapb_disconnect_request+0x9c/0x17c net/lapb/lapb_iface.c:326 lapb_device_event+0x288/0x4e0 net/lapb/lapb_iface.c:492 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline] __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 lapbeth_device_event+0x2e4/0x958 drivers/net/wan/lapbether.c:466 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline] __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 bond_enslave+0x2298/0x30cc drivers/net/bonding/bond_main.c:2332 bond_do_ioctl+0x268/0xc64 drivers/net/bonding/bond_main.c:4539 dev_ifsioc+0x754/0x9ac dev_ioctl+0x4d8/0xd34 net/core/dev_ioctl.c:786 sock_do_ioctl+0x1d4/0x2d0 net/socket.c:1217 sock_ioctl+0x4e8/0x834 net/socket.c:1322 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl fs/ioctl.c:857 [inline] __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:857 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155 el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: aa1803e6 aa1903e7 a90023f5 94785b8b (d4210000)
Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Jay Vosburgh jay.vosburgh@canonical.com Reviewed-by: Hangbin Liu liuhangbin@gmail.com Link: https://lore.kernel.org/r/20231109180102.4085183-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index a64ebb7f5b712..363b6cb33ae08 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1499,6 +1499,10 @@ static void bond_compute_features(struct bonding *bond) static void bond_setup_by_slave(struct net_device *bond_dev, struct net_device *slave_dev) { + bool was_up = !!(bond_dev->flags & IFF_UP); + + dev_close(bond_dev); + bond_dev->header_ops = slave_dev->header_ops;
bond_dev->type = slave_dev->type; @@ -1513,6 +1517,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev, bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); } + if (was_up) + dev_open(bond_dev, NULL); }
/* On bonding slaves other than the currently active slave, suppress
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 510e35fb931ffc3b100e5d5ae4595cd3beca9f1a ]
Enumerator 3 is 1548 bytes according to the datasheet. Not 1542.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-1-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 4 ++-- drivers/net/ethernet/cortina/gemini.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index 5715b9ab2712e..9fe8d22a26719 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -432,8 +432,8 @@ static const struct gmac_max_framelen gmac_maxlens[] = { .val = CONFIG0_MAXLEN_1536, }, { - .max_l3_len = 1542, - .val = CONFIG0_MAXLEN_1542, + .max_l3_len = 1548, + .val = CONFIG0_MAXLEN_1548, }, { .max_l3_len = 9212, diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h index 9fdf77d5eb374..99efb11557436 100644 --- a/drivers/net/ethernet/cortina/gemini.h +++ b/drivers/net/ethernet/cortina/gemini.h @@ -787,7 +787,7 @@ union gmac_config0 { #define CONFIG0_MAXLEN_1536 0 #define CONFIG0_MAXLEN_1518 1 #define CONFIG0_MAXLEN_1522 2 -#define CONFIG0_MAXLEN_1542 3 +#define CONFIG0_MAXLEN_1548 3 #define CONFIG0_MAXLEN_9k 4 /* 9212 */ #define CONFIG0_MAXLEN_10k 5 /* 10236 */ #define CONFIG0_MAXLEN_1518__6 6
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit d4d0c5b4d279bfe3585fbd806efefd3e51c82afa ]
The Gemini ethernet controller provides hardware checksumming for frames up to 1514 bytes including ethernet headers but not FCS.
If we start sending bigger frames (after first bumping up the MTU on both interfaces sending and receiving the frames), truncated packets start to appear on the target such as in this tcpdump resulting from ping -s 1474:
23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown), ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing! (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502) OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
If we bypass the hardware checksumming and provide a software fallback, everything starts working fine up to the max TX MTU of 2047 bytes, for example ping -s2000 192.168.1.2:
00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown), ethertype IPv4 (0x0800), length 2042: (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028) Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
The bit enabling to bypass hardware checksum (or any of the "TSS" bits) are undocumented in the hardware reference manual. The entire hardware checksum unit appears undocumented. The conclusion that we need to use the "bypass" bit was found by trial-and-error.
Since no hardware checksum will happen, we slot in a software checksum fallback.
Check for the condition where we need to compute checksum on the skb with either hardware or software using == CHECKSUM_PARTIAL instead of != CHECKSUM_NONE which is an incomplete check according to <linux/skbuff.h>.
On the D-Link DIR-685 router this fixes a bug on the conduit interface to the RTL8366RB DSA switch: as the switch needs to add space for its tag it increases the MTU on the conduit interface to 1504 and that means that when the router sends packages of 1500 bytes these get an extra 4 bytes of DSA tag and the transfer fails because of the erroneous hardware checksumming, affecting such basic functionality as the LuCI web interface.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-2-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index 9fe8d22a26719..a2d4ace01ed3c 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -1145,6 +1145,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb, dma_addr_t mapping; unsigned short mtu; void *buffer; + int ret;
mtu = ETH_HLEN; mtu += netdev->mtu; @@ -1159,9 +1160,30 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb, word3 |= mtu; }
- if (skb->ip_summed != CHECKSUM_NONE) { + if (skb->len >= ETH_FRAME_LEN) { + /* Hardware offloaded checksumming isn't working on frames + * bigger than 1514 bytes. A hypothesis about this is that the + * checksum buffer is only 1518 bytes, so when the frames get + * bigger they get truncated, or the last few bytes get + * overwritten by the FCS. + * + * Just use software checksumming and bypass on bigger frames. + */ + if (skb->ip_summed == CHECKSUM_PARTIAL) { + ret = skb_checksum_help(skb); + if (ret) + return ret; + } + word1 |= TSS_BYPASS_BIT; + } else if (skb->ip_summed == CHECKSUM_PARTIAL) { int tcp = 0;
+ /* We do not switch off the checksumming on non TCP/UDP + * frames: as is shown from tests, the checksumming engine + * is smart enough to see that a frame is not actually TCP + * or UDP and then just pass it through without any changes + * to the frame. + */ if (skb->protocol == htons(ETH_P_IP)) { word1 |= TSS_IP_CHKSUM_BIT; tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit dc6c0bfbaa947dd7976e30e8c29b10c868b6fa42 ]
The RX max frame size is over 10000 for the Gemini ethernet, but the TX max frame size is actually just 2047 (0x7ff after checking the datasheet). Reflect this in what we offer to Linux, cap the MTU at the TX max frame minus ethernet headers.
We delete the code disabling the hardware checksum for large MTUs as netdev->mtu can no longer be larger than netdev->max_mtu meaning the if()-clause in gmac_fix_features() is never true.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-3-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 17 ++++------------- drivers/net/ethernet/cortina/gemini.h | 2 +- 2 files changed, 5 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index a2d4ace01ed3c..e7137b468f5bc 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -2000,15 +2000,6 @@ static int gmac_change_mtu(struct net_device *netdev, int new_mtu) return 0; }
-static netdev_features_t gmac_fix_features(struct net_device *netdev, - netdev_features_t features) -{ - if (netdev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK) - features &= ~GMAC_OFFLOAD_FEATURES; - - return features; -} - static int gmac_set_features(struct net_device *netdev, netdev_features_t features) { @@ -2234,7 +2225,6 @@ static const struct net_device_ops gmac_351x_ops = { .ndo_set_mac_address = gmac_set_mac_address, .ndo_get_stats64 = gmac_get_stats64, .ndo_change_mtu = gmac_change_mtu, - .ndo_fix_features = gmac_fix_features, .ndo_set_features = gmac_set_features, };
@@ -2486,11 +2476,12 @@ static int gemini_ethernet_port_probe(struct platform_device *pdev)
netdev->hw_features = GMAC_OFFLOAD_FEATURES; netdev->features |= GMAC_OFFLOAD_FEATURES | NETIF_F_GRO; - /* We can handle jumbo frames up to 10236 bytes so, let's accept - * payloads of 10236 bytes minus VLAN and ethernet header + /* We can receive jumbo frames up to 10236 bytes but only + * transmit 2047 bytes so, let's accept payloads of 2047 + * bytes minus VLAN and ethernet header */ netdev->min_mtu = ETH_MIN_MTU; - netdev->max_mtu = 10236 - VLAN_ETH_HLEN; + netdev->max_mtu = MTU_SIZE_BIT_MASK - VLAN_ETH_HLEN;
port->freeq_refill = 0; netif_napi_add(netdev, &port->napi, gmac_napi_poll); diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h index 99efb11557436..24bb989981f23 100644 --- a/drivers/net/ethernet/cortina/gemini.h +++ b/drivers/net/ethernet/cortina/gemini.h @@ -502,7 +502,7 @@ union gmac_txdesc_3 { #define SOF_BIT 0x80000000 #define EOF_BIT 0x40000000 #define EOFIE_BIT BIT(29) -#define MTU_SIZE_BIT_MASK 0x1fff +#define MTU_SIZE_BIT_MASK 0x7ff /* Max MTU 2047 bytes */
/* GMAC Tx Descriptor */ struct gmac_txdesc {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4b7b492615cf3017190f55444f7016812b66611d ]
syzbot reported the following crash [1]
After releasing unix socket lock, u->oob_skb can be changed by another thread. We must temporarily increase skb refcount to make sure this other thread will not free the skb under us.
[1]
BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866 Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a627b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866 unix_stream_recv_urg net/unix/af_unix.c:2587 [inline] unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666 unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0xe2/0x170 net/socket.c:1066 ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803 ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845 __sys_recvmsg+0x114/0x1e0 net/socket.c:2875 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fc67492c559 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559 RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340 R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388 </TASK>
Allocated by task 5295: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523 __alloc_skb+0x287/0x330 net/core/skbuff.c:641 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] queue_oob net/unix/af_unix.c:2147 [inline] unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Freed by task 5295: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1800 [inline] slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826 slab_free mm/slub.c:3809 [inline] kmem_cache_free+0xf8/0x340 mm/slub.c:3831 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015 __kfree_skb net/core/skbuff.c:1073 [inline] consume_skb net/core/skbuff.c:1288 [inline] consume_skb+0xdf/0x170 net/core/skbuff.c:1282 queue_oob net/unix/af_unix.c:2178 [inline] unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
The buggy address belongs to the object at ffff88801f3b9c80 which belongs to the cache skbuff_head_cache of size 240 The buggy address is located 68 bytes inside of freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
The buggy address belongs to the physical page: page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9 flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000 raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1544 [inline] get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312 __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568 alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133 alloc_slab_page mm/slub.c:1870 [inline] allocate_slab+0x251/0x380 mm/slub.c:2017 new_slab mm/slub.c:2070 [inline] ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322 __slab_alloc_node mm/slub.c:3375 [inline] slab_alloc_node mm/slub.c:3468 [inline] kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523 __alloc_skb+0x287/0x330 net/core/skbuff.c:641 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] queue_oob net/unix/af_unix.c:2147 [inline] unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1137 [inline] free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347 free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487 __unfreeze_partials+0x21d/0x240 mm/slub.c:2655 qlink_free mm/kasan/quarantine.c:168 [inline] qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] slab_alloc mm/slub.c:3486 [inline] __kmem_cache_alloc_lru mm/slub.c:3493 [inline] kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502 vm_area_dup+0x21/0x2f0 kernel/fork.c:500 __split_vma+0x17d/0x1070 mm/mmap.c:2365 split_vma mm/mmap.c:2437 [inline] vma_modify+0x25d/0x450 mm/mmap.c:2472 vma_modify_flags include/linux/mm.h:3271 [inline] mprotect_fixup+0x228/0xc80 mm/mprotect.c:635 do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809 __do_sys_mprotect mm/mprotect.c:830 [inline] __se_sys_mprotect mm/mprotect.c:827 [inline] __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Memory state around the buggy address: ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: 876c14ad014d ("af_unix: fix holding spinlock in oob handling") Reported-and-tested-by: syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Rao Shoaib rao.shoaib@oracle.com Reviewed-by: Rao shoaib rao.shoaib@oracle.com Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 3e8a04a136688..3e6eeacb13aec 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2553,15 +2553,16 @@ static int unix_stream_recv_urg(struct unix_stream_read_state *state)
if (!(state->flags & MSG_PEEK)) WRITE_ONCE(u->oob_skb, NULL); - + else + skb_get(oob_skb); unix_state_unlock(sk);
chunk = state->recv_actor(oob_skb, 0, chunk, state);
- if (!(state->flags & MSG_PEEK)) { + if (!(state->flags & MSG_PEEK)) UNIXCB(oob_skb).consumed += 1; - kfree_skb(oob_skb); - } + + consume_skb(oob_skb);
mutex_unlock(&u->iolock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linkui Xiao xiaolinkui@kylinos.cn
[ Upstream commit a44af08e3d4d7566eeea98d7a29fe06e7b9de944 ]
K2CI reported a problem:
consume_skb(skb); return err; [nf_br_ip_fragment() error] uninitialized symbol 'err'.
err is not initialized, because returning 0 is expected, initialize err to 0.
Fixes: 3c171f496ef5 ("netfilter: bridge: add connection tracking system") Reported-by: k2ci kernel-bot@kylinos.cn Signed-off-by: Linkui Xiao xiaolinkui@kylinos.cn Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/netfilter/nf_conntrack_bridge.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/netfilter/nf_conntrack_bridge.c b/net/bridge/netfilter/nf_conntrack_bridge.c index 71056ee847736..0fcf357ea7ad3 100644 --- a/net/bridge/netfilter/nf_conntrack_bridge.c +++ b/net/bridge/netfilter/nf_conntrack_bridge.c @@ -37,7 +37,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk, ktime_t tstamp = skb->tstamp; struct ip_frag_state state; struct iphdr *iph; - int err; + int err = 0;
/* for offloaded checksums cleanup checksum before fragmentation */ if (skb->ip_summed == CHECKSUM_PARTIAL &&
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63 ]
The problem is in nft_byteorder_eval() where we are iterating through a loop and writing to dst[0], dst[1], dst[2] and so on... On each iteration we are writing 8 bytes. But dst[] is an array of u32 so each element only has space for 4 bytes. That means that every iteration overwrites part of the previous element.
I spotted this bug while reviewing commit caf3ef7468f7 ("netfilter: nf_tables: prevent OOB access in nft_byteorder_eval") which is a related issue. I think that the reason we have not detected this bug in testing is that most of time we only write one element.
Fixes: ce1e7989d989 ("netfilter: nft_byteorder: provide 64bit le/be conversion") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 4 ++-- net/netfilter/nft_byteorder.c | 5 +++-- net/netfilter/nft_meta.c | 2 +- 3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 7c816359d5a98..75972e211ba12 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -178,9 +178,9 @@ static inline __be32 nft_reg_load_be32(const u32 *sreg) return *(__force __be32 *)sreg; }
-static inline void nft_reg_store64(u32 *dreg, u64 val) +static inline void nft_reg_store64(u64 *dreg, u64 val) { - put_unaligned(val, (u64 *)dreg); + put_unaligned(val, dreg); }
static inline u64 nft_reg_load64(const u32 *sreg) diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c index e596d1a842f70..f6e791a681015 100644 --- a/net/netfilter/nft_byteorder.c +++ b/net/netfilter/nft_byteorder.c @@ -38,13 +38,14 @@ void nft_byteorder_eval(const struct nft_expr *expr,
switch (priv->size) { case 8: { + u64 *dst64 = (void *)dst; u64 src64;
switch (priv->op) { case NFT_BYTEORDER_NTOH: for (i = 0; i < priv->len / 8; i++) { src64 = nft_reg_load64(&src[i]); - nft_reg_store64(&dst[i], + nft_reg_store64(&dst64[i], be64_to_cpu((__force __be64)src64)); } break; @@ -52,7 +53,7 @@ void nft_byteorder_eval(const struct nft_expr *expr, for (i = 0; i < priv->len / 8; i++) { src64 = (__force __u64) cpu_to_be64(nft_reg_load64(&src[i])); - nft_reg_store64(&dst[i], src64); + nft_reg_store64(&dst64[i], src64); } break; } diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c index 8fdc7318c03c7..715484665a907 100644 --- a/net/netfilter/nft_meta.c +++ b/net/netfilter/nft_meta.c @@ -63,7 +63,7 @@ nft_meta_get_eval_time(enum nft_meta_keys key, { switch (key) { case NFT_META_TIME_NS: - nft_reg_store64(dest, ktime_get_real_ns()); + nft_reg_store64((u64 *)dest, ktime_get_real_ns()); break; case NFT_META_TIME_DAY: nft_reg_store8(dest, nft_meta_weekday());
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit a7d5a955bfa854ac6b0c53aaf933394b4e6139e4 ]
destroy element command bogusly reports ENOENT in case a set element does not exist. ENOENT errors are skipped, however, err is still set and propagated to userspace.
# nft destroy element ip raw BLACKLIST { 1.2.3.4 } Error: Could not process rule: No such file or directory destroy element ip raw BLACKLIST { 1.2.3.4 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Fixes: f80a612dd77c ("netfilter: nf_tables: add support to destroy operation") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 8776266ba1532..398a1bcc6ea61 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -7202,10 +7202,11 @@ static int nf_tables_delsetelem(struct sk_buff *skb,
if (err < 0) { NL_SET_BAD_ATTR(extack, attr); - break; + return err; } } - return err; + + return 0; }
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baruch Siach baruch@tkos.co.il
[ Upstream commit fa02de9e75889915b554eda1964a631fd019973b ]
The while loop condition verifies 'count < limit'. Neither value change before the 'count >= limit' check. As is this check is dead code. But code inspection reveals a code path that modifies 'count' and then goto 'drain_data' and back to 'read_again'. So there is a need to verify count value sanity after 'read_again'.
Move 'read_again' up to fix the count limit check.
Fixes: ec222003bd94 ("net: stmmac: Prepare to add Split Header support") Signed-off-by: Baruch Siach baruch@tkos.co.il Reviewed-by: Serge Semin fancer.lancer@gmail.com Link: https://lore.kernel.org/r/d9486296c3b6b12ab3a0515fcd47d56447a07bfc.169989737... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index e840cadb2d75a..7f9b02bb22a30 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5258,10 +5258,10 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) len = 0; }
+read_again: if (count >= limit) break;
-read_again: buf1_len = 0; buf2_len = 0; entry = next_entry;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baruch Siach baruch@tkos.co.il
[ Upstream commit b6cb4541853c7ee512111b0e7ddf3cb66c99c137 ]
dma_rx_size can be set as low as 64. Rx budget might be higher than that. Make sure to not overrun allocated rx buffers when budget is larger.
Leave one descriptor unused to avoid wrap around of 'dirty_rx' vs 'cur_rx'.
Signed-off-by: Baruch Siach baruch@tkos.co.il Reviewed-by: Serge Semin fancer.lancer@gmail.com Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.") Link: https://lore.kernel.org/r/d95413e44c97d4692e72cec13a75f894abeb6998.169989737... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 7f9b02bb22a30..86ff015fba354 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5223,6 +5223,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
dma_dir = page_pool_get_dma_dir(rx_q->page_pool); buf_sz = DIV_ROUND_UP(priv->dma_conf.dma_buf_sz, PAGE_SIZE) * PAGE_SIZE; + limit = min(priv->dma_conf.dma_rx_size - 1, (unsigned int)limit);
if (netif_msg_rx_status(priv)) { void *rx_head;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit 09d4c14c6c5e6e781a3879fed7f8e116a18b8c65 ]
Use the qcq's interrupt index, not the irq number, to mask the interrupt. Since the irq number can be out of range from the number of possible interrupts, we can end up accessing and potentially scribbling on out-of-range and/or unmapped memory, making the kernel angry.
[ 3116.039364] BUG: unable to handle page fault for address: ffffbeea1c3edf84 [ 3116.047059] #PF: supervisor write access in kernel mode [ 3116.052895] #PF: error_code(0x0002) - not-present page [ 3116.058636] PGD 100000067 P4D 100000067 PUD 1001f2067 PMD 10f82e067 PTE 0 [ 3116.066221] Oops: 0002 [#1] SMP NOPTI [ 3116.092948] RIP: 0010:iowrite32+0x9/0x76 [ 3116.190452] Call Trace: [ 3116.193185] <IRQ> [ 3116.195430] ? show_trace_log_lvl+0x1d6/0x2f9 [ 3116.200298] ? show_trace_log_lvl+0x1d6/0x2f9 [ 3116.205166] ? pdsc_adminq_isr+0x43/0x55 [pds_core] [ 3116.210618] ? __die_body.cold+0x8/0xa [ 3116.214806] ? page_fault_oops+0x16d/0x1ac [ 3116.219382] ? exc_page_fault+0xbe/0x13b [ 3116.223764] ? asm_exc_page_fault+0x22/0x27 [ 3116.228440] ? iowrite32+0x9/0x76 [ 3116.232143] pdsc_adminq_isr+0x43/0x55 [pds_core] [ 3116.237627] __handle_irq_event_percpu+0x3a/0x184 [ 3116.243088] handle_irq_event+0x57/0xb0 [ 3116.247575] handle_edge_irq+0x87/0x225 [ 3116.252062] __common_interrupt+0x3e/0xbc [ 3116.256740] common_interrupt+0x7b/0x98 [ 3116.261216] </IRQ> [ 3116.263745] <TASK> [ 3116.266268] asm_common_interrupt+0x22/0x27
Reported-by: Joao Martins joao.m.martins@oracle.com Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Link: https://lore.kernel.org/r/20231113183257.71110-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/pds_core/adminq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c index 045fe133f6ee9..5beadabc21361 100644 --- a/drivers/net/ethernet/amd/pds_core/adminq.c +++ b/drivers/net/ethernet/amd/pds_core/adminq.c @@ -146,7 +146,7 @@ irqreturn_t pdsc_adminq_isr(int irq, void *data) }
queue_work(pdsc->wq, &qcq->work); - pds_core_intr_mask(&pdsc->intr_ctrl[irq], PDS_CORE_INTR_MASK_CLEAR); + pds_core_intr_mask(&pdsc->intr_ctrl[qcq->intx], PDS_CORE_INTR_MASK_CLEAR);
return IRQ_HANDLED; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit 7c02f6ae676a954216a192612040f9a0cde3adf7 ]
Our friendly kernel test robot pointed out a couple of potential string truncation issues. None of which were we worried about, but can be relatively easily fixed to quiet the complaints.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202310211736.66syyDpp-lkp@intel.com/ Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Link: https://lore.kernel.org/r/20231113183257.71110-3-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/pds_core/core.h | 2 +- drivers/net/ethernet/amd/pds_core/dev.c | 8 ++++++-- drivers/net/ethernet/amd/pds_core/devlink.c | 2 +- 3 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/amd/pds_core/core.h b/drivers/net/ethernet/amd/pds_core/core.h index e545fafc48196..b1c1f1007b065 100644 --- a/drivers/net/ethernet/amd/pds_core/core.h +++ b/drivers/net/ethernet/amd/pds_core/core.h @@ -15,7 +15,7 @@ #define PDSC_DRV_DESCRIPTION "AMD/Pensando Core Driver"
#define PDSC_WATCHDOG_SECS 5 -#define PDSC_QUEUE_NAME_MAX_SZ 32 +#define PDSC_QUEUE_NAME_MAX_SZ 16 #define PDSC_ADMINQ_MIN_LENGTH 16 /* must be a power of two */ #define PDSC_NOTIFYQ_LENGTH 64 /* must be a power of two */ #define PDSC_TEARDOWN_RECOVERY false diff --git a/drivers/net/ethernet/amd/pds_core/dev.c b/drivers/net/ethernet/amd/pds_core/dev.c index f77cd9f5a2fda..eb178728edba9 100644 --- a/drivers/net/ethernet/amd/pds_core/dev.c +++ b/drivers/net/ethernet/amd/pds_core/dev.c @@ -254,10 +254,14 @@ static int pdsc_identify(struct pdsc *pdsc) struct pds_core_drv_identity drv = {}; size_t sz; int err; + int n;
drv.drv_type = cpu_to_le32(PDS_DRIVER_LINUX); - snprintf(drv.driver_ver_str, sizeof(drv.driver_ver_str), - "%s %s", PDS_CORE_DRV_NAME, utsname()->release); + /* Catching the return quiets a Wformat-truncation complaint */ + n = snprintf(drv.driver_ver_str, sizeof(drv.driver_ver_str), + "%s %s", PDS_CORE_DRV_NAME, utsname()->release); + if (n > sizeof(drv.driver_ver_str)) + dev_dbg(pdsc->dev, "release name truncated, don't care\n");
/* Next let's get some info about the device * We use the devcmd_lock at this level in order to diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c index d9607033bbf21..d2abf32b93fe3 100644 --- a/drivers/net/ethernet/amd/pds_core/devlink.c +++ b/drivers/net/ethernet/amd/pds_core/devlink.c @@ -104,7 +104,7 @@ int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req, struct pds_core_fw_list_info fw_list; struct pdsc *pdsc = devlink_priv(dl); union pds_core_dev_comp comp; - char buf[16]; + char buf[32]; int listlen; int err; int i;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ziwei Xiao ziweixiao@google.com
[ Upstream commit 278a370c1766060d2144d6cf0b06c101e1043b6d ]
Netpoll will explicilty pass the polling call with a budget of 0 to indicate it's clearing the Tx path only. For the gve_rx_poll and gve_xdp_poll, they were mistakenly taking the 0 budget as the indication to do all the work. Add check to avoid the rx path and xdp path being called when budget is 0. And also avoid napi_complete_done being called when budget is 0 for netpoll.
Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Ziwei Xiao ziweixiao@google.com Link: https://lore.kernel.org/r/20231114004144.2022268-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/google/gve/gve_main.c | 8 +++++++- drivers/net/ethernet/google/gve/gve_rx.c | 4 ---- drivers/net/ethernet/google/gve/gve_tx.c | 4 ---- 3 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index 465a6db5a40a8..79bfa2837a0e6 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -255,10 +255,13 @@ static int gve_napi_poll(struct napi_struct *napi, int budget) if (block->tx) { if (block->tx->q_num < priv->tx_cfg.num_queues) reschedule |= gve_tx_poll(block, budget); - else + else if (budget) reschedule |= gve_xdp_poll(block, budget); }
+ if (!budget) + return 0; + if (block->rx) { work_done = gve_rx_poll(block, budget); reschedule |= work_done == budget; @@ -299,6 +302,9 @@ static int gve_napi_poll_dqo(struct napi_struct *napi, int budget) if (block->tx) reschedule |= gve_tx_poll_dqo(block, /*do_clean=*/true);
+ if (!budget) + return 0; + if (block->rx) { work_done = gve_rx_poll_dqo(block, budget); reschedule |= work_done == budget; diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c index e84a066aa1a40..73655347902d2 100644 --- a/drivers/net/ethernet/google/gve/gve_rx.c +++ b/drivers/net/ethernet/google/gve/gve_rx.c @@ -1007,10 +1007,6 @@ int gve_rx_poll(struct gve_notify_block *block, int budget)
feat = block->napi.dev->features;
- /* If budget is 0, do all the work */ - if (budget == 0) - budget = INT_MAX; - if (budget > 0) work_done = gve_clean_rx_done(rx, budget, feat);
diff --git a/drivers/net/ethernet/google/gve/gve_tx.c b/drivers/net/ethernet/google/gve/gve_tx.c index 6957a865cff37..9f6ffc4a54f0b 100644 --- a/drivers/net/ethernet/google/gve/gve_tx.c +++ b/drivers/net/ethernet/google/gve/gve_tx.c @@ -925,10 +925,6 @@ bool gve_xdp_poll(struct gve_notify_block *block, int budget) bool repoll; u32 to_do;
- /* If budget is 0, do all the work */ - if (budget == 0) - budget = INT_MAX; - /* Find out how much work there is to be done */ nic_done = gve_tx_load_event_counter(priv, tx); to_do = min_t(u32, (nic_done - tx->done), budget);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
[ Upstream commit a0d45c3f596be53c1bd8822a1984532d14fdcea9 ]
A previous commit added a trylock for getting the SQPOLL thread info via fdinfo, but this introduced a regression where we often fail to get it if the thread is busy. For that case, we end up not printing the current CPU and PID info.
Rather than rely on this lock, just print the pid we already stored in the io_sq_data struct, and ensure we update the current CPU every time we've slept or potentially rescheduled. The latter won't potentially be 100% accurate, but that wasn't the case before either as the task can get migrated at any time unless it has been pinned at creation time.
We retain keeping the io_sq_data dereference inside the ctx->uring_lock, as it has always been, as destruction of the thread and data happen below that. We could make this RCU safe, but there's little point in doing that.
With this, we always print the last valid information we had, rather than have spurious outputs with missing information.
Fixes: 7644b1a1c9a7 ("io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/fdinfo.c | 9 ++------- io_uring/sqpoll.c | 12 ++++++++++-- 2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index b603a06f7103d..5fcfe03ed93ec 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -139,13 +139,8 @@ static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { struct io_sq_data *sq = ctx->sq_data;
- if (mutex_trylock(&sq->lock)) { - if (sq->thread) { - sq_pid = task_pid_nr(sq->thread); - sq_cpu = task_cpu(sq->thread); - } - mutex_unlock(&sq->lock); - } + sq_pid = sq->task_pid; + sq_cpu = sq->sq_cpu; }
seq_printf(m, "SqThread:\t%d\n", sq_pid); diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c index bd6c2c7959a5b..65b5dbe3c850e 100644 --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -214,6 +214,7 @@ static bool io_sqd_handle_event(struct io_sq_data *sqd) did_sig = get_signal(&ksig); cond_resched(); mutex_lock(&sqd->lock); + sqd->sq_cpu = raw_smp_processor_id(); } return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state); } @@ -229,10 +230,15 @@ static int io_sq_thread(void *data) snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid); set_task_comm(current, buf);
- if (sqd->sq_cpu != -1) + /* reset to our pid after we've set task_comm, for fdinfo */ + sqd->task_pid = current->pid; + + if (sqd->sq_cpu != -1) { set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu)); - else + } else { set_cpus_allowed_ptr(current, cpu_online_mask); + sqd->sq_cpu = raw_smp_processor_id(); + }
mutex_lock(&sqd->lock); while (1) { @@ -261,6 +267,7 @@ static int io_sq_thread(void *data) mutex_unlock(&sqd->lock); cond_resched(); mutex_lock(&sqd->lock); + sqd->sq_cpu = raw_smp_processor_id(); } continue; } @@ -294,6 +301,7 @@ static int io_sq_thread(void *data) mutex_unlock(&sqd->lock); schedule(); mutex_lock(&sqd->lock); + sqd->sq_cpu = raw_smp_processor_id(); } list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) atomic_andnot(IORING_SQ_NEED_WAKEUP,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit fd64fd13c49a53012ce9170449dcd9eb71c11284 ]
When running a phase adjustment operation, the free running clock should not be modified at all. The phase control keyword is intended to trigger an internal servo on the device that will converge to the provided delta. A free running counter cannot implement phase adjustment.
Fixes: 8e11a68e2e8a ("net/mlx5: Add adjphase function to support hardware-only offset control") Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-5-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c index aa29f09e83564..0c83ef174275a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c @@ -384,7 +384,12 @@ static int mlx5_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
static int mlx5_ptp_adjphase(struct ptp_clock_info *ptp, s32 delta) { - return mlx5_ptp_adjtime(ptp, delta); + struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info); + struct mlx5_core_dev *mdev; + + mdev = container_of(clock, struct mlx5_core_dev, clock); + + return mlx5_ptp_adjtime_real_time(mdev, delta); }
static int mlx5_ptp_freq_adj_real_time(struct mlx5_core_dev *mdev, long scaled_ppm)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dust Li dust.li@linux.alibaba.com
[ Upstream commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1 ]
When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_create_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue.
mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header)
This happens when cmd: MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT fail.
This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success.
Fixes: d589e785baf5e ("net/mlx5e: Allow concurrent creation of encap entries") Reported-by: Cruz Zhao cruzzhao@linux.alibaba.com Reported-by: Tianchen Ding dtcccc@linux.alibaba.com Signed-off-by: Dust Li dust.li@linux.alibaba.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index 00a04fdd756f5..8bca696b6658c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -300,9 +300,6 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, if (err) goto destroy_neigh_entry;
- e->encap_size = ipv4_encap_size; - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event @@ -322,6 +319,8 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, goto destroy_neigh_entry; }
+ e->encap_size = ipv4_encap_size; + e->encap_header = encap_header; e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv4_put(&attr); @@ -568,9 +567,6 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, if (err) goto destroy_neigh_entry;
- e->encap_size = ipv6_encap_size; - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event @@ -590,6 +586,8 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, goto destroy_neigh_entry; }
+ e->encap_size = ipv6_encap_size; + e->encap_header = encap_header; e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv6_put(&attr);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gavin Li gavinl@nvidia.com
[ Upstream commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb ]
Follow up to the previous patch to fix the same issue for mlx5e_tc_tun_update_header_ipv4{6} when mlx5_packet_reformat_alloc() fails.
When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_update_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue.
mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header)
This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success.
Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Gavin Li gavinl@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-7-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/tc_tun.c | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index 8bca696b6658c..668da5c70e63d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -403,16 +403,12 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, if (err) goto free_encap;
- e->encap_size = ipv4_encap_size; - kfree(e->encap_header); - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event * and not used before that. */ - goto release_neigh; + goto free_encap; }
memset(&reformat_params, 0, sizeof(reformat_params)); @@ -426,6 +422,10 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, goto free_encap; }
+ e->encap_size = ipv4_encap_size; + kfree(e->encap_header); + e->encap_header = encap_header; + e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv4_put(&attr); @@ -669,16 +669,12 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, if (err) goto free_encap;
- e->encap_size = ipv6_encap_size; - kfree(e->encap_header); - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event * and not used before that. */ - goto release_neigh; + goto free_encap; }
memset(&reformat_params, 0, sizeof(reformat_params)); @@ -692,6 +688,10 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, goto free_encap; }
+ e->encap_size = ipv6_encap_size; + kfree(e->encap_header); + e->encap_header = encap_header; + e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv6_put(&attr);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 0c101a23ca7eaf00eef1328eefb04b3a93401cc8 ]
Referenced commit addressed endianness issue in mlx5 pedit implementation in ad hoc manner instead of systematically treating integer values according to their types which left pedit fields of sizes not equal to 4 and where the bytes being modified are not least significant ones broken on big endian machines since wrong bits will be consumed during parsing which leads to following example error when applying pedit to source and destination MAC addresses:
[Wed Oct 18 12:52:42 2023] mlx5_core 0001:00:00.1 p1v3_r: attempt to offload an unsupported field (cmd 0) [Wed Oct 18 12:52:42 2023] mask: 00000000330c5b68: 00 00 00 00 ff ff 00 00 00 00 ff ff 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 0000000017d22fd9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 000000008186d717: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 0000000029eb6149: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 000000007ed103e4: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 00000000db8101a6: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 00000000ec3c08a9: 00 00 00 00 00 00 00 00 00 00 00 00 ............
Treat masks and values of pedit and filter match as network byte order, refactor pointers to them to void pointers instead of confusing u32 pointers and only cast to pointer-to-integer when reading a value from them. Treat pedit mlx5_fields->field_mask as host byte order according to its type u32, change the constants in fields array accordingly.
Fixes: 82198d8bcdef ("net/mlx5e: Fix endianness when calculating pedit mask first bit") Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Gal Pressman gal@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-8-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 60 ++++++++++--------- 1 file changed, 32 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 5797d8607633e..fdef505c4b88f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3148,7 +3148,7 @@ static struct mlx5_fields fields[] = { OFFLOAD(DIPV6_31_0, 32, U32_MAX, ip6.daddr.s6_addr32[3], 0, dst_ipv4_dst_ipv6.ipv6_layout.ipv6[12]), OFFLOAD(IPV6_HOPLIMIT, 8, U8_MAX, ip6.hop_limit, 0, ttl_hoplimit), - OFFLOAD(IP_DSCP, 16, 0xc00f, ip6, 0, ip_dscp), + OFFLOAD(IP_DSCP, 16, 0x0fc0, ip6, 0, ip_dscp),
OFFLOAD(TCP_SPORT, 16, U16_MAX, tcp.source, 0, tcp_sport), OFFLOAD(TCP_DPORT, 16, U16_MAX, tcp.dest, 0, tcp_dport), @@ -3159,21 +3159,31 @@ static struct mlx5_fields fields[] = { OFFLOAD(UDP_DPORT, 16, U16_MAX, udp.dest, 0, udp_dport), };
-static unsigned long mask_to_le(unsigned long mask, int size) +static u32 mask_field_get(void *mask, struct mlx5_fields *f) { - __be32 mask_be32; - __be16 mask_be16; - - if (size == 32) { - mask_be32 = (__force __be32)(mask); - mask = (__force unsigned long)cpu_to_le32(be32_to_cpu(mask_be32)); - } else if (size == 16) { - mask_be32 = (__force __be32)(mask); - mask_be16 = *(__be16 *)&mask_be32; - mask = (__force unsigned long)cpu_to_le16(be16_to_cpu(mask_be16)); + switch (f->field_bsize) { + case 32: + return be32_to_cpu(*(__be32 *)mask) & f->field_mask; + case 16: + return be16_to_cpu(*(__be16 *)mask) & (u16)f->field_mask; + default: + return *(u8 *)mask & (u8)f->field_mask; } +}
- return mask; +static void mask_field_clear(void *mask, struct mlx5_fields *f) +{ + switch (f->field_bsize) { + case 32: + *(__be32 *)mask &= ~cpu_to_be32(f->field_mask); + break; + case 16: + *(__be16 *)mask &= ~cpu_to_be16((u16)f->field_mask); + break; + default: + *(u8 *)mask &= ~(u8)f->field_mask; + break; + } }
static int offload_pedit_fields(struct mlx5e_priv *priv, @@ -3185,11 +3195,12 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, struct pedit_headers *set_masks, *add_masks, *set_vals, *add_vals; struct pedit_headers_action *hdrs = parse_attr->hdrs; void *headers_c, *headers_v, *action, *vals_p; - u32 *s_masks_p, *a_masks_p, s_mask, a_mask; struct mlx5e_tc_mod_hdr_acts *mod_acts; - unsigned long mask, field_mask; + void *s_masks_p, *a_masks_p; int i, first, last, next_z; struct mlx5_fields *f; + unsigned long mask; + u32 s_mask, a_mask; u8 cmd;
mod_acts = &parse_attr->mod_hdr_acts; @@ -3205,15 +3216,11 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, bool skip;
f = &fields[i]; - /* avoid seeing bits set from previous iterations */ - s_mask = 0; - a_mask = 0; - s_masks_p = (void *)set_masks + f->offset; a_masks_p = (void *)add_masks + f->offset;
- s_mask = *s_masks_p & f->field_mask; - a_mask = *a_masks_p & f->field_mask; + s_mask = mask_field_get(s_masks_p, f); + a_mask = mask_field_get(a_masks_p, f);
if (!s_mask && !a_mask) /* nothing to offload here */ continue; @@ -3240,22 +3247,20 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, match_mask, f->field_bsize)) skip = true; /* clear to denote we consumed this field */ - *s_masks_p &= ~f->field_mask; + mask_field_clear(s_masks_p, f); } else { cmd = MLX5_ACTION_TYPE_ADD; mask = a_mask; vals_p = (void *)add_vals + f->offset; /* add 0 is no change */ - if ((*(u32 *)vals_p & f->field_mask) == 0) + if (!mask_field_get(vals_p, f)) skip = true; /* clear to denote we consumed this field */ - *a_masks_p &= ~f->field_mask; + mask_field_clear(a_masks_p, f); } if (skip) continue;
- mask = mask_to_le(mask, f->field_bsize); - first = find_first_bit(&mask, f->field_bsize); next_z = find_next_zero_bit(&mask, f->field_bsize, first); last = find_last_bit(&mask, f->field_bsize); @@ -3282,10 +3287,9 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, MLX5_SET(set_action_in, action, field, f->field);
if (cmd == MLX5_ACTION_TYPE_SET) { + unsigned long field_mask = f->field_mask; int start;
- field_mask = mask_to_le(f->field_mask, f->field_bsize); - /* if field is bit sized it can start not from first bit */ start = find_first_bit(&field_mask, f->field_bsize);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit b608dd670bb69c89d5074685899d4c1089f57a16 ]
De-duplicate documentation by removing mellanox/mlx5/devlink.rst. Instead, only use the generic devlink documentation directory to document mlx5 devlink parameters. Avoid providing general devlink tool usage information in mlx5-specific documentation.
Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Reviewed-by: Gal Pressman gal@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter") Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/devlink.rst | 313 ------------------ .../ethernet/mellanox/mlx5/index.rst | 1 - Documentation/networking/devlink/mlx5.rst | 179 ++++++++++ 3 files changed, 179 insertions(+), 314 deletions(-) delete mode 100644 Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst deleted file mode 100644 index a4edf908b707c..0000000000000 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst +++ /dev/null @@ -1,313 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB -.. include:: <isonum.txt> - -======= -Devlink -======= - -:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. - -Contents -======== - -- `Info`_ -- `Parameters`_ -- `Health reporters`_ - -Info -==== - -The devlink info reports the running and stored firmware versions on device. -It also prints the device PSID which represents the HCA board type ID. - -User command example:: - - $ devlink dev info pci/0000:00:06.0 - pci/0000:00:06.0: - driver mlx5_core - versions: - fixed: - fw.psid MT_0000000009 - running: - fw.version 16.26.0100 - stored: - fw.version 16.26.0100 - -Parameters -========== - -flow_steering_mode: Device flow steering mode ---------------------------------------------- -The flow steering mode parameter controls the flow steering mode of the driver. -Two modes are supported: - -1. 'dmfs' - Device managed flow steering. -2. 'smfs' - Software/Driver managed flow steering. - -In DMFS mode, the HW steering entities are created and managed through the -Firmware. -In SMFS mode, the HW steering entities are created and managed though by -the driver directly into hardware without firmware intervention. - -SMFS mode is faster and provides better rule insertion rate compared to default DMFS mode. - -User command examples: - -- Set SMFS flow steering mode:: - - $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - -- Read device flow steering mode:: - - $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode - pci/0000:06:00.0: - name flow_steering_mode type driver-specific - values: - cmode runtime value smfs - -enable_roce: RoCE enablement state ----------------------------------- -If the device supports RoCE disablement, RoCE enablement state controls device -support for RoCE capability. Otherwise, the control occurs in the driver stack. -When RoCE is disabled at the driver level, only raw ethernet QPs are supported. - -To change RoCE enablement state, a user must change the driverinit cmode value -and run devlink reload. - -User command examples: - -- Disable RoCE:: - - $ devlink dev param set pci/0000:06:00.0 name enable_roce value false cmode driverinit - $ devlink dev reload pci/0000:06:00.0 - -- Read RoCE enablement state:: - - $ devlink dev param show pci/0000:06:00.0 name enable_roce - pci/0000:06:00.0: - name enable_roce type generic - values: - cmode driverinit value true - -esw_port_metadata: Eswitch port metadata state ----------------------------------------------- -When applicable, disabling eswitch metadata can increase packet rate -up to 20% depending on the use case and packet sizes. - -Eswitch port metadata state controls whether to internally tag packets with -metadata. Metadata tagging must be enabled for multi-port RoCE, failover -between representors and stacked devices. -By default metadata is enabled on the supported devices in E-switch. -Metadata is applicable only for E-switch in switchdev mode and -users may disable it when NONE of the below use cases will be in use: - -1. HCA is in Dual/multi-port RoCE mode. -2. VF/SF representor bonding (Usually used for Live migration) -3. Stacked devices - -When metadata is disabled, the above use cases will fail to initialize if -users try to enable them. - -- Show eswitch port metadata:: - - $ devlink dev param show pci/0000:06:00.0 name esw_port_metadata - pci/0000:06:00.0: - name esw_port_metadata type driver-specific - values: - cmode runtime value true - -- Disable eswitch port metadata:: - - $ devlink dev param set pci/0000:06:00.0 name esw_port_metadata value false cmode runtime - -- Change eswitch mode to switchdev mode where after choosing the metadata value:: - - $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev - -hairpin_num_queues: Number of hairpin queues --------------------------------------------- -We refer to a TC NIC rule that involves forwarding as "hairpin". - -Hairpin queues are mlx5 hardware specific implementation for hardware -forwarding of such packets. - -- Show the number of hairpin queues:: - - $ devlink dev param show pci/0000:06:00.0 name hairpin_num_queues - pci/0000:06:00.0: - name hairpin_num_queues type driver-specific - values: - cmode driverinit value 2 - -- Change the number of hairpin queues:: - - $ devlink dev param set pci/0000:06:00.0 name hairpin_num_queues value 4 cmode driverinit - -hairpin_queue_size: Size of the hairpin queues ----------------------------------------------- -Control the size of the hairpin queues. - -- Show the size of the hairpin queues:: - - $ devlink dev param show pci/0000:06:00.0 name hairpin_queue_size - pci/0000:06:00.0: - name hairpin_queue_size type driver-specific - values: - cmode driverinit value 1024 - -- Change the size (in packets) of the hairpin queues:: - - $ devlink dev param set pci/0000:06:00.0 name hairpin_queue_size value 512 cmode driverinit - -Health reporters -================ - -tx reporter ------------ -The tx reporter is responsible for reporting and recovering of the following two error scenarios: - -- tx timeout - Report on kernel tx timeout detection. - Recover by searching lost interrupts. -- tx error completion - Report on error tx completion. - Recover by flushing the tx queue and reset it. - -tx reporter also support on demand diagnose callback, on which it provides -real time information of its send queues status. - -User commands examples: - -- Diagnose send queues status:: - - $ devlink health diagnose pci/0000:82:00.0 reporter tx - -.. note:: - This command has valid output only when interface is up, otherwise the command has empty output. - -- Show number of tx errors indicated, number of recover flows ended successfully, - is autorecover enabled and graceful period from last recover:: - - $ devlink health show pci/0000:82:00.0 reporter tx - -rx reporter ------------ -The rx reporter is responsible for reporting and recovering of the following two error scenarios: - -- rx queues' initialization (population) timeout - Population of rx queues' descriptors on ring initialization is done - in napi context via triggering an irq. In case of a failure to get - the minimum amount of descriptors, a timeout would occur, and - descriptors could be recovered by polling the EQ (Event Queue). -- rx completions with errors (reported by HW on interrupt context) - Report on rx completion error. - Recover (if needed) by flushing the related queue and reset it. - -rx reporter also supports on demand diagnose callback, on which it -provides real time information of its receive queues' status. - -- Diagnose rx queues' status and corresponding completion queue:: - - $ devlink health diagnose pci/0000:82:00.0 reporter rx - -NOTE: This command has valid output only when interface is up. Otherwise, the command has empty output. - -- Show number of rx errors indicated, number of recover flows ended successfully, - is autorecover enabled, and graceful period from last recover:: - - $ devlink health show pci/0000:82:00.0 reporter rx - -fw reporter ------------ -The fw reporter implements `diagnose` and `dump` callbacks. -It follows symptoms of fw error such as fw syndrome by triggering -fw core dump and storing it into the dump buffer. -The fw reporter diagnose command can be triggered any time by the user to check -current fw status. - -User commands examples: - -- Check fw heath status:: - - $ devlink health diagnose pci/0000:82:00.0 reporter fw - -- Read FW core dump if already stored or trigger new one:: - - $ devlink health dump show pci/0000:82:00.0 reporter fw - -.. note:: - This command can run only on the PF which has fw tracer ownership, - running it on other PF or any VF will return "Operation not permitted". - -fw fatal reporter ------------------ -The fw fatal reporter implements `dump` and `recover` callbacks. -It follows fatal errors indications by CR-space dump and recover flow. -The CR-space dump uses vsc interface which is valid even if the FW command -interface is not functional, which is the case in most FW fatal errors. -The recover function runs recover flow which reloads the driver and triggers fw -reset if needed. -On firmware error, the health buffer is dumped into the dmesg. The log -level is derived from the error's severity (given in health buffer). - -User commands examples: - -- Run fw recover flow manually:: - - $ devlink health recover pci/0000:82:00.0 reporter fw_fatal - -- Read FW CR-space dump if already stored or trigger new one:: - - $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal - -.. note:: - This command can run only on PF. - -vnic reporter -------------- -The vnic reporter implements only the `diagnose` callback. -It is responsible for querying the vnic diagnostic counters from fw and displaying -them in realtime. - -Description of the vnic counters: - -- total_q_under_processor_handle - number of queues in an error state due to - an async error or errored command. -- send_queue_priority_update_flow - number of QP/SQ priority/SL update events. -- cq_overrun - number of times CQ entered an error state due to an overflow. -- async_eq_overrun - number of times an EQ mapped to async events was overrun. - comp_eq_overrun number of times an EQ mapped to completion events was - overrun. -- quota_exceeded_command - number of commands issued and failed due to quota exceeded. -- invalid_command - number of commands issued and failed dues to any reason other than quota - exceeded. -- nic_receive_steering_discard - number of packets that completed RX flow - steering but were discarded due to a mismatch in flow table. -- generated_pkt_steering_fail - number of packets generated by the VNIC experiencing unexpected steering - failure (at any point in steering flow). -- handled_pkt_steering_fail - number of packets handled by the VNIC experiencing unexpected steering - failure (at any point in steering flow owned by the VNIC, including the FDB - for the eswitch owner). - -User commands examples: - -- Diagnose PF/VF vnic counters:: - - $ devlink health diagnose pci/0000:82:00.1 reporter vnic - -- Diagnose representor vnic counters (performed by supplying devlink port of the - representor, which can be obtained via devlink port command):: - - $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic - -.. note:: - This command can run over all interfaces such as PF/VF and representor ports. diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst index 3fdcd6b61ccfa..581a91caa5795 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/index.rst @@ -13,7 +13,6 @@ Contents: :maxdepth: 2
kconfig - devlink switchdev tracepoints counters diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 202798d6501e7..196a4bb28df1e 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -18,6 +18,11 @@ Parameters * - ``enable_roce`` - driverinit - Type: Boolean + + If the device supports RoCE disablement, RoCE enablement state controls + device support for RoCE capability. Otherwise, the control occurs in the + driver stack. When RoCE is disabled at the driver level, only raw + ethernet QPs are supported. * - ``io_eq_size`` - driverinit - The range is between 64 and 4096. @@ -48,6 +53,9 @@ parameters. * ``smfs`` Software managed flow steering. In SMFS mode, the HW steering entities are created and manage through the driver without firmware intervention. + + SMFS mode is faster and provides better rule insertion rate compared to + default DMFS mode. * - ``fdb_large_groups`` - u32 - driverinit @@ -71,7 +79,24 @@ parameters. deprecated.
Default: disabled + * - ``esw_port_metadata`` + - Boolean + - runtime + - When applicable, disabling eswitch metadata can increase packet rate up + to 20% depending on the use case and packet sizes. + + Eswitch port metadata state controls whether to internally tag packets + with metadata. Metadata tagging must be enabled for multi-port RoCE, + failover between representors and stacked devices. By default metadata is + enabled on the supported devices in E-switch. Metadata is applicable only + for E-switch in switchdev mode and users may disable it when NONE of the + below use cases will be in use: + 1. HCA is in Dual/multi-port RoCE mode. + 2. VF/SF representor bonding (Usually used for Live migration) + 3. Stacked devices
+ When metadata is disabled, the above use cases will fail to initialize if + users try to enable them. * - ``hairpin_num_queues`` - u32 - driverinit @@ -104,3 +129,157 @@ The ``mlx5`` driver reports the following versions * - ``fw.version`` - stored, running - Three digit major.minor.subminor firmware version number. + +Health reporters +================ + +tx reporter +----------- +The tx reporter is responsible for reporting and recovering of the following two error scenarios: + +- tx timeout + Report on kernel tx timeout detection. + Recover by searching lost interrupts. +- tx error completion + Report on error tx completion. + Recover by flushing the tx queue and reset it. + +tx reporter also support on demand diagnose callback, on which it provides +real time information of its send queues status. + +User commands examples: + +- Diagnose send queues status:: + + $ devlink health diagnose pci/0000:82:00.0 reporter tx + +.. note:: + This command has valid output only when interface is up, otherwise the command has empty output. + +- Show number of tx errors indicated, number of recover flows ended successfully, + is autorecover enabled and graceful period from last recover:: + + $ devlink health show pci/0000:82:00.0 reporter tx + +rx reporter +----------- +The rx reporter is responsible for reporting and recovering of the following two error scenarios: + +- rx queues' initialization (population) timeout + Population of rx queues' descriptors on ring initialization is done + in napi context via triggering an irq. In case of a failure to get + the minimum amount of descriptors, a timeout would occur, and + descriptors could be recovered by polling the EQ (Event Queue). +- rx completions with errors (reported by HW on interrupt context) + Report on rx completion error. + Recover (if needed) by flushing the related queue and reset it. + +rx reporter also supports on demand diagnose callback, on which it +provides real time information of its receive queues' status. + +- Diagnose rx queues' status and corresponding completion queue:: + + $ devlink health diagnose pci/0000:82:00.0 reporter rx + +.. note:: + This command has valid output only when interface is up. Otherwise, the command has empty output. + +- Show number of rx errors indicated, number of recover flows ended successfully, + is autorecover enabled, and graceful period from last recover:: + + $ devlink health show pci/0000:82:00.0 reporter rx + +fw reporter +----------- +The fw reporter implements `diagnose` and `dump` callbacks. +It follows symptoms of fw error such as fw syndrome by triggering +fw core dump and storing it into the dump buffer. +The fw reporter diagnose command can be triggered any time by the user to check +current fw status. + +User commands examples: + +- Check fw heath status:: + + $ devlink health diagnose pci/0000:82:00.0 reporter fw + +- Read FW core dump if already stored or trigger new one:: + + $ devlink health dump show pci/0000:82:00.0 reporter fw + +.. note:: + This command can run only on the PF which has fw tracer ownership, + running it on other PF or any VF will return "Operation not permitted". + +fw fatal reporter +----------------- +The fw fatal reporter implements `dump` and `recover` callbacks. +It follows fatal errors indications by CR-space dump and recover flow. +The CR-space dump uses vsc interface which is valid even if the FW command +interface is not functional, which is the case in most FW fatal errors. +The recover function runs recover flow which reloads the driver and triggers fw +reset if needed. +On firmware error, the health buffer is dumped into the dmesg. The log +level is derived from the error's severity (given in health buffer). + +User commands examples: + +- Run fw recover flow manually:: + + $ devlink health recover pci/0000:82:00.0 reporter fw_fatal + +- Read FW CR-space dump if already stored or trigger new one:: + + $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal + +.. note:: + This command can run only on PF. + +vnic reporter +------------- +The vnic reporter implements only the `diagnose` callback. +It is responsible for querying the vnic diagnostic counters from fw and displaying +them in realtime. + +Description of the vnic counters: + +- total_q_under_processor_handle + number of queues in an error state due to + an async error or errored command. +- send_queue_priority_update_flow + number of QP/SQ priority/SL update events. +- cq_overrun + number of times CQ entered an error state due to an overflow. +- async_eq_overrun + number of times an EQ mapped to async events was overrun. + comp_eq_overrun number of times an EQ mapped to completion events was + overrun. +- quota_exceeded_command + number of commands issued and failed due to quota exceeded. +- invalid_command + number of commands issued and failed dues to any reason other than quota + exceeded. +- nic_receive_steering_discard + number of packets that completed RX flow + steering but were discarded due to a mismatch in flow table. +- generated_pkt_steering_fail + number of packets generated by the VNIC experiencing unexpected steering + failure (at any point in steering flow). +- handled_pkt_steering_fail + number of packets handled by the VNIC experiencing unexpected steering + failure (at any point in steering flow owned by the VNIC, including the FDB + for the eswitch owner). + +User commands examples: + +- Diagnose PF/VF vnic counters:: + + $ devlink health diagnose pci/0000:82:00.1 reporter vnic + +- Diagnose representor vnic counters (performed by supplying devlink port of the + representor, which can be obtained via devlink port command):: + + $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic + +.. note:: + This command can run over all interfaces such as PF/VF and representor ports.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 3178308ad4ca38955cad684d235153d4939f1fcd ]
Use a map structure for associating CQEs containing port timestamping information with the appropriate skb. Track order of WQEs submitted using a FIFO. Check if the corresponding port timestamping CQEs from the lookup values in the FIFO are considered dropped due to time elapsed. Return the lookup value to a freelist after consuming the skb. Reuse the freed lookup in future WQE submission iterations.
The map structure uses an integer identifier for the key and returns an skb corresponding to that identifier. Embed the integer identifier in the WQE submitted to the WQ for the transmit path when the SQ is a PTP (port timestamping) SQ. The embedded identifier can then be queried using a field in the CQE of the corresponding port timestamping CQ. In the port timestamping napi_poll context, the identifier is queried from the CQE polled from CQ and used to lookup the corresponding skb from the WQE submit path. The skb reference is removed from map and then embedded with the port HW timestamp information from the CQE and eventually consumed.
The metadata freelist FIFO is an array containing integer identifiers that can be pushed and popped in the FIFO. The purpose of this structure is bookkeeping what identifier values can safely be used in a subsequent WQE submission and should not contain identifiers that have still not been reaped by processing a corresponding CQE completion on the port timestamping CQ.
The ts_cqe_pending_list structure is a combination of an array and linked list. The array is pre-populated with the nodes that will be added and removed from the head of the linked list. Each node contains the unique identifier value associated with the values submitted in the WQEs and retrieved in the port timestamping CQEs. When a WQE is submitted, the node in the array corresponding to the identifier popped from the metadata freelist is added to the end of the CQE pending list and is marked as "in-use". The node is removed from the linked list under two conditions. The first condition is that the corresponding port timestamping CQE is polled in the PTP napi_poll context. The second condition is that more than a second has elapsed since the DMA timestamp value corresponding to the WQE submission. When the first condition occurs, the "in-use" bit in the linked list node is cleared, and the resources corresponding to the WQE submission are then released. The second condition, however, indicates that the port timestamping CQE will likely never be delivered. It's not impossible for the device to post a CQE after an infinite amount of time though highly improbable. In order to be resilient to this improbable case, resources related to the corresponding WQE submission are still kept, the identifier value is not returned to the freelist, and the "in-use" bit is cleared on the node to indicate that it's no longer part of the linked list of "likely to be delivered" port timestamping CQE identifiers. A count for the number of port timestamping CQEs considered highly likely to never be delivered by the device is maintained. This count gets decremented in the unlikely event a port timestamping CQE considered unlikely to ever be delivered is polled in the PTP napi_poll context.
Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter") Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/counters.rst | 6 + .../net/ethernet/mellanox/mlx5/core/en/ptp.c | 215 +++++++++++++----- .../net/ethernet/mellanox/mlx5/core/en/ptp.h | 57 ++++- .../ethernet/mellanox/mlx5/core/en_ethtool.c | 3 +- .../ethernet/mellanox/mlx5/core/en_stats.c | 4 +- .../ethernet/mellanox/mlx5/core/en_stats.h | 4 +- .../net/ethernet/mellanox/mlx5/core/en_tx.c | 28 ++- 7 files changed, 236 insertions(+), 81 deletions(-)
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst index a395df9c27513..008e560e12b58 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -683,6 +683,12 @@ the software port. time protocol. - Error
+ * - `ptp_cq[i]_late_cqe` + - Number of times a CQE has been delivered on the PTP timestamping CQ when + the CQE was not expected since a certain amount of time had elapsed where + the device typically ensures not posting the CQE. + - Error + .. [#ring_global] The corresponding ring and global counters do not share the same name (i.e. do not follow the common naming scheme).
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c index b0b429a0321ed..8680d21f3e7b0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c @@ -5,6 +5,8 @@ #include "en/txrx.h" #include "en/params.h" #include "en/fs_tt_redirect.h" +#include <linux/list.h> +#include <linux/spinlock.h>
struct mlx5e_ptp_fs { struct mlx5_flow_handle *l2_rule; @@ -19,6 +21,48 @@ struct mlx5e_ptp_params { struct mlx5e_rq_param rq_param; };
+struct mlx5e_ptp_port_ts_cqe_tracker { + u8 metadata_id; + bool inuse : 1; + struct list_head entry; +}; + +struct mlx5e_ptp_port_ts_cqe_list { + struct mlx5e_ptp_port_ts_cqe_tracker *nodes; + struct list_head tracker_list_head; + /* Sync list operations in xmit and napi_poll contexts */ + spinlock_t tracker_list_lock; +}; + +static inline void +mlx5e_ptp_port_ts_cqe_list_add(struct mlx5e_ptp_port_ts_cqe_list *list, u8 metadata) +{ + struct mlx5e_ptp_port_ts_cqe_tracker *tracker = &list->nodes[metadata]; + + WARN_ON_ONCE(tracker->inuse); + tracker->inuse = true; + spin_lock(&list->tracker_list_lock); + list_add_tail(&tracker->entry, &list->tracker_list_head); + spin_unlock(&list->tracker_list_lock); +} + +static void +mlx5e_ptp_port_ts_cqe_list_remove(struct mlx5e_ptp_port_ts_cqe_list *list, u8 metadata) +{ + struct mlx5e_ptp_port_ts_cqe_tracker *tracker = &list->nodes[metadata]; + + WARN_ON_ONCE(!tracker->inuse); + tracker->inuse = false; + spin_lock(&list->tracker_list_lock); + list_del(&tracker->entry); + spin_unlock(&list->tracker_list_lock); +} + +void mlx5e_ptpsq_track_metadata(struct mlx5e_ptpsq *ptpsq, u8 metadata) +{ + mlx5e_ptp_port_ts_cqe_list_add(ptpsq->ts_cqe_pending_list, metadata); +} + struct mlx5e_skb_cb_hwtstamp { ktime_t cqe_hwtstamp; ktime_t port_hwtstamp; @@ -79,75 +123,88 @@ void mlx5e_skb_cb_hwtstamp_handler(struct sk_buff *skb, int hwtstamp_type, memset(skb->cb, 0, sizeof(struct mlx5e_skb_cb_hwtstamp)); }
-#define PTP_WQE_CTR2IDX(val) ((val) & ptpsq->ts_cqe_ctr_mask) - -static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_ci, u16 skb_id) +static struct sk_buff * +mlx5e_ptp_metadata_map_lookup(struct mlx5e_ptp_metadata_map *map, u16 metadata) { - return (ptpsq->ts_cqe_ctr_mask && (skb_ci != skb_id)); + return map->data[metadata]; }
-static bool mlx5e_ptp_ts_cqe_ooo(struct mlx5e_ptpsq *ptpsq, u16 skb_id) +static struct sk_buff * +mlx5e_ptp_metadata_map_remove(struct mlx5e_ptp_metadata_map *map, u16 metadata) { - u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); - u16 skb_pi = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_pc); + struct sk_buff *skb;
- if (PTP_WQE_CTR2IDX(skb_id - skb_ci) >= PTP_WQE_CTR2IDX(skb_pi - skb_ci)) - return true; + skb = map->data[metadata]; + map->data[metadata] = NULL;
- return false; + return skb; }
-static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_ci, - u16 skb_id, int budget) +static void mlx5e_ptpsq_mark_ts_cqes_undelivered(struct mlx5e_ptpsq *ptpsq, + ktime_t port_tstamp) { - struct skb_shared_hwtstamps hwts = {}; - struct sk_buff *skb; + struct mlx5e_ptp_port_ts_cqe_list *cqe_list = ptpsq->ts_cqe_pending_list; + ktime_t timeout = ns_to_ktime(MLX5E_PTP_TS_CQE_UNDELIVERED_TIMEOUT); + struct mlx5e_ptp_metadata_map *metadata_map = &ptpsq->metadata_map; + struct mlx5e_ptp_port_ts_cqe_tracker *pos, *n; + + spin_lock(&cqe_list->tracker_list_lock); + list_for_each_entry_safe(pos, n, &cqe_list->tracker_list_head, entry) { + struct sk_buff *skb = + mlx5e_ptp_metadata_map_lookup(metadata_map, pos->metadata_id); + ktime_t dma_tstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp;
- ptpsq->cq_stats->resync_event++; + if (!dma_tstamp || + ktime_after(ktime_add(dma_tstamp, timeout), port_tstamp)) + break;
- while (skb_ci != skb_id) { - skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo); - hwts.hwtstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp; - skb_tstamp_tx(skb, &hwts); - ptpsq->cq_stats->resync_cqe++; - napi_consume_skb(skb, budget); - skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); + metadata_map->undelivered_counter++; + WARN_ON_ONCE(!pos->inuse); + pos->inuse = false; + list_del(&pos->entry); } + spin_unlock(&cqe_list->tracker_list_lock); }
+#define PTP_WQE_CTR2IDX(val) ((val) & ptpsq->ts_cqe_ctr_mask) + static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, struct mlx5_cqe64 *cqe, int budget) { - u16 skb_id = PTP_WQE_CTR2IDX(be16_to_cpu(cqe->wqe_counter)); - u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); + struct mlx5e_ptp_port_ts_cqe_list *pending_cqe_list = ptpsq->ts_cqe_pending_list; + u8 metadata_id = PTP_WQE_CTR2IDX(be16_to_cpu(cqe->wqe_counter)); + bool is_err_cqe = !!MLX5E_RX_ERR_CQE(cqe); struct mlx5e_txqsq *sq = &ptpsq->txqsq; struct sk_buff *skb; ktime_t hwtstamp;
- if (unlikely(MLX5E_RX_ERR_CQE(cqe))) { - skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo); - ptpsq->cq_stats->err_cqe++; - goto out; + if (likely(pending_cqe_list->nodes[metadata_id].inuse)) { + mlx5e_ptp_port_ts_cqe_list_remove(pending_cqe_list, metadata_id); + } else { + /* Reclaim space in the unlikely event CQE was delivered after + * marking it late. + */ + ptpsq->metadata_map.undelivered_counter--; + ptpsq->cq_stats->late_cqe++; }
- if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_ci, skb_id)) { - if (mlx5e_ptp_ts_cqe_ooo(ptpsq, skb_id)) { - /* already handled by a previous resync */ - ptpsq->cq_stats->ooo_cqe_drop++; - return; - } - mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_ci, skb_id, budget); + skb = mlx5e_ptp_metadata_map_remove(&ptpsq->metadata_map, metadata_id); + + if (unlikely(is_err_cqe)) { + ptpsq->cq_stats->err_cqe++; + goto out; }
- skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo); hwtstamp = mlx5e_cqe_ts_to_ns(sq->ptp_cyc2time, sq->clock, get_cqe_ts(cqe)); mlx5e_skb_cb_hwtstamp_handler(skb, MLX5E_SKB_CB_PORT_HWTSTAMP, hwtstamp, ptpsq->cq_stats); ptpsq->cq_stats->cqe++;
+ mlx5e_ptpsq_mark_ts_cqes_undelivered(ptpsq, hwtstamp); out: napi_consume_skb(skb, budget); + mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, metadata_id); }
static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget) @@ -291,36 +348,78 @@ static void mlx5e_ptp_destroy_sq(struct mlx5_core_dev *mdev, u32 sqn)
static int mlx5e_ptp_alloc_traffic_db(struct mlx5e_ptpsq *ptpsq, int numa) { - int wq_sz = mlx5_wq_cyc_get_size(&ptpsq->txqsq.wq); - struct mlx5_core_dev *mdev = ptpsq->txqsq.mdev; + struct mlx5e_ptp_metadata_fifo *metadata_freelist = &ptpsq->metadata_freelist; + struct mlx5e_ptp_metadata_map *metadata_map = &ptpsq->metadata_map; + struct mlx5e_ptp_port_ts_cqe_list *cqe_list; + int db_sz; + int md;
- ptpsq->skb_fifo.fifo = kvzalloc_node(array_size(wq_sz, sizeof(*ptpsq->skb_fifo.fifo)), - GFP_KERNEL, numa); - if (!ptpsq->skb_fifo.fifo) + cqe_list = kvzalloc_node(sizeof(*ptpsq->ts_cqe_pending_list), GFP_KERNEL, numa); + if (!cqe_list) return -ENOMEM; + ptpsq->ts_cqe_pending_list = cqe_list; + + db_sz = min_t(u32, mlx5_wq_cyc_get_size(&ptpsq->txqsq.wq), + 1 << MLX5_CAP_GEN_2(ptpsq->txqsq.mdev, + ts_cqe_metadata_size2wqe_counter)); + ptpsq->ts_cqe_ctr_mask = db_sz - 1; + + cqe_list->nodes = kvzalloc_node(array_size(db_sz, sizeof(*cqe_list->nodes)), + GFP_KERNEL, numa); + if (!cqe_list->nodes) + goto free_cqe_list; + INIT_LIST_HEAD(&cqe_list->tracker_list_head); + spin_lock_init(&cqe_list->tracker_list_lock); + + metadata_freelist->data = + kvzalloc_node(array_size(db_sz, sizeof(*metadata_freelist->data)), + GFP_KERNEL, numa); + if (!metadata_freelist->data) + goto free_cqe_list_nodes; + metadata_freelist->mask = ptpsq->ts_cqe_ctr_mask; + + for (md = 0; md < db_sz; ++md) { + cqe_list->nodes[md].metadata_id = md; + metadata_freelist->data[md] = md; + } + metadata_freelist->pc = db_sz; + + metadata_map->data = + kvzalloc_node(array_size(db_sz, sizeof(*metadata_map->data)), + GFP_KERNEL, numa); + if (!metadata_map->data) + goto free_metadata_freelist; + metadata_map->capacity = db_sz;
- ptpsq->skb_fifo.pc = &ptpsq->skb_fifo_pc; - ptpsq->skb_fifo.cc = &ptpsq->skb_fifo_cc; - ptpsq->skb_fifo.mask = wq_sz - 1; - if (MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter)) - ptpsq->ts_cqe_ctr_mask = - (1 << MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter)) - 1; return 0; + +free_metadata_freelist: + kvfree(metadata_freelist->data); +free_cqe_list_nodes: + kvfree(cqe_list->nodes); +free_cqe_list: + kvfree(cqe_list); + return -ENOMEM; }
-static void mlx5e_ptp_drain_skb_fifo(struct mlx5e_skb_fifo *skb_fifo) +static void mlx5e_ptp_drain_metadata_map(struct mlx5e_ptp_metadata_map *map) { - while (*skb_fifo->pc != *skb_fifo->cc) { - struct sk_buff *skb = mlx5e_skb_fifo_pop(skb_fifo); + int idx; + + for (idx = 0; idx < map->capacity; ++idx) { + struct sk_buff *skb = map->data[idx];
dev_kfree_skb_any(skb); } }
-static void mlx5e_ptp_free_traffic_db(struct mlx5e_skb_fifo *skb_fifo) +static void mlx5e_ptp_free_traffic_db(struct mlx5e_ptpsq *ptpsq) { - mlx5e_ptp_drain_skb_fifo(skb_fifo); - kvfree(skb_fifo->fifo); + mlx5e_ptp_drain_metadata_map(&ptpsq->metadata_map); + kvfree(ptpsq->metadata_map.data); + kvfree(ptpsq->metadata_freelist.data); + kvfree(ptpsq->ts_cqe_pending_list->nodes); + kvfree(ptpsq->ts_cqe_pending_list); }
static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn, @@ -348,8 +447,7 @@ static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn, if (err) goto err_free_txqsq;
- err = mlx5e_ptp_alloc_traffic_db(ptpsq, - dev_to_node(mlx5_core_dma_dev(c->mdev))); + err = mlx5e_ptp_alloc_traffic_db(ptpsq, dev_to_node(mlx5_core_dma_dev(c->mdev))); if (err) goto err_free_txqsq;
@@ -366,7 +464,7 @@ static void mlx5e_ptp_close_txqsq(struct mlx5e_ptpsq *ptpsq) struct mlx5e_txqsq *sq = &ptpsq->txqsq; struct mlx5_core_dev *mdev = sq->mdev;
- mlx5e_ptp_free_traffic_db(&ptpsq->skb_fifo); + mlx5e_ptp_free_traffic_db(ptpsq); cancel_work_sync(&sq->recover_work); mlx5e_ptp_destroy_sq(mdev, sq->sqn); mlx5e_free_txqsq_descs(sq); @@ -534,7 +632,10 @@ static void mlx5e_ptp_build_params(struct mlx5e_ptp *c,
/* SQ */ if (test_bit(MLX5E_PTP_STATE_TX, c->state)) { - params->log_sq_size = orig->log_sq_size; + params->log_sq_size = + min(MLX5_CAP_GEN_2(c->mdev, ts_cqe_metadata_size2wqe_counter), + MLX5E_PTP_MAX_LOG_SQ_SIZE); + params->log_sq_size = min(params->log_sq_size, orig->log_sq_size); mlx5e_ptp_build_sq_param(c->mdev, params, &cparams->txq_sq_param); } /* RQ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h index cc7efde88ac3c..7c5597d4589df 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h @@ -7,18 +7,36 @@ #include "en.h" #include "en_stats.h" #include "en/txrx.h" +#include <linux/ktime.h> #include <linux/ptp_classify.h> +#include <linux/time64.h>
#define MLX5E_PTP_CHANNEL_IX 0 +#define MLX5E_PTP_MAX_LOG_SQ_SIZE (8U) +#define MLX5E_PTP_TS_CQE_UNDELIVERED_TIMEOUT (1 * NSEC_PER_SEC) + +struct mlx5e_ptp_metadata_fifo { + u8 cc; + u8 pc; + u8 mask; + u8 *data; +}; + +struct mlx5e_ptp_metadata_map { + u16 undelivered_counter; + u16 capacity; + struct sk_buff **data; +};
struct mlx5e_ptpsq { struct mlx5e_txqsq txqsq; struct mlx5e_cq ts_cq; - u16 skb_fifo_cc; - u16 skb_fifo_pc; - struct mlx5e_skb_fifo skb_fifo; struct mlx5e_ptp_cq_stats *cq_stats; u16 ts_cqe_ctr_mask; + + struct mlx5e_ptp_port_ts_cqe_list *ts_cqe_pending_list; + struct mlx5e_ptp_metadata_fifo metadata_freelist; + struct mlx5e_ptp_metadata_map metadata_map; };
enum { @@ -69,12 +87,35 @@ static inline bool mlx5e_use_ptpsq(struct sk_buff *skb) fk.ports.dst == htons(PTP_EV_PORT)); }
-static inline bool mlx5e_ptpsq_fifo_has_room(struct mlx5e_txqsq *sq) +static inline void mlx5e_ptp_metadata_fifo_push(struct mlx5e_ptp_metadata_fifo *fifo, u8 metadata) { - if (!sq->ptpsq) - return true; + fifo->data[fifo->mask & fifo->pc++] = metadata; +} + +static inline u8 +mlx5e_ptp_metadata_fifo_pop(struct mlx5e_ptp_metadata_fifo *fifo) +{ + return fifo->data[fifo->mask & fifo->cc++]; +}
- return mlx5e_skb_fifo_has_room(&sq->ptpsq->skb_fifo); +static inline void +mlx5e_ptp_metadata_map_put(struct mlx5e_ptp_metadata_map *map, + struct sk_buff *skb, u8 metadata) +{ + WARN_ON_ONCE(map->data[metadata]); + map->data[metadata] = skb; +} + +static inline bool mlx5e_ptpsq_metadata_freelist_empty(struct mlx5e_ptpsq *ptpsq) +{ + struct mlx5e_ptp_metadata_fifo *freelist; + + if (likely(!ptpsq)) + return false; + + freelist = &ptpsq->metadata_freelist; + + return freelist->pc == freelist->cc; }
int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params, @@ -89,6 +130,8 @@ void mlx5e_ptp_free_rx_fs(struct mlx5e_flow_steering *fs, const struct mlx5e_profile *profile); int mlx5e_ptp_rx_manage_fs(struct mlx5e_priv *priv, bool set);
+void mlx5e_ptpsq_track_metadata(struct mlx5e_ptpsq *ptpsq, u8 metadata); + enum { MLX5E_SKB_CB_CQE_HWTSTAMP = BIT(0), MLX5E_SKB_CB_PORT_HWTSTAMP = BIT(1), diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 27861b68ced57..3d2d5d3b59f0b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -2061,7 +2061,8 @@ static int set_pflag_tx_port_ts(struct net_device *netdev, bool enable) struct mlx5e_params new_params; int err;
- if (!MLX5_CAP_GEN(mdev, ts_cqe_to_dest_cqn)) + if (!MLX5_CAP_GEN(mdev, ts_cqe_to_dest_cqn) || + !MLX5_CAP_GEN_2(mdev, ts_cqe_metadata_size2wqe_counter)) return -EOPNOTSUPP;
/* Don't allow changing the PTP state if HTB offload is active, because diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 4d77055abd4be..dfdd357974164 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -2142,9 +2142,7 @@ static const struct counter_desc ptp_cq_stats_desc[] = { { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, err_cqe) }, { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, abort) }, { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, abort_abs_diff_ns) }, - { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_cqe) }, - { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, resync_event) }, - { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, ooo_cqe_drop) }, + { MLX5E_DECLARE_PTP_CQ_STAT(struct mlx5e_ptp_cq_stats, late_cqe) }, };
static const struct counter_desc ptp_rq_stats_desc[] = { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 67938b4ea1b90..13a07e52ae92b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -449,9 +449,7 @@ struct mlx5e_ptp_cq_stats { u64 err_cqe; u64 abort; u64 abort_abs_diff_ns; - u64 resync_cqe; - u64 resync_event; - u64 ooo_cqe_drop; + u64 late_cqe; };
struct mlx5e_rep_stats { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c index c7eb6b238c2ba..d41435c22ce56 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c @@ -372,7 +372,7 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb, const struct mlx5e_tx_attr *attr, const struct mlx5e_tx_wqe_attr *wqe_attr, u8 num_dma, struct mlx5e_tx_wqe_info *wi, struct mlx5_wqe_ctrl_seg *cseg, - bool xmit_more) + struct mlx5_wqe_eth_seg *eseg, bool xmit_more) { struct mlx5_wq_cyc *wq = &sq->wq; bool send_doorbell; @@ -394,11 +394,16 @@ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
mlx5e_tx_check_stop(sq);
- if (unlikely(sq->ptpsq)) { + if (unlikely(sq->ptpsq && + (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))) { + u8 metadata_index = be32_to_cpu(eseg->flow_table_metadata); + mlx5e_skb_cb_hwtstamp_init(skb); - mlx5e_skb_fifo_push(&sq->ptpsq->skb_fifo, skb); + mlx5e_ptpsq_track_metadata(sq->ptpsq, metadata_index); + mlx5e_ptp_metadata_map_put(&sq->ptpsq->metadata_map, skb, + metadata_index); if (!netif_tx_queue_stopped(sq->txq) && - !mlx5e_skb_fifo_has_room(&sq->ptpsq->skb_fifo)) { + mlx5e_ptpsq_metadata_freelist_empty(sq->ptpsq)) { netif_tx_stop_queue(sq->txq); sq->stats->stopped++; } @@ -483,13 +488,16 @@ mlx5e_sq_xmit_wqe(struct mlx5e_txqsq *sq, struct sk_buff *skb, if (unlikely(num_dma < 0)) goto err_drop;
- mlx5e_txwqe_complete(sq, skb, attr, wqe_attr, num_dma, wi, cseg, xmit_more); + mlx5e_txwqe_complete(sq, skb, attr, wqe_attr, num_dma, wi, cseg, eseg, xmit_more);
return;
err_drop: stats->dropped++; dev_kfree_skb_any(skb); + if (unlikely(sq->ptpsq && (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))) + mlx5e_ptp_metadata_fifo_push(&sq->ptpsq->metadata_freelist, + be32_to_cpu(eseg->flow_table_metadata)); mlx5e_tx_flush(sq); }
@@ -645,9 +653,9 @@ void mlx5e_tx_mpwqe_ensure_complete(struct mlx5e_txqsq *sq) static void mlx5e_cqe_ts_id_eseg(struct mlx5e_ptpsq *ptpsq, struct sk_buff *skb, struct mlx5_wqe_eth_seg *eseg) { - if (ptpsq->ts_cqe_ctr_mask && unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) - eseg->flow_table_metadata = cpu_to_be32(ptpsq->skb_fifo_pc & - ptpsq->ts_cqe_ctr_mask); + if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) + eseg->flow_table_metadata = + cpu_to_be32(mlx5e_ptp_metadata_fifo_pop(&ptpsq->metadata_freelist)); }
static void mlx5e_txwqe_build_eseg(struct mlx5e_priv *priv, struct mlx5e_txqsq *sq, @@ -766,7 +774,7 @@ void mlx5e_txqsq_wake(struct mlx5e_txqsq *sq) { if (netif_tx_queue_stopped(sq->txq) && mlx5e_wqc_has_room_for(&sq->wq, sq->cc, sq->pc, sq->stop_room) && - mlx5e_ptpsq_fifo_has_room(sq) && + !mlx5e_ptpsq_metadata_freelist_empty(sq->ptpsq) && !test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) { netif_tx_wake_queue(sq->txq); sq->stats->wake++; @@ -1031,7 +1039,7 @@ void mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb, if (unlikely(num_dma < 0)) goto err_drop;
- mlx5e_txwqe_complete(sq, skb, &attr, &wqe_attr, num_dma, wi, cseg, xmit_more); + mlx5e_txwqe_complete(sq, skb, &attr, &wqe_attr, num_dma, wi, cseg, eseg, xmit_more);
return;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 53b836a44db4259b94ffcfff321fb3d63f976b76 ]
A new check for the tx devlink health reporter is introduced for determining when the PTP port timestamping SQ is considered unhealthy. If there are enough CQEs considered never to be delivered, the space that can be utilized on the SQ decreases significantly, impacting performance and usability of the SQ. The health reporter is triggered when the number of likely never delivered port timestamping CQEs that utilize the space of the PTP SQ is greater than 93.75% of the total capacity of the SQ. A devlink health reporter recover method is also provided for this specific TX error context that restarts the PTP SQ.
Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/devlink/mlx5.rst | 5 +- .../ethernet/mellanox/mlx5/core/en/health.h | 1 + .../net/ethernet/mellanox/mlx5/core/en/ptp.c | 22 +++++++ .../net/ethernet/mellanox/mlx5/core/en/ptp.h | 2 + .../mellanox/mlx5/core/en/reporter_tx.c | 65 +++++++++++++++++++ 5 files changed, 94 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 196a4bb28df1e..702f204a3dbd3 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -135,7 +135,7 @@ Health reporters
tx reporter ----------- -The tx reporter is responsible for reporting and recovering of the following two error scenarios: +The tx reporter is responsible for reporting and recovering of the following three error scenarios:
- tx timeout Report on kernel tx timeout detection. @@ -143,6 +143,9 @@ The tx reporter is responsible for reporting and recovering of the following two - tx error completion Report on error tx completion. Recover by flushing the tx queue and reset it. +- tx PTP port timestamping CQ unhealthy + Report too many CQEs never delivered on port ts CQ. + Recover by flushing and re-creating all PTP channels.
tx reporter also support on demand diagnose callback, on which it provides real time information of its send queues status. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/health.h b/drivers/net/ethernet/mellanox/mlx5/core/en/health.h index 0107e4e73bb06..415840c3ef84f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/health.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/health.h @@ -18,6 +18,7 @@ void mlx5e_reporter_tx_create(struct mlx5e_priv *priv); void mlx5e_reporter_tx_destroy(struct mlx5e_priv *priv); void mlx5e_reporter_tx_err_cqe(struct mlx5e_txqsq *sq); int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq); +void mlx5e_reporter_tx_ptpsq_unhealthy(struct mlx5e_ptpsq *ptpsq);
int mlx5e_health_cq_diag_fmsg(struct mlx5e_cq *cq, struct devlink_fmsg *fmsg); int mlx5e_health_cq_common_diag_fmsg(struct mlx5e_cq *cq, struct devlink_fmsg *fmsg); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c index 8680d21f3e7b0..bb11e644d24f7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c @@ -2,6 +2,7 @@ // Copyright (c) 2020 Mellanox Technologies
#include "en/ptp.h" +#include "en/health.h" #include "en/txrx.h" #include "en/params.h" #include "en/fs_tt_redirect.h" @@ -140,6 +141,12 @@ mlx5e_ptp_metadata_map_remove(struct mlx5e_ptp_metadata_map *map, u16 metadata) return skb; }
+static bool mlx5e_ptp_metadata_map_unhealthy(struct mlx5e_ptp_metadata_map *map) +{ + /* Considered beginning unhealthy state if size * 15 / 2^4 cannot be reclaimed. */ + return map->undelivered_counter > (map->capacity >> 4) * 15; +} + static void mlx5e_ptpsq_mark_ts_cqes_undelivered(struct mlx5e_ptpsq *ptpsq, ktime_t port_tstamp) { @@ -205,6 +212,9 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, out: napi_consume_skb(skb, budget); mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, metadata_id); + if (unlikely(mlx5e_ptp_metadata_map_unhealthy(&ptpsq->metadata_map)) && + !test_and_set_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) + queue_work(ptpsq->txqsq.priv->wq, &ptpsq->report_unhealthy_work); }
static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget) @@ -422,6 +432,14 @@ static void mlx5e_ptp_free_traffic_db(struct mlx5e_ptpsq *ptpsq) kvfree(ptpsq->ts_cqe_pending_list); }
+static void mlx5e_ptpsq_unhealthy_work(struct work_struct *work) +{ + struct mlx5e_ptpsq *ptpsq = + container_of(work, struct mlx5e_ptpsq, report_unhealthy_work); + + mlx5e_reporter_tx_ptpsq_unhealthy(ptpsq); +} + static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn, int txq_ix, struct mlx5e_ptp_params *cparams, int tc, struct mlx5e_ptpsq *ptpsq) @@ -451,6 +469,8 @@ static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn, if (err) goto err_free_txqsq;
+ INIT_WORK(&ptpsq->report_unhealthy_work, mlx5e_ptpsq_unhealthy_work); + return 0;
err_free_txqsq: @@ -464,6 +484,8 @@ static void mlx5e_ptp_close_txqsq(struct mlx5e_ptpsq *ptpsq) struct mlx5e_txqsq *sq = &ptpsq->txqsq; struct mlx5_core_dev *mdev = sq->mdev;
+ if (current_work() != &ptpsq->report_unhealthy_work) + cancel_work_sync(&ptpsq->report_unhealthy_work); mlx5e_ptp_free_traffic_db(ptpsq); cancel_work_sync(&sq->recover_work); mlx5e_ptp_destroy_sq(mdev, sq->sqn); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h index 7c5597d4589df..7b700d0f956a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h @@ -10,6 +10,7 @@ #include <linux/ktime.h> #include <linux/ptp_classify.h> #include <linux/time64.h> +#include <linux/workqueue.h>
#define MLX5E_PTP_CHANNEL_IX 0 #define MLX5E_PTP_MAX_LOG_SQ_SIZE (8U) @@ -34,6 +35,7 @@ struct mlx5e_ptpsq { struct mlx5e_ptp_cq_stats *cq_stats; u16 ts_cqe_ctr_mask;
+ struct work_struct report_unhealthy_work; struct mlx5e_ptp_port_ts_cqe_list *ts_cqe_pending_list; struct mlx5e_ptp_metadata_fifo metadata_freelist; struct mlx5e_ptp_metadata_map metadata_map; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c index b35ff289af492..ff8242f67c545 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c @@ -164,6 +164,43 @@ static int mlx5e_tx_reporter_timeout_recover(void *ctx) return err; }
+static int mlx5e_tx_reporter_ptpsq_unhealthy_recover(void *ctx) +{ + struct mlx5e_ptpsq *ptpsq = ctx; + struct mlx5e_channels *chs; + struct net_device *netdev; + struct mlx5e_priv *priv; + int carrier_ok; + int err; + + if (!test_bit(MLX5E_SQ_STATE_RECOVERING, &ptpsq->txqsq.state)) + return 0; + + priv = ptpsq->txqsq.priv; + + mutex_lock(&priv->state_lock); + chs = &priv->channels; + netdev = priv->netdev; + + carrier_ok = netif_carrier_ok(netdev); + netif_carrier_off(netdev); + + mlx5e_deactivate_priv_channels(priv); + + mlx5e_ptp_close(chs->ptp); + err = mlx5e_ptp_open(priv, &chs->params, chs->c[0]->lag_port, &chs->ptp); + + mlx5e_activate_priv_channels(priv); + + /* return carrier back if needed */ + if (carrier_ok) + netif_carrier_on(netdev); + + mutex_unlock(&priv->state_lock); + + return err; +} + /* state lock cannot be grabbed within this function. * It can cause a dead lock or a read-after-free. */ @@ -516,6 +553,15 @@ static int mlx5e_tx_reporter_timeout_dump(struct mlx5e_priv *priv, struct devlin return mlx5e_tx_reporter_dump_sq(priv, fmsg, to_ctx->sq); }
+static int mlx5e_tx_reporter_ptpsq_unhealthy_dump(struct mlx5e_priv *priv, + struct devlink_fmsg *fmsg, + void *ctx) +{ + struct mlx5e_ptpsq *ptpsq = ctx; + + return mlx5e_tx_reporter_dump_sq(priv, fmsg, &ptpsq->txqsq); +} + static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv, struct devlink_fmsg *fmsg) { @@ -621,6 +667,25 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq) return to_ctx.status; }
+void mlx5e_reporter_tx_ptpsq_unhealthy(struct mlx5e_ptpsq *ptpsq) +{ + struct mlx5e_ptp_metadata_map *map = &ptpsq->metadata_map; + char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN]; + struct mlx5e_txqsq *txqsq = &ptpsq->txqsq; + struct mlx5e_cq *ts_cq = &ptpsq->ts_cq; + struct mlx5e_priv *priv = txqsq->priv; + struct mlx5e_err_ctx err_ctx = {}; + + err_ctx.ctx = ptpsq; + err_ctx.recover = mlx5e_tx_reporter_ptpsq_unhealthy_recover; + err_ctx.dump = mlx5e_tx_reporter_ptpsq_unhealthy_dump; + snprintf(err_str, sizeof(err_str), + "Unhealthy TX port TS queue: %d, SQ: 0x%x, CQ: 0x%x, Undelivered CQEs: %u Map Capacity: %u", + txqsq->ch_ix, txqsq->sqn, ts_cq->mcq.cqn, map->undelivered_counter, map->capacity); + + mlx5e_health_report(priv, priv->tx_reporter, err_str, &err_ctx); +} + static const struct devlink_health_reporter_ops mlx5_tx_reporter_ops = { .name = "tx", .recover = mlx5e_tx_reporter_recover,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 92214be5979c0961a471b7eaaaeacab41bdf456c ]
Previously, mlx5e_ptp_poll_ts_cq would update the device doorbell with the incremented consumer index after the relevant software counters in the kernel were updated. In the mlx5e_sq_xmit_wqe context, this would lead to either overrunning the device CQ or exceeding the expected software buffer size in the device CQ if the device CQ size was greater than the software buffer size. Update the relevant software counter only after updating the device CQ consumer index in the port timestamping napi_poll context.
Log: mlx5_core 0000:08:00.0: cq_err_event_notifier:517:(pid 0): CQ error on CQN 0x487, syndrome 0x1 mlx5_core 0000:08:00.0 eth2: mlx5e_cq_error_event: cqn=0x000487 event=0x04
Fixes: 1880bc4e4a96 ("net/mlx5e: Add TX port timestamp support") Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-12-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en/ptp.c | 20 +++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c index bb11e644d24f7..af3928eddafd1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c @@ -177,6 +177,8 @@ static void mlx5e_ptpsq_mark_ts_cqes_undelivered(struct mlx5e_ptpsq *ptpsq,
static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, struct mlx5_cqe64 *cqe, + u8 *md_buff, + u8 *md_buff_sz, int budget) { struct mlx5e_ptp_port_ts_cqe_list *pending_cqe_list = ptpsq->ts_cqe_pending_list; @@ -211,19 +213,24 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, mlx5e_ptpsq_mark_ts_cqes_undelivered(ptpsq, hwtstamp); out: napi_consume_skb(skb, budget); - mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, metadata_id); + md_buff[*md_buff_sz++] = metadata_id; if (unlikely(mlx5e_ptp_metadata_map_unhealthy(&ptpsq->metadata_map)) && !test_and_set_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) queue_work(ptpsq->txqsq.priv->wq, &ptpsq->report_unhealthy_work); }
-static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget) +static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int napi_budget) { struct mlx5e_ptpsq *ptpsq = container_of(cq, struct mlx5e_ptpsq, ts_cq); - struct mlx5_cqwq *cqwq = &cq->wq; + int budget = min(napi_budget, MLX5E_TX_CQ_POLL_BUDGET); + u8 metadata_buff[MLX5E_TX_CQ_POLL_BUDGET]; + u8 metadata_buff_sz = 0; + struct mlx5_cqwq *cqwq; struct mlx5_cqe64 *cqe; int work_done = 0;
+ cqwq = &cq->wq; + if (unlikely(!test_bit(MLX5E_SQ_STATE_ENABLED, &ptpsq->txqsq.state))) return false;
@@ -234,7 +241,8 @@ static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget) do { mlx5_cqwq_pop(cqwq);
- mlx5e_ptp_handle_ts_cqe(ptpsq, cqe, budget); + mlx5e_ptp_handle_ts_cqe(ptpsq, cqe, + metadata_buff, &metadata_buff_sz, napi_budget); } while ((++work_done < budget) && (cqe = mlx5_cqwq_get_cqe(cqwq)));
mlx5_cqwq_update_db_record(cqwq); @@ -242,6 +250,10 @@ static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget) /* ensure cq space is freed before enabling more cqes */ wmb();
+ while (metadata_buff_sz > 0) + mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, + metadata_buff[--metadata_buff_sz]); + mlx5e_txqsq_wake(&ptpsq->txqsq);
return work_done == budget;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 3338bebfc26a1e2cebbba82a1cf12c0159608e73 ]
Without increased buffer size, will trigger -Wformat-truncation with W=1 for the snprintf operation writing to the buffer.
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c: In function 'mlx5_irq_alloc': drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:296:7: error: '@pci:' directive output may be truncated writing 5 bytes into a region of size between 1 and 32 [-Werror=format-truncation=] 296 | "%s@pci:%s", name, pci_name(dev->pdev)); | ^~~~~ drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:295:2: note: 'snprintf' output 6 or more bytes (assuming 37) into a destination of size 32 295 | snprintf(irq->name, MLX5_MAX_IRQ_NAME, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 296 | "%s@pci:%s", name, pci_name(dev->pdev)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: ada9f5d00797 ("IB/mlx5: Fix eq names to display nicely in /proc/interrupts") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-13-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 6 +++--- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h | 3 +++ 2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index cba2a4afb5fda..235e170c65bb7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -28,7 +28,7 @@ struct mlx5_irq { struct atomic_notifier_head nh; cpumask_var_t mask; - char name[MLX5_MAX_IRQ_NAME]; + char name[MLX5_MAX_IRQ_FORMATTED_NAME]; struct mlx5_irq_pool *pool; int refcount; struct msi_map map; @@ -289,8 +289,8 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, else irq_sf_set_name(pool, name, i); ATOMIC_INIT_NOTIFIER_HEAD(&irq->nh); - snprintf(irq->name, MLX5_MAX_IRQ_NAME, - "%s@pci:%s", name, pci_name(dev->pdev)); + snprintf(irq->name, MLX5_MAX_IRQ_FORMATTED_NAME, + MLX5_IRQ_NAME_FORMAT_STR, name, pci_name(dev->pdev)); err = request_irq(irq->map.virq, irq_int_handler, 0, irq->name, &irq->nh); if (err) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h index d3a77a0ab8488..c4d377f8df308 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h @@ -7,6 +7,9 @@ #include <linux/mlx5/driver.h>
#define MLX5_MAX_IRQ_NAME (32) +#define MLX5_IRQ_NAME_FORMAT_STR ("%s@pci:%s") +#define MLX5_MAX_IRQ_FORMATTED_NAME \ + (MLX5_MAX_IRQ_NAME + sizeof(MLX5_IRQ_NAME_FORMAT_STR)) /* max irq_index is 2047, so four chars */ #define MLX5_MAX_IRQ_IDX_CHARS (4) #define MLX5_EQ_REFS_PER_IRQ (2)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saeed Mahameed saeedm@nvidia.com
[ Upstream commit dce94142842e119b982c27c1b62bd20890c7fd21 ]
icosq_str size is unnecessarily too long, and it causes a build warning -Wformat-truncation with W=1. Looking closely, It doesn't need to be 255B, hence this patch reduces the size to 32B which should be more than enough to host the string: "ICOSQ: 0x%x, ".
While here, add a missing space in the formatted string.
This fixes the following build warning:
$ KCFLAGS='-Wall -Werror' $ make O=/tmp/kbuild/linux W=1 -s -j12 drivers/net/ethernet/mellanox/mlx5/core/
drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c: In function 'mlx5e_reporter_rx_timeout': drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c:718:56: error: ', CQ: 0x' directive output may be truncated writing 8 bytes into a region of size between 0 and 255 [-Werror=format-truncation=] 718 | "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", | ^~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c:717:9: note: 'snprintf' output between 43 and 322 bytes into a destination of size 288 717 | snprintf(err_str, sizeof(err_str), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 718 | "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 719 | rq->ix, icosq_str, rq->rqn, rq->cq.mcq.cqn); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 521f31af004a ("net/mlx5e: Allow RQ outside of channel context") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-14-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c index e8eea9ffd5eb6..03b119a434bc9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c @@ -702,11 +702,11 @@ static int mlx5e_rx_reporter_dump(struct devlink_health_reporter *reporter,
void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq) { - char icosq_str[MLX5E_REPORTER_PER_Q_MAX_LEN] = {}; char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN]; struct mlx5e_icosq *icosq = rq->icosq; struct mlx5e_priv *priv = rq->priv; struct mlx5e_err_ctx err_ctx = {}; + char icosq_str[32] = {};
err_ctx.ctx = rq; err_ctx.recover = mlx5e_rx_reporter_timeout_recover; @@ -715,7 +715,7 @@ void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq) if (icosq) snprintf(icosq_str, sizeof(icosq_str), "ICOSQ: 0x%x, ", icosq->sqn); snprintf(err_str, sizeof(err_str), - "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", + "RX timeout on channel: %d, %s RQ: 0x%x, CQ: 0x%x", rq->ix, icosq_str, rq->rqn, rq->cq.mcq.cqn);
mlx5e_health_report(priv, priv->rx_reporter, err_str, &err_ctx);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 41e63c2baa11dc2aa71df5dd27a5bd87d11b6bbb ]
Treat the operation as an error case when the return value is equivalent to the size of the name buffer. Failed to write null terminator to the name buffer, making the string malformed and should not be used. Provide a string with only the firmware version when forming the string with the board id fails.
Without check, will trigger -Wformat-truncation with W=1.
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: In function 'mlx5e_ethtool_get_drvinfo': drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:49:31: warning: '%.16s' directive output may be truncated writing up to 16 bytes into a region of size between 13 and 22 [-Wformat-truncation=] 49 | "%d.%d.%04d (%.16s)", | ^~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:48:9: note: 'snprintf' output between 12 and 37 bytes into a destination of size 32 48 | snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 49 | "%d.%d.%04d (%.16s)", | ~~~~~~~~~~~~~~~~~~~~~ 50 | fw_rev_maj(mdev), fw_rev_min(mdev), fw_rev_sub(mdev), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 51 | mdev->board_id); | ~~~~~~~~~~~~~~~
Fixes: 84e11edb71de ("net/mlx5e: Show board id in ethtool driver information") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 3d2d5d3b59f0b..bd3fabb007c94 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -43,12 +43,17 @@ void mlx5e_ethtool_get_drvinfo(struct mlx5e_priv *priv, struct ethtool_drvinfo *drvinfo) { struct mlx5_core_dev *mdev = priv->mdev; + int count;
strscpy(drvinfo->driver, KBUILD_MODNAME, sizeof(drvinfo->driver)); - snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), - "%d.%d.%04d (%.16s)", - fw_rev_maj(mdev), fw_rev_min(mdev), fw_rev_sub(mdev), - mdev->board_id); + count = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d (%.16s)", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev), mdev->board_id); + if (count == sizeof(drvinfo->fw_version)) + snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev)); + strscpy(drvinfo->bus_info, dev_name(mdev->device), sizeof(drvinfo->bus_info)); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 1b2bd0c0264febcd8d47209079a6671c38e6558b ]
Treat the operation as an error case when the return value is equivalent to the size of the name buffer. Failed to write null terminator to the name buffer, making the string malformed and should not be used. Provide a string with only the firmware version when forming the string with the board id fails. This logic for representors is identical to normal flow with ethtool.
Without check, will trigger -Wformat-truncation with W=1.
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 'mlx5e_rep_get_drvinfo': drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:78:31: warning: '%.16s' directive output may be truncated writing up to 16 bytes into a region of size between 13 and 22 [-Wformat-truncation=] 78 | "%d.%d.%04d (%.16s)", | ^~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:77:9: note: 'snprintf' output between 12 and 37 bytes into a destination of size 32 77 | snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78 | "%d.%d.%04d (%.16s)", | ~~~~~~~~~~~~~~~~~~~~~ 79 | fw_rev_maj(mdev), fw_rev_min(mdev), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 | fw_rev_sub(mdev), mdev->board_id); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: cf83c8fdcd47 ("net/mlx5e: Add missing ethtool driver info for representors") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-16-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 0cd44ef190058..87fda65852fb7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -71,13 +71,17 @@ static void mlx5e_rep_get_drvinfo(struct net_device *dev, { struct mlx5e_priv *priv = netdev_priv(dev); struct mlx5_core_dev *mdev = priv->mdev; + int count;
strscpy(drvinfo->driver, mlx5e_rep_driver_name, sizeof(drvinfo->driver)); - snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), - "%d.%d.%04d (%.16s)", - fw_rev_maj(mdev), fw_rev_min(mdev), - fw_rev_sub(mdev), mdev->board_id); + count = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d (%.16s)", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev), mdev->board_id); + if (count == sizeof(drvinfo->fw_version)) + snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev)); }
static const struct counter_desc sw_rep_stats_desc[] = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 7cd5af0e937a197295f3aa3721031f0fbae49cff ]
There is no hardware supporting ct helper offload. However, prior to this patch, a flower filter with a helper in the ct action can be successfully set into the HW, for example (eth1 is a bnxt NIC):
# tc qdisc add dev eth1 ingress_block 22 ingress # tc filter add block 22 proto ip flower skip_sw ip_proto tcp \ dst_port 21 ct_state -trk action ct helper ipv4-tcp-ftp # tc filter show dev eth1 ingress
filter block 22 protocol ip pref 49152 flower chain 0 handle 0x1 eth_type ipv4 ip_proto tcp dst_port 21 ct_state -trk skip_sw in_hw in_hw_count 1 <---- action order 1: ct zone 0 helper ipv4-tcp-ftp pipe index 2 ref 1 bind 1 used_hw_stats delayed
This might cause the flower filter not to work as expected in the HW.
This patch avoids this problem by simply returning -EOPNOTSUPP in tcf_ct_offload_act_setup() to not allow to offload flows with a helper in act_ct.
Fixes: a21b06e73191 ("net: sched: add helper support in act_ct") Signed-off-by: Xin Long lucien.xin@gmail.com Reviewed-by: Jamal Hadi Salim jhs@mojatatu.com Link: https://lore.kernel.org/r/f8685ec7702c4a448a1371a8b34b43217b583b9d.169989800... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/tc_act/tc_ct.h | 9 +++++++++ net/sched/act_ct.c | 3 +++ 2 files changed, 12 insertions(+)
diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h index b24ea2d9400ba..1dc2f827d0bcf 100644 --- a/include/net/tc_act/tc_ct.h +++ b/include/net/tc_act/tc_ct.h @@ -57,6 +57,11 @@ static inline struct nf_flowtable *tcf_ct_ft(const struct tc_action *a) return to_ct_params(a)->nf_ft; }
+static inline struct nf_conntrack_helper *tcf_ct_helper(const struct tc_action *a) +{ + return to_ct_params(a)->helper; +} + #else static inline uint16_t tcf_ct_zone(const struct tc_action *a) { return 0; } static inline int tcf_ct_action(const struct tc_action *a) { return 0; } @@ -64,6 +69,10 @@ static inline struct nf_flowtable *tcf_ct_ft(const struct tc_action *a) { return NULL; } +static inline struct nf_conntrack_helper *tcf_ct_helper(const struct tc_action *a) +{ + return NULL; +} #endif /* CONFIG_NF_CONNTRACK */
#if IS_ENABLED(CONFIG_NET_ACT_CT) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index d131750663c3c..ea05d0b2df68a 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -1534,6 +1534,9 @@ static int tcf_ct_offload_act_setup(struct tc_action *act, void *entry_data, if (bind) { struct flow_action_entry *entry = entry_data;
+ if (tcf_ct_helper(act)) + return -EOPNOTSUPP; + entry->id = FLOW_ACTION_CT; entry->ct.action = tcf_ct_action(act); entry->ct.zone = tcf_ct_zone(act);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 7e1caeace0418381f36b3aa8403dfd82fc57fc53 ]
Macvlan device in passthru mode sets its lower device promiscuous mode according to its MACVLAN_FLAG_NOPROMISC flag instead of synchronizing it to its own promiscuity setting. However, macvlan_change_rx_flags() function doesn't check the mode before propagating such changes to the lower device which can cause net_device->promiscuity counter overflow as illustrated by reproduction example [0] and resulting dmesg log [1]. Fix the issue by first verifying the mode in macvlan_change_rx_flags() function before propagating promiscuous mode change to the lower device.
[0]: ip link add macvlan1 link enp8s0f0 type macvlan mode passthru ip link set macvlan1 promisc on ip l set dev macvlan1 up ip link set macvlan1 promisc off ip l set dev macvlan1 down ip l set dev macvlan1 up
[1]: [ 5156.281724] macvlan1: entered promiscuous mode [ 5156.285467] mlx5_core 0000:08:00.0 enp8s0f0: entered promiscuous mode [ 5156.287639] macvlan1: left promiscuous mode [ 5156.288339] mlx5_core 0000:08:00.0 enp8s0f0: left promiscuous mode [ 5156.290907] mlx5_core 0000:08:00.0 enp8s0f0: entered promiscuous mode [ 5156.317197] mlx5_core 0000:08:00.0 enp8s0f0: promiscuity touches roof, set promiscuity failed. promiscuity feature of device might be broken.
Fixes: efdbd2b30caa ("macvlan: Propagate promiscuity setting to lower devices.") Reviewed-by: Gal Pressman gal@nvidia.com Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20231114175915.1649154-1-vladbu@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/macvlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index ed908165a8b4e..347f288350619 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -780,7 +780,7 @@ static void macvlan_change_rx_flags(struct net_device *dev, int change) if (dev->flags & IFF_UP) { if (change & IFF_ALLMULTI) dev_set_allmulti(lowerdev, dev->flags & IFF_ALLMULTI ? 1 : -1); - if (change & IFF_PROMISC) + if (!macvlan_passthru(vlan->port) && change & IFF_PROMISC) dev_set_promiscuity(lowerdev, dev->flags & IFF_PROMISC ? 1 : -1);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Rui rui.zhang@intel.com
[ Upstream commit 137f01b3529d292a68d22e9681e2f903c768f790 ]
MSR_KNL_CORE_C6_RESIDENCY should be evaluated only if 1. this is KNL platform AND 2. need to get C6 residency or need to calculate C1 residency
Fix the broken logic introduced by commit 1e9042b9c8d4 ("tools/power turbostat: Fix CPU%C1 display value").
Fixes: 1e9042b9c8d4 ("tools/power turbostat: Fix CPU%C1 display value") Signed-off-by: Zhang Rui rui.zhang@intel.com Reviewed-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index 8a36ba5df9f90..a938699ca2419 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -2180,7 +2180,7 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p) if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !do_knl_cstates) { if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6)) return -7; - } else if (do_knl_cstates || soft_c1_residency_display(BIC_CPU_c6)) { + } else if (do_knl_cstates && soft_c1_residency_display(BIC_CPU_c6)) { if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6)) return -7; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Yu yu.c.chen@intel.com
[ Upstream commit b61b7d8c4c22c4298a50ae5d0ee88facb85ce665 ]
Currently the C-state Pre-wake will not be printed due to the probe has not been invoked. Invoke the probe function accordingly.
Fixes: aeb01e6d71ff ("tools/power turbostat: Print the C-state Pre-wake settings") Signed-off-by: Chen Yu yu.c.chen@intel.com Reviewed-by: Zhang Rui rui.zhang@intel.com Reviewed-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index a938699ca2419..ce9860e388bd4 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -5790,6 +5790,7 @@ void process_cpuid() rapl_probe(family, model); perf_limit_reasons_probe(family, model); automatic_cstate_conversion_probe(family, model); + prewake_cstate_probe(family, model);
check_tcc_offset(model_orig);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naomi Chu naomi.chu@mediatek.com
[ Upstream commit defde5a50d91c74e1ce71a7f0bce7fb1ae311d84 ]
The UFSHCI 4.0 specification mandates that there should always be at least one empty slot in each queue for distinguishing between full and empty states. Enlarge 'hwq->max_entries' to 'DeviceQueueDepth + 1' to allow UFSHCI 4.0 controllers to fully utilize MCQ queue slots.
Fixes: 4682abfae2eb ("scsi: ufs: core: mcq: Allocate memory for MCQ mode") Signed-off-by: Naomi Chu naomi.chu@mediatek.com Link: https://lore.kernel.org/r/20231102052426.12006-2-naomi.chu@mediatek.com Reviewed-by: Stanley Chu stanley.chu@mediatek.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Peter Wang peter.wang@mediatek.com Reviewed-by: Chun-Hung chun-hung.wu@mediatek.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ufs/core/ufs-mcq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c index 386674ead7f0d..4ecbd30c07827 100644 --- a/drivers/ufs/core/ufs-mcq.c +++ b/drivers/ufs/core/ufs-mcq.c @@ -433,7 +433,7 @@ int ufshcd_mcq_init(struct ufs_hba *hba)
for (i = 0; i < hba->nr_hw_queues; i++) { hwq = &hba->uhq[i]; - hwq->max_entries = hba->nutrs; + hwq->max_entries = hba->nutrs + 1; spin_lock_init(&hwq->sq_lock); spin_lock_init(&hwq->cq_lock); mutex_init(&hwq->sq_mutex);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anastasia Belova abelova@astralinux.ru
[ Upstream commit ff31ba19d732efb9aca3633935d71085e68d5076 ]
"host=" should start with ';' (as in cifs_get_spnego_key) So its length should be 6.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Fixes: 7c9c3760b3a5 ("[CIFS] add constants for string lengths of keynames in SPNEGO upcall string") Signed-off-by: Anastasia Belova abelova@astralinux.ru Co-developed-by: Ekaterina Esina eesina@astralinux.ru Signed-off-by: Ekaterina Esina eesina@astralinux.ru Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/cifs_spnego.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/smb/client/cifs_spnego.c b/fs/smb/client/cifs_spnego.c index 6f3285f1dfee5..af7849e5974ff 100644 --- a/fs/smb/client/cifs_spnego.c +++ b/fs/smb/client/cifs_spnego.c @@ -64,8 +64,8 @@ struct key_type cifs_spnego_key_type = { * strlen(";sec=ntlmsspi") */ #define MAX_MECH_STR_LEN 13
-/* strlen of "host=" */ -#define HOST_KEY_LEN 5 +/* strlen of ";host=" */ +#define HOST_KEY_LEN 6
/* strlen of ";ip4=" or ";ip6=" */ #define IP_KEY_LEN 5
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ekaterina Esina eesina@astralinux.ru
[ Upstream commit 181724fc72486dec2bec8803459be05b5162aaa8 ]
Remove extra check after condition, add check after generating key for encryption. The check is needed to return non zero rc before rewriting it with generating key for decryption.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Fixes: d70e9fa55884 ("cifs: try opening channels after mounting") Signed-off-by: Ekaterina Esina eesina@astralinux.ru Co-developed-by: Anastasia Belova abelova@astralinux.ru Signed-off-by: Anastasia Belova abelova@astralinux.ru Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2transport.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 7676091b3e77a..21fc6d84e396d 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -452,6 +452,8 @@ generate_smb3signingkey(struct cifs_ses *ses, ptriplet->encryption.context, ses->smb3encryptionkey, SMB3_ENC_DEC_KEY_SIZE); + if (rc) + return rc; rc = generate_key(ses, ptriplet->decryption.label, ptriplet->decryption.context, ses->smb3decryptionkey, @@ -460,9 +462,6 @@ generate_smb3signingkey(struct cifs_ses *ses, return rc; }
- if (rc) - return rc; - #ifdef CONFIG_CIFS_DEBUG_DUMP_KEYS cifs_dbg(VFS, "%s: dumping generated AES session keys\n", __func__); /*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 889c58b3155ff4c8e8671c95daef63d6fabbb6b1 upstream.
Audit of the refcounting turned up that perf_pmu_migrate_context() fails to migrate the ctx refcount.
Fixes: bd2756811766 ("perf: Rewrite core context handling") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lkml.kernel.org/r/20230612093539.085862001@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/perf_event.h | 13 ++++++++----- kernel/events/core.c | 17 +++++++++++++++++ 2 files changed, 25 insertions(+), 5 deletions(-)
--- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -843,11 +843,11 @@ struct perf_event { };
/* - * ,-----------------------[1:n]----------------------. - * V V - * perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event - * ^ ^ | | - * `--------[1:n]---------' `-[n:1]-> pmu <-[1:n]-' + * ,-----------------------[1:n]------------------------. + * V V + * perf_event_context <-[1:n]-> perf_event_pmu_context <-[1:n]- perf_event + * | | + * `--[n:1]-> pmu <-[1:n]--' * * * struct perf_event_pmu_context lifetime is refcount based and RCU freed @@ -865,6 +865,9 @@ struct perf_event { * ctx->mutex pinning the configuration. Since we hold a reference on * group_leader (through the filedesc) it can't go away, therefore it's * associated pmu_ctx must exist and cannot change due to ctx->mutex. + * + * perf_event holds a refcount on perf_event_context + * perf_event holds a refcount on perf_event_pmu_context */ struct perf_event_pmu_context { struct pmu *pmu; --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4816,6 +4816,11 @@ find_get_pmu_context(struct pmu *pmu, st void *task_ctx_data = NULL;
if (!ctx->task) { + /* + * perf_pmu_migrate_context() / __perf_pmu_install_event() + * relies on the fact that find_get_pmu_context() cannot fail + * for CPU contexts. + */ struct perf_cpu_pmu_context *cpc;
cpc = per_cpu_ptr(pmu->cpu_pmu_context, event->cpu); @@ -12888,6 +12893,9 @@ static void __perf_pmu_install_event(str int cpu, struct perf_event *event) { struct perf_event_pmu_context *epc; + struct perf_event_context *old_ctx = event->ctx; + + get_ctx(ctx); /* normally find_get_context() */
event->cpu = cpu; epc = find_get_pmu_context(pmu, ctx, event); @@ -12896,6 +12904,11 @@ static void __perf_pmu_install_event(str if (event->state >= PERF_EVENT_STATE_OFF) event->state = PERF_EVENT_STATE_INACTIVE; perf_install_in_context(ctx, event, cpu); + + /* + * Now that event->ctx is updated and visible, put the old ctx. + */ + put_ctx(old_ctx); }
static void __perf_pmu_install(struct perf_event_context *ctx, @@ -12934,6 +12947,10 @@ void perf_pmu_migrate_context(struct pmu struct perf_event_context *src_ctx, *dst_ctx; LIST_HEAD(events);
+ /* + * Since per-cpu context is persistent, no need to grab an extra + * reference. + */ src_ctx = &per_cpu_ptr(&perf_cpu_context, src_cpu)->ctx; dst_ctx = &per_cpu_ptr(&perf_cpu_context, dst_cpu)->ctx;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
commit 471aa951bf1206d3c10d0daa67005b8e4db4ff83 upstream.
When i915 perf interface is not available dereferencing it will lead to NULL dereferences.
As returning -ENOTSUPP is pretty clear return when perf interface is not available.
Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Suggested-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231027172822.2753059-1-harsh... [tursulin: added stable tag] (cherry picked from commit 36f27350ff745bd228ab04d7845dfbffc177a889) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/i915_perf.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-)
--- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4286,11 +4286,8 @@ int i915_perf_open_ioctl(struct drm_devi u32 known_open_flags; int ret;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
known_open_flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | @@ -4666,11 +4663,8 @@ int i915_perf_add_config_ioctl(struct dr struct i915_oa_reg *regs; int err, id;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
if (!perf->metrics_kobj) { drm_dbg(&perf->i915->drm, @@ -4832,11 +4826,8 @@ int i915_perf_remove_config_ioctl(struct struct i915_oa_config *oa_config; int ret;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
if (i915_perf_stream_paranoid && !perfmon_capable()) { drm_dbg(&perf->i915->drm,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilkka Koskinen ilkka@os.amperecomputing.com
commit 15c7ef7341a2e54cfa12ac502c65d6fd2cce2b62 upstream.
Coresight PMU driver didn't reject events meant for other PMUs. This caused some of the Core PMU events disappearing from the output of "perf list". In addition, trying to run e.g.
$ perf stat -e r2 sleep 1
made Coresight PMU driver to handle the event instead of letting Core PMU driver to deal with it.
Cc: stable@vger.kernel.org Fixes: e37dfd65731d ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver") Signed-off-by: Ilkka Koskinen ilkka@os.amperecomputing.com Acked-by: Will Deacon will@kernel.org Reviewed-by: Besar Wicaksono bwicaksono@nvidia.com Acked-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Link: https://lore.kernel.org/r/20231103001654.35565-1-ilkka@os.amperecomputing.co... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/perf/arm_cspmu/arm_cspmu.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -635,6 +635,9 @@ static int arm_cspmu_event_init(struct p
cspmu = to_arm_cspmu(event->pmu);
+ if (event->attr.type != event->pmu->type) + return -ENOENT; + /* * Following other "uncore" PMUs, we do not support sampling mode or * attach to a task (per-process mode).
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandre Ghiti alexghiti@rivosinc.com
commit c6e316ac05532febb0c966fa9b55f5258ed037be upstream.
We must check the return value of find_first_bit() before using the return value as an index array since it happens to overflow the array and then panic:
[ 107.318430] Kernel BUG [#1] [ 107.319434] CPU: 3 PID: 1238 Comm: kill Tainted: G E 6.6.0-rc6ubuntu-defconfig #2 [ 107.319465] Hardware name: riscv-virtio,qemu (DT) [ 107.319551] epc : pmu_sbi_ovf_handler+0x3a4/0x3ae [ 107.319840] ra : pmu_sbi_ovf_handler+0x52/0x3ae [ 107.319868] epc : ffffffff80a0a77c ra : ffffffff80a0a42a sp : ffffaf83fecda350 [ 107.319884] gp : ffffffff823961a8 tp : ffffaf8083db1dc0 t0 : ffffaf83fecda480 [ 107.319899] t1 : ffffffff80cafe62 t2 : 000000000000ff00 s0 : ffffaf83fecda520 [ 107.319921] s1 : ffffaf83fecda380 a0 : 00000018fca29df0 a1 : ffffffffffffffff [ 107.319936] a2 : 0000000001073734 a3 : 0000000000000004 a4 : 0000000000000000 [ 107.319951] a5 : 0000000000000040 a6 : 000000001d1c8774 a7 : 0000000000504d55 [ 107.319965] s2 : ffffffff82451f10 s3 : ffffffff82724e70 s4 : 000000000000003f [ 107.319980] s5 : 0000000000000011 s6 : ffffaf8083db27c0 s7 : 0000000000000000 [ 107.319995] s8 : 0000000000000001 s9 : 00007fffb45d6558 s10: 00007fffb45d81a0 [ 107.320009] s11: ffffaf7ffff60000 t3 : 0000000000000004 t4 : 0000000000000000 [ 107.320023] t5 : ffffaf7f80000000 t6 : ffffaf8000000000 [ 107.320037] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 107.320081] [<ffffffff80a0a77c>] pmu_sbi_ovf_handler+0x3a4/0x3ae [ 107.320112] [<ffffffff800b42d0>] handle_percpu_devid_irq+0x9e/0x1a0 [ 107.320131] [<ffffffff800ad92c>] generic_handle_domain_irq+0x28/0x36 [ 107.320148] [<ffffffff8065f9f8>] riscv_intc_irq+0x36/0x4e [ 107.320166] [<ffffffff80caf4a0>] handle_riscv_irq+0x54/0x86 [ 107.320189] [<ffffffff80cb0036>] do_irq+0x64/0x96 [ 107.320271] Code: 85a6 855e b097 ff7f 80e7 9220 b709 9002 4501 bbd9 (9002) 6097 [ 107.320585] ---[ end trace 0000000000000000 ]--- [ 107.320704] Kernel panic - not syncing: Fatal exception in interrupt [ 107.320775] SMP: stopping secondary CPUs [ 107.321219] Kernel Offset: 0x0 from 0xffffffff80000000 [ 107.333051] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Fixes: 4905ec2fb7e6 ("RISC-V: Add sscofpmf extension support") Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20231109082128.40777-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/perf/riscv_pmu_sbi.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -629,6 +629,11 @@ static irqreturn_t pmu_sbi_ovf_handler(i
/* Firmware counter don't support overflow yet */ fidx = find_first_bit(cpu_hw_evt->used_hw_ctrs, RISCV_MAX_COUNTERS); + if (fidx == RISCV_MAX_COUNTERS) { + csr_clear(CSR_SIP, BIT(riscv_pmu_irq_num)); + return IRQ_NONE; + } + event = cpu_hw_evt->events[fidx]; if (!event) { csr_clear(CSR_SIP, BIT(riscv_pmu_irq_num));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 5e538fce33589da6d7cb2de1445b84d3a8a692f7 upstream.
Read and write pointers are used to track the packet index in the memory shared between video driver and firmware. There is a possibility of OOB access if the read or write pointer goes beyond the queue memory size. Add checks for the read and write pointer to avoid OOB access.
Cc: stable@vger.kernel.org Fixes: d96d3f30c0f2 ("[media] media: venus: hfi: add Venus HFI files") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_venus.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_venus.c +++ b/drivers/media/platform/qcom/venus/hfi_venus.c @@ -205,6 +205,11 @@ static int venus_write_queue(struct venu
new_wr_idx = wr_idx + dwords; wr_ptr = (u32 *)(queue->qmem.kva + (wr_idx << 2)); + + if (wr_ptr < (u32 *)queue->qmem.kva || + wr_ptr > (u32 *)(queue->qmem.kva + queue->qmem.size - sizeof(*wr_ptr))) + return -EINVAL; + if (new_wr_idx < qsize) { memcpy(wr_ptr, packet, dwords << 2); } else { @@ -272,6 +277,11 @@ static int venus_read_queue(struct venus }
rd_ptr = (u32 *)(queue->qmem.kva + (rd_idx << 2)); + + if (rd_ptr < (u32 *)queue->qmem.kva || + rd_ptr > (u32 *)(queue->qmem.kva + queue->qmem.size - sizeof(*rd_ptr))) + return -EINVAL; + dwords = *rd_ptr >> 2; if (!dwords) return -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter adrian.hunter@intel.com
commit f2d87895cbc4af80649850dcf5da36de6b2ed3dd upstream.
Ensure PERF_IP_FLAG_ASYNC is set always for asynchronous branches (i.e. interrupts etc).
Fixes: 90e457f7be08 ("perf tools: Add Intel PT support") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Namhyung Kim namhyung@kernel.org Link: https://lore.kernel.org/r/20230928072953.19369-1-adrian.hunter@intel.com Signed-off-by: Namhyung Kim namhyung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/intel-pt.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1512,9 +1512,11 @@ static void intel_pt_sample_flags(struct } else if (ptq->state->flags & INTEL_PT_ASYNC) { if (!ptq->state->to_ip) ptq->flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_ASYNC | PERF_IP_FLAG_TRACE_END; else if (ptq->state->from_nr && !ptq->state->to_nr) ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | + PERF_IP_FLAG_ASYNC | PERF_IP_FLAG_VMEXIT; else ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit ea142e590aec55ba40c5affb4d49e68c713c63dc upstream.
When the PMU is disabled, MMCRA is not updated to disable BHRB and instruction sampling. This can lead to those features remaining enabled, which can slow down a real or emulated CPU.
Fixes: 1cade527f6e9 ("powerpc/perf: BHRB control to disable BHRB logic when not used") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20231018153423.298373-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/perf/core-book3s.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -1371,8 +1371,7 @@ static void power_pmu_disable(struct pmu /* * Disable instruction sampling if it was enabled */ - if (cpuhw->mmcr.mmcra & MMCRA_SAMPLE_ENABLE) - val &= ~MMCRA_SAMPLE_ENABLE; + val &= ~MMCRA_SAMPLE_ENABLE;
/* Disable BHRB via mmcra (BHRBRD) for p10 */ if (ppmu->flags & PPMU_ARCH_31) @@ -1383,7 +1382,7 @@ static void power_pmu_disable(struct pmu * instruction sampling or BHRB. */ if (val != mmcra) { - mtspr(SPRN_MMCRA, mmcra); + mtspr(SPRN_MMCRA, val); mb(); isync(); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
commit 381fdb73d1e2a48244de7260550e453d1003bb8e upstream.
The performance mode of the gcc-plugin randstruct was shuffling struct members outside of the cache-line groups. Limit the range to the specified group indexes.
Cc: linux-hardening@vger.kernel.org Cc: stable@vger.kernel.org Reported-by: Lukas Loidolt e1634039@student.tuwien.ac.at Closes: https://lore.kernel.org/all/f3ca77f0-e414-4065-83a5-ae4c4d25545d@student.tuw... Fixes: 313dd1b62921 ("gcc-plugins: Add the randstruct plugin") Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/gcc-plugins/randomize_layout_plugin.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -191,12 +191,14 @@ static void partition_struct(tree *field
static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prng_state) { - unsigned long i, x; + unsigned long i, x, index; struct partition_group size_group[length]; unsigned long num_groups = 0; unsigned long randnum;
partition_struct(newtree, length, (struct partition_group *)&size_group, &num_groups); + + /* FIXME: this group shuffle is currently a no-op. */ for (i = num_groups - 1; i > 0; i--) { struct partition_group tmp; randnum = ranval(prng_state) % (i + 1); @@ -206,11 +208,14 @@ static void performance_shuffle(tree *ne }
for (x = 0; x < num_groups; x++) { - for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { + for (index = size_group[x].length - 1; index > 0; index--) { tree tmp; + + i = size_group[x].start + index; if (DECL_BIT_FIELD_TYPE(newtree[i])) continue; - randnum = ranval(prng_state) % (i + 1); + randnum = ranval(prng_state) % (index + 1); + randnum += size_group[x].start; // we could handle this case differently if desired if (DECL_BIT_FIELD_TYPE(newtree[randnum])) continue;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Sun sunhao.th@gmail.com
commit 811c363645b33e6e22658634329e95f383dfc705 upstream.
In check_stack_write_fixed_off(), imm value is cast to u32 before being spilled to the stack. Therefore, the sign information is lost, and the range information is incorrect when load from the stack again.
For the following prog: 0: r2 = r10 1: *(u64*)(r2 -40) = -44 2: r0 = *(u64*)(r2 - 40) 3: if r0 s<= 0xa goto +2 4: r0 = 1 5: exit 6: r0 = 0 7: exit
The verifier gives: func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 1: (7a) *(u64 *)(r2 -40) = -44 ; R2_w=fp0 fp-40_w=4294967252 2: (79) r0 = *(u64 *)(r2 -40) ; R0_w=4294967252 R2_w=fp0 fp-40_w=4294967252 3: (c5) if r0 s< 0xa goto pc+2 mark_precise: frame0: last_idx 3 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r0 stack= before 2: (79) r0 = *(u64 *)(r2 -40) 3: R0_w=4294967252 4: (b7) r0 = 1 ; R0_w=1 5: (95) exit verification time 7971 usec stack depth 40 processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
So remove the incorrect cast, since imm field is declared as s32, and __mark_reg_known() takes u64, so imm would be correctly sign extended by compiler.
Fixes: ecdf985d7615 ("bpf: track immediate values written to stack by BPF_ST instruction") Cc: stable@vger.kernel.org Signed-off-by: Hao Sun sunhao.th@gmail.com Acked-by: Shung-Hsi Yu shung-hsi.yu@suse.com Acked-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/r/20231101-fix-check-stack-write-v3-1-f05c2b1473d5@g... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4368,7 +4368,7 @@ static int check_stack_write_fixed_off(s insn->imm != 0 && env->bpf_capable) { struct bpf_reg_state fake_reg = {};
- __mark_reg_known(&fake_reg, (u32)insn->imm); + __mark_reg_known(&fake_reg, insn->imm); fake_reg.type = SCALAR_VALUE; save_register_state(state, spi, &fake_reg, size); } else if (reg && is_spillable_regtype(reg->type)) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shung-Hsi Yu shung-hsi.yu@suse.com
commit 291d044fd51f8484066300ee42afecf8c8db7b3a upstream.
BPF_END and BPF_NEG has a different specification for the source bit in the opcode compared to other ALU/ALU64 instructions, and is either reserved or use to specify the byte swap endianness. In both cases the source bit does not encode source operand location, and src_reg is a reserved field.
backtrack_insn() currently does not differentiate BPF_END and BPF_NEG from other ALU/ALU64 instructions, which leads to r0 being incorrectly marked as precise when processing BPF_ALU | BPF_TO_BE | BPF_END instructions. This commit teaches backtrack_insn() to correctly mark precision for such case.
While precise tracking of BPF_NEG and other BPF_END instructions are correct and does not need fixing, this commit opt to process all BPF_NEG and BPF_END instructions within the same if-clause to better align with current convention used in the verifier (e.g. check_alu_op).
Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Cc: stable@vger.kernel.org Reported-by: Mohamed Mahmoud mmahmoud@redhat.com Closes: https://lore.kernel.org/r/87jzrrwptf.fsf@toke.dk Tested-by: Toke Høiland-Jørgensen toke@redhat.com Tested-by: Tao Lyu tao.lyu@epfl.ch Acked-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Shung-Hsi Yu shung-hsi.yu@suse.com Link: https://lore.kernel.org/r/20231102053913.12004-2-shung-hsi.yu@suse.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3436,7 +3436,12 @@ static int backtrack_insn(struct bpf_ver if (class == BPF_ALU || class == BPF_ALU64) { if (!bt_is_reg_set(bt, dreg)) return 0; - if (opcode == BPF_MOV) { + if (opcode == BPF_END || opcode == BPF_NEG) { + /* sreg is reserved and unused + * dreg still need precision before this insn + */ + return 0; + } else if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { /* dreg = sreg * dreg needs precision after this insn
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ranjan Kumar ranjan.kumar@broadcom.com
commit 3c978492c333f0c08248a8d51cecbe5eb5f617c9 upstream.
The retry loop continues to iterate until the count reaches 30, even after receiving the correct value. Exit loop when a non-zero value is read.
Fixes: 4ca10f3e3174 ("scsi: mpt3sas: Perform additional retries if doorbell read returns 0") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar ranjan.kumar@broadcom.com Link: https://lore.kernel.org/r/20231020105849.6350-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -223,8 +223,8 @@ _base_readl_ext_retry(const volatile voi
for (i = 0 ; i < 30 ; i++) { ret_val = readl(addr); - if (ret_val == 0) - continue; + if (ret_val != 0) + break; }
return ret_val;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandrakanth patil chandrakanth.patil@broadcom.com
commit 8e3ed9e786511ad800c33605ed904b9de49323cf upstream.
In BMC environments with concurrent access to multiple registers, certain registers occasionally yield a value of 0 even after 3 retries due to hardware errata. As a fix, we have extended the retry count from 3 to 30.
The same errata applies to the mpt3sas driver, and a similar patch has been accepted. Please find more details in the mpt3sas patch reference link.
Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Fixes: 272652fcbf1a ("scsi: megaraid_sas: add retry logic in megasas_readl") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth patil chandrakanth.patil@broadcom.com Signed-off-by: Sumit Saxena sumit.saxena@broadcom.com Link: https://lore.kernel.org/r/20231003110021.168862-2-chandrakanth.patil@broadco... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/megaraid/megaraid_sas_base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -263,13 +263,13 @@ u32 megasas_readl(struct megasas_instanc * Fusion registers could intermittently return all zeroes. * This behavior is transient in nature and subsequent reads will * return valid value. As a workaround in driver, retry readl for - * upto three times until a non-zero value is read. + * up to thirty times until a non-zero value is read. */ if (instance->adapter_type == AERO_SERIES) { do { ret_val = readl(addr); i++; - } while (ret_val == 0 && i < 3); + } while (ret_val == 0 && i < 30); return ret_val; } else { return readl(addr);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org
commit fc88ca19ad0989dc0e4d4b126d5d0ba91f6cb616 upstream.
The "hs_gear" variable is used to program the PHY settings (submode) during ufs_qcom_power_up_sequence(). Currently, it is being updated every time the agreed gear changes. Due to this, if the gear got downscaled before suspend (runtime/system), then while resuming, the PHY settings for the lower gear will be applied first and later when scaling to max gear with REINIT, the PHY settings for the max gear will be applied.
This adds a latency while resuming and also really not needed as the PHY gear settings are backwards compatible i.e., we can continue using the PHY settings for max gear with lower gear speed.
So let's update the "hs_gear" variable _only_ when the agreed gear is greater than the current one. This guarantees that the PHY settings will be changed only during probe time and fatal error condition.
Due to this, UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH can now be skipped when the PM operation is in progress.
Cc: stable@vger.kernel.org Fixes: 96a7141da332 ("scsi: ufs: core: Add support for reinitializing the UFS device") Reported-by: Can Guo quic_cang@quicinc.com Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20230908145329.154024-1-manivannan.sadhasivam@lina... Reviewed-by: Can Guo quic_cang@quicinc.com Tested-by: Can Guo quic_cang@quicinc.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ufs/core/ufshcd.c | 3 ++- drivers/ufs/host/ufs-qcom.c | 9 +++++++-- 2 files changed, 9 insertions(+), 3 deletions(-)
--- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -8798,7 +8798,8 @@ static int ufshcd_probe_hba(struct ufs_h if (ret) goto out;
- if (hba->quirks & UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH) { + if (!hba->pm_op_in_progress && + (hba->quirks & UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH)) { /* Reset the device and controller before doing reinit */ ufshcd_device_reset(hba); ufshcd_hba_stop(hba); --- a/drivers/ufs/host/ufs-qcom.c +++ b/drivers/ufs/host/ufs-qcom.c @@ -820,8 +820,13 @@ static int ufs_qcom_pwr_change_notify(st return ret; }
- /* Use the agreed gear */ - host->hs_gear = dev_req_params->gear_tx; + /* + * Update hs_gear only when the gears are scaled to a higher value. This is because, + * the PHY gear settings are backwards compatible and we only need to change the PHY + * settings while scaling to higher gears. + */ + if (dev_req_params->gear_tx > host->hs_gear) + host->hs_gear = dev_req_params->gear_tx;
/* enable the device ref clock before changing to HS mode */ if (!ufshcd_is_hs_mode(&hba->pwr_info) &&
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qutran@marvell.com
commit 19597cad64d608aa8ac2f8aef50a50187a565223 upstream.
User experiences system crash when running AER error injection. The perturbation causes the abort-all-I/O path to trigger. The driver assumes all I/O on this path is FCP only. If there is both NVMe & FCP traffic, a system crash happens. Add additional check to see if I/O is FCP or not before access.
PID: 999019 TASK: ff35d769f24722c0 CPU: 53 COMMAND: "kworker/53:1" 0 [ff3f78b964847b58] machine_kexec at ffffffffae86973d 1 [ff3f78b964847ba8] __crash_kexec at ffffffffae9be29d 2 [ff3f78b964847c70] crash_kexec at ffffffffae9bf528 3 [ff3f78b964847c78] oops_end at ffffffffae8282ab 4 [ff3f78b964847c98] exc_page_fault at ffffffffaf2da502 5 [ff3f78b964847cc0] asm_exc_page_fault at ffffffffaf400b62 [exception RIP: qla2x00_abort_srb+444] RIP: ffffffffc07b5f8c RSP: ff3f78b964847d78 RFLAGS: 00010046 RAX: 0000000000000282 RBX: ff35d74a0195a200 RCX: ff35d76886fd03a0 RDX: 0000000000000001 RSI: ffffffffc07c5ec8 RDI: ff35d74a0195a200 RBP: ff35d76913d22080 R8: ff35d7694d103200 R9: ff35d7694d103200 R10: 0000000100000000 R11: ffffffffb05d6630 R12: 0000000000010000 R13: ff3f78b964847df8 R14: ff35d768d8754000 R15: ff35d768877248e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 6 [ff3f78b964847d70] qla2x00_abort_srb at ffffffffc07b5f84 [qla2xxx] 7 [ff3f78b964847de0] __qla2x00_abort_all_cmds at ffffffffc07b6238 [qla2xxx] 8 [ff3f78b964847e38] qla2x00_abort_all_cmds at ffffffffc07ba635 [qla2xxx] 9 [ff3f78b964847e58] qla2x00_terminate_rport_io at ffffffffc08145eb [qla2xxx] 10 [ff3f78b964847e70] fc_terminate_rport_io at ffffffffc045987e [scsi_transport_fc] 11 [ff3f78b964847e88] process_one_work at ffffffffae914f15 12 [ff3f78b964847ed0] worker_thread at ffffffffae9154c0 13 [ff3f78b964847f10] kthread at ffffffffae91c456 14 [ff3f78b964847f50] ret_from_fork at ffffffffae8036ef
Cc: stable@vger.kernel.org Fixes: f45bca8c5052 ("scsi: qla2xxx: Fix double scsi_done for abort path") Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20231030064912.37912-1-njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_os.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1835,8 +1835,16 @@ static void qla2x00_abort_srb(struct qla }
spin_lock_irqsave(qp->qp_lock_ptr, *flags); - if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd))) - sp->done(sp, res); + switch (sp->type) { + case SRB_SCSI_CMD: + if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd))) + sp->done(sp, res); + break; + default: + if (ret_cmd) + sp->done(sp, res); + break; + } } else { sp->done(sp, res); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Wang peter.wang@mediatek.com
commit 27900d7119c464b43cd9eac69c85884d17bae240 upstream.
If command timeout happens and cq complete IRQ is raised at the same time, ufshcd_mcq_abort clears lprb->cmd and a NULL pointer deref happens in the ISR. Error log:
ufshcd_abort: Device abort task at tag 18 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 pc : [0xffffffe27ef867ac] scsi_dma_unmap+0xc/0x44 lr : [0xffffffe27f1b898c] ufshcd_release_scsi_cmd+0x24/0x114
Fixes: f1304d442077 ("scsi: ufs: mcq: Added ufshcd_mcq_abort()") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang peter.wang@mediatek.com Link: https://lore.kernel.org/r/20231106075117.8995-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ufs/core/ufs-mcq.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/ufs/core/ufs-mcq.c +++ b/drivers/ufs/core/ufs-mcq.c @@ -632,6 +632,7 @@ int ufshcd_mcq_abort(struct scsi_cmnd *c int tag = scsi_cmd_to_rq(cmd)->tag; struct ufshcd_lrb *lrbp = &hba->lrb[tag]; struct ufs_hw_queue *hwq; + unsigned long flags; int err = FAILED;
if (!ufshcd_cmd_inflight(lrbp->cmd)) { @@ -672,8 +673,10 @@ int ufshcd_mcq_abort(struct scsi_cmnd *c }
err = SUCCESS; + spin_lock_irqsave(&hwq->cq_lock, flags); if (ufshcd_cmd_inflight(lrbp->cmd)) ufshcd_release_scsi_cmd(hba, lrbp); + spin_unlock_irqrestore(&hwq->cq_lock, flags);
out: return err;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roxana Nicolescu roxana.nicolescu@canonical.com
commit 1c43c0f1f84aa59dfc98ce66f0a67b2922aa7f9d upstream.
x86 optimized crypto modules are built as modules rather than build-in and they are not loaded when the crypto API is initialized, resulting in the generic builtin module (sha1-generic) being used instead.
It was discovered when creating a sha1/sha256 checksum of a 2Gb file by using kcapi-tools because it would take significantly longer than creating a sha512 checksum of the same file. trace-cmd showed that for sha1/256 the generic module was used, whereas for sha512 the optimized module was used instead.
Add module aliases() for these x86 optimized crypto modules based on CPU feature bits so udev gets a chance to load them later in the boot process. This resulted in ~3x decrease in the real-time execution of kcapi-dsg.
Fix is inspired from commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") where a similar fix was done for sha512.
Cc: stable@vger.kernel.org # 5.15+ Suggested-by: Dimitri John Ledkov dimitri.ledkov@canonical.com Suggested-by: Julian Andres Klode julian.klode@canonical.com Signed-off-by: Roxana Nicolescu roxana.nicolescu@canonical.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/crypto/sha1_ssse3_glue.c | 12 ++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 12 ++++++++++++ 2 files changed, 24 insertions(+)
--- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,8 +24,17 @@ #include <linux/types.h> #include <crypto/sha1.h> #include <crypto/sha1_base.h> +#include <asm/cpu_device_id.h> #include <asm/simd.h>
+static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { @@ -301,6 +310,9 @@ static inline void unregister_sha1_ni(vo
static int __init sha1_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha1_ssse3()) goto fail;
--- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,11 +38,20 @@ #include <crypto/sha2.h> #include <crypto/sha256_base.h> #include <linux/string.h> +#include <asm/cpu_device_id.h> #include <asm/simd.h>
asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks);
+static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha256_block_fn *sha256_xform) { @@ -366,6 +375,9 @@ static inline void unregister_sha256_ni(
static int __init sha256_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha256_ssse3()) goto fail;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9 upstream.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This occurs because on some AMD platforms, even though the Root Ports advertise PME_Support for D3hot and D3cold, wakeup events from devices on a USB4 controller don't result in wakeup interrupts from the Root Port when amd-pmc has put the platform in a hardware sleep state.
If amd-pmc will be involved in the suspend, remove D3hot and D3cold from the PME_Support mask of Root Ports above USB4 controllers so we avoid those states if we need wakeups.
Restore D3 support at resume so that it can be used by runtime suspend.
This affects both AMD Rembrandt and Phoenix SoCs.
"pm_suspend_target_state == PM_SUSPEND_ON" means we're doing runtime suspend, and amd-pmc will not be involved. In that case PMEs work as advertised in D3hot/D3cold, so we don't need to do anything.
Note that amd-pmc is technically optional, and there's no need for this quirk if it's not present, but we assume it's always present because power consumption is so high without it.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/20231004144959.158840-1-mario.limonciello@amd.com Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Signed-off-by: Mario Limonciello mario.limonciello@amd.com [bhelgaas: commit log, move to arch/x86/pci/fixup.c, add #includes] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/pci/fixup.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+)
--- a/arch/x86/pci/fixup.c +++ b/arch/x86/pci/fixup.c @@ -3,9 +3,11 @@ * Exceptions for specific devices. Usually work-arounds for fatal design flaws. */
+#include <linux/bitfield.h> #include <linux/delay.h> #include <linux/dmi.h> #include <linux/pci.h> +#include <linux/suspend.h> #include <linux/vgaarb.h> #include <asm/amd_nb.h> #include <asm/hpet.h> @@ -904,3 +906,60 @@ static void chromeos_fixup_apl_pci_l1ss_ } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability); + +#ifdef CONFIG_SUSPEND +/* + * Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but + * if the SoC is put into a hardware sleep state by the amd-pmc driver, the + * Root Ports don't generate wakeup interrupts for USB devices. + * + * When suspending, remove D3hot and D3cold from the PME_Support advertised + * by the Root Port so we don't use those states if we're expecting wakeup + * interrupts. Restore the advertised PME_Support when resuming. + */ +static void amd_rp_pme_suspend(struct pci_dev *dev) +{ + struct pci_dev *rp; + + /* + * PM_SUSPEND_ON means we're doing runtime suspend, which means + * amd-pmc will not be involved so PMEs during D3 work as advertised. + * + * The PMEs *do* work if amd-pmc doesn't put the SoC in the hardware + * sleep state, but we assume amd-pmc is always present. + */ + if (pm_suspend_target_state == PM_SUSPEND_ON) + return; + + rp = pcie_find_root_port(dev); + if (!rp->pm_cap) + return; + + rp->pme_support &= ~((PCI_PM_CAP_PME_D3hot|PCI_PM_CAP_PME_D3cold) >> + PCI_PM_CAP_PME_SHIFT); + dev_info_once(&rp->dev, "quirk: disabling D3cold for suspend\n"); +} + +static void amd_rp_pme_resume(struct pci_dev *dev) +{ + struct pci_dev *rp; + u16 pmc; + + rp = pcie_find_root_port(dev); + if (!rp->pm_cap) + return; + + pci_read_config_word(rp, rp->pm_cap + PCI_PM_PMC, &pmc); + rp->pme_support = FIELD_GET(PCI_PM_CAP_PME_MASK, pmc); +} +/* Rembrandt (yellow_carp) */ +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_suspend); +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_resume); +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_suspend); +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_resume); +/* Phoenix (pink_sardine) */ +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_suspend); +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_resume); +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_suspend); +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_resume); +#endif /* CONFIG_SUSPEND */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Koichiro Den den@valinux.co.jp
commit b56ebe7c896dc78b5865ec2c4b1dae3c93537517 upstream.
commit ef8dd01538ea ("genirq/msi: Make interrupt allocation less convoluted"), reworked the code so that the x86 specific quirk for affinity setting of non-maskable PCI/MSI interrupts is not longer activated if necessary.
This could be solved by restoring the original logic in the core MSI code, but after a deeper analysis it turned out that the quirk flag is not required at all.
The quirk is only required when the PCI/MSI device cannot mask the MSI interrupts, which in turn also prevents reservation mode from being enabled for the affected interrupt.
This allows ot remove the NOMASK quirk bit completely as msi_set_affinity() can instead check whether reservation mode is enabled for the interrupt, which gives exactly the same answer.
Even in the momentary non-existing case that the reservation mode would be not set for a maskable MSI interrupt this would not cause any harm as it just would cause msi_set_affinity() to go needlessly through the functionaly equivalent slow path, which works perfectly fine with maskable interrupts as well.
Rework msi_set_affinity() to query the reservation mode and remove all NOMASK quirk logic from the core code.
[ tglx: Massaged changelog ]
Fixes: ef8dd01538ea ("genirq/msi: Make interrupt allocation less convoluted") Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Koichiro Den den@valinux.co.jp Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231026032036.2462428-1-den@valinux.co.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/apic/msi.c | 8 +++----- include/linux/irq.h | 26 ++++---------------------- include/linux/msi.h | 6 ------ kernel/irq/debugfs.c | 1 - kernel/irq/msi.c | 12 +----------- 5 files changed, 8 insertions(+), 45 deletions(-)
--- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -55,14 +55,14 @@ msi_set_affinity(struct irq_data *irqd, * caused by the non-atomic update of the address/data pair. * * Direct update is possible when: - * - The MSI is maskable (remapped MSI does not use this code path)). - * The quirk bit is not set in this case. + * - The MSI is maskable (remapped MSI does not use this code path). + * The reservation mode bit is set in this case. * - The new vector is the same as the old vector * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up) * - The interrupt is not yet started up * - The new destination CPU is the same as the old destination CPU */ - if (!irqd_msi_nomask_quirk(irqd) || + if (!irqd_can_reserve(irqd) || cfg->vector == old_cfg.vector || old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR || !irqd_is_started(irqd) || @@ -215,8 +215,6 @@ static bool x86_init_dev_msi_info(struct if (WARN_ON_ONCE(domain != real_parent)) return false; info->chip->irq_set_affinity = msi_set_affinity; - /* See msi_set_affinity() for the gory details */ - info->flags |= MSI_FLAG_NOMASK_QUIRK; break; case DOMAIN_BUS_DMAR: case DOMAIN_BUS_AMDVI: --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -215,8 +215,6 @@ struct irq_data { * IRQD_SINGLE_TARGET - IRQ allows only a single affinity target * IRQD_DEFAULT_TRIGGER_SET - Expected trigger already been set * IRQD_CAN_RESERVE - Can use reservation mode - * IRQD_MSI_NOMASK_QUIRK - Non-maskable MSI quirk for affinity change - * required * IRQD_HANDLE_ENFORCE_IRQCTX - Enforce that handle_irq_*() is only invoked * from actual interrupt context. * IRQD_AFFINITY_ON_ACTIVATE - Affinity is set on activation. Don't call @@ -247,11 +245,10 @@ enum { IRQD_SINGLE_TARGET = BIT(24), IRQD_DEFAULT_TRIGGER_SET = BIT(25), IRQD_CAN_RESERVE = BIT(26), - IRQD_MSI_NOMASK_QUIRK = BIT(27), - IRQD_HANDLE_ENFORCE_IRQCTX = BIT(28), - IRQD_AFFINITY_ON_ACTIVATE = BIT(29), - IRQD_IRQ_ENABLED_ON_SUSPEND = BIT(30), - IRQD_RESEND_WHEN_IN_PROGRESS = BIT(31), + IRQD_HANDLE_ENFORCE_IRQCTX = BIT(27), + IRQD_AFFINITY_ON_ACTIVATE = BIT(28), + IRQD_IRQ_ENABLED_ON_SUSPEND = BIT(29), + IRQD_RESEND_WHEN_IN_PROGRESS = BIT(30), };
#define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors) @@ -426,21 +423,6 @@ static inline bool irqd_can_reserve(stru return __irqd_to_state(d) & IRQD_CAN_RESERVE; }
-static inline void irqd_set_msi_nomask_quirk(struct irq_data *d) -{ - __irqd_to_state(d) |= IRQD_MSI_NOMASK_QUIRK; -} - -static inline void irqd_clr_msi_nomask_quirk(struct irq_data *d) -{ - __irqd_to_state(d) &= ~IRQD_MSI_NOMASK_QUIRK; -} - -static inline bool irqd_msi_nomask_quirk(struct irq_data *d) -{ - return __irqd_to_state(d) & IRQD_MSI_NOMASK_QUIRK; -} - static inline void irqd_set_affinity_on_activate(struct irq_data *d) { __irqd_to_state(d) |= IRQD_AFFINITY_ON_ACTIVATE; --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -547,12 +547,6 @@ enum { MSI_FLAG_ALLOC_SIMPLE_MSI_DESCS = (1 << 5), /* Free MSI descriptors */ MSI_FLAG_FREE_MSI_DESCS = (1 << 6), - /* - * Quirk to handle MSI implementations which do not provide - * masking. Currently known to affect x86, but has to be partially - * handled in the core MSI code. - */ - MSI_FLAG_NOMASK_QUIRK = (1 << 7),
/* Mask for the generic functionality */ MSI_GENERIC_FLAGS_MASK = GENMASK(15, 0), --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -121,7 +121,6 @@ static const struct irq_bit_descr irqdat BIT_MASK_DESCR(IRQD_AFFINITY_ON_ACTIVATE), BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN), BIT_MASK_DESCR(IRQD_CAN_RESERVE), - BIT_MASK_DESCR(IRQD_MSI_NOMASK_QUIRK),
BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU),
--- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -1204,7 +1204,6 @@ static int msi_handle_pci_fail(struct ir
#define VIRQ_CAN_RESERVE 0x01 #define VIRQ_ACTIVATE 0x02 -#define VIRQ_NOMASK_QUIRK 0x04
static int msi_init_virq(struct irq_domain *domain, int virq, unsigned int vflags) { @@ -1213,8 +1212,6 @@ static int msi_init_virq(struct irq_doma
if (!(vflags & VIRQ_CAN_RESERVE)) { irqd_clr_can_reserve(irqd); - if (vflags & VIRQ_NOMASK_QUIRK) - irqd_set_msi_nomask_quirk(irqd);
/* * If the interrupt is managed but no CPU is available to @@ -1275,15 +1272,8 @@ static int __msi_domain_alloc_irqs(struc * Interrupt can use a reserved vector and will not occupy * a real device vector until the interrupt is requested. */ - if (msi_check_reservation_mode(domain, info, dev)) { + if (msi_check_reservation_mode(domain, info, dev)) vflags |= VIRQ_CAN_RESERVE; - /* - * MSI affinity setting requires a special quirk (X86) when - * reservation mode is active. - */ - if (info->flags & MSI_FLAG_NOMASK_QUIRK) - vflags |= VIRQ_NOMASK_QUIRK; - }
xa_for_each_range(xa, idx, desc, ctrl->first, ctrl->last) { if (!msi_desc_match(desc, MSI_DESC_NOTASSOCIATED))
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pu Wen puwen@hygon.cn
commit ee545b94d39a00c93dc98b1dbcbcf731d2eadeb4 upstream.
Hygon processors with a model ID > 3 have CPUID leaf 0xB correctly populated and don't need the fixed package ID shift workaround. The fixup is also incorrect when running in a guest.
Fixes: e0ceeae708ce ("x86/CPU/hygon: Fix phys_proc_id calculation logic for multi-die processors") Signed-off-by: Pu Wen puwen@hygon.cn Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/tencent_594804A808BD93A4EBF50A994F228E3A7F07@qq.co... Link: https://lore.kernel.org/r/20230814085112.089607918@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/hygon.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/hygon.c +++ b/arch/x86/kernel/cpu/hygon.c @@ -86,8 +86,12 @@ static void hygon_get_topology(struct cp if (!err) c->x86_coreid_bits = get_count_order(c->x86_max_cores);
- /* Socket ID is ApicId[6] for these processors. */ - c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT; + /* + * Socket ID is ApicId[6] for the processors with model <= 0x3 + * when running on host. + */ + if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) && c->x86_model <= 0x3) + c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT;
cacheinfo_hygon_init_llc_id(c, cpu); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Saenz Julienne nsaenz@amazon.com
commit d6800af51c76b6dae20e6023bbdc9b3da3ab5121 upstream.
Don't apply the stimer's counter side effects when modifying its value from user-space, as this may trigger spurious interrupts.
For example: - The stimer is configured in auto-enable mode. - The stimer's count is set and the timer enabled. - The stimer expires, an interrupt is injected. - The VM is live migrated. - The stimer config and count are deserialized, auto-enable is ON, the stimer is re-enabled. - The stimer expires right away, and injects an unwarranted interrupt.
Cc: stable@vger.kernel.org Fixes: 1f4b34f825e8 ("kvm/x86: Hyper-V SynIC timers") Signed-off-by: Nicolas Saenz Julienne nsaenz@amazon.com Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Link: https://lore.kernel.org/r/20231017155101.40677-1-nsaenz@amazon.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/hyperv.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -727,10 +727,12 @@ static int stimer_set_count(struct kvm_v
stimer_cleanup(stimer); stimer->count = count; - if (stimer->count == 0) - stimer->config.enable = 0; - else if (stimer->config.auto_enable) - stimer->config.enable = 1; + if (!host) { + if (stimer->count == 0) + stimer->config.enable = 0; + else if (stimer->config.auto_enable) + stimer->config.enable = 1; + }
if (stimer->config.enable) stimer_mark_pending(stimer, false);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej S. Szmigiero maciej.szmigiero@oracle.com
commit 2770d4722036d6bd24bcb78e9cd7f6e572077d03 upstream.
Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED + STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP in the guest kernel).
This is because Windows tries to set bit 8 in MSR_AMD64_TW_CFG and can't handle receiving a #GP when doing so.
Give this MSR the same treatment that commit 2e32b7190641 ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant only. Although apparently it was then needed for Linux guests, not Windows as in this case.
With this change, the aforementioned guest setup is able to finish booting successfully.
This issue can be reproduced either on a Summit Ridge Ryzen (with just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since EPYC is ordinarily stepping 2).
Alternatively, userspace could solve the problem by using MSR filters, but forcing every userspace to define a filter isn't very friendly and doesn't add much, if any, value. The only potential hiccup is if one of these "baremetal-only" MSRs ever requires actual emulation and/or has F/M/S specific behavior. But if that happens, then KVM can still punt *that* handling to userspace since userspace MSR filters "win" over KVM's default handling.
Signed-off-by: Maciej S. Szmigiero maciej.szmigiero@oracle.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1ce85d9c7c9e9632393816cf19c902e0a3f411f1.169773140... [sean: call out MSR filtering alternative] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/x86.c | 2 ++ 2 files changed, 3 insertions(+)
--- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -553,6 +553,7 @@ #define MSR_AMD64_CPUID_FN_1 0xc0011004 #define MSR_AMD64_LS_CFG 0xc0011020 #define MSR_AMD64_DC_CFG 0xc0011022 +#define MSR_AMD64_TW_CFG 0xc0011023
#define MSR_AMD64_DE_CFG 0xc0011029 #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT 1 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3643,6 +3643,7 @@ int kvm_set_msr_common(struct kvm_vcpu * case MSR_AMD64_PATCH_LOADER: case MSR_AMD64_BU_CFG2: case MSR_AMD64_DC_CFG: + case MSR_AMD64_TW_CFG: case MSR_F15H_EX_CFG: break;
@@ -4067,6 +4068,7 @@ int kvm_get_msr_common(struct kvm_vcpu * case MSR_AMD64_BU_CFG2: case MSR_IA32_PERF_CTL: case MSR_AMD64_DC_CFG: + case MSR_AMD64_TW_CFG: case MSR_F15H_EX_CFG: /* * Intel Sandy Bridge CPUs must support the RAPL (running average power
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tao Su tao1.su@linux.intel.com
commit 629d3698f6958ee6f8131ea324af794f973b12ac upstream.
When IPI virtualization is enabled, a WARN is triggered if bit12 of ICR MSR is set after APIC-write VM-exit. The reason is kvm_apic_send_ipi() thinks the APIC_ICR_BUSY bit should be cleared because KVM has no delay, but kvm_apic_write_nodecode() doesn't clear the APIC_ICR_BUSY bit.
Under the x2APIC section, regarding ICR, the SDM says:
It remains readable only to aid in debugging; however, software should not assume the value returned by reading the ICR is the last written value.
I.e. the guest is allowed to set bit 12. However, the SDM also gives KVM free reign to do whatever it wants with the bit, so long as KVM's behavior doesn't confuse userspace or break KVM's ABI.
Clear bit 12 so that it reads back as '0'. This approach is safer than "do nothing" and is consistent with the case where IPI virtualization is disabled or not supported, i.e.,
handle_fastpath_set_x2apic_icr_irqoff() -> kvm_x2apic_icr_write()
Opportunistically replace the TODO with a comment calling out that eating the write is likely faster than a conditional branch around the busy bit.
Link: https://lore.kernel.org/all/ZPj6iF0Q7iynn62p@google.com/ Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") Cc: stable@vger.kernel.org Signed-off-by: Tao Su tao1.su@linux.intel.com Tested-by: Yi Lai yi1.lai@intel.com Reviewed-by: Chao Gao chao.gao@intel.com Link: https://lore.kernel.org/r/20230914055504.151365-1-tao1.su@linux.intel.com [sean: tweak changelog, replace TODO with comment, drop local "val"] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/lapic.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2423,22 +2423,22 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi); void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset) { struct kvm_lapic *apic = vcpu->arch.apic; - u64 val;
/* - * ICR is a single 64-bit register when x2APIC is enabled. For legacy - * xAPIC, ICR writes need to go down the common (slightly slower) path - * to get the upper half from ICR2. + * ICR is a single 64-bit register when x2APIC is enabled, all others + * registers hold 32-bit values. For legacy xAPIC, ICR writes need to + * go down the common path to get the upper half from ICR2. + * + * Note, using the write helpers may incur an unnecessary write to the + * virtual APIC state, but KVM needs to conditionally modify the value + * in certain cases, e.g. to clear the ICR busy bit. The cost of extra + * conditional branches is likely a wash relative to the cost of the + * maybe-unecessary write, and both are in the noise anyways. */ - if (apic_x2apic_mode(apic) && offset == APIC_ICR) { - val = kvm_lapic_get_reg64(apic, APIC_ICR); - kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32)); - trace_kvm_apic_write(APIC_ICR, val); - } else { - /* TODO: optimize to just emulate side effect w/o one more write */ - val = kvm_lapic_get_reg(apic, offset); - kvm_lapic_reg_write(apic, offset, (u32)val); - } + if (apic_x2apic_mode(apic) && offset == APIC_ICR) + kvm_x2apic_icr_write(apic, kvm_lapic_get_reg64(apic, APIC_ICR)); + else + kvm_lapic_reg_write(apic, offset, kvm_lapic_get_reg(apic, offset)); } EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haitao Shan hshan@google.com
commit 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 upstream.
When running android emulator (which is based on QEMU 2.12) on certain Intel hosts with kernel version 6.3-rc1 or above, guest will freeze after loading a snapshot. This is almost 100% reproducible. By default, the android emulator will use snapshot to speed up the next launching of the same android guest. So this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by running command "loadvm" after "savevm". The same issue is observed. At the same time, none of our AMD platforms is impacted. More experiments show that loading the KVM module with "enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86: fire timer when it is migrated and expired, and in oneshot mode"). However, as is pointed out by Sean Christopherson, it is introduced by commit 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when it is migrated and expired, and in oneshot mode") just makes it easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately inside the KVM_SET_LAPIC call when loading the snapshot. On Intel platforms with APIC virtualization and posted interrupt processing, this eventually leads to setting the corresponding PIR bit. However, the whole PIR bits get cleared later in the same KVM_SET_LAPIC call by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore. What vmx_apicv_post_state_restore does is actually clearing any former apicv state and this behavior is more suitable to carry out in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson seanjc@google.com Signed-off-by: Haitao Shan hshan@google.com Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/lapic.c | 4 ++++ arch/x86/kvm/vmx/vmx.c | 4 ++-- 4 files changed, 8 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -108,6 +108,7 @@ KVM_X86_OP_OPTIONAL(vcpu_blocking) KVM_X86_OP_OPTIONAL(vcpu_unblocking) KVM_X86_OP_OPTIONAL(pi_update_irte) KVM_X86_OP_OPTIONAL(pi_start_assignment) +KVM_X86_OP_OPTIONAL(apicv_pre_state_restore) KVM_X86_OP_OPTIONAL(apicv_post_state_restore) KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt) KVM_X86_OP_OPTIONAL(set_hv_timer) --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1690,6 +1690,7 @@ struct kvm_x86_ops { int (*pi_update_irte)(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void (*pi_start_assignment)(struct kvm *kvm); + void (*apicv_pre_state_restore)(struct kvm_vcpu *vcpu); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
--- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2649,6 +2649,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vc u64 msr_val; int i;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu); + if (!init_event) { msr_val = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE; if (kvm_vcpu_is_reset_bsp(vcpu)) @@ -2960,6 +2962,8 @@ int kvm_apic_set_state(struct kvm_vcpu * struct kvm_lapic *apic = vcpu->arch.apic; int r;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu); + kvm_lapic_set_base(vcpu, vcpu->arch.apic_base); /* set SPIV separately to get count of SW disabled APICs right */ apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV))); --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6909,7 +6909,7 @@ static void vmx_load_eoi_exitmap(struct vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]); }
-static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) +static void vmx_apicv_pre_state_restore(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -8275,7 +8275,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .set_apic_access_page_addr = vmx_set_apic_access_page_addr, .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl, .load_eoi_exitmap = vmx_load_eoi_exitmap, - .apicv_post_state_restore = vmx_apicv_post_state_restore, + .apicv_pre_state_restore = vmx_apicv_pre_state_restore, .required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS, .hwapic_irr_update = vmx_hwapic_irr_update, .hwapic_isr_update = vmx_hwapic_isr_update,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Shih victor.shih@genesyslogic.com.tw
commit 85dd3af64965c1c0eb7373b340a1b1f7773586b0 upstream.
Due to a flaw in the hardware design, the GL9755 replay timer frequently times out when ASPM is enabled. As a result, the warning messages will often appear in the system log when the system accesses the GL9755 PCI config. Therefore, the replay timer timeout must be masked.
Fixes: 36ed2fd32b2c ("mmc: sdhci-pci-gli: A workaround to allow GL9755 to enter ASPM L1.2") Signed-off-by: Victor Shih victor.shih@genesyslogic.com.tw Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Kai-Heng Feng kai.heng.geng@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231107095741.8832-3-victorshihgli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-gli.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -149,6 +149,9 @@ #define PCI_GLI_9755_PM_CTRL 0xFC #define PCI_GLI_9755_PM_STATE GENMASK(1, 0)
+#define PCI_GLI_9755_CORRERR_MASK 0x214 +#define PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT BIT(12) + #define SDHCI_GLI_9767_GM_BURST_SIZE 0x510 #define SDHCI_GLI_9767_GM_BURST_SIZE_AXI_ALWAYS_SET BIT(8)
@@ -756,6 +759,11 @@ static void gl9755_hw_setting(struct sdh value &= ~PCI_GLI_9755_PM_STATE; pci_write_config_dword(pdev, PCI_GLI_9755_PM_CTRL, value);
+ /* mask the replay timer timeout of AER */ + pci_read_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, &value); + value |= PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT; + pci_write_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, value); + gl9755_wt_off(pdev); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Weiner hannes@cmpxchg.org
commit 8b39d20eceeda6c4eb23df1497f9ed2fffdc8f69 upstream.
519fabc7aaba ("psi: remove 500ms min window size limitation for triggers") breaks unprivileged psi polling on cgroups.
Historically, we had a privilege check for polling in the open() of a pressure file in /proc, but were erroneously missing it for the open() of cgroup pressure files.
When unprivileged polling was introduced in d82caa273565 ("sched/psi: Allow unprivileged polling of N*2s period"), it needed to filter privileges depending on the exact polling parameters, and as such moved the CAP_SYS_RESOURCE check from the proc open() callback to psi_trigger_create(). Both the proc files as well as cgroup files go through this during write(). This implicitly added the missing check for privileges required for HT polling for cgroups.
When 519fabc7aaba ("psi: remove 500ms min window size limitation for triggers") followed right after to remove further restrictions on the RT polling window, it incorrectly assumed the cgroup privilege check was still missing and added it to the cgroup open(), mirroring what we used to do for proc files in the past.
As a result, unprivileged poll requests that would be supported now get rejected when opening the cgroup pressure file for writing.
Remove the cgroup open() check. psi_trigger_create() handles it.
Fixes: 519fabc7aaba ("psi: remove 500ms min window size limitation for triggers") Reported-by: Luca Boccassi bluca@debian.org Signed-off-by: Johannes Weiner hannes@cmpxchg.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Luca Boccassi bluca@debian.org Acked-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org # 6.5+ Link: https://lore.kernel.org/r/20231026164114.2488682-1-hannes@cmpxchg.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cgroup/cgroup.c | 12 ------------ 1 file changed, 12 deletions(-)
--- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3836,14 +3836,6 @@ static __poll_t cgroup_pressure_poll(str return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); }
-static int cgroup_pressure_open(struct kernfs_open_file *of) -{ - if (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) - return -EPERM; - - return 0; -} - static void cgroup_pressure_release(struct kernfs_open_file *of) { struct cgroup_file_ctx *ctx = of->priv; @@ -5243,7 +5235,6 @@ static struct cftype cgroup_psi_files[] { .name = "io.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IO]), - .open = cgroup_pressure_open, .seq_show = cgroup_io_pressure_show, .write = cgroup_io_pressure_write, .poll = cgroup_pressure_poll, @@ -5252,7 +5243,6 @@ static struct cftype cgroup_psi_files[] { .name = "memory.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]), - .open = cgroup_pressure_open, .seq_show = cgroup_memory_pressure_show, .write = cgroup_memory_pressure_write, .poll = cgroup_pressure_poll, @@ -5261,7 +5251,6 @@ static struct cftype cgroup_psi_files[] { .name = "cpu.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]), - .open = cgroup_pressure_open, .seq_show = cgroup_cpu_pressure_show, .write = cgroup_cpu_pressure_write, .poll = cgroup_pressure_poll, @@ -5271,7 +5260,6 @@ static struct cftype cgroup_psi_files[] { .name = "irq.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]), - .open = cgroup_pressure_open, .seq_show = cgroup_irq_pressure_show, .write = cgroup_irq_pressure_write, .poll = cgroup_pressure_poll,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Moore paul@paul-moore.com
commit 47846d51348dd62e5231a83be040981b17c955fa upstream.
The get_task_exe_file() function locks the given task with task_lock() which when used inside audit_exe_compare() can cause deadlocks on systems that generate audit records when the task_lock() is held. We resolve this problem with two changes: ignoring those cases where the task being audited is not the current task, and changing our approach to obtaining the executable file struct to not require task_lock().
With the intent of the audit exe filter being to filter on audit events generated by processes started by the specified executable, it makes sense that we would only want to use the exe filter on audit records associated with the currently executing process, e.g. @current. If we are asked to filter records using a non-@current task_struct we can safely ignore the exe filter without negatively impacting the admin's expectations for the exe filter.
Knowing that we only have to worry about filtering the currently executing task in audit_exe_compare() we can do away with the task_lock() and call get_mm_exe_file() with @current->mm directly.
Cc: stable@vger.kernel.org Fixes: 5efc244346f9 ("audit: fix exe_file access in audit_exe_compare") Reported-by: Andreas Steinmetz anstein99@googlemail.com Reviewed-by: John Johansen john.johanse@canonical.com Reviewed-by: Mateusz Guzik mjguzik@gmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/audit_watch.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -527,11 +527,18 @@ int audit_exe_compare(struct task_struct unsigned long ino; dev_t dev;
- exe_file = get_task_exe_file(tsk); + /* only do exe filtering if we are recording @current events/records */ + if (tsk != current) + return 0; + + if (WARN_ON_ONCE(!current->mm)) + return 0; + exe_file = get_mm_exe_file(current->mm); if (!exe_file) return 0; ino = file_inode(exe_file)->i_ino; dev = file_inode(exe_file)->i_sb->s_dev; fput(exe_file); + return audit_mark_compare(mark, ino, dev); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Moore paul@paul-moore.com
commit 969d90ec212bae4b45bf9d21d7daa30aa6cf055e upstream.
eBPF can end up calling into the audit code from some odd places, and some of these places don't have @current set properly so we end up tripping the `WARN_ON_ONCE(!current->mm)` near the top of `audit_exe_compare()`. While the basic `!current->mm` check is good, the `WARN_ON_ONCE()` results in some scary console messages so let's drop that and just do the regular `!current->mm` check to avoid problems.
Cc: stable@vger.kernel.org Fixes: 47846d51348d ("audit: don't take task_lock() in audit_exe_compare() code path") Reported-by: Artem Savkov asavkov@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/audit_watch.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -531,7 +531,7 @@ int audit_exe_compare(struct task_struct if (tsk != current) return 0;
- if (WARN_ON_ONCE(!current->mm)) + if (!current->mm) return 0; exe_file = get_mm_exe_file(current->mm); if (!exe_file)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krister Johansen kjlx@templeofstupid.com
commit 8001f49394e353f035306a45bcf504f06fca6355 upstream.
The code that checks for unknown boot options is unaware of the sysctl alias facility, which maps bootparams to sysctl values. If a user sets an old value that has a valid alias, a message about an invalid parameter will be printed during boot, and the parameter will get passed to init. Fix by checking for the existence of aliased parameters in the unknown boot parameter code. If an alias exists, don't return an error or pass the value to init.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com Cc: stable@vger.kernel.org Fixes: 0a477e1ae21b ("kernel/sysctl: support handling command line aliases") Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/proc_sysctl.c | 7 +++++++ include/linux/sysctl.h | 6 ++++++ init/main.c | 4 ++++ 3 files changed, 17 insertions(+)
--- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1590,6 +1590,13 @@ static const char *sysctl_find_alias(cha return NULL; }
+bool sysctl_is_alias(char *param) +{ + const char *alias = sysctl_find_alias(param); + + return alias != NULL; +} + /* Set sysctl value passed on kernel command line. */ static int process_sysctl_arg(char *param, char *val, const char *unused, void *arg) --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -227,6 +227,7 @@ extern void __register_sysctl_init(const extern struct ctl_table_header *register_sysctl_mount_point(const char *path);
void do_sysctl_args(void); +bool sysctl_is_alias(char *param); int do_proc_douintvec(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos, int (*conv)(unsigned long *lvalp, @@ -270,6 +271,11 @@ static inline void setup_sysctl_set(stru static inline void do_sysctl_args(void) { } + +static inline bool sysctl_is_alias(char *param) +{ + return false; +} #endif /* CONFIG_SYSCTL */
int sysctl_max_threads(struct ctl_table *table, int write, void *buffer, --- a/init/main.c +++ b/init/main.c @@ -530,6 +530,10 @@ static int __init unknown_bootoption(cha { size_t len = strlen(param);
+ /* Handle params aliased to sysctls */ + if (sysctl_is_alias(param)) + return 0; + repair_env_string(param, val);
/* Handle obsolete-style parameters */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum usama.anjum@collabora.com
commit dd976a97d15b47656991e185a94ef42a0fa5cfd4 upstream.
The smp_processor_id() shouldn't be called from preemptible code. Instead use get_cpu() and put_cpu() which disables preemption in addition to getting the processor id. Enable preemption back after calling schedule_work() to make sure that the work gets scheduled on all cores other than the current core. We want to avoid a scenario where current core's stack trace is printed multiple times and one core's stack trace isn't printed because of scheduling of current task.
This fixes the following bug:
[ 119.143590] sysrq: Show backtrace of all active CPUs [ 119.143902] BUG: using smp_processor_id() in preemptible [00000000] code: bash/873 [ 119.144586] caller is debug_smp_processor_id+0x20/0x30 [ 119.144827] CPU: 6 PID: 873 Comm: bash Not tainted 5.10.124-dirty #3 [ 119.144861] Hardware name: QEMU QEMU Virtual Machine, BIOS 2023.05-1 07/22/2023 [ 119.145053] Call trace: [ 119.145093] dump_backtrace+0x0/0x1a0 [ 119.145122] show_stack+0x18/0x70 [ 119.145141] dump_stack+0xc4/0x11c [ 119.145159] check_preemption_disabled+0x100/0x110 [ 119.145175] debug_smp_processor_id+0x20/0x30 [ 119.145195] sysrq_handle_showallcpus+0x20/0xc0 [ 119.145211] __handle_sysrq+0x8c/0x1a0 [ 119.145227] write_sysrq_trigger+0x94/0x12c [ 119.145247] proc_reg_write+0xa8/0xe4 [ 119.145266] vfs_write+0xec/0x280 [ 119.145282] ksys_write+0x6c/0x100 [ 119.145298] __arm64_sys_write+0x20/0x30 [ 119.145315] el0_svc_common.constprop.0+0x78/0x1e4 [ 119.145332] do_el0_svc+0x24/0x8c [ 119.145348] el0_svc+0x10/0x20 [ 119.145364] el0_sync_handler+0x134/0x140 [ 119.145381] el0_sync+0x180/0x1c0
Cc: jirislaby@kernel.org Cc: stable@vger.kernel.org Fixes: 47cab6a722d4 ("debug lockups: Improve lockup detection, fix generic arch fallback") Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/r/20231009162021.3607632-1-usama.anjum@collabora.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/sysrq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -263,13 +263,14 @@ static void sysrq_handle_showallcpus(int if (in_hardirq()) regs = get_irq_regs();
- pr_info("CPU%d:\n", smp_processor_id()); + pr_info("CPU%d:\n", get_cpu()); if (regs) show_regs(regs); else show_stack(NULL, NULL, KERN_INFO);
schedule_work(&sysrq_showallcpus); + put_cpu(); } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Krasavin pkrasavin@imaqliq.com
commit 2a1d728f20edeee7f26dc307ed9df4e0d23947ab upstream.
There might be hard lockup if we set crtscts mode on port without RTS/CTS configured:
# stty -F /dev/ttyAML6 crtscts; echo 1 > /dev/ttyAML6; echo 2 > /dev/ttyAML6 [ 95.890386] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 95.890857] rcu: 3-...0: (201 ticks this GP) idle=e33c/1/0x4000000000000000 softirq=5844/5846 fqs=4984 [ 95.900212] rcu: (detected by 2, t=21016 jiffies, g=7753, q=296 ncpus=4) [ 95.906972] Task dump for CPU 3: [ 95.910178] task:bash state:R running task stack:0 pid:205 ppid:1 flags:0x00000202 [ 95.920059] Call trace: [ 95.922485] __switch_to+0xe4/0x168 [ 95.925951] 0xffffff8003477508 [ 95.974379] watchdog: Watchdog detected hard LOCKUP on cpu 3 [ 95.974424] Modules linked in: 88x2cs(O) rtc_meson_vrtc
Possible solution would be to not allow to setup crtscts on such port.
Tested on S905X3 based board.
Fixes: ff7693d079e5 ("ARM: meson: serial: add MesonX SoC on-chip uart driver") Cc: stable@vger.kernel.org Signed-off-by: Pavel Krasavin pkrasavin@imaqliq.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Dmitry Rokosov ddrokosov@salutedevices.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/meson_uart.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/tty/serial/meson_uart.c +++ b/drivers/tty/serial/meson_uart.c @@ -379,10 +379,14 @@ static void meson_uart_set_termios(struc else val |= AML_UART_STOP_BIT_1SB;
- if (cflags & CRTSCTS) - val &= ~AML_UART_TWO_WIRE_EN; - else + if (cflags & CRTSCTS) { + if (port->flags & UPF_HARD_FLOW) + val &= ~AML_UART_TWO_WIRE_EN; + else + termios->c_cflag &= ~CRTSCTS; + } else { val |= AML_UART_TWO_WIRE_EN; + }
writel(val, port->membase + AML_UART_CONTROL);
@@ -697,6 +701,7 @@ static int meson_uart_probe(struct platf u32 fifosize = 64; /* Default is 64, 128 for EE UART_0 */ int ret = 0; int irq; + bool has_rtscts;
if (pdev->dev.of_node) pdev->id = of_alias_get_id(pdev->dev.of_node, "serial"); @@ -724,6 +729,7 @@ static int meson_uart_probe(struct platf return irq;
of_property_read_u32(pdev->dev.of_node, "fifo-size", &fifosize); + has_rtscts = of_property_read_bool(pdev->dev.of_node, "uart-has-rtscts");
if (meson_ports[pdev->id]) { dev_err(&pdev->dev, "port %d already allocated\n", pdev->id); @@ -743,6 +749,8 @@ static int meson_uart_probe(struct platf port->mapsize = resource_size(res_mem); port->irq = irq; port->flags = UPF_BOOT_AUTOCONF | UPF_LOW_LATENCY; + if (has_rtscts) + port->flags |= UPF_HARD_FLOW; port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_MESON_CONSOLE); port->dev = &pdev->dev; port->line = pdev->id;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit a30badfd7c13fc8763a9e10c5a12ba7f81515a55 upstream.
On unplug of a Xen console, xencons_disconnect_backend() unconditionally calls free_irq() via unbind_from_irqhandler(), causing a warning of freeing an already-free IRQ:
(qemu) device_del con1 [ 32.050919] ------------[ cut here ]------------ [ 32.050942] Trying to free already-free IRQ 33 [ 32.050990] WARNING: CPU: 0 PID: 51 at kernel/irq/manage.c:1895 __free_irq+0x1d4/0x330
It should be using evtchn_put() to tear down the event channel binding, and let the Linux IRQ side of it be handled by notifier_del_irq() through the HVC code.
On which topic... xencons_disconnect_backend() should call hvc_remove() *first*, rather than tearing down the event channel and grant mapping while they are in use. And then the IRQ is guaranteed to be freed by the time it's torn down by evtchn_put().
Since evtchn_put() also closes the actual event channel, avoid calling xenbus_free_evtchn() except in the failure path where the IRQ was not successfully set up.
However, calling hvc_remove() at the start of xencons_disconnect_backend() still isn't early enough. An unplug request is indicated by the backend setting its state to XenbusStateClosing, which triggers a notification to xencons_backend_changed(), which... does nothing except set its own frontend state directly to XenbusStateClosed without *actually* tearing down the HVC device or, you know, making sure it isn't actively in use.
So the backend sees the guest frontend set its state to XenbusStateClosed and stops servicing the interrupt... and the guest spins for ever in the domU_write_console() function waiting for the ring to drain.
Fix that one by calling hvc_remove() from xencons_backend_changed() before signalling to the backend that it's OK to proceed with the removal.
Tested with 'dd if=/dev/zero of=/dev/hvc1' while telling Qemu to remove the console device.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-4-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -377,18 +377,21 @@ void xen_console_resume(void) #ifdef CONFIG_HVC_XEN_FRONTEND static void xencons_disconnect_backend(struct xencons_info *info) { - if (info->irq > 0) - unbind_from_irqhandler(info->irq, NULL); - info->irq = 0; + if (info->hvc != NULL) + hvc_remove(info->hvc); + info->hvc = NULL; + if (info->irq > 0) { + evtchn_put(info->evtchn); + info->irq = 0; + info->evtchn = 0; + } + /* evtchn_put() will also close it so this is only an error path */ if (info->evtchn > 0) xenbus_free_evtchn(info->xbdev, info->evtchn); info->evtchn = 0; if (info->gntref > 0) gnttab_free_grant_references(info->gntref); info->gntref = 0; - if (info->hvc != NULL) - hvc_remove(info->hvc); - info->hvc = NULL; }
static void xencons_free(struct xencons_info *info) @@ -553,10 +556,23 @@ static void xencons_backend_changed(stru if (dev->state == XenbusStateClosed) break; fallthrough; /* Missed the backend's CLOSING state */ - case XenbusStateClosing: + case XenbusStateClosing: { + struct xencons_info *info = dev_get_drvdata(&dev->dev);; + + /* + * Don't tear down the evtchn and grant ref before the other + * end has disconnected, but do stop userspace from trying + * to use the device before we allow the backend to close. + */ + if (info->hvc) { + hvc_remove(info->hvc); + info->hvc = NULL; + } + xenbus_frontend_closed(dev); break; } + } }
static const struct xenbus_device_id xencons_ids[] = { @@ -616,7 +632,7 @@ static int __init xen_hvc_init(void) list_del(&info->list); spin_unlock_irqrestore(&xencons_lock, flags); if (info->irq) - unbind_from_irqhandler(info->irq, NULL); + evtchn_put(info->evtchn); kfree(info); return r; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 2704c9a5593f4a47620c12dad78838ca62b52f48 upstream.
The xen_hvc_init() function should always register the frontend driver, even when there's no primary console — as there may be secondary consoles. (Qemu can always add secondary consoles, but only the toolstack can add the primary because it's special.)
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Reviewed-by: Juergen Gross jgross@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-3-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -604,7 +604,7 @@ static int __init xen_hvc_init(void) ops = &dom0_hvc_ops; r = xen_initial_domain_console_init(); if (r < 0) - return r; + goto register_fe; info = vtermno_to_xencons(HVC_COOKIE); } else { ops = &domU_hvc_ops; @@ -613,7 +613,7 @@ static int __init xen_hvc_init(void) else r = xen_pv_console_init(); if (r < 0) - return r; + goto register_fe;
info = vtermno_to_xencons(HVC_COOKIE); info->irq = bind_evtchn_to_irq_lateeoi(info->evtchn); @@ -638,6 +638,7 @@ static int __init xen_hvc_init(void) }
r = 0; + register_fe: #ifdef CONFIG_HVC_XEN_FRONTEND r = xenbus_register_frontend(&xencons_driver); #endif
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit ef5dd8ec88ac11e8e353164407d55b73c988b369 upstream.
The xencons_connect_backend() function allocates a local interdomain event channel with xenbus_alloc_evtchn(), then calls bind_interdomain_evtchn_to_irq_lateeoi() to bind to that port# on the *remote* domain.
That doesn't work very well:
(qemu) device_add xen-console,id=con1,chardev=pty0 [ 44.323872] xenconsole console-1: 2 xenbus_dev_probe on device/console/1 [ 44.323995] xenconsole: probe of console-1 failed with error -2
Fix it to use bind_evtchn_to_irq_lateeoi(), which does the right thing by just binding that *local* event channel to an irq. The backend will do the interdomain binding.
This didn't affect the primary console because the setup for that is special — the toolstack allocates the guest event channel and the guest discovers it with HVMOP_get_param.
Fixes: fe415186b43d ("xen/console: harden hvc_xen against event channel storms") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Reviewed-by: Juergen Gross jgross@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-2-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -436,7 +436,7 @@ static int xencons_connect_backend(struc if (ret) return ret; info->evtchn = evtchn; - irq = bind_interdomain_evtchn_to_irq_lateeoi(dev, evtchn); + irq = bind_evtchn_to_irq_lateeoi(evtchn); if (irq < 0) return irq; info->irq = irq;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
commit 70b70a4307cccebe91388337b1c85735ce4de6ff upstream.
struct pci_dev contains two flags which govern whether the device may suspend to D3cold:
* no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold)
* d3cold_allowed provides an opt-out for user space (default is true, user space may set to false)
Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working.
That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device.
That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.169502536... Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-acpi.c | 2 +- drivers/pci/pci-sysfs.c | 5 +---- 2 files changed, 2 insertions(+), 5 deletions(-)
--- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct { int acpi_state, d_max;
- if (pdev->no_d3cold) + if (pdev->no_d3cold || !pdev->d3cold_allowed) d_max = ACPI_STATE_D3_HOT; else d_max = ACPI_STATE_D3_COLD; --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -529,10 +529,7 @@ static ssize_t d3cold_allowed_store(stru return -EINVAL;
pdev->d3cold_allowed = !!val; - if (pdev->d3cold_allowed) - pci_d3cold_enable(pdev); - else - pci_d3cold_disable(pdev); + pci_bridge_d3_update(pdev);
pm_runtime_resume(dev);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 19467a950b49432a84bf6dbadbbb17bdf89418b7 upstream.
damon_sysfs_set_targets(), which updates the targets of the context for online commitment, do not remove targets that removed from the corresponding sysfs files. As a result, more than intended targets of the context can exist and hence consume memory and monitoring CPU resource more than expected.
Fix it by removing all targets of the context and fill up again using the user input. This could cause unnecessary memory dealloc and realloc operations, but this is not a hot code path. Also, note that damon_target is stateless, and hence no data is lost.
[sj@kernel.org: fix unnecessary monitoring results removal] Link: https://lkml.kernel.org/r/20231028213353.45397-1-sj@kernel.org Link: https://lkml.kernel.org/r/20231022210735.46409-2-sj@kernel.org Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park sj@kernel.org Cc: Brendan Higgins brendanhiggins@google.com Cc: stable@vger.kernel.org [5.19.x] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs.c | 68 ++++++++++++++++++++++++++++--------------------------- 1 file changed, 35 insertions(+), 33 deletions(-)
--- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -1144,58 +1144,60 @@ destroy_targets_out: return err; }
-/* - * Search a target in a context that corresponds to the sysfs target input. - * - * Return: pointer to the target if found, NULL if not found, or negative - * error code if the search failed. - */ -static struct damon_target *damon_sysfs_existing_target( - struct damon_sysfs_target *sys_target, struct damon_ctx *ctx) +static int damon_sysfs_update_target(struct damon_target *target, + struct damon_ctx *ctx, + struct damon_sysfs_target *sys_target) { struct pid *pid; - struct damon_target *t; + struct damon_region *r, *next;
- if (!damon_target_has_pid(ctx)) { - /* Up to only one target for paddr could exist */ - damon_for_each_target(t, ctx) - return t; - return NULL; - } + if (!damon_target_has_pid(ctx)) + return 0;
- /* ops.id should be DAMON_OPS_VADDR or DAMON_OPS_FVADDR */ pid = find_get_pid(sys_target->pid); if (!pid) - return ERR_PTR(-EINVAL); - damon_for_each_target(t, ctx) { - if (t->pid == pid) { - put_pid(pid); - return t; - } + return -EINVAL; + + /* no change to the target */ + if (pid == target->pid) { + put_pid(pid); + return 0; } - put_pid(pid); - return NULL; + + /* remove old monitoring results and update the target's pid */ + damon_for_each_region_safe(r, next, target) + damon_destroy_region(r, target); + put_pid(target->pid); + target->pid = pid; + return 0; }
static int damon_sysfs_set_targets(struct damon_ctx *ctx, struct damon_sysfs_targets *sysfs_targets) { - int i, err; + struct damon_target *t, *next; + int i = 0, err;
/* Multiple physical address space monitoring targets makes no sense */ if (ctx->ops.id == DAMON_OPS_PADDR && sysfs_targets->nr > 1) return -EINVAL;
- for (i = 0; i < sysfs_targets->nr; i++) { + damon_for_each_target_safe(t, next, ctx) { + if (i < sysfs_targets->nr) { + damon_sysfs_update_target(t, ctx, + sysfs_targets->targets_arr[i]); + } else { + if (damon_target_has_pid(ctx)) + put_pid(t->pid); + damon_destroy_target(t); + } + i++; + } + + for (; i < sysfs_targets->nr; i++) { struct damon_sysfs_target *st = sysfs_targets->targets_arr[i]; - struct damon_target *t = damon_sysfs_existing_target(st, ctx);
- if (IS_ERR(t)) - return PTR_ERR(t); - if (!t) - err = damon_sysfs_add_target(st, ctx); - else - err = damon_sysfs_set_regions(t, st->regions); + err = damon_sysfs_add_target(st, ctx); if (err) return err; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 9732336006764e2ee61225387e3c70eae9139035 upstream.
When user input is committed online, DAMON sysfs interface is ignoring the user input for the monitoring target regions. Such request is valid and useful for fixed monitoring target regions-based monitoring ops like 'paddr' or 'fvaddr'.
Update the region boundaries as user specified, too. Note that the monitoring results of the regions that overlap between the latest monitoring target regions and the new target regions are preserved.
Treat empty monitoring target regions user request as a request to just make no change to the monitoring target regions. Otherwise, users should set the monitoring target regions same to current one for every online input commit, and it could be challenging for dynamic monitoring target regions update DAMON ops like 'vaddr'. If the user really need to remove all monitoring target regions, they can simply remove the target and then create the target again with empty target regions.
Link: https://lkml.kernel.org/r/20231031170131.46972-1-sj@kernel.org Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [5.19+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs.c | 47 ++++++++++++++++++++++++++++++----------------- 1 file changed, 30 insertions(+), 17 deletions(-)
--- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -1144,34 +1144,47 @@ destroy_targets_out: return err; }
-static int damon_sysfs_update_target(struct damon_target *target, - struct damon_ctx *ctx, - struct damon_sysfs_target *sys_target) +static int damon_sysfs_update_target_pid(struct damon_target *target, int pid) { - struct pid *pid; - struct damon_region *r, *next; - - if (!damon_target_has_pid(ctx)) - return 0; + struct pid *pid_new;
- pid = find_get_pid(sys_target->pid); - if (!pid) + pid_new = find_get_pid(pid); + if (!pid_new) return -EINVAL;
- /* no change to the target */ - if (pid == target->pid) { - put_pid(pid); + if (pid_new == target->pid) { + put_pid(pid_new); return 0; }
- /* remove old monitoring results and update the target's pid */ - damon_for_each_region_safe(r, next, target) - damon_destroy_region(r, target); put_pid(target->pid); - target->pid = pid; + target->pid = pid_new; return 0; }
+static int damon_sysfs_update_target(struct damon_target *target, + struct damon_ctx *ctx, + struct damon_sysfs_target *sys_target) +{ + int err; + + if (damon_target_has_pid(ctx)) { + err = damon_sysfs_update_target_pid(target, sys_target->pid); + if (err) + return err; + } + + /* + * Do monitoring target region boundary update only if one or more + * regions are set by the user. This is for keeping current monitoring + * target results and range easier, especially for dynamic monitoring + * target regions update ops like 'vaddr'. + */ + if (sys_target->regions->nr) + err = damon_sysfs_set_regions(target, sys_target->regions); + return err; +} + static int damon_sysfs_set_targets(struct damon_ctx *ctx, struct damon_sysfs_targets *sysfs_targets) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krister Johansen kjlx@templeofstupid.com
commit 8b793bcda61f6c3ed4f5b2ded7530ef6749580cb upstream.
Setting softlockup_panic from do_sysctl_args() causes it to take effect later in boot. The lockup detector is enabled before SMP is brought online, but do_sysctl_args runs afterwards. If a user wants to set softlockup_panic on boot and have it trigger should a softlockup occur during onlining of the non-boot processors, they could do this prior to commit f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases"). However, after this commit the value of softlockup_panic is set too late to be of help for this type of problem. Restore the prior behavior.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com Cc: stable@vger.kernel.org Fixes: f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases") Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/proc_sysctl.c | 1 - kernel/watchdog.c | 7 +++++++ 2 files changed, 7 insertions(+), 1 deletion(-)
--- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1574,7 +1574,6 @@ static const struct sysctl_alias sysctl_ {"hung_task_panic", "kernel.hung_task_panic" }, {"numa_zonelist_order", "vm.numa_zonelist_order" }, {"softlockup_all_cpu_backtrace", "kernel.softlockup_all_cpu_backtrace" }, - {"softlockup_panic", "kernel.softlockup_panic" }, { } };
--- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -283,6 +283,13 @@ static DEFINE_PER_CPU(struct hrtimer, wa static DEFINE_PER_CPU(bool, softlockup_touch_sync); static unsigned long soft_lockup_nmi_warn;
+static int __init softlockup_panic_setup(char *str) +{ + softlockup_panic = simple_strtoul(str, NULL, 0); + return 1; +} +__setup("softlockup_panic=", softlockup_panic_setup); + static int __init nowatchdog_setup(char *str) { watchdog_user_enabled = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Koichiro Den den@valinux.co.jp
commit e7250ab7ca4998fe026f2149805b03e09dc32498 upstream.
In iopt_area_split(), if the original iopt_area has filled a domain and is linked to domains_itree, pages_nodes have to be properly reinserted. Otherwise the domains_itree becomes corrupted and we will UAF.
Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping") Link: https://lore.kernel.org/r/20231027162941.2864615-2-den@valinux.co.jp Cc: stable@vger.kernel.org Signed-off-by: Koichiro Den den@valinux.co.jp Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/iommufd/io_pagetable.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -1060,6 +1060,16 @@ static int iopt_area_split(struct iopt_a if (WARN_ON(rc)) goto err_remove_lhs;
+ /* + * If the original area has filled a domain, domains_itree has to be + * updated. + */ + if (area->storage_domain) { + interval_tree_remove(&area->pages_node, &pages->domains_itree); + interval_tree_insert(&lhs->pages_node, &pages->domains_itree); + interval_tree_insert(&rhs->pages_node, &pages->domains_itree); + } + lhs->storage_domain = area->storage_domain; lhs->pages = area->pages; rhs->storage_domain = area->storage_domain;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 8a32aa17c1cd48df1ddaa78e45abcb8c7a2220d6 upstream.
The pointer to the next STI font is actually a signed 32-bit offset. With this change the 64-bit kernel will correctly subract the (signed 32-bit) offset instead of adding a (unsigned 32-bit) offset. It has no effect on 32-bit kernels.
This fixes the stifb driver with a 64-bit kernel on qemu.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/video/sticore.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/video/sticore.h +++ b/include/video/sticore.h @@ -232,7 +232,7 @@ struct sti_rom_font { u8 height; u8 font_type; /* language type */ u8 bytes_per_char; - u32 next_font; + s32 next_font; /* note: signed int */ u8 underline_height; u8 underline_pos; u8 res008[2];
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 9793c269da6cd339757de6ba5b2c8681b54c99af upstream.
The commit 5054e778fcd9c ("dm crypt: allocate compound pages if possible") changed dm-crypt to use compound pages to improve performance. Unfortunately, there was an oversight: the allocation of compound pages was not accounted at all. Normal pages are accounted in a percpu counter cc->n_allocated_pages and dm-crypt is limited to allocate at most 2% of memory. Because compound pages were not accounted at all, dm-crypt could allocate memory over the 2% limit.
Fix this by adding the accounting of compound pages, so that memory consumption of dm-crypt is properly limited.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Fixes: 5054e778fcd9c ("dm crypt: allocate compound pages if possible") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-crypt.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
--- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1700,11 +1700,17 @@ retry: order = min(order, remaining_order);
while (order > 0) { + if (unlikely(percpu_counter_read_positive(&cc->n_allocated_pages) + + (1 << order) > dm_crypt_pages_per_client)) + goto decrease_order; pages = alloc_pages(gfp_mask | __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP, order); - if (likely(pages != NULL)) + if (likely(pages != NULL)) { + percpu_counter_add(&cc->n_allocated_pages, 1 << order); goto have_pages; + } +decrease_order: order--; }
@@ -1742,10 +1748,13 @@ static void crypt_free_buffer_pages(stru
if (clone->bi_vcnt > 0) { /* bio_for_each_folio_all crashes with an empty bio */ bio_for_each_folio_all(fi, clone) { - if (folio_test_large(fi.folio)) + if (folio_test_large(fi.folio)) { + percpu_counter_sub(&cc->n_allocated_pages, + 1 << folio_order(fi.folio)); folio_put(fi.folio); - else + } else { mempool_free(&fi.folio->page, &cc->page_pool); + } } } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 44063f125af4bb4efd1d500d8091fa33a98af325 upstream.
When calculating the hotness threshold for lru_prio scheme of DAMON_LRU_SORT, the module divides some values by the maximum nr_accesses. However, due to the type of the related variables, simple division-based calculation of the divisor can return zero. As a result, divide-by-zero is possible. Fix it by using damon_max_nr_accesses(), which handles the case.
Link: https://lkml.kernel.org/r/20231019194924.100347-5-sj@kernel.org Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting") Signed-off-by: SeongJae Park sj@kernel.org Reported-by: Jakub Acs acsjakub@amazon.de Cc: stable@vger.kernel.org [6.0+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/lru_sort.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/mm/damon/lru_sort.c +++ b/mm/damon/lru_sort.c @@ -193,9 +193,7 @@ static int damon_lru_sort_apply_paramete if (err) return err;
- /* aggr_interval / sample_interval is the maximum nr_accesses */ - hot_thres = damon_lru_sort_mon_attrs.aggr_interval / - damon_lru_sort_mon_attrs.sample_interval * + hot_thres = damon_max_nr_accesses(&damon_lru_sort_mon_attrs) * hot_thres_access_freq / 1000; scheme = damon_lru_sort_new_hot_scheme(hot_thres); if (!scheme)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 3bafc47d3c4a2fc4d3b382aeb3c087f8fc84d9fd upstream.
When calculating the hotness of each region for the under-quota regions prioritization, DAMON divides some values by the maximum nr_accesses. However, due to the type of the related variables, simple division-based calculation of the divisor can return zero. As a result, divide-by-zero is possible. Fix it by using damon_max_nr_accesses(), which handles the case.
Link: https://lkml.kernel.org/r/20231019194924.100347-4-sj@kernel.org Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization") Signed-off-by: SeongJae Park sj@kernel.org Reported-by: Jakub Acs acsjakub@amazon.de Cc: stable@vger.kernel.org [5.16+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/ops-common.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/damon/ops-common.c +++ b/mm/damon/ops-common.c @@ -73,7 +73,6 @@ void damon_pmdp_mkold(pmd_t *pmd, struct int damon_hot_score(struct damon_ctx *c, struct damon_region *r, struct damos *s) { - unsigned int max_nr_accesses; int freq_subscore; unsigned int age_in_sec; int age_in_log, age_subscore; @@ -81,8 +80,8 @@ int damon_hot_score(struct damon_ctx *c, unsigned int age_weight = s->quota.weight_age; int hotness;
- max_nr_accesses = c->attrs.aggr_interval / c->attrs.sample_interval; - freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses; + freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / + damon_max_nr_accesses(&c->attrs);
age_in_sec = (unsigned long)r->age * c->attrs.aggr_interval / 1000000; for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 35f5d94187a6a3a8df2cba54beccca1c2379edb8 upstream.
Patch series "avoid divide-by-zero due to max_nr_accesses overflow".
The maximum nr_accesses of given DAMON context can be calculated by dividing the aggregation interval by the sampling interval. Some logics in DAMON uses the maximum nr_accesses as a divisor. Hence, the value shouldn't be zero. Such case is avoided since DAMON avoids setting the agregation interval as samller than the sampling interval. However, since nr_accesses is unsigned int while the intervals are unsigned long, the maximum nr_accesses could be zero while casting.
Avoid the divide-by-zero by implementing a function that handles the corner case (first patch), and replaces the vulnerable direct max nr_accesses calculations (remaining patches).
Note that the patches for the replacements are divided for broken commits, to make backporting on required tres easier. Especially, the last patch is for a patch that not yet merged into the mainline but in mm tree.
This patch (of 4):
The maximum nr_accesses of given DAMON context can be calculated by dividing the aggregation interval by the sampling interval. Some logics in DAMON uses the maximum nr_accesses as a divisor. Hence, the value shouldn't be zero. Such case is avoided since DAMON avoids setting the agregation interval as samller than the sampling interval. However, since nr_accesses is unsigned int while the intervals are unsigned long, the maximum nr_accesses could be zero while casting. Implement a function that handles the corner case.
Note that this commit is not fixing the real issue since this is only introducing the safe function that will replaces the problematic divisions. The replacements will be made by followup commits, to make backporting on stable series easier.
Link: https://lkml.kernel.org/r/20231019194924.100347-1-sj@kernel.org Link: https://lkml.kernel.org/r/20231019194924.100347-2-sj@kernel.org Fixes: 198f0f4c58b9 ("mm/damon/vaddr,paddr: support pageout prioritization") Signed-off-by: SeongJae Park sj@kernel.org Reported-by: Jakub Acs acsjakub@amazon.de Cc: stable@vger.kernel.org [5.16+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/damon.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -626,6 +626,13 @@ static inline bool damon_target_has_pid( return ctx->ops.id == DAMON_OPS_VADDR || ctx->ops.id == DAMON_OPS_FVADDR; }
+static inline unsigned int damon_max_nr_accesses(const struct damon_attrs *attrs) +{ + /* {aggr,sample}_interval are unsigned long, hence could overflow */ + return min(attrs->aggr_interval / attrs->sample_interval, + (unsigned long)UINT_MAX); +} +
int damon_start(struct damon_ctx **ctxs, int nr_ctxs, bool exclusive); int damon_stop(struct damon_ctx **ctxs, int nr_ctxs);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit d35963bfb05877455228ecec6b194f624489f96a upstream.
When monitoring attributes are changed, DAMON updates access rate of the monitoring results accordingly. For that, it divides some values by the maximum nr_accesses. However, due to the type of the related variables, simple division-based calculation of the divisor can return zero. As a result, divide-by-zero is possible. Fix it by using damon_max_nr_accesses(), which handles the case.
Link: https://lkml.kernel.org/r/20231019194924.100347-3-sj@kernel.org Fixes: 2f5bef5a590b ("mm/damon/core: update monitoring results for new monitoring attributes") Signed-off-by: SeongJae Park sj@kernel.org Reported-by: Jakub Acs acsjakub@amazon.de Cc: stable@vger.kernel.org [6.3+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/core.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
--- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -476,20 +476,14 @@ static unsigned int damon_age_for_new_at static unsigned int damon_accesses_bp_to_nr_accesses( unsigned int accesses_bp, struct damon_attrs *attrs) { - unsigned int max_nr_accesses = - attrs->aggr_interval / attrs->sample_interval; - - return accesses_bp * max_nr_accesses / 10000; + return accesses_bp * damon_max_nr_accesses(attrs) / 10000; }
/* convert nr_accesses to access ratio in bp (per 10,000) */ static unsigned int damon_nr_accesses_to_accesses_bp( unsigned int nr_accesses, struct damon_attrs *attrs) { - unsigned int max_nr_accesses = - attrs->aggr_interval / attrs->sample_interval; - - return nr_accesses * 10000 / max_nr_accesses; + return nr_accesses * 10000 / damon_max_nr_accesses(attrs); }
static unsigned int damon_nr_accesses_for_new_attrs(unsigned int nr_accesses,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit ae636ae2bbfd9279f5681dbf320d1da817e52b68 upstream.
DAMON sysfs interface's before_damos_apply callback (damon_sysfs_before_damos_apply()), which creates the DAMOS tried regions for each DAMOS action applied region, is not handling the allocation failure for the sysfs directory data. As a result, NULL pointer derefeence is possible. Fix it by handling the case.
Link: https://lkml.kernel.org/r/20231106233408.51159-4-sj@kernel.org Fixes: f1d13cacabe1 ("mm/damon/sysfs: implement DAMOS tried regions update command") Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [6.2+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs-schemes.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/mm/damon/sysfs-schemes.c +++ b/mm/damon/sysfs-schemes.c @@ -1649,6 +1649,8 @@ static int damon_sysfs_before_damos_appl
sysfs_regions = sysfs_schemes->schemes_arr[schemes_idx]->tried_regions; region = damon_sysfs_scheme_region_alloc(r); + if (!region) + return 0; list_add_tail(®ion->list, &sysfs_regions->regions_list); sysfs_regions->nr_regions++; if (kobject_init_and_add(®ion->kobj,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit 84055688b6bc075c92a88e2d6c3ad26ab93919f9 upstream.
DAMOS tried regions sysfs directory allocation function (damon_sysfs_scheme_regions_alloc()) is not handling the memory allocation failure. In the case, the code will dereference NULL pointer. Handle the failure to avoid such invalid access.
Link: https://lkml.kernel.org/r/20231106233408.51159-3-sj@kernel.org Fixes: 9277d0367ba1 ("mm/damon/sysfs-schemes: implement scheme region directory") Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [6.2+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs-schemes.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/damon/sysfs-schemes.c +++ b/mm/damon/sysfs-schemes.c @@ -125,6 +125,9 @@ damon_sysfs_scheme_regions_alloc(void) struct damon_sysfs_scheme_regions *regions = kmalloc(sizeof(*regions), GFP_KERNEL);
+ if (!regions) + return NULL; + regions->kobj = (struct kobject){}; INIT_LIST_HEAD(®ions->regions_list); regions->nr_regions = 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit b4936b544b08ed44949055b92bd25f77759ebafc upstream.
Patch series "mm/damon/sysfs: fix unhandled return values".
Some of DAMON sysfs interface code is not handling return values from some functions. As a result, confusing user input handling or NULL-dereference is possible. Check those properly.
This patch (of 3):
damon_sysfs_update_target() returns error code for failures, but its caller, damon_sysfs_set_targets() is ignoring that. The update function seems making no critical change in case of such failures, but the behavior will look like DAMON sysfs is silently ignoring or only partially accepting the user input. Fix it.
Link: https://lkml.kernel.org/r/20231106233408.51159-1-sj@kernel.org Link: https://lkml.kernel.org/r/20231106233408.51159-2-sj@kernel.org Fixes: 19467a950b49 ("mm/damon/sysfs: remove requested targets when online-commit inputs") Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [5.19+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -1197,8 +1197,10 @@ static int damon_sysfs_set_targets(struc
damon_for_each_target_safe(t, next, ctx) { if (i < sysfs_targets->nr) { - damon_sysfs_update_target(t, ctx, + err = damon_sysfs_update_target(t, ctx, sysfs_targets->targets_arr[i]); + if (err) + return err; } else { if (damon_target_has_pid(ctx)) put_pid(t->pid);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: John David Anglin dave@parisc-linux.org
commit ad4aa06e1d92b06ed56c7240252927bd60632efe upstream.
An excerpt from the PA8800 ERS states:
* The PA8800 violates the seven instruction pipeline rule when performing TLB inserts or PxTLBE instructions with the PSW C bit on. The instruction will take effect by the 12th instruction after the insert or purge.
I believe we have a problem with handling TLB misses. We don't fill the pipeline following TLB inserts. As a result, we likely fault again after returning from the interruption.
The above statement indicates that we need at least seven instructions after the insert on pre PA8800 processors and we need 12 instructions on PA8800/PA8900 processors.
Here we add macros and code to provide the required number instructions after a TLB insert.
Signed-off-by: John David Anglin dave.anglin@bell.net Suggested-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/entry.S | 81 ++++++++++++++++++++++++++++----------------- 1 file changed, 52 insertions(+), 29 deletions(-)
--- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -36,6 +36,24 @@ .level 2.0 #endif
+/* + * We need seven instructions after a TLB insert for it to take effect. + * The PA8800/PA8900 processors are an exception and need 12 instructions. + * The RFI changes both IAOQ_Back and IAOQ_Front, so it counts as one. + */ +#ifdef CONFIG_64BIT +#define NUM_PIPELINE_INSNS 12 +#else +#define NUM_PIPELINE_INSNS 7 +#endif + + /* Insert num nops */ + .macro insert_nops num + .rept \num + nop + .endr + .endm + /* Get aligned page_table_lock address for this mm from cr28/tr4 */ .macro get_ptl reg mfctl %cr28,\reg @@ -415,24 +433,20 @@ 3: .endm
- /* Release page_table_lock without reloading lock address. - We use an ordered store to ensure all prior accesses are - performed prior to releasing the lock. */ - .macro ptl_unlock0 spc,tmp,tmp2 + /* Release page_table_lock if for user space. We use an ordered + store to ensure all prior accesses are performed prior to + releasing the lock. Note stw may not be executed, so we + provide one extra nop when CONFIG_TLB_PTLOCK is defined. */ + .macro ptl_unlock spc,tmp,tmp2 #ifdef CONFIG_TLB_PTLOCK -98: ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, \tmp2 +98: get_ptl \tmp + ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, \tmp2 or,COND(=) %r0,\spc,%r0 stw,ma \tmp2,0(\tmp) 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) -#endif - .endm - - /* Release page_table_lock. */ - .macro ptl_unlock1 spc,tmp,tmp2 -#ifdef CONFIG_TLB_PTLOCK -98: get_ptl \tmp - ptl_unlock0 \spc,\tmp,\tmp2 -99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) + insert_nops NUM_PIPELINE_INSNS - 4 +#else + insert_nops NUM_PIPELINE_INSNS - 1 #endif .endm
@@ -1124,7 +1138,7 @@ dtlb_miss_20w: idtlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1133,6 +1147,7 @@ dtlb_check_alias_20w:
idtlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1150,7 +1165,7 @@ nadtlb_miss_20w:
idtlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1159,6 +1174,7 @@ nadtlb_check_alias_20w:
idtlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1184,7 +1200,7 @@ dtlb_miss_11:
mtsp t1, %sr1 /* Restore sr1 */
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1194,6 +1210,7 @@ dtlb_check_alias_11: idtlba pte,(va) idtlbp prot,(va)
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1217,7 +1234,7 @@ nadtlb_miss_11:
mtsp t1, %sr1 /* Restore sr1 */
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1227,6 +1244,7 @@ nadtlb_check_alias_11: idtlba pte,(va) idtlbp prot,(va)
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1246,7 +1264,7 @@ dtlb_miss_20:
idtlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1255,6 +1273,7 @@ dtlb_check_alias_20: idtlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1274,7 +1293,7 @@ nadtlb_miss_20: idtlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1283,6 +1302,7 @@ nadtlb_check_alias_20:
idtlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1319,7 +1339,7 @@ itlb_miss_20w: iitlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1343,7 +1363,7 @@ naitlb_miss_20w:
iitlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1352,6 +1372,7 @@ naitlb_check_alias_20w:
iitlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1377,7 +1398,7 @@ itlb_miss_11:
mtsp t1, %sr1 /* Restore sr1 */
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1401,7 +1422,7 @@ naitlb_miss_11:
mtsp t1, %sr1 /* Restore sr1 */
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1411,6 +1432,7 @@ naitlb_check_alias_11: iitlba pte,(%sr0, va) iitlbp prot,(%sr0, va)
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1431,7 +1453,7 @@ itlb_miss_20:
iitlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1451,7 +1473,7 @@ naitlb_miss_20:
iitlbt pte,prot
- ptl_unlock1 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1460,6 +1482,7 @@ naitlb_check_alias_20:
iitlbt pte,prot
+ insert_nops NUM_PIPELINE_INSNS - 1 rfir nop
@@ -1481,7 +1504,7 @@ dbit_trap_20w: idtlbt pte,prot
- ptl_unlock0 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop #else @@ -1507,7 +1530,7 @@ dbit_trap_11:
mtsp t1, %sr1 /* Restore sr1 */
- ptl_unlock0 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop
@@ -1527,7 +1550,7 @@ dbit_trap_20: idtlbt pte,prot
- ptl_unlock0 spc,t0,t1 + ptl_unlock spc,t0,t1 rfir nop #endif
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Werner Sembach wse@tuxedocomputers.com
commit 0da9eccde3270b832c059ad618bf66e510c75d33 upstream.
The TongFang GMxXGxx/TUXEDO Stellaris/Pollaris Gen5 needs IRQ overriding for the keyboard to work.
Adding an entry for this laptop to the override_table makes the internal keyboard functional.
Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: All applicable stable@vger.kernel.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/resource.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -496,6 +496,18 @@ static const struct dmi_system_id mainge } }, { + /* TongFang GMxXGxx/TUXEDO Polaris 15 Gen5 AMD */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GMxXGxx"), + }, + }, + { + /* TongFang GM6XGxX/TUXEDO Stellaris 16 Gen5 AMD */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GM6XGxX"), + }, + }, + { .ident = "MAINGEAR Vector Pro 2 17", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Brown broonie@kernel.org
commit 0ec7731655de196bc1e4af99e495b38778109d22 upstream.
When we sync the register cache we do so with the cache bypassed in order to avoid overhead from writing the synced values back into the cache. If the regmap has ranges and the selector register for those ranges is in a register which is cached this has the unfortunate side effect of meaning that the physical and cached copies of the selector register can be out of sync after a cache sync. The cache will have whatever the selector was when the sync started and the hardware will have the selector for the register that was synced last.
Fix this by rewriting all cached selector registers after every sync, ensuring that the hardware and cache have the same content. This will result in extra writes that wouldn't otherwise be needed but is simple so hopefully robust. We don't read from the hardware since not all devices have physical read support.
Given that nobody noticed this until now it is likely that we are rarely if ever hitting this case.
Reported-by: Hector Martin marcan@marcan.st Cc: stable@vger.kernel.org Signed-off-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20231026-regmap-fix-selector-sync-v1-1-633ded82770... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/regmap/regcache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
--- a/drivers/base/regmap/regcache.c +++ b/drivers/base/regmap/regcache.c @@ -334,6 +334,11 @@ static int regcache_default_sync(struct return 0; }
+static int rbtree_all(const void *key, const struct rb_node *node) +{ + return 0; +} + /** * regcache_sync - Sync the register cache with the hardware. * @@ -351,6 +356,7 @@ int regcache_sync(struct regmap *map) unsigned int i; const char *name; bool bypass; + struct rb_node *node;
if (WARN_ON(map->cache_type == REGCACHE_NONE)) return -EINVAL; @@ -392,6 +398,30 @@ out: /* Restore the bypass state */ map->cache_bypass = bypass; map->no_sync_defaults = false; + + /* + * If we did any paging with cache bypassed and a cached + * paging register then the register and cache state might + * have gone out of sync, force writes of all the paging + * registers. + */ + rb_for_each(node, 0, &map->range_tree, rbtree_all) { + struct regmap_range_node *this = + rb_entry(node, struct regmap_range_node, node); + + /* If there's nothing in the cache there's nothing to sync */ + ret = regcache_read(map, this->selector_reg, &i); + if (ret != 0) + continue; + + ret = _regmap_write(map, this->selector_reg, i); + if (ret != 0) { + dev_err(map->dev, "Failed to write %x = %x: %d\n", + this->selector_reg, i, ret); + break; + } + } + map->unlock(map->lock_arg);
regmap_async_complete(map);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 1a5352a81b4720ba43d9c899974e3bddf7ce0ce8 upstream.
The ath11k active pdevs are protected by RCU but the temperature event handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section as reported by RCU lockdep:
============================= WARNING: suspicious RCU usage 6.6.0-rc6 #7 Not tainted ----------------------------- drivers/net/wireless/ath/ath11k/mac.c:638 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/0/0. ... Call trace: ... lockdep_rcu_suspicious+0x16c/0x22c ath11k_mac_get_ar_by_pdev_id+0x194/0x1b0 [ath11k] ath11k_wmi_tlv_op_rx+0xa84/0x2c1c [ath11k] ath11k_htc_rx_completion_handler+0x388/0x510 [ath11k]
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
Fixes: a41d10348b01 ("ath11k: add thermal sensor device support") Cc: stable@vger.kernel.org # 5.7 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019153115.26401-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/wmi.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -8383,15 +8383,19 @@ ath11k_wmi_pdev_temperature_event(struct ath11k_dbg(ab, ATH11K_DBG_WMI, "event pdev temperature ev temp %d pdev_id %d\n", ev->temp, ev->pdev_id);
+ rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, ev->pdev_id); if (!ar) { ath11k_warn(ab, "invalid pdev id in pdev temperature ev %d", ev->pdev_id); - kfree(tb); - return; + goto exit; }
ath11k_thermal_event_temperature(ar, ev->temp);
+exit: + rcu_read_unlock(); + kfree(tb); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 3b6c14833165f689cc5928574ebafe52bbce5f1e upstream.
The ath11k active pdevs are protected by RCU but the DFS radar event handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org # 5.6 Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019153115.26401-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/wmi.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -8337,6 +8337,8 @@ ath11k_wmi_pdev_dfs_radar_detected_event ev->detector_id, ev->segment_id, ev->timestamp, ev->is_chirp, ev->freq_offset, ev->sidx);
+ rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, ev->pdev_id);
if (!ar) { @@ -8354,6 +8356,8 @@ ath11k_wmi_pdev_dfs_radar_detected_event ieee80211_radar_detected(ar->hw);
exit: + rcu_read_unlock(); + kfree(tb); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 3f77c7d605b29df277d77e9ee75d96e7ad145d2d upstream.
The ath11k active pdevs are protected by RCU but the htt pktlog handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019112521.2071-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/dp_rx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath11k/dp_rx.c +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c @@ -1621,14 +1621,20 @@ static void ath11k_htt_pktlog(struct ath u8 pdev_id;
pdev_id = FIELD_GET(HTT_T2H_PPDU_STATS_INFO_PDEV_ID, data->hdr); + + rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, pdev_id); if (!ar) { ath11k_warn(ab, "invalid pdev id %d on htt pktlog\n", pdev_id); - return; + goto out; }
trace_ath11k_htt_pktlog(ar, data->payload, hdr->size, ar->ab->pktlog_defs_checksum); + +out: + rcu_read_unlock(); }
static void ath11k_htt_backpressure_event_handler(struct ath11k_base *ab,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 1dea3c0720a146bd7193969f2847ccfed5be2221 upstream.
The ath11k active pdevs are protected by RCU but the gtk offload status event handling code calling ath11k_mac_get_arvif_by_vdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: a16d9b50cfba ("ath11k: support GTK rekey offload") Cc: stable@vger.kernel.org # 5.18 Cc: Carl Huang quic_cjhuang@quicinc.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019155342.31631-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/wmi.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -8619,12 +8619,13 @@ static void ath11k_wmi_gtk_offload_statu return; }
+ rcu_read_lock(); + arvif = ath11k_mac_get_arvif_by_vdev_id(ab, ev->vdev_id); if (!arvif) { ath11k_warn(ab, "failed to get arvif for vdev_id:%d\n", ev->vdev_id); - kfree(tb); - return; + goto exit; }
ath11k_dbg(ab, ATH11K_DBG_WMI, "event gtk offload refresh_cnt %d\n", @@ -8641,6 +8642,8 @@ static void ath11k_wmi_gtk_offload_statu
ieee80211_gtk_rekey_notify(arvif->vif, arvif->bssid, (void *)&replay_ctr_be, GFP_ATOMIC); +exit: + rcu_read_unlock();
kfree(tb); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 6afc57ea315e0f660b1f870a681737bb7b71faef upstream.
The ath12k active pdevs are protected by RCU but the htt mlo-offset event handling code calling ath12k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices") Cc: stable@vger.kernel.org # v6.2 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019113650.9060-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath12k/dp_rx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/ath/ath12k/dp_rx.c +++ b/drivers/net/wireless/ath/ath12k/dp_rx.c @@ -1658,11 +1658,12 @@ static void ath12k_htt_mlo_offset_event_ msg = (struct ath12k_htt_mlo_offset_msg *)skb->data; pdev_id = u32_get_bits(__le32_to_cpu(msg->info), HTT_T2H_MLO_OFFSET_INFO_PDEV_ID); - ar = ath12k_mac_get_ar_by_pdev_id(ab, pdev_id);
+ rcu_read_lock(); + ar = ath12k_mac_get_ar_by_pdev_id(ab, pdev_id); if (!ar) { ath12k_warn(ab, "invalid pdev id %d on htt mlo offset\n", pdev_id); - return; + goto exit; }
spin_lock_bh(&ar->data_lock); @@ -1678,6 +1679,8 @@ static void ath12k_htt_mlo_offset_event_ pdev->timestamp.mlo_comp_timer = __le32_to_cpu(msg->mlo_comp_timer);
spin_unlock_bh(&ar->data_lock); +exit: + rcu_read_unlock(); }
void ath12k_dp_htt_htc_t2h_msg_handler(struct ath12k_base *ab,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 69bd216e049349886405b1c87a55dce3d35d1ba7 upstream.
The ath12k active pdevs are protected by RCU but the DFS-radar and temperature event handling code calling ath12k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as RCU read-side critical sections to avoid any potential use-after-free issues.
Note that the temperature event handler looks like a place holder currently but would still trigger an RCU lockdep splat.
Compile tested only.
Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices") Cc: stable@vger.kernel.org # v6.2 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019113650.9060-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath12k/wmi.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath12k/wmi.c +++ b/drivers/net/wireless/ath/ath12k/wmi.c @@ -6234,6 +6234,8 @@ ath12k_wmi_pdev_dfs_radar_detected_event ev->detector_id, ev->segment_id, ev->timestamp, ev->is_chirp, ev->freq_offset, ev->sidx);
+ rcu_read_lock(); + ar = ath12k_mac_get_ar_by_pdev_id(ab, le32_to_cpu(ev->pdev_id));
if (!ar) { @@ -6251,6 +6253,8 @@ ath12k_wmi_pdev_dfs_radar_detected_event ieee80211_radar_detected(ar->hw);
exit: + rcu_read_unlock(); + kfree(tb); }
@@ -6269,11 +6273,16 @@ ath12k_wmi_pdev_temperature_event(struct ath12k_dbg(ab, ATH12K_DBG_WMI, "pdev temperature ev temp %d pdev_id %d\n", ev.temp, ev.pdev_id);
+ rcu_read_lock(); + ar = ath12k_mac_get_ar_by_pdev_id(ab, le32_to_cpu(ev.pdev_id)); if (!ar) { ath12k_warn(ab, "invalid pdev id in pdev temperature ev %d", ev.pdev_id); - return; + goto exit; } + +exit: + rcu_read_unlock(); }
static void ath12k_fils_discovery_event(struct ath12k_base *ab,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rong Chen rong.chen@amlogic.com
commit 57925e16c9f7d18012bcf45bfa658f92c087981a upstream.
For the t7 and older SoC families, the CMD_CFG_ERROR has no effect. Starting from SoC family C3, setting this bit without SG LINK data address will cause the controller to generate an IRQ and stop working.
To fix it, don't set the bit CMD_CFG_ERROR anymore.
Fixes: 18f92bc02f17 ("mmc: meson-gx: make sure the descriptor is stopped on errors") Signed-off-by: Rong Chen rong.chen@amlogic.com Reviewed-by: Jerome Brunet jbrunet@baylibre.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231026073156.2868310-1-rong.chen@amlogic.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/meson-gx-mmc.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/mmc/host/meson-gx-mmc.c +++ b/drivers/mmc/host/meson-gx-mmc.c @@ -801,7 +801,6 @@ static void meson_mmc_start_cmd(struct m
cmd_cfg |= FIELD_PREP(CMD_CFG_CMD_INDEX_MASK, cmd->opcode); cmd_cfg |= CMD_CFG_OWNER; /* owned by CPU */ - cmd_cfg |= CMD_CFG_ERROR; /* stop in case of error */
meson_mmc_set_response_bits(cmd, &cmd_cfg);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit 5e7afb2eb7b2a7c81e9f608cbdf74a07606fd1b5 upstream.
irq_remove_generic_chip() calculates the Linux interrupt number for removing the handler and interrupt chip based on gc::irq_base as a linear function of the bit positions of set bits in the @msk argument.
When the generic chip is present in an irq domain, i.e. created with a call to irq_alloc_domain_generic_chips(), gc::irq_base contains not the base Linux interrupt number. It contains the base hardware interrupt for this chip. It is set to 0 for the first chip in the domain, 0 + N for the next chip, where $N is the number of hardware interrupts per chip.
That means the Linux interrupt number cannot be calculated based on gc::irq_base for irqdomain based chips without a domain map lookup, which is currently missing.
Rework the code to take the irqdomain case into account and calculate the Linux interrupt number by a irqdomain lookup of the domain specific hardware interrupt number.
[ tglx: Massage changelog. Reshuffle the logic and add a proper comment. ]
Fixes: cfefd21e693d ("genirq: Add chip suspend and resume callbacks") Signed-off-by: Herve Codina herve.codina@bootlin.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231024150335.322282-1-herve.codina@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/irq/generic-chip.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-)
--- a/kernel/irq/generic-chip.c +++ b/kernel/irq/generic-chip.c @@ -544,21 +544,34 @@ EXPORT_SYMBOL_GPL(irq_setup_alt_chip); void irq_remove_generic_chip(struct irq_chip_generic *gc, u32 msk, unsigned int clr, unsigned int set) { - unsigned int i = gc->irq_base; + unsigned int i, virq;
raw_spin_lock(&gc_lock); list_del(&gc->list); raw_spin_unlock(&gc_lock);
- for (; msk; msk >>= 1, i++) { + for (i = 0; msk; msk >>= 1, i++) { if (!(msk & 0x01)) continue;
+ /* + * Interrupt domain based chips store the base hardware + * interrupt number in gc::irq_base. Otherwise gc::irq_base + * contains the base Linux interrupt number. + */ + if (gc->domain) { + virq = irq_find_mapping(gc->domain, gc->irq_base + i); + if (!virq) + continue; + } else { + virq = gc->irq_base + i; + } + /* Remove handler first. That will mask the irq line */ - irq_set_handler(i, NULL); - irq_set_chip(i, &no_irq_chip); - irq_set_chip_data(i, NULL); - irq_modify_status(i, clr, set); + irq_set_handler(virq, NULL); + irq_set_chip(virq, &no_irq_chip); + irq_set_chip_data(virq, NULL); + irq_modify_status(virq, clr, set); } } EXPORT_SYMBOL_GPL(irq_remove_generic_chip);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Jia jiahao.os@bytedance.com
commit 5ebde09d91707a4a9bec1e3d213e3c12ffde348f upstream.
Igor Raits and Bagas Sanjaya report a RQCF_ACT_SKIP leak warning.
This warning may be triggered in the following situations:
CPU0 CPU1
__schedule() *rq->clock_update_flags <<= 1;* unregister_fair_sched_group() pick_next_task_fair+0x4a/0x410 destroy_cfs_bandwidth() newidle_balance+0x115/0x3e0 for_each_possible_cpu(i) *i=0* rq_unpin_lock(this_rq, rf) __cfsb_csd_unthrottle() raw_spin_rq_unlock(this_rq) rq_lock(*CPU0_rq*, &rf) rq_clock_start_loop_update() rq->clock_update_flags & RQCF_ACT_SKIP <-- raw_spin_rq_lock(this_rq)
The purpose of RQCF_ACT_SKIP is to skip the update rq clock, but the update is very early in __schedule(), but we clear RQCF_*_SKIP very late, causing it to span that gap above and triggering this warning.
In __schedule() we can clear the RQCF_*_SKIP flag immediately after update_rq_clock() to avoid this RQCF_ACT_SKIP leak warning. And set rq->clock_update_flags to RQCF_UPDATED to avoid rq->clock_update_flags < RQCF_ACT_SKIP warning that may be triggered later.
Fixes: ebb83d84e49b ("sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()") Closes: https://lore.kernel.org/all/20230913082424.73252-1-jiahao.os@bytedance.com Reported-by: Igor Raits igor.raits@gmail.com Reported-by: Bagas Sanjaya bagasdotme@gmail.com Suggested-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Hao Jia jiahao.os@bytedance.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/a5dd536d-041a-2ce9-f4b7-64d8d85c86dc@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5377,8 +5377,6 @@ context_switch(struct rq *rq, struct tas /* switch_mm_cid() requires the memory barriers above. */ switch_mm_cid(rq, prev, next);
- rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP); - prepare_lock_switch(rq, next, rf);
/* Here we just switch the register state and the stack. */ @@ -6634,6 +6632,7 @@ static void __sched notrace __schedule(u /* Promote REQ to ACT */ rq->clock_update_flags <<= 1; update_rq_clock(rq); + rq->clock_update_flags = RQCF_UPDATED;
switch_count = &prev->nivcsw;
@@ -6713,8 +6712,6 @@ static void __sched notrace __schedule(u /* Also unlocks the rq: */ rq = context_switch(rq, prev, next, &rf); } else { - rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP); - rq_unpin_lock(rq, &rf); __balance_callbacks(rq); raw_spin_rq_unlock_irq(rq);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sumit Garg sumit.garg@linaro.org
commit c745cd1718b7825d69315fe7127e2e289e617598 upstream.
The OP-TEE driver using the old SMC based ABI permits overlapping shared buffers, but with the new FF-A based ABI each physical page may only be registered once.
As the key and blob buffer are allocated adjancently, there is no need for redundant register shared memory invocation. Also, it is incompatibile with FF-A based ABI limitation. So refactor register shared memory implementation to use only single invocation to register both key and blob buffers.
[jarkko: Added cc to stable.] Cc: stable@vger.kernel.org # v5.16+ Fixes: 4615e5a34b95 ("optee: add FF-A support") Reported-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Sumit Garg sumit.garg@linaro.org Tested-by: Jens Wiklander jens.wiklander@linaro.org Reviewed-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/trusted-keys/trusted_tee.c | 64 +++++++++---------------------- 1 file changed, 20 insertions(+), 44 deletions(-)
--- a/security/keys/trusted-keys/trusted_tee.c +++ b/security/keys/trusted-keys/trusted_tee.c @@ -65,24 +65,16 @@ static int trusted_tee_seal(struct trust int ret; struct tee_ioctl_invoke_arg inv_arg; struct tee_param param[4]; - struct tee_shm *reg_shm_in = NULL, *reg_shm_out = NULL; + struct tee_shm *reg_shm = NULL;
memset(&inv_arg, 0, sizeof(inv_arg)); memset(¶m, 0, sizeof(param));
- reg_shm_in = tee_shm_register_kernel_buf(pvt_data.ctx, p->key, - p->key_len); - if (IS_ERR(reg_shm_in)) { - dev_err(pvt_data.dev, "key shm register failed\n"); - return PTR_ERR(reg_shm_in); - } - - reg_shm_out = tee_shm_register_kernel_buf(pvt_data.ctx, p->blob, - sizeof(p->blob)); - if (IS_ERR(reg_shm_out)) { - dev_err(pvt_data.dev, "blob shm register failed\n"); - ret = PTR_ERR(reg_shm_out); - goto out; + reg_shm = tee_shm_register_kernel_buf(pvt_data.ctx, p->key, + sizeof(p->key) + sizeof(p->blob)); + if (IS_ERR(reg_shm)) { + dev_err(pvt_data.dev, "shm register failed\n"); + return PTR_ERR(reg_shm); }
inv_arg.func = TA_CMD_SEAL; @@ -90,13 +82,13 @@ static int trusted_tee_seal(struct trust inv_arg.num_params = 4;
param[0].attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT; - param[0].u.memref.shm = reg_shm_in; + param[0].u.memref.shm = reg_shm; param[0].u.memref.size = p->key_len; param[0].u.memref.shm_offs = 0; param[1].attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_OUTPUT; - param[1].u.memref.shm = reg_shm_out; + param[1].u.memref.shm = reg_shm; param[1].u.memref.size = sizeof(p->blob); - param[1].u.memref.shm_offs = 0; + param[1].u.memref.shm_offs = sizeof(p->key);
ret = tee_client_invoke_func(pvt_data.ctx, &inv_arg, param); if ((ret < 0) || (inv_arg.ret != 0)) { @@ -107,11 +99,7 @@ static int trusted_tee_seal(struct trust p->blob_len = param[1].u.memref.size; }
-out: - if (reg_shm_out) - tee_shm_free(reg_shm_out); - if (reg_shm_in) - tee_shm_free(reg_shm_in); + tee_shm_free(reg_shm);
return ret; } @@ -124,24 +112,16 @@ static int trusted_tee_unseal(struct tru int ret; struct tee_ioctl_invoke_arg inv_arg; struct tee_param param[4]; - struct tee_shm *reg_shm_in = NULL, *reg_shm_out = NULL; + struct tee_shm *reg_shm = NULL;
memset(&inv_arg, 0, sizeof(inv_arg)); memset(¶m, 0, sizeof(param));
- reg_shm_in = tee_shm_register_kernel_buf(pvt_data.ctx, p->blob, - p->blob_len); - if (IS_ERR(reg_shm_in)) { - dev_err(pvt_data.dev, "blob shm register failed\n"); - return PTR_ERR(reg_shm_in); - } - - reg_shm_out = tee_shm_register_kernel_buf(pvt_data.ctx, p->key, - sizeof(p->key)); - if (IS_ERR(reg_shm_out)) { - dev_err(pvt_data.dev, "key shm register failed\n"); - ret = PTR_ERR(reg_shm_out); - goto out; + reg_shm = tee_shm_register_kernel_buf(pvt_data.ctx, p->key, + sizeof(p->key) + sizeof(p->blob)); + if (IS_ERR(reg_shm)) { + dev_err(pvt_data.dev, "shm register failed\n"); + return PTR_ERR(reg_shm); }
inv_arg.func = TA_CMD_UNSEAL; @@ -149,11 +129,11 @@ static int trusted_tee_unseal(struct tru inv_arg.num_params = 4;
param[0].attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT; - param[0].u.memref.shm = reg_shm_in; + param[0].u.memref.shm = reg_shm; param[0].u.memref.size = p->blob_len; - param[0].u.memref.shm_offs = 0; + param[0].u.memref.shm_offs = sizeof(p->key); param[1].attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_OUTPUT; - param[1].u.memref.shm = reg_shm_out; + param[1].u.memref.shm = reg_shm; param[1].u.memref.size = sizeof(p->key); param[1].u.memref.shm_offs = 0;
@@ -166,11 +146,7 @@ static int trusted_tee_unseal(struct tru p->key_len = param[1].u.memref.size; }
-out: - if (reg_shm_out) - tee_shm_free(reg_shm_out); - if (reg_shm_in) - tee_shm_free(reg_shm_in); + tee_shm_free(reg_shm);
return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko@kernel.org
commit 31de287345f41bbfaec36a5c8cbdba035cf76442 upstream.
Do bind neither static calls nor trusted_key_exit() before a successful init, in order to maintain a consistent state. In addition, depart the init_trusted() in the case of a real error (i.e. getting back something else than -ENODEV).
Reported-by: Linus Torvalds torvalds@linux-foundation.org Closes: https://lore.kernel.org/linux-integrity/CAHk-=whOPoLaWM8S8GgoOPT7a2+nMH5h3TL... Cc: stable@vger.kernel.org # v5.13+ Fixes: 5d0682be3189 ("KEYS: trusted: Add generic trusted keys framework") Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/trusted-keys/trusted_core.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/security/keys/trusted-keys/trusted_core.c +++ b/security/keys/trusted-keys/trusted_core.c @@ -358,17 +358,17 @@ static int __init init_trusted(void) if (!get_random) get_random = kernel_get_random;
- static_call_update(trusted_key_seal, - trusted_key_sources[i].ops->seal); - static_call_update(trusted_key_unseal, - trusted_key_sources[i].ops->unseal); - static_call_update(trusted_key_get_random, - get_random); - trusted_key_exit = trusted_key_sources[i].ops->exit; - migratable = trusted_key_sources[i].ops->migratable; - ret = trusted_key_sources[i].ops->init(); - if (!ret) + if (!ret) { + static_call_update(trusted_key_seal, trusted_key_sources[i].ops->seal); + static_call_update(trusted_key_unseal, trusted_key_sources[i].ops->unseal); + static_call_update(trusted_key_get_random, get_random); + + trusted_key_exit = trusted_key_sources[i].ops->exit; + migratable = trusted_key_sources[i].ops->migratable; + } + + if (!ret || ret != -ENODEV) break; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 200bddbb3f5202bbce96444fdc416305de14f547 upstream.
With CONFIG_PCIE_KEYSTONE=y and ks_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse.
The right thing to do is do always have the remove callback available. Note that this driver cannot be compiled as a module, so ks_pcie_remove() was always discarded before this change and modpost couldn't warn about this issue. Furthermore the __ref annotation also prevents a warning.
Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-4-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-keystone.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -1303,7 +1303,7 @@ err_link: return ret; }
-static int __exit ks_pcie_remove(struct platform_device *pdev) +static int ks_pcie_remove(struct platform_device *pdev) { struct keystone_pcie *ks_pcie = platform_get_drvdata(pdev); struct device_link **link = ks_pcie->link; @@ -1321,7 +1321,7 @@ static int __exit ks_pcie_remove(struct
static struct platform_driver ks_pcie_driver __refdata = { .probe = ks_pcie_probe, - .remove = __exit_p(ks_pcie_remove), + .remove = ks_pcie_remove, .driver = { .name = "keystone-pcie", .of_match_table = ks_pcie_of_match,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 7994db905c0fd692cf04c527585f08a91b560144 upstream.
The __init annotation makes the ks_pcie_probe() function disappear after booting completes. However a device can also be bound later. In that case, we try to call ks_pcie_probe(), but the backing memory is likely already overwritten.
The right thing to do is do always have the probe callback available. Note that the (wrong) __refdata annotation prevented this issue to be noticed by modpost.
Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-5-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-keystone.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -1101,7 +1101,7 @@ static const struct of_device_id ks_pcie { }, };
-static int __init ks_pcie_probe(struct platform_device *pdev) +static int ks_pcie_probe(struct platform_device *pdev) { const struct dw_pcie_host_ops *host_ops; const struct dw_pcie_ep_ops *ep_ops; @@ -1319,7 +1319,7 @@ static int ks_pcie_remove(struct platfor return 0; }
-static struct platform_driver ks_pcie_driver __refdata = { +static struct platform_driver ks_pcie_driver = { .probe = ks_pcie_probe, .remove = ks_pcie_remove, .driver = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 146a15b873353f8ac28dc281c139ff611a3c4848 upstream.
Prior to LLVM 15.0.0, LLVM's integrated assembler would incorrectly byte-swap NOP when compiling for big-endian, and the resulting series of bytes happened to match the encoding of FNMADD S21, S30, S0, S0.
This went unnoticed until commit:
34f66c4c4d5518c1 ("arm64: Use a positive cpucap for FP/SIMD")
Prior to that commit, the kernel would always enable the use of FPSIMD early in boot when __cpu_setup() initialized CPACR_EL1, and so usage of FNMADD within the kernel was not detected, but could result in the corruption of user or kernel FPSIMD state.
After that commit, the instructions happen to trap during boot prior to FPSIMD being detected and enabled, e.g.
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x000000001fe00000 -- ASIMD | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1 | Hardware name: linux,dummy-virt (DT) | pstate: 400000c9 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __pi_strcmp+0x1c/0x150 | lr : populate_properties+0xe4/0x254 | sp : ffffd014173d3ad0 | x29: ffffd014173d3af0 x28: fffffbfffddffcb8 x27: 0000000000000000 | x26: 0000000000000058 x25: fffffbfffddfe054 x24: 0000000000000008 | x23: fffffbfffddfe000 x22: fffffbfffddfe000 x21: fffffbfffddfe044 | x20: ffffd014173d3b70 x19: 0000000000000001 x18: 0000000000000005 | x17: 0000000000000010 x16: 0000000000000000 x15: 00000000413e7000 | x14: 0000000000000000 x13: 0000000000001bcc x12: 0000000000000000 | x11: 00000000d00dfeed x10: ffffd414193f2cd0 x9 : 0000000000000000 | x8 : 0101010101010101 x7 : ffffffffffffffc0 x6 : 0000000000000000 | x5 : 0000000000000000 x4 : 0101010101010101 x3 : 000000000000002a | x2 : 0000000000000001 x1 : ffffd014171f2988 x0 : fffffbfffddffcb8 | Kernel panic - not syncing: Unhandled exception | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0xec/0x108 | show_stack+0x18/0x2c | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | panic+0x13c/0x340 | el1t_64_irq_handler+0x0/0x1c | el1_abort+0x0/0x5c | el1h_64_sync+0x64/0x68 | __pi_strcmp+0x1c/0x150 | unflatten_dt_nodes+0x1e8/0x2d8 | __unflatten_device_tree+0x5c/0x15c | unflatten_device_tree+0x38/0x50 | setup_arch+0x164/0x1e0 | start_kernel+0x64/0x38c | __primary_switched+0xbc/0xc4
Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked commit.
Closes: https://github.com/ClangBuiltLinux/linux/issues/1948 Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba530... Signed-off-by: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org Acked-by: Mark Rutland mark.rutland@arm.com Link: https://lore.kernel.org/r/20231025-disable-arm64-be-ias-b4-llvm-15-v1-1-b252... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/Kconfig | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1369,6 +1369,8 @@ choice config CPU_BIG_ENDIAN bool "Build big-endian kernel" depends on !LD_IS_LLD || LLD_VERSION >= 130000 + # https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba530... + depends on AS_IS_GNU || AS_VERSION >= 150000 help Say Y if you plan on running a kernel with a big-endian userspace.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maria Yu quic_aiquny@quicinc.com
commit d35686444fc80950c731e33a2f6ad4a55822be9b upstream.
The counting of module PLTs has been broken when CONFIG_RANDOMIZE_BASE=n since commit:
3e35d303ab7d22c4 ("arm64: module: rework module VA range selection")
Prior to that commit, when CONFIG_RANDOMIZE_BASE=n, the kernel image and all modules were placed within a 128M region, and no PLTs were necessary for B or BL. Hence count_plts() and partition_branch_plt_relas() skipped handling B and BL when CONFIG_RANDOMIZE_BASE=n.
After that commit, modules can be placed anywhere within a 2G window regardless of CONFIG_RANDOMIZE_BASE, and hence PLTs may be necessary for B and BL even when CONFIG_RANDOMIZE_BASE=n. Unfortunately that commit failed to update count_plts() and partition_branch_plt_relas() accordingly.
Due to this, module_emit_plt_entry() may fail if an insufficient number of PLT entries have been reserved, resulting in modules failing to load with -ENOEXEC.
Fix this by counting PLTs regardless of CONFIG_RANDOMIZE_BASE in count_plts() and partition_branch_plt_relas().
Fixes: 3e35d303ab7d ("arm64: module: rework module VA range selection") Signed-off-by: Maria Yu quic_aiquny@quicinc.com Cc: stable@vger.kernel.org # 6.5.x Acked-by: Ard Biesheuvel ardb@kernel.org Fixes: 3e35d303ab7d ("arm64: module: rework module VA range selection") Reviewed-by: Mark Rutland mark.rutland@arm.com Link: https://lore.kernel.org/r/20231024010954.6768-1-quic_aiquny@quicinc.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/module-plts.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c index bd69a4e7cd60..79200f21e123 100644 --- a/arch/arm64/kernel/module-plts.c +++ b/arch/arm64/kernel/module-plts.c @@ -167,9 +167,6 @@ static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela *rela, int num, switch (ELF64_R_TYPE(rela[i].r_info)) { case R_AARCH64_JUMP26: case R_AARCH64_CALL26: - if (!IS_ENABLED(CONFIG_RANDOMIZE_BASE)) - break; - /* * We only have to consider branch targets that resolve * to symbols that are defined in a different section. @@ -269,9 +266,6 @@ static int partition_branch_plt_relas(Elf64_Sym *syms, Elf64_Rela *rela, { int i = 0, j = numrels - 1;
- if (!IS_ENABLED(CONFIG_RANDOMIZE_BASE)) - return 0; - while (i < j) { if (branch_rela_needs_plt(syms, &rela[i], dstidx)) i++;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 86bb854d134f4429feb35d2e05f55c6e036770d2 upstream.
The PDIR table of the System Bus Adapter (SBA) I/O MMU uses 64-bit little-endian pointers.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/agp/parisc-agp.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/char/agp/parisc-agp.c b/drivers/char/agp/parisc-agp.c index c6f181702b9a..edbc4d338117 100644 --- a/drivers/char/agp/parisc-agp.c +++ b/drivers/char/agp/parisc-agp.c @@ -38,7 +38,7 @@ static struct _parisc_agp_info {
int lba_cap_offset;
- u64 *gatt; + __le64 *gatt; u64 gatt_entries;
u64 gart_base; @@ -104,7 +104,7 @@ parisc_agp_create_gatt_table(struct agp_bridge_data *bridge) int i;
for (i = 0; i < info->gatt_entries; i++) { - info->gatt[i] = (unsigned long)agp_bridge->scratch_page; + info->gatt[i] = cpu_to_le64(agp_bridge->scratch_page); }
return 0; @@ -158,9 +158,9 @@ parisc_agp_insert_memory(struct agp_memory *mem, off_t pg_start, int type) for (k = 0; k < info->io_pages_per_kpage; k++, j++, paddr += info->io_page_size) { - info->gatt[j] = + info->gatt[j] = cpu_to_le64( parisc_agp_mask_memory(agp_bridge, - paddr, type); + paddr, type)); asm_io_fdc(&info->gatt[j]); } } @@ -184,7 +184,7 @@ parisc_agp_remove_memory(struct agp_memory *mem, off_t pg_start, int type) io_pg_start = info->io_pages_per_kpage * pg_start; io_pg_count = info->io_pages_per_kpage * mem->page_count; for (i = io_pg_start; i < io_pg_count + io_pg_start; i++) { - info->gatt[i] = agp_bridge->scratch_page; + info->gatt[i] = cpu_to_le64(agp_bridge->scratch_page); }
agp_bridge->driver->tlb_flush(mem); @@ -204,7 +204,8 @@ parisc_agp_mask_memory(struct agp_bridge_data *bridge, dma_addr_t addr, pa |= (ci >> PAGE_SHIFT) & 0xff;/* move CI (8 bits) into lowest byte */ pa |= SBA_PDIR_VALID_BIT; /* set "valid" bit */
- return cpu_to_le64(pa); + /* return native (big-endian) PDIR entry */ + return pa; }
static void @@ -251,7 +252,8 @@ static int __init agp_ioc_init(void __iomem *ioc_regs) { struct _parisc_agp_info *info = &parisc_agp_info; - u64 iova_base, *io_pdir, io_tlb_ps; + u64 iova_base, io_tlb_ps; + __le64 *io_pdir; int io_tlb_shift;
printk(KERN_INFO DRVPFX "IO PDIR shared with sba_iommu\n");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 6240553b52c475d9fc9674de0521b77e692f3764 upstream.
PDC2.0 specifies the additional PSW-bit field.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/uapi/asm/pdc.h | 1 + 1 file changed, 1 insertion(+)
--- a/arch/parisc/include/uapi/asm/pdc.h +++ b/arch/parisc/include/uapi/asm/pdc.h @@ -472,6 +472,7 @@ struct pdc_model { /* for PDC_MODEL */ unsigned long arch_rev; unsigned long pot_key; unsigned long curr_key; + unsigned long width; /* default of PSW_W bit (1=enabled) */ };
struct pdc_cache_cf { /* for PDC_CACHE (I/D-caches) */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit d0c219472980d15f5cbc5c8aec736848bda3f235 upstream.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/power.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/drivers/parisc/power.c +++ b/drivers/parisc/power.c @@ -197,6 +197,14 @@ static struct notifier_block parisc_pani .priority = INT_MAX, };
+/* qemu soft power-off function */ +static int qemu_power_off(struct sys_off_data *data) +{ + /* this turns the system off via SeaBIOS */ + *(int *)data->cb_data = 0; + pdc_soft_power_button(1); + return NOTIFY_DONE; +}
static int __init power_init(void) { @@ -226,7 +234,13 @@ static int __init power_init(void) soft_power_reg); }
- power_task = kthread_run(kpowerswd, (void*)soft_power_reg, KTHREAD_NAME); + power_task = NULL; + if (running_on_qemu && soft_power_reg) + register_sys_off_handler(SYS_OFF_MODE_POWER_OFF, SYS_OFF_PRIO_DEFAULT, + qemu_power_off, (void *)soft_power_reg); + else + power_task = kthread_run(kpowerswd, (void*)soft_power_reg, + KTHREAD_NAME); if (IS_ERR(power_task)) { printk(KERN_ERR DRIVER_NAME ": thread creation failed. Driver not loaded.\n"); pdc_soft_power_button(0);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Marangi ansuelsmth@gmail.com
commit ea167a7fc2426f7685c3735e104921c1a20a6d3f upstream.
Commit 3c0897c180c6 ("cpufreq: Use scnprintf() for avoiding potential buffer overflow") switched from snprintf to the more secure scnprintf but never updated the exit condition for PAGE_SIZE.
As the commit say and as scnprintf document, what scnprintf returns what is actually written not counting the '\0' end char. This results in the case of len exceeding the size, len set to PAGE_SIZE - 1, as it can be written at max PAGE_SIZE - 1 (as '\0' is not counted)
Because of len is never set to PAGE_SIZE, the function never break early, never prints the warning and never return -EFBIG.
Fix this by changing the condition to PAGE_SIZE - 1 to correctly trigger the error.
Cc: 5.10+ stable@vger.kernel.org # 5.10+ Fixes: 3c0897c180c6 ("cpufreq: Use scnprintf() for avoiding potential buffer overflow") Signed-off-by: Christian Marangi ansuelsmth@gmail.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/cpufreq_stats.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/cpufreq/cpufreq_stats.c +++ b/drivers/cpufreq/cpufreq_stats.c @@ -131,23 +131,23 @@ static ssize_t show_trans_table(struct c len += sysfs_emit_at(buf, len, " From : To\n"); len += sysfs_emit_at(buf, len, " : "); for (i = 0; i < stats->state_num; i++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break; len += sysfs_emit_at(buf, len, "%9u ", stats->freq_table[i]); } - if (len >= PAGE_SIZE) - return PAGE_SIZE; + if (len >= PAGE_SIZE - 1) + return PAGE_SIZE - 1;
len += sysfs_emit_at(buf, len, "\n");
for (i = 0; i < stats->state_num; i++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break;
len += sysfs_emit_at(buf, len, "%9u: ", stats->freq_table[i]);
for (j = 0; j < stats->state_num; j++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break;
if (pending) @@ -157,12 +157,12 @@ static ssize_t show_trans_table(struct c
len += sysfs_emit_at(buf, len, "%9u ", count); } - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break; len += sysfs_emit_at(buf, len, "\n"); }
- if (len >= PAGE_SIZE) { + if (len >= PAGE_SIZE - 1) { pr_warn_once("cpufreq transition table exceeds PAGE_SIZE. Disabling\n"); return -EFBIG; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit a60ec4485f1c72dfece365cf95e6de82bdd74300 upstream.
Before the refactoring the pr_warn() only triggered when someone explicitly tried to write to a BIOS locked limit. After the refactoring the warning is also triggering during system resume. The user can't do anything about this so printing scary warnings doesn't make sense
Keep the printk but make it pr_debug() instead of pr_warn() to make it clear it's not a serious issue.
Fixes: 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits support") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Cc: 6.5+ stable@vger.kernel.org # 6.5+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/powercap/intel_rapl_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index 40a2cc649c79..2feed036c1cd 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -892,7 +892,7 @@ static int rapl_write_pl_data(struct rapl_domain *rd, int pl, return -EINVAL;
if (rd->rpl[pl].locked) { - pr_warn("%s:%s:%s locked by BIOS\n", rd->rp->name, rd->name, pl_names[pl]); + pr_debug("%s:%s:%s locked by BIOS\n", rd->rp->name, rd->name, pl_names[pl]); return -EACCES; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavoars@kernel.org
commit d761bb01c85b22d5b44abe283eb89019693f6595 upstream.
`struct clk_hw_onecell_data` is a flexible structure, which means that it contains flexible-array member at the bottom, in this case array `hws`:
include/linux/clk-provider.h: 1380 struct clk_hw_onecell_data { 1381 unsigned int num; 1382 struct clk_hw *hws[] __counted_by(num); 1383 };
This could potentially lead to an overwrite of the objects following `clk_data` in `struct stratix10_clock_data`, in this case `void __iomem *base;` at run-time:
drivers/clk/socfpga/stratix10-clk.h: 9 struct stratix10_clock_data { 10 struct clk_hw_onecell_data clk_data; 11 void __iomem *base; 12 };
There are currently three different places where memory is allocated for `struct stratix10_clock_data`, including the flex-array `hws` in `struct clk_hw_onecell_data`:
drivers/clk/socfpga/clk-agilex.c: 469 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 470 num_clks), GFP_KERNEL);
drivers/clk/socfpga/clk-agilex.c: 509 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 510 num_clks), GFP_KERNEL);
drivers/clk/socfpga/clk-s10.c: 400 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 401 num_clks), GFP_KERNEL);
I'll use just one of them to describe the issue. See below.
Notice that a total of 440 bytes are allocated for flexible-array member `hws` at line 469:
include/dt-bindings/clock/agilex-clock.h: 70 #define AGILEX_NUM_CLKS 55
drivers/clk/socfpga/clk-agilex.c: 459 struct stratix10_clock_data *clk_data; 460 void __iomem *base; ... 466 467 num_clks = AGILEX_NUM_CLKS; 468 469 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 470 num_clks), GFP_KERNEL);
`struct_size(clk_data, clk_data.hws, num_clks)` above translates to sizeof(struct stratix10_clock_data) + sizeof(struct clk_hw *) * 55 == 16 + 8 * 55 == 16 + 440 ^^^ | allocated bytes for flex-array `hws`
474 for (i = 0; i < num_clks; i++) 475 clk_data->clk_data.hws[i] = ERR_PTR(-ENOENT); 476 477 clk_data->base = base;
and then some data is written into both `hws` and `base` objects.
Fix this by placing the declaration of object `clk_data` at the end of `struct stratix10_clock_data`. Also, add a comment to make it clear that this object must always be last in the structure.
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally.
Fixes: ba7e258425ac ("clk: socfpga: Convert to s10/agilex/n5x to use clk_hw") Cc: stable@vger.kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Link: https://lore.kernel.org/r/1da736106d8e0806aeafa6e471a13ced490eae22.169811781... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/socfpga/stratix10-clk.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/clk/socfpga/stratix10-clk.h +++ b/drivers/clk/socfpga/stratix10-clk.h @@ -7,8 +7,10 @@ #define __STRATIX10_CLK_H
struct stratix10_clock_data { - struct clk_hw_onecell_data clk_data; void __iomem *base; + + /* Must be last */ + struct clk_hw_onecell_data clk_data; };
struct stratix10_pll_clock {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavoars@kernel.org
commit 5ad1e217a2b23aa046b241183bd9452d259d70d0 upstream.
`struct clk_hw_onecell_data` is a flexible structure, which means that it contains flexible-array member at the bottom, in this case array `hws`:
include/linux/clk-provider.h: 1380 struct clk_hw_onecell_data { 1381 unsigned int num; 1382 struct clk_hw *hws[] __counted_by(num); 1383 };
This could potentially lead to an overwrite of the objects following `clk_data` in `struct visconti_pll_provider`, in this case `struct device_node *node;`, at run-time:
drivers/clk/visconti/pll.h: 16 struct visconti_pll_provider { 17 void __iomem *reg_base; 18 struct clk_hw_onecell_data clk_data; 19 struct device_node *node; 20 };
Notice that a total of 56 bytes are allocated for flexible-array `hws` at line 328. See below:
include/dt-bindings/clock/toshiba,tmpv770x.h: 14 #define TMPV770X_NR_PLL 7
drivers/clk/visconti/pll-tmpv770x.c: 69 ctx = visconti_init_pll(np, reg_base, TMPV770X_NR_PLL);
drivers/clk/visconti/pll.c: 321 struct visconti_pll_provider * __init visconti_init_pll(struct device_node *np, 322 void __iomem *base, 323 unsigned long nr_plls) 324 { 325 struct visconti_pll_provider *ctx; ... 328 ctx = kzalloc(struct_size(ctx, clk_data.hws, nr_plls), GFP_KERNEL);
`struct_size(ctx, clk_data.hws, nr_plls)` above translates to sizeof(struct visconti_pll_provider) + sizeof(struct clk_hw *) * 7 == 24 + 8 * 7 == 24 + 56 ^^^^ | allocated bytes for flex array `hws`
$ pahole -C visconti_pll_provider drivers/clk/visconti/pll.o struct visconti_pll_provider { void * reg_base; /* 0 8 */ struct clk_hw_onecell_data clk_data; /* 8 8 */ struct device_node * node; /* 16 8 */
/* size: 24, cachelines: 1, members: 3 */ /* last cacheline: 24 bytes */ };
And then, after the allocation, some data is written into all members of `struct visconti_pll_provider`:
332 for (i = 0; i < nr_plls; ++i) 333 ctx->clk_data.hws[i] = ERR_PTR(-ENOENT); 334 335 ctx->node = np; 336 ctx->reg_base = base; 337 ctx->clk_data.num = nr_plls;
Fix all these by placing the declaration of object `clk_data` at the end of `struct visconti_pll_provider`. Also, add a comment to make it clear that this object must always be last in the structure, and prevent this bug from being introduced again in the future.
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally.
Fixes: b4cbe606dc36 ("clk: visconti: Add support common clock driver and reset driver") Cc: stable@vger.kernel.org Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Nobuhiro Iwamatsu nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Link: https://lore.kernel.org/r/57a831d94ee2b3889b11525d4ad500356f89576f.169749289... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/visconti/pll.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/clk/visconti/pll.h +++ b/drivers/clk/visconti/pll.h @@ -15,8 +15,10 @@
struct visconti_pll_provider { void __iomem *reg_base; - struct clk_hw_onecell_data clk_data; struct device_node *node; + + /* Must be last */ + struct clk_hw_onecell_data clk_data; };
#define VISCONTI_PLL_RATE(_rate, _dacen, _dsmen, \
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit e641a070137dd959932c7c222e000d9d941167a2 upstream.
GPLL, NSS crypto PLL clock rates are fixed and shouldn't be scaled based on the request from dependent clocks. Doing so will result in the unexpected behaviour. So drop the CLK_SET_RATE_PARENT flag from the PLL clocks.
Cc: stable@vger.kernel.org Fixes: b8e7e519625f ("clk: qcom: ipq8074: add remaining PLL’s") Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Link: https://lore.kernel.org/r/20230913-gpll_cleanup-v2-1-c8ceb1a37680@quicinc.co... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/gcc-ipq8074.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/clk/qcom/gcc-ipq8074.c +++ b/drivers/clk/qcom/gcc-ipq8074.c @@ -76,7 +76,6 @@ static struct clk_fixed_factor gpll0_out &gpll0_main.clkr.hw }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -122,7 +121,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll2_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -155,7 +153,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll4_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -189,7 +186,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll6_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -202,7 +198,6 @@ static struct clk_fixed_factor gpll6_out &gpll6_main.clkr.hw }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -267,7 +262,6 @@ static struct clk_alpha_pll_postdiv nss_ &nss_crypto_pll_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit 99cd4935cb972d0aafb16838bb2aeadbcaf196ce upstream.
GPLL, NSS crypto PLL clock rates are fixed and shouldn't be scaled based on the request from dependent clocks. Doing so will result in the unexpected behaviour. So drop the CLK_SET_RATE_PARENT flag from the PLL clocks.
Cc: stable@vger.kernel.org Fixes: d9db07f088af ("clk: qcom: Add ipq6018 Global Clock Controller support") Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230913-gpll_cleanup-v2-2-c8ceb1a37680@quicinc.co... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/gcc-ipq6018.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/clk/qcom/gcc-ipq6018.c +++ b/drivers/clk/qcom/gcc-ipq6018.c @@ -73,7 +73,6 @@ static struct clk_fixed_factor gpll0_out &gpll0_main.clkr.hw }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -87,7 +86,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll0_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -162,7 +160,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll6_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -193,7 +190,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll4_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -244,7 +240,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll2_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -275,7 +270,6 @@ static struct clk_alpha_pll_postdiv nss_ &nss_crypto_pll_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marios Makassikis mmakassikis@freebox.fr
commit 807252f028c59b9a3bac4d62ad84761548c10f11 upstream.
Running smb2.rename test from Samba smbtorture suite against a kernel built with lockdep triggers a "possible recursive locking detected" warning.
This is because mnt_want_write() is called twice with no mnt_drop_write() in between: -> ksmbd_vfs_mkdir() -> ksmbd_vfs_kern_path_create() -> kern_path_create() -> filename_create() -> mnt_want_write() -> mnt_want_write()
Fix this by removing the mnt_want_write/mnt_drop_write calls from vfs helpers that call kern_path_create().
Full lockdep trace below:
============================================ WARNING: possible recursive locking detected 6.6.0-rc5 #775 Not tainted -------------------------------------------- kworker/1:1/32 is trying to acquire lock: ffff888005ac83f8 (sb_writers#5){.+.+}-{0:0}, at: ksmbd_vfs_mkdir+0xe1/0x410
but task is already holding lock: ffff888005ac83f8 (sb_writers#5){.+.+}-{0:0}, at: filename_create+0xb6/0x260
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(sb_writers#5); lock(sb_writers#5);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/1:1/32: #0: ffff8880064e4138 ((wq_completion)ksmbd-io){+.+.}-{0:0}, at: process_one_work+0x40e/0x980 #1: ffff888005b0fdd0 ((work_completion)(&work->work)){+.+.}-{0:0}, at: process_one_work+0x40e/0x980 #2: ffff888005ac83f8 (sb_writers#5){.+.+}-{0:0}, at: filename_create+0xb6/0x260 #3: ffff8880057ce760 (&type->i_mutex_dir_key#3/1){+.+.}-{3:3}, at: filename_create+0x123/0x260
Cc: stable@vger.kernel.org Fixes: 40b268d384a2 ("ksmbd: add mnt_want_write to ksmbd vfs functions") Signed-off-by: Marios Makassikis mmakassikis@freebox.fr Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/vfs.c | 23 +++-------------------- 1 file changed, 3 insertions(+), 20 deletions(-)
--- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -173,10 +173,6 @@ int ksmbd_vfs_create(struct ksmbd_work * return err; }
- err = mnt_want_write(path.mnt); - if (err) - goto out_err; - mode |= S_IFREG; err = vfs_create(mnt_idmap(path.mnt), d_inode(path.dentry), dentry, mode, true); @@ -186,9 +182,7 @@ int ksmbd_vfs_create(struct ksmbd_work * } else { pr_err("File(%s): creation failed (err:%d)\n", name, err); } - mnt_drop_write(path.mnt);
-out_err: done_path_create(&path, dentry); return err; } @@ -219,10 +213,6 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *w return err; }
- err = mnt_want_write(path.mnt); - if (err) - goto out_err2; - idmap = mnt_idmap(path.mnt); mode |= S_IFDIR; err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); @@ -233,21 +223,19 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *w dentry->d_name.len); if (IS_ERR(d)) { err = PTR_ERR(d); - goto out_err1; + goto out_err; } if (unlikely(d_is_negative(d))) { dput(d); err = -ENOENT; - goto out_err1; + goto out_err; }
ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); dput(d); }
-out_err1: - mnt_drop_write(path.mnt); -out_err2: +out_err: done_path_create(&path, dentry); if (err) pr_err("mkdir(%s): creation failed (err:%d)\n", name, err); @@ -665,16 +653,11 @@ int ksmbd_vfs_link(struct ksmbd_work *wo goto out3; }
- err = mnt_want_write(newpath.mnt); - if (err) - goto out3; - err = vfs_link(oldpath.dentry, mnt_idmap(newpath.mnt), d_inode(newpath.dentry), dentry, NULL); if (err) ksmbd_debug(VFS, "vfs_link failed err %d\n", err); - mnt_drop_write(newpath.mnt);
out3: done_path_create(&newpath, dentry);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 5a5409d90bd05f87fe5623a749ccfbf3f7c7d400 upstream.
If set_smb1_rsp_status() is not implemented, It will cause NULL pointer dereferece error when client send malformed smb1 message. This patch add set_smb1_rsp_status() to ignore malformed smb1 message.
Cc: stable@vger.kernel.org Reported-by: Robert Morris rtm@csail.mit.edu Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb_common.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/fs/smb/server/smb_common.c +++ b/fs/smb/server/smb_common.c @@ -372,11 +372,22 @@ static int smb1_allocate_rsp_buf(struct return 0; }
+/** + * set_smb1_rsp_status() - set error type in smb response header + * @work: smb work containing smb response header + * @err: error code to set in response + */ +static void set_smb1_rsp_status(struct ksmbd_work *work, __le32 err) +{ + work->send_no_response = 1; +} + static struct smb_version_ops smb1_server_ops = { .get_cmd_val = get_smb1_cmd_val, .init_rsp_hdr = init_smb1_rsp_hdr, .allocate_rsp_buf = smb1_allocate_rsp_buf, .check_user_session = smb1_check_user_session, + .set_rsp_status = set_smb1_rsp_status, };
static int smb1_negotiate(struct ksmbd_work *work)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit eebff19acaa35820cb09ce2ccb3d21bee2156ffb upstream.
slab out-of-bounds write is caused by that offsets is bigger than pntsd allocation size. This patch add the check to validate 3 offsets using allocation size.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-22271 Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smbacl.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)
--- a/fs/smb/server/smbacl.c +++ b/fs/smb/server/smbacl.c @@ -1107,6 +1107,7 @@ pass: struct smb_acl *pdacl; struct smb_sid *powner_sid = NULL, *pgroup_sid = NULL; int powner_sid_size = 0, pgroup_sid_size = 0, pntsd_size; + int pntsd_alloc_size;
if (parent_pntsd->osidoffset) { powner_sid = (struct smb_sid *)((char *)parent_pntsd + @@ -1119,9 +1120,10 @@ pass: pgroup_sid_size = 1 + 1 + 6 + (pgroup_sid->num_subauth * 4); }
- pntsd = kzalloc(sizeof(struct smb_ntsd) + powner_sid_size + - pgroup_sid_size + sizeof(struct smb_acl) + - nt_size, GFP_KERNEL); + pntsd_alloc_size = sizeof(struct smb_ntsd) + powner_sid_size + + pgroup_sid_size + sizeof(struct smb_acl) + nt_size; + + pntsd = kzalloc(pntsd_alloc_size, GFP_KERNEL); if (!pntsd) { rc = -ENOMEM; goto free_aces_base; @@ -1136,6 +1138,27 @@ pass: pntsd->gsidoffset = parent_pntsd->gsidoffset; pntsd->dacloffset = parent_pntsd->dacloffset;
+ if ((u64)le32_to_cpu(pntsd->osidoffset) + powner_sid_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + + if ((u64)le32_to_cpu(pntsd->gsidoffset) + pgroup_sid_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + + if ((u64)le32_to_cpu(pntsd->dacloffset) + sizeof(struct smb_acl) + nt_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + if (pntsd->osidoffset) { struct smb_sid *owner_sid = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->osidoffset));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit b44f9da81783fda72632ef9b0d05ea3f3ca447a5 upstream.
This error path should return -EINVAL instead of success.
Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/0769d30c-ad80-421b-bf5d-7d6f5d85604e@moroto.mounta... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/vub300.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mmc/host/vub300.c +++ b/drivers/mmc/host/vub300.c @@ -2309,6 +2309,7 @@ static int vub300_probe(struct usb_inter vub300->read_only = (0x0010 & vub300->system_port_status.port_flags) ? 1 : 0; } else { + retval = -EINVAL; goto error5; } usb_set_intfdata(interface, vub300);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nitin Yadav n-yadav@ti.com
commit 71956d0cb56c1e5f9feeb4819db87a076418e930 upstream.
ti,otap-del-sel-legacy/ti,itap-del-sel-legacy passed from DT are currently ignored for all SD/MMC and eMMC modes. Fix this by making start loop index to MMC_TIMING_LEGACY.
Fixes: 8ee5fc0e0b3b ("mmc: sdhci_am654: Update OTAPDLY writes") Signed-off-by: Nitin Yadav n-yadav@ti.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231026061458.1116276-1-n-yadav@ti.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci_am654.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -598,7 +598,7 @@ static int sdhci_am654_get_otap_delay(st return 0; }
- for (i = MMC_TIMING_MMC_HS; i <= MMC_TIMING_MMC_HS400; i++) { + for (i = MMC_TIMING_LEGACY; i <= MMC_TIMING_MMC_HS400; i++) {
ret = device_property_read_u32(dev, td[i].otap_binding, &sdhci_am654->otap_del_sel[i]);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bean Huo beanhuo@micron.com
commit ed9009ad300c0f15a3ecfe9613547b1962bde02c upstream.
Micron MTFC4GACAJCN eMMC supports cache but requires that flush cache operation be allowed only after a write has occurred. Otherwise, the cache flush command or subsequent commands will time out.
Signed-off-by: Bean Huo beanhuo@micron.com Signed-off-by: Rafael Beims rafael.beims@toradex.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231030224809.59245-1-beanhuo@iokpp.de Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/block.c | 4 +++- drivers/mmc/core/card.h | 4 ++++ drivers/mmc/core/mmc.c | 8 ++++++-- drivers/mmc/core/quirks.h | 7 ++++--- include/linux/mmc/card.h | 2 ++ 5 files changed, 19 insertions(+), 6 deletions(-)
--- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -2389,8 +2389,10 @@ enum mmc_issued mmc_blk_mq_issue_rq(stru } ret = mmc_blk_cqe_issue_flush(mq, req); break; - case REQ_OP_READ: case REQ_OP_WRITE: + card->written_flag = true; + fallthrough; + case REQ_OP_READ: if (host->cqe_enabled) ret = mmc_blk_cqe_issue_rw_rq(mq, req); else --- a/drivers/mmc/core/card.h +++ b/drivers/mmc/core/card.h @@ -280,4 +280,8 @@ static inline int mmc_card_broken_sd_cac return c->quirks & MMC_QUIRK_BROKEN_SD_CACHE; }
+static inline int mmc_card_broken_cache_flush(const struct mmc_card *c) +{ + return c->quirks & MMC_QUIRK_BROKEN_CACHE_FLUSH; +} #endif --- a/drivers/mmc/core/mmc.c +++ b/drivers/mmc/core/mmc.c @@ -2081,13 +2081,17 @@ static int _mmc_flush_cache(struct mmc_h { int err = 0;
+ if (mmc_card_broken_cache_flush(host->card) && !host->card->written_flag) + return 0; + if (_mmc_cache_enabled(host)) { err = mmc_switch(host->card, EXT_CSD_CMD_SET_NORMAL, EXT_CSD_FLUSH_CACHE, 1, CACHE_FLUSH_TIMEOUT_MS); if (err) - pr_err("%s: cache flush error %d\n", - mmc_hostname(host), err); + pr_err("%s: cache flush error %d\n", mmc_hostname(host), err); + else + host->card->written_flag = false; }
return err; --- a/drivers/mmc/core/quirks.h +++ b/drivers/mmc/core/quirks.h @@ -110,11 +110,12 @@ static const struct mmc_fixup __maybe_un MMC_QUIRK_TRIM_BROKEN),
/* - * Micron MTFC4GACAJCN-1M advertises TRIM but it does not seems to - * support being used to offload WRITE_ZEROES. + * Micron MTFC4GACAJCN-1M supports TRIM but does not appear to support + * WRITE_ZEROES offloading. It also supports caching, but the cache can + * only be flushed after a write has occurred. */ MMC_FIXUP("Q2J54A", CID_MANFID_MICRON, 0x014e, add_quirk_mmc, - MMC_QUIRK_TRIM_BROKEN), + MMC_QUIRK_TRIM_BROKEN | MMC_QUIRK_BROKEN_CACHE_FLUSH),
/* * Kingston EMMC04G-M627 advertises TRIM but it does not seems to --- a/include/linux/mmc/card.h +++ b/include/linux/mmc/card.h @@ -295,7 +295,9 @@ struct mmc_card { #define MMC_QUIRK_BROKEN_HPI (1<<13) /* Disable broken HPI support */ #define MMC_QUIRK_BROKEN_SD_DISCARD (1<<14) /* Disable broken SD discard support */ #define MMC_QUIRK_BROKEN_SD_CACHE (1<<15) /* Disable broken SD cache support */ +#define MMC_QUIRK_BROKEN_CACHE_FLUSH (1<<16) /* Don't flush cache until the write has occurred */
+ bool written_flag; /* Indicates eMMC has been written since power on */ bool reenable_cmdq; /* Re-enable Command Queue */
unsigned int erase_size; /* erase size in sectors */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit 8e37372ad0bea4c9b4712d9943f6ae96cff9491f upstream.
aspm_attr_store_common(), which handles sysfs control of ASPM, has the same problem as fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1"): disabling L1 adds only ASPM_L1 (but not any of the L1.x substates) to the "aspm_disable" mask.
Enabling one substate, e.g., L1.1, via sysfs removes ASPM_L1 from the disable mask. Since disabling L1 via sysfs doesn't add any of the substates to the disable mask, enabling L1.1 actually enables *all* the substates.
In this scenario:
- Write 0 to "l1_aspm" to disable L1 - Write 1 to "l1_1_aspm" to enable L1.1
the intention is to disable L1 and all L1.x substates, then enable just L1.1, but in fact, *all* L1.x substates are enabled.
Fix this by explicitly disabling all the L1.x substates when disabling L1.
Fixes: 72ea91afbfb0 ("PCI/ASPM: Add sysfs attributes for controlling ASPM link states") Link: https://lore.kernel.org/r/6ba7dd79-9cfe-4ed0-a002-d99cb842f361@gmail.com Signed-off-by: Heiner Kallweit hkallweit1@gmail.com [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pcie/aspm.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1248,6 +1248,8 @@ static ssize_t aspm_attr_store_common(st link->aspm_disable &= ~ASPM_STATE_L1; } else { link->aspm_disable |= state; + if (state & ASPM_STATE_L1) + link->aspm_disable |= ASPM_STATE_L1SS; }
pcie_config_aspm_link(link, policy_to_aspm_state(link));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 3064ef2e88c1629c1e67a77d7bc20020b35846f2 upstream.
With CONFIG_PCIE_KIRIN=y and kirin_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse.
The right thing to do is do always have the remove callback available. This fixes the following warning by modpost:
drivers/pci/controller/dwc/pcie-kirin: section mismatch in reference: kirin_pcie_driver+0x8 (section: .data) -> kirin_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 000f60db784b ("PCI: kirin: Add support for a PHY layer") Link: https://lore.kernel.org/r/20231001170254.2506508-3-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-kirin.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pcie-kirin.c +++ b/drivers/pci/controller/dwc/pcie-kirin.c @@ -742,7 +742,7 @@ err: return ret; }
-static int __exit kirin_pcie_remove(struct platform_device *pdev) +static int kirin_pcie_remove(struct platform_device *pdev) { struct kirin_pcie *kirin_pcie = platform_get_drvdata(pdev);
@@ -819,7 +819,7 @@ static int kirin_pcie_probe(struct platf
static struct platform_driver kirin_pcie_driver = { .probe = kirin_pcie_probe, - .remove = __exit_p(kirin_pcie_remove), + .remove = kirin_pcie_remove, .driver = { .name = "kirin-pcie", .of_match_table = kirin_pcie_match,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 upstream.
With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse.
The right thing to do is do always have the remove callback available. This fixes the following warning by modpost:
WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos") Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Alim Akhtar alim.akhtar@samsung.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-exynos.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-exynos.c +++ b/drivers/pci/controller/dwc/pci-exynos.c @@ -375,7 +375,7 @@ fail_probe: return ret; }
-static int __exit exynos_pcie_remove(struct platform_device *pdev) +static int exynos_pcie_remove(struct platform_device *pdev) { struct exynos_pcie *ep = platform_get_drvdata(pdev);
@@ -431,7 +431,7 @@ static const struct of_device_id exynos_
static struct platform_driver exynos_pcie_driver = { .probe = exynos_pcie_probe, - .remove = __exit_p(exynos_pcie_remove), + .remove = exynos_pcie_remove, .driver = { .name = "exynos-pcie", .of_match_table = exynos_pcie_of_match,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ajay Singh ajay.kathat@microchip.com
commit 05ac1a198a63ad66bf5ae8b7321407c102d40ef3 upstream.
Enabling KASAN and running some iperf tests raises some memory issues with vmm_table:
BUG: KASAN: slab-out-of-bounds in wilc_wlan_handle_txq+0x6ac/0xdb4 Write of size 4 at addr c3a61540 by task wlan0-tx/95
KASAN detects that we are writing data beyond range allocated to vmm_table. There is indeed a mismatch between the size passed to allocator in wilc_wlan_init, and the range of possible indexes used later: allocation size is missing a multiplication by sizeof(u32)
Fixes: 40b717bfcefa ("wifi: wilc1000: fix DMA on stack objects") Cc: stable@vger.kernel.org Signed-off-by: Ajay Singh ajay.kathat@microchip.com Signed-off-by: Alexis Lothoré alexis.lothore@bootlin.com Reviewed-by: Michael Walle mwalle@kernel.org Reviewed-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20231017-wilc1000_tx_oops-v3-1-b2155f1f7bee@bootli... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/microchip/wilc1000/wlan.c +++ b/drivers/net/wireless/microchip/wilc1000/wlan.c @@ -1492,7 +1492,7 @@ int wilc_wlan_init(struct net_device *de }
if (!wilc->vmm_table) - wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE, GFP_KERNEL); + wilc->vmm_table = kcalloc(WILC_VMM_TBL_SIZE, sizeof(u32), GFP_KERNEL);
if (!wilc->vmm_table) { ret = -ENOBUFS;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 197115ebf358cb440c73e868b2a0a5ef728decc6 upstream.
When an RPC Call message cannot be pulled from the client, that is a message loss, by definition. Close the connection to trigger the client to resend.
Cc: stable@vger.kernel.org Reviewed-by: Tom Talpey tom@talpey.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -852,7 +852,8 @@ out_readfail: if (ret == -EINVAL) svc_rdma_send_error(rdma_xprt, ctxt, ret); svc_rdma_recv_ctxt_put(rdma_xprt, ctxt); - return ret; + svc_xprt_deferred_close(xprt); + return -ENOTCONN;
out_backchannel: svc_rdma_handle_bc_reply(rqstp, ctxt);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joel Fernandes (Google) joel@joelfernandes.org
commit b96e7a5fa0ba9cda32888e04f8f4bac42d49a7f8 upstream.
There are instances where rcu_cpu_stall_reset() is called when jiffies did not get a chance to update for a long time. Before jiffies is updated, the CPU stall detector can go off triggering false-positives where a just-started grace period appears to be ages old. In the past, we disabled stall detection in rcu_cpu_stall_reset() however this got changed [1]. This is resulting in false-positives in KGDB usecase [2].
Fix this by deferring the update of jiffies to the third run of the FQS loop. This is more robust, as, even if rcu_cpu_stall_reset() is called just before jiffies is read, we would end up pushing out the jiffies read by 3 more FQS loops. Meanwhile the CPU stall detection will be delayed and we will not get any false positives.
[1] https://lore.kernel.org/all/20210521155624.174524-2-senozhatsky@chromium.org... [2] https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Tested with rcutorture.cpu_stall option as well to verify stall behavior with/without patch.
Tested-by: Huacai Chen chenhuacai@loongson.cn Reported-by: Binbin Zhou zhoubinbin@loongson.cn Closes: https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/ Suggested-by: Paul McKenney paulmck@kernel.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()") Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tree.c | 12 ++++++++++++ kernel/rcu/tree.h | 4 ++++ kernel/rcu/tree_stall.h | 20 ++++++++++++++++++-- 3 files changed, 34 insertions(+), 2 deletions(-)
--- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1552,10 +1552,22 @@ static bool rcu_gp_fqs_check_wake(int *g */ static void rcu_gp_fqs(bool first_time) { + int nr_fqs = READ_ONCE(rcu_state.nr_fqs_jiffies_stall); struct rcu_node *rnp = rcu_get_root();
WRITE_ONCE(rcu_state.gp_activity, jiffies); WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1); + + WARN_ON_ONCE(nr_fqs > 3); + /* Only countdown nr_fqs for stall purposes if jiffies moves. */ + if (nr_fqs) { + if (nr_fqs == 1) { + WRITE_ONCE(rcu_state.jiffies_stall, + jiffies + rcu_jiffies_till_stall_check()); + } + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs); + } + if (first_time) { /* Collect dyntick-idle snapshots. */ force_qs_rnp(dyntick_save_progress_counter); --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -386,6 +386,10 @@ struct rcu_state { /* in jiffies. */ unsigned long jiffies_stall; /* Time at which to check */ /* for CPU stalls. */ + int nr_fqs_jiffies_stall; /* Number of fqs loops after + * which read jiffies and set + * jiffies_stall. Stall + * warnings disabled if !0. */ unsigned long jiffies_resched; /* Time at which to resched */ /* a reluctant CPU. */ unsigned long n_force_qs_gpstart; /* Snapshot of n_force_qs at */ --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -149,12 +149,17 @@ static void panic_on_rcu_stall(void) /** * rcu_cpu_stall_reset - restart stall-warning timeout for current grace period * + * To perform the reset request from the caller, disable stall detection until + * 3 fqs loops have passed. This is required to ensure a fresh jiffies is + * loaded. It should be safe to do from the fqs loop as enough timer + * interrupts and context switches should have passed. + * * The caller must disable hard irqs. */ void rcu_cpu_stall_reset(void) { - WRITE_ONCE(rcu_state.jiffies_stall, - jiffies + rcu_jiffies_till_stall_check()); + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3); + WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX); }
////////////////////////////////////////////////////////////////////////////// @@ -170,6 +175,7 @@ static void record_gp_stall_check_time(v WRITE_ONCE(rcu_state.gp_start, j); j1 = rcu_jiffies_till_stall_check(); smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq. + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0); WRITE_ONCE(rcu_state.jiffies_stall, j + j1); rcu_state.jiffies_resched = j + j1 / 2; rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); @@ -725,6 +731,16 @@ static void check_cpu_stall(struct rcu_d !rcu_gp_in_progress()) return; rcu_stall_kick_kthreads(); + + /* + * Check if it was requested (via rcu_cpu_stall_reset()) that the FQS + * loop has to set jiffies to ensure a non-stale jiffies value. This + * is required to have good jiffies value after coming out of long + * breaks of jiffies updates. Not doing so can cause false positives. + */ + if (READ_ONCE(rcu_state.nr_fqs_jiffies_stall) > 0) + return; + j = jiffies;
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit 95d97b111e1e184b0c8656137033ed64f2cf21e4 upstream.
SMEM uses lock index 3 of the TCSR Mutex hwlock for allocations in SMEM region shared by the Host and FW.
Fix the SMEM hwlock index to 3 for IPQ6018.
Cc: stable@vger.kernel.org Fixes: 5bf635621245 ("arm64: dts: ipq6018: Add a few device nodes") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230904172516.479866-3-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -207,7 +207,7 @@ smem { compatible = "qcom,smem"; memory-region = <&smem_region>; - hwlocks = <&tcsr_mutex 0>; + hwlocks = <&tcsr_mutex 3>; };
soc: soc@0 {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biju Das biju.das.jz@bp.renesas.com
commit b7a8f1f7a8a25e09aaefebb6251a77f44cda638b upstream.
As per R01UH0914EJ0130 Rev.1.30 HW manual the MTU3 overflow/underflow interrupt names starts with 'tci' instead of 'tgi'.
Fix this documentation issue by replacing below overflow/underflow interrupt names: - tgiv0->tciv0 - tgiv1->tciv1 - tgiu1->tciu1 - tgiv2->tciv2 - tgiu2->tciu2 - tgiv3->tciv3 - tgiv4->tciv4 - tgiv6->tciv6 - tgiv7->tciv7 - tgiv8->tciv8 - tgiu8->tciu8
Fixes: 0a9d6b54297e ("dt-bindings: timer: Document RZ/G2L MTU3a bindings") Cc: stable@kernel.org Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Acked-by: Conor Dooley conor.dooley@microchip.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20230727081848.100834-2-biju.das.jz@bp.renesas.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../bindings/timer/renesas,rz-mtu3.yaml | 38 +++++++++---------- 1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml b/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml index bffdab0b0185..fbac40b958dd 100644 --- a/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml +++ b/Documentation/devicetree/bindings/timer/renesas,rz-mtu3.yaml @@ -169,27 +169,27 @@ properties: - const: tgib0 - const: tgic0 - const: tgid0 - - const: tgiv0 + - const: tciv0 - const: tgie0 - const: tgif0 - const: tgia1 - const: tgib1 - - const: tgiv1 - - const: tgiu1 + - const: tciv1 + - const: tciu1 - const: tgia2 - const: tgib2 - - const: tgiv2 - - const: tgiu2 + - const: tciv2 + - const: tciu2 - const: tgia3 - const: tgib3 - const: tgic3 - const: tgid3 - - const: tgiv3 + - const: tciv3 - const: tgia4 - const: tgib4 - const: tgic4 - const: tgid4 - - const: tgiv4 + - const: tciv4 - const: tgiu5 - const: tgiv5 - const: tgiw5 @@ -197,18 +197,18 @@ properties: - const: tgib6 - const: tgic6 - const: tgid6 - - const: tgiv6 + - const: tciv6 - const: tgia7 - const: tgib7 - const: tgic7 - const: tgid7 - - const: tgiv7 + - const: tciv7 - const: tgia8 - const: tgib8 - const: tgic8 - const: tgid8 - - const: tgiv8 - - const: tgiu8 + - const: tciv8 + - const: tciu8
clocks: maxItems: 1 @@ -285,16 +285,16 @@ examples: <GIC_SPI 211 IRQ_TYPE_EDGE_RISING>, <GIC_SPI 212 IRQ_TYPE_EDGE_RISING>, <GIC_SPI 213 IRQ_TYPE_EDGE_RISING>; - interrupt-names = "tgia0", "tgib0", "tgic0", "tgid0", "tgiv0", "tgie0", + interrupt-names = "tgia0", "tgib0", "tgic0", "tgid0", "tciv0", "tgie0", "tgif0", - "tgia1", "tgib1", "tgiv1", "tgiu1", - "tgia2", "tgib2", "tgiv2", "tgiu2", - "tgia3", "tgib3", "tgic3", "tgid3", "tgiv3", - "tgia4", "tgib4", "tgic4", "tgid4", "tgiv4", + "tgia1", "tgib1", "tciv1", "tciu1", + "tgia2", "tgib2", "tciv2", "tciu2", + "tgia3", "tgib3", "tgic3", "tgid3", "tciv3", + "tgia4", "tgib4", "tgic4", "tgid4", "tciv4", "tgiu5", "tgiv5", "tgiw5", - "tgia6", "tgib6", "tgic6", "tgid6", "tgiv6", - "tgia7", "tgib7", "tgic7", "tgid7", "tgiv7", - "tgia8", "tgib8", "tgic8", "tgid8", "tgiv8", "tgiu8"; + "tgia6", "tgib6", "tgic6", "tgid6", "tciv6", + "tgia7", "tgib7", "tgic7", "tgid7", "tciv7", + "tgia8", "tgib8", "tgic8", "tgid8", "tciv8", "tciu8"; clocks = <&cpg CPG_MOD R9A07G044_MTU_X_MCK_MTU3>; power-domains = <&cpg>; resets = <&cpg R9A07G044_MTU_X_PRESET_MTU3>;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Geffon bgeffon@google.com
commit f0c7183008b41e92fa676406d87f18773724b48b upstream.
We found at least one situation where the safe pages list was empty and get_buffer() would gladly try to use a NULL pointer.
Signed-off-by: Brian Geffon bgeffon@google.com Fixes: 8357376d3df2 ("[PATCH] swsusp: Improve handling of highmem") Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/power/snapshot.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -2474,8 +2474,9 @@ static void *get_highmem_page_buffer(str pbe->copy_page = tmp; } else { /* Copy of the page will be stored in normal memory */ - kaddr = safe_pages_list; - safe_pages_list = safe_pages_list->next; + kaddr = __get_safe_page(ca->gfp_mask); + if (!kaddr) + return ERR_PTR(-ENOMEM); pbe->copy_page = virt_to_page(kaddr); } pbe->next = highmem_pblist; @@ -2655,8 +2656,9 @@ static void *get_buffer(struct memory_bi return ERR_PTR(-ENOMEM); } pbe->orig_address = page_address(page); - pbe->address = safe_pages_list; - safe_pages_list = safe_pages_list->next; + pbe->address = __get_safe_page(ca->gfp_mask); + if (!pbe->address) + return ERR_PTR(-ENOMEM); pbe->next = restore_pblist; restore_pblist = pbe; return pbe->address;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Geffon bgeffon@google.com
commit d08970df1980476f27936e24d452550f3e9e92e1 upstream.
In snapshot_write_next(), sync_read is set and unset in three different spots unnecessiarly. As a result there is a subtle bug where the first page after the meta data has been loaded unconditionally sets sync_read to 0. If this first PFN was actually a highmem page, then the returned buffer will be the global "buffer," and the page needs to be loaded synchronously.
That is, I'm not sure we can always assume the following to be safe:
handle->buffer = get_buffer(&orig_bm, &ca); handle->sync_read = 0;
Because get_buffer() can call get_highmem_page_buffer() which can return 'buffer'.
The easiest way to address this is just set sync_read before snapshot_write_next() returns if handle->buffer == buffer.
Signed-off-by: Brian Geffon bgeffon@google.com Fixes: 8357376d3df2 ("[PATCH] swsusp: Improve handling of highmem") Cc: All applicable stable@vger.kernel.org [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/power/snapshot.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
--- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -2689,8 +2689,6 @@ int snapshot_write_next(struct snapshot_ if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages) return 0;
- handle->sync_read = 1; - if (!handle->cur) { if (!buffer) /* This makes the buffer be freed by swsusp_free() */ @@ -2726,7 +2724,6 @@ int snapshot_write_next(struct snapshot_ memory_bm_position_reset(&orig_bm); restore_pblist = NULL; handle->buffer = get_buffer(&orig_bm, &ca); - handle->sync_read = 0; if (IS_ERR(handle->buffer)) return PTR_ERR(handle->buffer); } @@ -2736,9 +2733,8 @@ int snapshot_write_next(struct snapshot_ handle->buffer = get_buffer(&orig_bm, &ca); if (IS_ERR(handle->buffer)) return PTR_ERR(handle->buffer); - if (handle->buffer != buffer) - handle->sync_read = 0; } + handle->sync_read = (handle->buffer == buffer); handle->cur++; return PAGE_SIZE; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Catalin Marinas catalin.marinas@arm.com
commit 5f98fd034ca6fd1ab8c91a3488968a0e9caaabf6 upstream.
Since the actual slab freeing is deferred when calling kvfree_rcu(), so is the kmemleak_free() callback informing kmemleak of the object deletion. From the perspective of the kvfree_rcu() caller, the object is freed and it may remove any references to it. Since kmemleak does not scan RCU internal data storing the pointer, it will report such objects as leaks during the grace period.
Tell kmemleak to ignore such objects on the kvfree_call_rcu() path. Note that the tiny RCU implementation does not have such issue since the objects can be tracked from the rcu_ctrlblk structure.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://lore.kernel.org/all/F903A825-F05F-4B77-A2B5-7356282FBA2C@apple.com/ Cc: stable@vger.kernel.org Tested-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tree.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -31,6 +31,7 @@ #include <linux/bitops.h> #include <linux/export.h> #include <linux/completion.h> +#include <linux/kmemleak.h> #include <linux/moduleparam.h> #include <linux/panic.h> #include <linux/panic_notifier.h> @@ -3397,6 +3398,14 @@ void kvfree_call_rcu(struct rcu_head *he success = true; }
+ /* + * The kvfree_rcu() caller considers the pointer freed at this point + * and likely removes any references to it. Since the actual slab + * freeing (and kmemleak_free()) is deferred, tell kmemleak to ignore + * this object (no scanning or false positives reporting). + */ + kmemleak_ignore(ptr); + // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING) schedule_delayed_monitor_work(krcp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 11aeb97b45ad2e0040cbb2a589bc403152526345 upstream.
We have a random schedule_timeout() if the current transaction is committing, which seems to be a holdover from the original delalloc reservation code.
Remove this, we have the proper flushing stuff, we shouldn't be hoping for random timing things to make everything work. This just induces latency for no reason.
CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/delalloc-space.c | 3 --- 1 file changed, 3 deletions(-)
--- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -322,9 +322,6 @@ int btrfs_delalloc_reserve_metadata(stru } else { if (current->journal_info) flush = BTRFS_RESERVE_FLUSH_LIMIT; - - if (btrfs_transaction_in_commit(fs_info)) - schedule_timeout(1); }
num_bytes = ALIGN(num_bytes, fs_info->sectorsize);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Arcari darcari@redhat.com
commit fae633cfb729da2771b5433f6b84ae7e8b4aa5f7 upstream.
KASAN reported this
[ 444.853098] BUG: KASAN: global-out-of-bounds in param_get_int+0x77/0x90 [ 444.853111] Read of size 4 at addr ffffffffc16c9220 by task cat/2105 ... [ 444.853442] The buggy address belongs to the variable: [ 444.853443] max_idle+0x0/0xffffffffffffcde0 [intel_powerclamp]
There is a mismatch between the param_get_int and the definition of max_idle. Replacing param_get_int with param_get_byte resolves this issue.
Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters") Cc: 6.3+ stable@vger.kernel.org # 6.3+ Signed-off-by: David Arcari darcari@redhat.com Reviewed-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thermal/intel/intel_powerclamp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c index 36243a3972fd..5ac5cb60bae6 100644 --- a/drivers/thermal/intel/intel_powerclamp.c +++ b/drivers/thermal/intel/intel_powerclamp.c @@ -256,7 +256,7 @@ static int max_idle_set(const char *arg, const struct kernel_param *kp)
static const struct kernel_param_ops max_idle_ops = { .set = max_idle_set, - .get = param_get_int, + .get = param_get_byte, };
module_param_cb(max_idle, &max_idle_ops, &max_idle, 0644);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit d08afd80158399a081b478a19902364e3dd0f84c upstream.
SMEM uses lock index 3 of the TCSR Mutex hwlock for allocations in SMEM region shared by the Host and FW.
Fix the SMEM hwlock index to 3 for IPQ5332.
Cc: stable@vger.kernel.org Fixes: d56dd7f935e1 ("arm64: dts: qcom: ipq5332: add SMEM support") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230904172516.479866-2-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq5332.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/ipq5332.dtsi b/arch/arm64/boot/dts/qcom/ipq5332.dtsi index 991b23027805..d3fef2f80a81 100644 --- a/arch/arm64/boot/dts/qcom/ipq5332.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq5332.dtsi @@ -135,7 +135,7 @@ smem@4a800000 { reg = <0x0 0x4a800000 0x0 0x100000>; no-map;
- hwlocks = <&tcsr_mutex 0>; + hwlocks = <&tcsr_mutex 3>; }; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit 8a781d04e580705d36f7db07f5c80e748100b69d upstream.
SMEM uses lock index 3 of the TCSR Mutex hwlock for allocations in SMEM region shared by the Host and FW.
Fix the SMEM hwlock index to 3 for IPQ8074.
Cc: stable@vger.kernel.org Fixes: 42124b947e8e ("arm64: dts: qcom: ipq8074: add SMEM support") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230904172516.479866-4-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq8074.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/qcom/ipq8074.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq8074.dtsi @@ -101,7 +101,7 @@ reg = <0x0 0x4ab00000 0x0 0x100000>; no-map;
- hwlocks = <&tcsr_mutex 0>; + hwlocks = <&tcsr_mutex 3>; };
memory@4ac00000 {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit 3337a6fea25370d3d244ec6bb38c71ee86fcf837 upstream.
Per the "SMC calling convention specification", the 64-bit calling convention can only be used when the client is 64-bit. Whereas the 32-bit calling convention can be used by either a 32-bit or a 64-bit client.
Currently during SCM probe, irrespective of the client, 64-bit calling convention is made, which is incorrect and may lead to the undefined behaviour when the client is 32-bit. Let's fix it.
Cc: stable@vger.kernel.org Fixes: 9a434cee773a ("firmware: qcom_scm: Dynamically support SMCCC and legacy conventions") Reviewed-By: Elliot Berman quic_eberman@quicinc.com Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Link: https://lore.kernel.org/r/20230925-scm-v3-1-8790dff6a749@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/qcom_scm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/firmware/qcom_scm.c +++ b/drivers/firmware/qcom_scm.c @@ -172,6 +172,12 @@ static enum qcom_scm_convention __get_co return qcom_scm_convention;
/* + * Per the "SMC calling convention specification", the 64-bit calling + * convention can only be used when the client is 64-bit, otherwise + * system will encounter the undefined behaviour. + */ +#if IS_ENABLED(CONFIG_ARM64) + /* * Device isn't required as there is only one argument - no device * needed to dma_map_single to secure world */ @@ -191,6 +197,7 @@ static enum qcom_scm_convention __get_co forced = true; goto found; } +#endif
probed_convention = SMC_CONVENTION_ARM_32; ret = __scm_smc_call(NULL, &desc, probed_convention, &res, true);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasily Khoruzhick anarsoul@gmail.com
commit a83c68a3bf7c418c9a46693c63c638852b0c1f4e upstream.
Buggy BIOSes may have invalid FPDT subtables, e.g. on my hardware:
S3PT subtable:
7F20FE30: 53 33 50 54 24 00 00 00-00 00 00 00 00 00 18 01 *S3PT$...........* 7F20FE40: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................* 7F20FE50: 00 00 00 00
Here the first record has zero length.
FBPT subtable:
7F20FE50: 46 42 50 54-3C 00 00 00 46 42 50 54 *....FBPT<...FBPT* 7F20FE60: 02 00 30 02 00 00 00 00-00 00 00 00 00 00 00 00 *..0.............* 7F20FE70: 2A A6 BC 6E 0B 00 00 00-1A 44 41 70 0B 00 00 00 **..n.....DAp....* 7F20FE80: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
And here FBPT table has FBPT signature repeated instead of the first record.
Current code will be looping indefinitely due to zero length records, so break out of the loop if record length is zero.
While we are here, add proper handling for fpdt_process_subtable() failures.
Fixes: d1eb86e59be0 ("ACPI: tables: introduce support for FPDT table") Cc: All applicable stable@vger.kernel.org Signed-off-by: Vasily Khoruzhick anarsoul@gmail.com [ rjw: Comment edit, added empty code lines ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_fpdt.c | 45 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 8 deletions(-)
--- a/drivers/acpi/acpi_fpdt.c +++ b/drivers/acpi/acpi_fpdt.c @@ -194,12 +194,19 @@ static int fpdt_process_subtable(u64 add record_header = (void *)subtable_header + offset; offset += record_header->length;
+ if (!record_header->length) { + pr_err(FW_BUG "Zero-length record found in FPTD.\n"); + result = -EINVAL; + goto err; + } + switch (record_header->type) { case RECORD_S3_RESUME: if (subtable_type != SUBTABLE_S3PT) { pr_err(FW_BUG "Invalid record %d for subtable %s\n", record_header->type, signature); - return -EINVAL; + result = -EINVAL; + goto err; } if (record_resume) { pr_err("Duplicate resume performance record found.\n"); @@ -208,7 +215,7 @@ static int fpdt_process_subtable(u64 add record_resume = (struct resume_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &resume_attr_group); if (result) - return result; + goto err; break; case RECORD_S3_SUSPEND: if (subtable_type != SUBTABLE_S3PT) { @@ -223,13 +230,14 @@ static int fpdt_process_subtable(u64 add record_suspend = (struct suspend_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &suspend_attr_group); if (result) - return result; + goto err; break; case RECORD_BOOT: if (subtable_type != SUBTABLE_FBPT) { pr_err(FW_BUG "Invalid %d for subtable %s\n", record_header->type, signature); - return -EINVAL; + result = -EINVAL; + goto err; } if (record_boot) { pr_err("Duplicate boot performance record found.\n"); @@ -238,7 +246,7 @@ static int fpdt_process_subtable(u64 add record_boot = (struct boot_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &boot_attr_group); if (result) - return result; + goto err; break;
default: @@ -247,6 +255,18 @@ static int fpdt_process_subtable(u64 add } } return 0; + +err: + if (record_boot) + sysfs_remove_group(fpdt_kobj, &boot_attr_group); + + if (record_suspend) + sysfs_remove_group(fpdt_kobj, &suspend_attr_group); + + if (record_resume) + sysfs_remove_group(fpdt_kobj, &resume_attr_group); + + return result; }
static int __init acpi_init_fpdt(void) @@ -255,6 +275,7 @@ static int __init acpi_init_fpdt(void) struct acpi_table_header *header; struct fpdt_subtable_entry *subtable; u32 offset = sizeof(*header); + int result;
status = acpi_get_table(ACPI_SIG_FPDT, 0, &header);
@@ -263,8 +284,8 @@ static int __init acpi_init_fpdt(void)
fpdt_kobj = kobject_create_and_add("fpdt", acpi_kobj); if (!fpdt_kobj) { - acpi_put_table(header); - return -ENOMEM; + result = -ENOMEM; + goto err_nomem; }
while (offset < header->length) { @@ -272,8 +293,10 @@ static int __init acpi_init_fpdt(void) switch (subtable->type) { case SUBTABLE_FBPT: case SUBTABLE_S3PT: - fpdt_process_subtable(subtable->address, + result = fpdt_process_subtable(subtable->address, subtable->type); + if (result) + goto err_subtable; break; default: /* Other types are reserved in ACPI 6.4 spec. */ @@ -282,6 +305,12 @@ static int __init acpi_init_fpdt(void) offset += sizeof(*subtable); } return 0; +err_subtable: + kobject_put(fpdt_kobj); + +err_nomem: + acpi_put_table(header); + return result; }
fs_initcall(acpi_init_fpdt);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit 5fe8508e2bc8eb4208b0434b6c1ca306c1519ade upstream.
SMEM uses lock index 3 of the TCSR Mutex hwlock for allocations in SMEM region shared by the Host and FW.
Fix the SMEM hwlock index to 3 for IPQ9574.
Cc: stable@vger.kernel.org Fixes: 46384ac7a618 ("arm64: dts: qcom: ipq9574: Add SMEM support") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230904172516.479866-5-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq9574.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/qcom/ipq9574.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq9574.dtsi @@ -174,7 +174,7 @@ smem@4aa00000 { compatible = "qcom,smem"; reg = <0x0 0x4aa00000 0x0 0x100000>; - hwlocks = <&tcsr_mutex 0>; + hwlocks = <&tcsr_mutex 3>; no-map; }; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit 72fc3d58b87b0d622039c6299b89024fbb7b420f upstream.
IPQ6018's TCSR Mutex HW lock register has 32 locks of size 4KB each. Total size of the TCSR Mutex registers is 128KB.
Fix size of the tcsr_mutex hwlock register to 0x20000.
Changes in v2: - Drop change to remove qcom,ipq6018-tcsr-mutex compatible string - Added Fixes and stable tags
Cc: stable@vger.kernel.org Fixes: 5bf635621245 ("arm64: dts: ipq6018: Add a few device nodes") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230905095535.1263113-2-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -389,7 +389,7 @@
tcsr_mutex: hwlock@1905000 { compatible = "qcom,ipq6018-tcsr-mutex", "qcom,tcsr-mutex"; - reg = <0x0 0x01905000 0x0 0x1000>; + reg = <0x0 0x01905000 0x0 0x20000>; #hwlock-cells = <1>; };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Marangi ansuelsmth@gmail.com
commit 259e33cbb1712a7dd844fc9757661cc47cb0e39b upstream.
GCC 13.2 complains about array subscript 17 is above array bounds of 'char[16]' with IFNAMSIZ set to 16.
The warning is correct but this scenario is impossible. set_device_name is called by device_name_store (store sysfs entry) and netdev_trig_activate.
device_name_store already check if size is >= of IFNAMSIZ and return -EINVAL. (making the warning scenario impossible)
netdev_trig_activate works on already defined interface, where the name has already been checked and should already follow the condition of strlen() < IFNAMSIZ.
Aside from the scenario being impossible, set_device_name can be improved to both mute the warning and make the function safer. To make it safer, move size check from device_name_store directly to set_device_name and prevent any out of bounds scenario.
Cc: stable@vger.kernel.org Fixes: 28a6a2ef18ad ("leds: trigger: netdev: refactor code setting device name") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202309192035.GTJEEbem-lkp@intel.com/ Signed-off-by: Christian Marangi ansuelsmth@gmail.com Link: https://lore.kernel.org/r/20231007131042.15032-1-ansuelsmth@gmail.com Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/leds/trigger/ledtrig-netdev.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/leds/trigger/ledtrig-netdev.c b/drivers/leds/trigger/ledtrig-netdev.c index 58f3352539e8..e358e77e4b38 100644 --- a/drivers/leds/trigger/ledtrig-netdev.c +++ b/drivers/leds/trigger/ledtrig-netdev.c @@ -221,6 +221,9 @@ static ssize_t device_name_show(struct device *dev, static int set_device_name(struct led_netdev_data *trigger_data, const char *name, size_t size) { + if (size >= IFNAMSIZ) + return -EINVAL; + cancel_delayed_work_sync(&trigger_data->work);
mutex_lock(&trigger_data->lock); @@ -263,9 +266,6 @@ static ssize_t device_name_store(struct device *dev, struct led_netdev_data *trigger_data = led_trigger_get_drvdata(dev); int ret;
- if (size >= IFNAMSIZ) - return -EINVAL; - ret = set_device_name(trigger_data, buf, size);
if (ret < 0)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit a0fa44c261e448c531f9adb3a5189a3520f3e316 upstream.
The Qualcomm SPMI PMIC revid implementation is broken in multiple ways.
First, it totally ignores struct device_node reference counting and leaks references to the parent bus node as well as each child it iterates over using an open-coded for_each_child_of_node().
Second, it leaks references to each spmi device on the bus that it iterates over by failing to drop the reference taken by the spmi_device_from_of() helper.
Fix the struct device_node leaks by reimplementing the lookup using for_each_child_of_node() and adding the missing reference count decrements. Fix the sibling struct device leaks by dropping the unnecessary lookups of devices with the wrong USID.
Note that this still leaves one struct device reference leak in case a base device is found but it is not the parent of the device used for the lookup. This will be addressed in a follow-on patch.
Fixes: e9c11c6e3a0e ("mfd: qcom-spmi-pmic: expose the PMIC revid information to clients") Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Caleb Connolly caleb.connolly@linaro.org Link: https://lore.kernel.org/r/20231003152927.15000-2-johan+linaro@kernel.org Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/qcom-spmi-pmic.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-)
--- a/drivers/mfd/qcom-spmi-pmic.c +++ b/drivers/mfd/qcom-spmi-pmic.c @@ -81,7 +81,7 @@ static struct spmi_device *qcom_pmic_get struct spmi_device *sdev; struct qcom_spmi_dev *ctx; struct device_node *spmi_bus; - struct device_node *other_usid = NULL; + struct device_node *child; int function_parent_usid, ret; u32 pmic_addr;
@@ -105,28 +105,34 @@ static struct spmi_device *qcom_pmic_get * device for USID 2. */ spmi_bus = of_get_parent(sdev->dev.of_node); - do { - other_usid = of_get_next_child(spmi_bus, other_usid); - - ret = of_property_read_u32_index(other_usid, "reg", 0, &pmic_addr); - if (ret) - return ERR_PTR(ret); + sdev = ERR_PTR(-ENODATA); + for_each_child_of_node(spmi_bus, child) { + ret = of_property_read_u32_index(child, "reg", 0, &pmic_addr); + if (ret) { + of_node_put(child); + sdev = ERR_PTR(ret); + break; + }
- sdev = spmi_device_from_of(other_usid); if (pmic_addr == function_parent_usid - (ctx->num_usids - 1)) { - if (!sdev) + sdev = spmi_device_from_of(child); + if (!sdev) { /* * If the base USID for this PMIC hasn't probed yet * but the secondary USID has, then we need to defer * the function driver so that it will attempt to * probe again when the base USID is ready. */ - return ERR_PTR(-EPROBE_DEFER); - return sdev; + sdev = ERR_PTR(-EPROBE_DEFER); + } + of_node_put(child); + break; } - } while (other_usid->sibling); + } + + of_node_put(spmi_bus);
- return ERR_PTR(-ENODATA); + return sdev; }
static int pmic_spmi_load_revid(struct regmap *map, struct device *dev,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 7b439aaa62fee474a0d84d67a25f4984467e7b95 upstream.
The Qualcomm SPMI PMIC revid implementation is broken in multiple ways.
First, it assumes that just because the sibling base device has been registered that means that it is also bound to a driver, which may not be the case (e.g. due to probe deferral or asynchronous probe). This could trigger a NULL-pointer dereference when attempting to access the driver data of the unbound device.
Second, it accesses driver data of a sibling device directly and without any locking, which means that the driver data may be freed while it is being accessed (e.g. on driver unbind).
Third, it leaks a struct device reference to the sibling device which is looked up using the spmi_device_from_of() every time a function (child) device is calling the revid function (e.g. on probe).
Fix this mess by reimplementing the revid lookup so that it is done only at probe of the PMIC device; the base device fetches the revid info from the hardware, while any secondary SPMI device fetches the information from the base device and caches it so that it can be accessed safely from its children. If the base device has not been probed yet then probe of a secondary device is deferred.
Fixes: e9c11c6e3a0e ("mfd: qcom-spmi-pmic: expose the PMIC revid information to clients") Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Caleb Connolly caleb.connolly@linaro.org Link: https://lore.kernel.org/r/20231003152927.15000-3-johan+linaro@kernel.org Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/qcom-spmi-pmic.c | 69 +++++++++++++++++++++++++++++++++---------- 1 file changed, 53 insertions(+), 16 deletions(-)
--- a/drivers/mfd/qcom-spmi-pmic.c +++ b/drivers/mfd/qcom-spmi-pmic.c @@ -30,6 +30,8 @@ struct qcom_spmi_dev { struct qcom_spmi_pmic pmic; };
+static DEFINE_MUTEX(pmic_spmi_revid_lock); + #define N_USIDS(n) ((void *)n)
static const struct of_device_id pmic_spmi_id_table[] = { @@ -76,24 +78,21 @@ static const struct of_device_id pmic_sp * * This only supports PMICs with 1 or 2 USIDs. */ -static struct spmi_device *qcom_pmic_get_base_usid(struct device *dev) +static struct spmi_device *qcom_pmic_get_base_usid(struct spmi_device *sdev, struct qcom_spmi_dev *ctx) { - struct spmi_device *sdev; - struct qcom_spmi_dev *ctx; struct device_node *spmi_bus; struct device_node *child; int function_parent_usid, ret; u32 pmic_addr;
- sdev = to_spmi_device(dev); - ctx = dev_get_drvdata(&sdev->dev); - /* * Quick return if the function device is already in the base * USID. This will always be hit for PMICs with only 1 USID. */ - if (sdev->usid % ctx->num_usids == 0) + if (sdev->usid % ctx->num_usids == 0) { + get_device(&sdev->dev); return sdev; + }
function_parent_usid = sdev->usid;
@@ -118,10 +117,8 @@ static struct spmi_device *qcom_pmic_get sdev = spmi_device_from_of(child); if (!sdev) { /* - * If the base USID for this PMIC hasn't probed yet - * but the secondary USID has, then we need to defer - * the function driver so that it will attempt to - * probe again when the base USID is ready. + * If the base USID for this PMIC hasn't been + * registered yet then we need to defer. */ sdev = ERR_PTR(-EPROBE_DEFER); } @@ -135,6 +132,35 @@ static struct spmi_device *qcom_pmic_get return sdev; }
+static int pmic_spmi_get_base_revid(struct spmi_device *sdev, struct qcom_spmi_dev *ctx) +{ + struct qcom_spmi_dev *base_ctx; + struct spmi_device *base; + int ret = 0; + + base = qcom_pmic_get_base_usid(sdev, ctx); + if (IS_ERR(base)) + return PTR_ERR(base); + + /* + * Copy revid info from base device if it has probed and is still + * bound to its driver. + */ + mutex_lock(&pmic_spmi_revid_lock); + base_ctx = spmi_device_get_drvdata(base); + if (!base_ctx) { + ret = -EPROBE_DEFER; + goto out_unlock; + } + memcpy(&ctx->pmic, &base_ctx->pmic, sizeof(ctx->pmic)); +out_unlock: + mutex_unlock(&pmic_spmi_revid_lock); + + put_device(&base->dev); + + return ret; +} + static int pmic_spmi_load_revid(struct regmap *map, struct device *dev, struct qcom_spmi_pmic *pmic) { @@ -210,11 +236,7 @@ const struct qcom_spmi_pmic *qcom_pmic_g if (!of_match_device(pmic_spmi_id_table, dev->parent)) return ERR_PTR(-EINVAL);
- sdev = qcom_pmic_get_base_usid(dev->parent); - - if (IS_ERR(sdev)) - return ERR_CAST(sdev); - + sdev = to_spmi_device(dev->parent); spmi = dev_get_drvdata(&sdev->dev);
return &spmi->pmic; @@ -249,16 +271,31 @@ static int pmic_spmi_probe(struct spmi_d ret = pmic_spmi_load_revid(regmap, &sdev->dev, &ctx->pmic); if (ret < 0) return ret; + } else { + ret = pmic_spmi_get_base_revid(sdev, ctx); + if (ret) + return ret; } + + mutex_lock(&pmic_spmi_revid_lock); spmi_device_set_drvdata(sdev, ctx); + mutex_unlock(&pmic_spmi_revid_lock);
return devm_of_platform_populate(&sdev->dev); }
+static void pmic_spmi_remove(struct spmi_device *sdev) +{ + mutex_lock(&pmic_spmi_revid_lock); + spmi_device_set_drvdata(sdev, NULL); + mutex_unlock(&pmic_spmi_revid_lock); +} + MODULE_DEVICE_TABLE(of, pmic_spmi_id_table);
static struct spmi_driver pmic_spmi_driver = { .probe = pmic_spmi_probe, + .remove = pmic_spmi_remove, .driver = { .name = "pmic-spmi", .of_match_table = pmic_spmi_id_table,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
commit e044374a8a0a99e46f4e6d6751d3042b6d9cc12e upstream.
It is not clear that IMA should be nested at all, but as long is it measures files both on overlayfs and on underlying fs, we need to annotate the iint mutex to avoid lockdep false positives related to IMA + overlayfs, same as overlayfs annotates the inode mutex.
Reported-and-tested-by: syzbot+b42fe626038981fb7bfa@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein amir73il@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/integrity/iint.c | 48 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 11 deletions(-)
--- a/security/integrity/iint.c +++ b/security/integrity/iint.c @@ -66,9 +66,32 @@ struct integrity_iint_cache *integrity_i return iint; }
-static void iint_free(struct integrity_iint_cache *iint) +#define IMA_MAX_NESTING (FILESYSTEM_MAX_STACK_DEPTH+1) + +/* + * It is not clear that IMA should be nested at all, but as long is it measures + * files both on overlayfs and on underlying fs, we need to annotate the iint + * mutex to avoid lockdep false positives related to IMA + overlayfs. + * See ovl_lockdep_annotate_inode_mutex_key() for more details. + */ +static inline void iint_lockdep_annotate(struct integrity_iint_cache *iint, + struct inode *inode) +{ +#ifdef CONFIG_LOCKDEP + static struct lock_class_key iint_mutex_key[IMA_MAX_NESTING]; + + int depth = inode->i_sb->s_stack_depth; + + if (WARN_ON_ONCE(depth < 0 || depth >= IMA_MAX_NESTING)) + depth = 0; + + lockdep_set_class(&iint->mutex, &iint_mutex_key[depth]); +#endif +} + +static void iint_init_always(struct integrity_iint_cache *iint, + struct inode *inode) { - kfree(iint->ima_hash); iint->ima_hash = NULL; iint->version = 0; iint->flags = 0UL; @@ -80,6 +103,14 @@ static void iint_free(struct integrity_i iint->ima_creds_status = INTEGRITY_UNKNOWN; iint->evm_status = INTEGRITY_UNKNOWN; iint->measured_pcrs = 0; + mutex_init(&iint->mutex); + iint_lockdep_annotate(iint, inode); +} + +static void iint_free(struct integrity_iint_cache *iint) +{ + kfree(iint->ima_hash); + mutex_destroy(&iint->mutex); kmem_cache_free(iint_cache, iint); }
@@ -104,6 +135,8 @@ struct integrity_iint_cache *integrity_i if (!iint) return NULL;
+ iint_init_always(iint, inode); + write_lock(&integrity_iint_lock);
p = &integrity_iint_tree.rb_node; @@ -153,25 +186,18 @@ void integrity_inode_free(struct inode * iint_free(iint); }
-static void init_once(void *foo) +static void iint_init_once(void *foo) { struct integrity_iint_cache *iint = (struct integrity_iint_cache *) foo;
memset(iint, 0, sizeof(*iint)); - iint->ima_file_status = INTEGRITY_UNKNOWN; - iint->ima_mmap_status = INTEGRITY_UNKNOWN; - iint->ima_bprm_status = INTEGRITY_UNKNOWN; - iint->ima_read_status = INTEGRITY_UNKNOWN; - iint->ima_creds_status = INTEGRITY_UNKNOWN; - iint->evm_status = INTEGRITY_UNKNOWN; - mutex_init(&iint->mutex); }
static int __init integrity_iintcache_init(void) { iint_cache = kmem_cache_create("iint_cache", sizeof(struct integrity_iint_cache), - 0, SLAB_PANIC, init_once); + 0, SLAB_PANIC, iint_init_once); return 0; } DEFINE_LSM(integrity) = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mimi Zohar zohar@linux.ibm.com
commit b836c4d29f2744200b2af41e14bf50758dddc818 upstream.
Commit 18b44bc5a672 ("ovl: Always reevaluate the file signature for IMA") forced signature re-evaulation on every file access.
Instead of always re-evaluating the file's integrity, detect a change to the backing file, by comparing the cached file metadata with the backing file's metadata. Verifying just the i_version has not changed is insufficient. In addition save and compare the i_ino and s_dev as well.
Reviewed-by: Amir Goldstein amir73il@gmail.com Tested-by: Eric Snowberg eric.snowberg@oracle.com Tested-by: Raul E Rangel rrangel@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/super.c | 2 +- security/integrity/ima/ima_api.c | 5 +++++ security/integrity/ima/ima_main.c | 16 +++++++++++++++- security/integrity/integrity.h | 2 ++ 4 files changed, 23 insertions(+), 2 deletions(-)
--- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -1467,7 +1467,7 @@ int ovl_fill_super(struct super_block *s ovl_trusted_xattr_handlers; sb->s_fs_info = ofs; sb->s_flags |= SB_POSIXACL; - sb->s_iflags |= SB_I_SKIP_SYNC | SB_I_IMA_UNVERIFIABLE_SIGNATURE; + sb->s_iflags |= SB_I_SKIP_SYNC;
err = -ENOMEM; root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe); --- a/security/integrity/ima/ima_api.c +++ b/security/integrity/ima/ima_api.c @@ -243,6 +243,7 @@ int ima_collect_measurement(struct integ { const char *audit_cause = "failed"; struct inode *inode = file_inode(file); + struct inode *real_inode = d_real_inode(file_dentry(file)); const char *filename = file->f_path.dentry->d_name.name; struct ima_max_digest_data hash; struct kstat stat; @@ -302,6 +303,10 @@ int ima_collect_measurement(struct integ iint->ima_hash = tmpbuf; memcpy(iint->ima_hash, &hash, length); iint->version = i_version; + if (real_inode != inode) { + iint->real_ino = real_inode->i_ino; + iint->real_dev = real_inode->i_sb->s_dev; + }
/* Possibly temporary failure due to type of read (eg. O_DIRECT) */ if (!result) --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -25,6 +25,7 @@ #include <linux/xattr.h> #include <linux/ima.h> #include <linux/fs.h> +#include <linux/iversion.h>
#include "ima.h"
@@ -207,7 +208,7 @@ static int process_measurement(struct fi u32 secid, char *buf, loff_t size, int mask, enum ima_hooks func) { - struct inode *inode = file_inode(file); + struct inode *backing_inode, *inode = file_inode(file); struct integrity_iint_cache *iint = NULL; struct ima_template_desc *template_desc = NULL; char *pathbuf = NULL; @@ -284,6 +285,19 @@ static int process_measurement(struct fi iint->measured_pcrs = 0; }
+ /* Detect and re-evaluate changes made to the backing file. */ + backing_inode = d_real_inode(file_dentry(file)); + if (backing_inode != inode && + (action & IMA_DO_MASK) && (iint->flags & IMA_DONE_MASK)) { + if (!IS_I_VERSION(backing_inode) || + backing_inode->i_sb->s_dev != iint->real_dev || + backing_inode->i_ino != iint->real_ino || + !inode_eq_iversion(backing_inode, iint->version)) { + iint->flags &= ~IMA_DONE_MASK; + iint->measured_pcrs = 0; + } + } + /* Determine if already appraised/measured based on bitmask * (IMA_MEASURE, IMA_MEASURED, IMA_XXXX_APPRAISE, IMA_XXXX_APPRAISED, * IMA_AUDIT, IMA_AUDITED) --- a/security/integrity/integrity.h +++ b/security/integrity/integrity.h @@ -164,6 +164,8 @@ struct integrity_iint_cache { unsigned long flags; unsigned long measured_pcrs; unsigned long atomic_flags; + unsigned long real_ino; + dev_t real_dev; enum integrity_status ima_file_status:4; enum integrity_status ima_mmap_status:4; enum integrity_status ima_bprm_status:4;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 93995bf4af2c5a99e2a87f0cd5ce547d31eb7630 ]
The expired catchall element is not deactivated and removed from GC sync path. This path holds mutex so just call nft_setelem_data_deactivate() and nft_setelem_catchall_remove() before queueing the GC work.
Fixes: 4a9e12ea7e70 ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC") Reported-by: lonial con kongln9170@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 398a1bcc6ea61..d676c87411dc1 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6461,6 +6461,12 @@ static int nft_setelem_deactivate(const struct net *net, return ret; }
+static void nft_setelem_catchall_destroy(struct nft_set_elem_catchall *catchall) +{ + list_del_rcu(&catchall->list); + kfree_rcu(catchall, rcu); +} + static void nft_setelem_catchall_remove(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem) @@ -6469,8 +6475,7 @@ static void nft_setelem_catchall_remove(const struct net *net,
list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { if (catchall->elem == elem->priv) { - list_del_rcu(&catchall->list); - kfree_rcu(catchall, rcu); + nft_setelem_catchall_destroy(catchall); break; } } @@ -9636,11 +9641,12 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, unsigned int gc_seq, bool sync) { - struct nft_set_elem_catchall *catchall; + struct nft_set_elem_catchall *catchall, *next; const struct nft_set *set = gc->set; + struct nft_elem_priv *elem_priv; struct nft_set_ext *ext;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) { + list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext)) @@ -9658,7 +9664,17 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, if (!gc) return NULL;
- nft_trans_gc_elem_add(gc, catchall->elem); + elem_priv = catchall->elem; + if (sync) { + struct nft_set_elem elem = { + .priv = elem_priv, + }; + + nft_setelem_data_deactivate(gc->net, gc->set, &elem); + nft_setelem_catchall_destroy(catchall); + } + + nft_trans_gc_elem_add(gc, elem_priv); }
return gc;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 8837ba3e58ea1e3d09ae36db80b1e80853aada95 ]
list_for_each_entry_safe() does not work for the async case which runs under RCU, therefore, split GC logic for catchall in two functions instead, one for each of the sync and async GC variants.
The catchall sync GC variant never sees a _DEAD bit set on ever, thus, this handling is removed in such case, moreover, allocate GC sync batch via GFP_KERNEL.
Fixes: 93995bf4af2c ("netfilter: nf_tables: remove catchall element in GC sync path") Reported-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 61 ++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 29 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index d676c87411dc1..db582c8d25f00 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9637,16 +9637,14 @@ void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans) call_rcu(&trans->rcu, nft_trans_gc_trans_free); }
-static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, - unsigned int gc_seq, - bool sync) +struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc, + unsigned int gc_seq) { - struct nft_set_elem_catchall *catchall, *next; + struct nft_set_elem_catchall *catchall; const struct nft_set *set = gc->set; - struct nft_elem_priv *elem_priv; struct nft_set_ext *ext;
- list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { + list_for_each_entry_rcu(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext)) @@ -9656,39 +9654,44 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
nft_set_elem_dead(ext); dead_elem: - if (sync) - gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); - else - gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); - + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); if (!gc) return NULL;
- elem_priv = catchall->elem; - if (sync) { - struct nft_set_elem elem = { - .priv = elem_priv, - }; - - nft_setelem_data_deactivate(gc->net, gc->set, &elem); - nft_setelem_catchall_destroy(catchall); - } - - nft_trans_gc_elem_add(gc, elem_priv); + nft_trans_gc_elem_add(gc, catchall->elem); }
return gc; }
-struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc, - unsigned int gc_seq) -{ - return nft_trans_gc_catchall(gc, gc_seq, false); -} - struct nft_trans_gc *nft_trans_gc_catchall_sync(struct nft_trans_gc *gc) { - return nft_trans_gc_catchall(gc, 0, true); + struct nft_set_elem_catchall *catchall, *next; + const struct nft_set *set = gc->set; + struct nft_set_elem elem; + struct nft_set_ext *ext; + + WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net)); + + list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { + ext = nft_set_elem_ext(set, catchall->elem); + + if (!nft_set_elem_expired(ext)) + continue; + + gc = nft_trans_gc_queue_sync(gc, GFP_KERNEL); + if (!gc) + return NULL; + + memset(&elem, 0, sizeof(elem)); + elem.priv = catchall->elem; + + nft_setelem_data_deactivate(gc->net, gc->set, &elem); + nft_setelem_catchall_destroy(catchall); + nft_trans_gc_elem_add(gc, elem.priv); + } + + return gc; }
static void nf_tables_module_autoload_cleanup(struct net *net)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
commit f0220575e65abe09c09cd17826a3cdea76e8d58f upstream.
In some setups like Speaker amps which are very sensitive, ex: keeping them unmute without actual data stream for very short duration results in a static charge and results in pop and clicks. To minimize this, provide a way to mute and unmute such codecs during trigger callbacks.
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Tested-by: Johan Hovold johan+linaro@kernel.org Link: https://lore.kernel.org/r/20231027105747.32450-2-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org [ johan: backport to v6.6.2 ] Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/sound/soc-dai.h | 1 + sound/soc/soc-dai.c | 7 +++++++ sound/soc/soc-pcm.c | 12 ++++++++---- 3 files changed, 16 insertions(+), 4 deletions(-)
--- a/include/sound/soc-dai.h +++ b/include/sound/soc-dai.h @@ -355,6 +355,7 @@ struct snd_soc_dai_ops {
/* bit field */ unsigned int no_capture_mute:1; + unsigned int mute_unmute_on_trigger:1; };
struct snd_soc_cdai_ops { --- a/sound/soc/soc-dai.c +++ b/sound/soc/soc-dai.c @@ -641,6 +641,10 @@ int snd_soc_pcm_dai_trigger(struct snd_p ret = soc_dai_trigger(dai, substream, cmd); if (ret < 0) break; + + if (dai->driver->ops && dai->driver->ops->mute_unmute_on_trigger) + snd_soc_dai_digital_mute(dai, 0, substream->stream); + soc_dai_mark_push(dai, substream, trigger); } break; @@ -651,6 +655,9 @@ int snd_soc_pcm_dai_trigger(struct snd_p if (rollback && !soc_dai_mark_match(dai, substream, trigger)) continue;
+ if (dai->driver->ops && dai->driver->ops->mute_unmute_on_trigger) + snd_soc_dai_digital_mute(dai, 1, substream->stream); + r = soc_dai_trigger(dai, substream, cmd); if (r < 0) ret = r; /* use last ret */ --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -896,8 +896,10 @@ static int __soc_pcm_prepare(struct snd_ snd_soc_dapm_stream_event(rtd, substream->stream, SND_SOC_DAPM_STREAM_START);
- for_each_rtd_dais(rtd, i, dai) - snd_soc_dai_digital_mute(dai, 0, substream->stream); + for_each_rtd_dais(rtd, i, dai) { + if (dai->driver->ops && !dai->driver->ops->mute_unmute_on_trigger) + snd_soc_dai_digital_mute(dai, 0, substream->stream); + }
out: return soc_pcm_ret(rtd, ret); @@ -939,8 +941,10 @@ static int soc_pcm_hw_clean(struct snd_s if (snd_soc_dai_active(dai) == 1) soc_pcm_set_dai_params(dai, NULL);
- if (snd_soc_dai_stream_active(dai, substream->stream) == 1) - snd_soc_dai_digital_mute(dai, 1, substream->stream); + if (snd_soc_dai_stream_active(dai, substream->stream) == 1) { + if (dai->driver->ops && !dai->driver->ops->mute_unmute_on_trigger) + snd_soc_dai_digital_mute(dai, 1, substream->stream); + } }
/* run the stream event */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
commit 805ce81826c896dd3c351a32814b28557f9edf54 upstream.
In the current setup the PA is left unmuted even when the Soundwire ports are not started streaming. This can lead to click and pop sounds during start. There is a same issue in the reverse order where in the PA is left unmute even after the data stream is stopped, the time between data stream stopping and port closing is long enough to accumulate DC on the line resulting in Click/Pop noise during end of stream.
making use of new mute_unmute_on_trigger flag is helping a lot with this Click/Pop issues reported on this Codec
Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Tested-by: Johan Hovold johan+linaro@kernel.org Link: https://lore.kernel.org/r/20231027105747.32450-3-srinivas.kandagatla@linaro.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wsa883x.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
--- a/sound/soc/codecs/wsa883x.c +++ b/sound/soc/codecs/wsa883x.c @@ -1203,9 +1203,6 @@ static int wsa883x_spkr_event(struct snd break; }
- snd_soc_component_write_field(component, WSA883X_DRE_CTL_1, - WSA883X_DRE_GAIN_EN_MASK, - WSA883X_DRE_GAIN_FROM_CSR); if (wsa883x->port_enable[WSA883X_PORT_COMP]) snd_soc_component_write_field(component, WSA883X_DRE_CTL_0, WSA883X_DRE_OFFSET_MASK, @@ -1218,9 +1215,6 @@ static int wsa883x_spkr_event(struct snd snd_soc_component_write_field(component, WSA883X_PDM_WD_CTL, WSA883X_PDM_EN_MASK, WSA883X_PDM_ENABLE); - snd_soc_component_write_field(component, WSA883X_PA_FSM_CTL, - WSA883X_GLOBAL_PA_EN_MASK, - WSA883X_GLOBAL_PA_ENABLE);
break; case SND_SOC_DAPM_PRE_PMD: @@ -1346,6 +1340,7 @@ static const struct snd_soc_dai_ops wsa8 .hw_free = wsa883x_hw_free, .mute_stream = wsa883x_digital_mute, .set_stream = wsa883x_set_sdw_stream, + .mute_unmute_on_trigger = true, };
static struct snd_soc_dai_driver wsa883x_dais[] = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit beb7f471847663559bd0fe60af1d70e05a1d7c6c upstream.
signal_handler_unregister() calls sigaction() with uninitializing sa_flags in the struct sigaction.
Make sure sa_flags is always initialized in signal_handler_unregister() by initializing the struct sigaction when declaring it. Also add the initialization to signal_handler_register() even if there are no know bugs in there because correctness is then obvious from the code itself.
Fixes: 73c55fa5ab55 ("selftests/resctrl: Commonize the signal handler register/unregister for all tests") Suggested-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Reinette Chatre reinette.chatre@intel.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/resctrl_val.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c index 51963a6f2186..01bbe11a8983 100644 --- a/tools/testing/selftests/resctrl/resctrl_val.c +++ b/tools/testing/selftests/resctrl/resctrl_val.c @@ -482,7 +482,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr) */ int signal_handler_register(void) { - struct sigaction sigact; + struct sigaction sigact = {}; int ret = 0;
sigact.sa_sigaction = ctrlc_handler; @@ -504,7 +504,7 @@ int signal_handler_register(void) */ void signal_handler_unregister(void) { - struct sigaction sigact; + struct sigaction sigact = {};
sigact.sa_handler = SIG_DFL; sigemptyset(&sigact.sa_mask);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit 030b48fb2cf045dead8ee2c5ead560930044c029 upstream.
The test runner run_cmt_test() in resctrl_tests.c checks for CMT feature and does not run cmt_resctrl_val() if CMT is not supported. Then cmt_resctrl_val() also check is CMT is supported.
Remove the duplicated feature check for CMT from cmt_resctrl_val().
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Tested-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/cmt_test.c | 3 --- 1 file changed, 3 deletions(-)
--- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -90,9 +90,6 @@ int cmt_resctrl_val(int cpu_no, int n, c if (ret) return ret;
- if (!validate_resctrl_feature_request(CMT_STR)) - return -1; - ret = get_cbm_mask("L3", cbm_mask); if (ret) return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit 3a1e4a91aa454a1c589a9824d54179fdbfccde45 upstream.
_GNU_SOURCE is defined in resctrl.h. Defining _GNU_SOURCE has a large impact on what gets defined when including headers either before or after it. This can result in compile failures if .c file decides to include a standard header file before resctrl.h.
It is safer to define _GNU_SOURCE in Makefile so it is always defined regardless of in which order includes are done.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Tested-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/Makefile | 2 +- tools/testing/selftests/resctrl/resctrl.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-)
--- a/tools/testing/selftests/resctrl/Makefile +++ b/tools/testing/selftests/resctrl/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0
-CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2 +CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE CFLAGS += $(KHDR_INCLUDES)
TEST_GEN_PROGS := resctrl_tests --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -1,5 +1,4 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#define _GNU_SOURCE #ifndef RESCTRL_H #define RESCTRL_H #include <stdio.h>
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit ef43c30858754d99373a63dff33280a9969b49bc upstream.
The initial value of 5% chosen for the maximum allowed percentage difference between resctrl mbm value and IMC mbm value in
commit 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format") was "randomly chosen value" (as admitted by the changelog).
When running tests in our lab across a large number platforms, 5% difference upper bound for success seems a bit on the low side for the MBA and MBM tests. Some platforms produce outliers that are slightly above that, typically 6-7%, which leads MBA/MBM test frequently failing.
Replace the "randomly chosen value" with a success bound that is based on those measurements across large number of platforms by relaxing the MBA/MBM success bound to 8%. The relaxed bound removes the failures due the frequent outliers.
Fixed commit description style error during merge: Shuah Khan skhan@linuxfoundation.org
Fixes: 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Tested-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/mba_test.c | 2 +- tools/testing/selftests/resctrl/mbm_test.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -12,7 +12,7 @@
#define RESULT_FILE_NAME "result_mba" #define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 5 +#define MAX_DIFF_PERCENT 8 #define ALLOCATION_MAX 100 #define ALLOCATION_MIN 10 #define ALLOCATION_STEP 10 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -11,7 +11,7 @@ #include "resctrl.h"
#define RESULT_FILE_NAME "result_mbm" -#define MAX_DIFF_PERCENT 5 +#define MAX_DIFF_PERCENT 8 #define NUM_OF_RUNS 5
static int
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jamie Lentin jm@lentin.co.uk
commit 2f2bd7cbd1d1548137b351040dc4e037d18cdfdc upstream.
The USB Compact Keyboard variant requires a reset_resume function to restore keyboard configuration after a suspend in some situations. Move configuration normally done on probe to lenovo_features_set_cptkbd(), then recycle this for use on reset_resume.
Without, the keyboard and driver would end up in an inconsistent state, breaking middle-button scrolling amongst other problems, and twiddling sysfs values wouldn't help as the middle-button mode won't be set until the driver is reloaded.
Tested on a USB and Bluetooth Thinkpad Compact Keyboard.
CC: stable@vger.kernel.org Fixes: 94eefa271323 ("HID: lenovo: Use native middle-button mode for compact keyboards") Signed-off-by: Jamie Lentin jm@lentin.co.uk Signed-off-by: Martin Kepplinger martink@posteo.de Link: https://lore.kernel.org/r/20231002150914.22101-1-martink@posteo.de Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-lenovo.c | 50 +++++++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 16 deletions(-)
--- a/drivers/hid/hid-lenovo.c +++ b/drivers/hid/hid-lenovo.c @@ -526,6 +526,19 @@ static void lenovo_features_set_cptkbd(s int ret; struct lenovo_drvdata *cptkbd_data = hid_get_drvdata(hdev);
+ /* + * Tell the keyboard a driver understands it, and turn F7, F9, F11 into + * regular keys + */ + ret = lenovo_send_cmd_cptkbd(hdev, 0x01, 0x03); + if (ret) + hid_warn(hdev, "Failed to switch F7/9/11 mode: %d\n", ret); + + /* Switch middle button to native mode */ + ret = lenovo_send_cmd_cptkbd(hdev, 0x09, 0x01); + if (ret) + hid_warn(hdev, "Failed to switch middle button: %d\n", ret); + ret = lenovo_send_cmd_cptkbd(hdev, 0x05, cptkbd_data->fn_lock); if (ret) hid_err(hdev, "Fn-lock setting failed: %d\n", ret); @@ -1148,22 +1161,6 @@ static int lenovo_probe_cptkbd(struct hi } hid_set_drvdata(hdev, cptkbd_data);
- /* - * Tell the keyboard a driver understands it, and turn F7, F9, F11 into - * regular keys (Compact only) - */ - if (hdev->product == USB_DEVICE_ID_LENOVO_CUSBKBD || - hdev->product == USB_DEVICE_ID_LENOVO_CBTKBD) { - ret = lenovo_send_cmd_cptkbd(hdev, 0x01, 0x03); - if (ret) - hid_warn(hdev, "Failed to switch F7/9/11 mode: %d\n", ret); - } - - /* Switch middle button to native mode */ - ret = lenovo_send_cmd_cptkbd(hdev, 0x09, 0x01); - if (ret) - hid_warn(hdev, "Failed to switch middle button: %d\n", ret); - /* Set keyboard settings to known state */ cptkbd_data->middlebutton_state = 0; cptkbd_data->fn_lock = true; @@ -1286,6 +1283,24 @@ err: return ret; }
+#ifdef CONFIG_PM +static int lenovo_reset_resume(struct hid_device *hdev) +{ + switch (hdev->product) { + case USB_DEVICE_ID_LENOVO_CUSBKBD: + case USB_DEVICE_ID_LENOVO_TPIIUSBKBD: + if (hdev->type == HID_TYPE_USBMOUSE) + lenovo_features_set_cptkbd(hdev); + + break; + default: + break; + } + + return 0; +} +#endif + static void lenovo_remove_tpkbd(struct hid_device *hdev) { struct lenovo_drvdata *data_pointer = hid_get_drvdata(hdev); @@ -1402,6 +1417,9 @@ static struct hid_driver lenovo_driver = .raw_event = lenovo_raw_event, .event = lenovo_event, .report_fixup = lenovo_report_fixup, +#ifdef CONFIG_PM + .reset_resume = lenovo_reset_resume, +#endif }; module_hid_driver(lenovo_driver);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 72151ad0cba8a07df90130ff62c979520d71f23b upstream.
Driver compares widget name in wsa_macro_spk_boost_event() widget event callback, however it does not handle component's name prefix. This leads to using uninitialized stack variables as registers and register values. Handle gracefully such case.
Fixes: 2c4066e5d428 ("ASoC: codecs: lpass-wsa-macro: add dapm widgets and route") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20231003155422.801160-1-krzysztof.kozlowski@linaro... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/lpass-wsa-macro.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/soc/codecs/lpass-wsa-macro.c +++ b/sound/soc/codecs/lpass-wsa-macro.c @@ -1685,6 +1685,9 @@ static int wsa_macro_spk_boost_event(str boost_path_cfg1 = CDC_WSA_RX1_RX_PATH_CFG1; reg = CDC_WSA_RX1_RX_PATH_CTL; reg_mix = CDC_WSA_RX1_RX_PATH_MIX_CTL; + } else { + dev_warn(component->dev, "Incorrect widget name in the driver\n"); + return -EINVAL; }
switch (event) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhihao Cheng chengzhihao1@huawei.com
commit 61187fce8600e8ef90e601be84f9d0f3222c1206 upstream.
JBD2 makes sure journal data is fallen on fs device by sync_blockdev(), however, other process could intercept the EIO information from bdev's mapping, which leads journal recovering successful even EIO occurs during data written back to fs device.
We found this problem in our product, iscsi + multipath is chosen for block device of ext4. Unstable network may trigger kpartx to rescan partitions in device mapper layer. Detailed process is shown as following:
mount kpartx irq jbd2_journal_recover do_one_pass memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal mark_buffer_dirty // mark bh dirty vfs_read generic_file_read_iter // dio filemap_write_and_wait_range __filemap_fdatawrite_range do_writepages block_write_full_folio submit_bh_wbc >> EIO occurs in disk << end_buffer_async_write mark_buffer_write_io_error mapping_set_error set_bit(AS_EIO, &mapping->flags) // set! filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // clear! err2 = sync_blockdev filemap_write_and_wait filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // false err2 = 0
Filesystem is mounted successfully even data from journal is failed written into disk, and ext4/ocfs2 could become corrupted.
Fix it by comparing the wb_err state in fs block device before recovering and after recovering.
A reproducer can be found in the kernel bugzilla referenced below.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888 Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230919012525.1783108-1-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/recovery.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -289,6 +289,8 @@ int jbd2_journal_recover(journal_t *jour journal_superblock_t * sb;
struct recovery_info info; + errseq_t wb_err; + struct address_space *mapping;
memset(&info, 0, sizeof(info)); sb = journal->j_superblock; @@ -306,6 +308,9 @@ int jbd2_journal_recover(journal_t *jour return 0; }
+ wb_err = 0; + mapping = journal->j_fs_dev->bd_inode->i_mapping; + errseq_check_and_advance(&mapping->wb_err, &wb_err); err = do_one_pass(journal, &info, PASS_SCAN); if (!err) err = do_one_pass(journal, &info, PASS_REVOKE); @@ -329,6 +334,9 @@ int jbd2_journal_recover(journal_t *jour err2 = sync_blockdev(journal->j_fs_dev); if (!err) err = err2; + err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err); + if (!err) + err = err2; /* Make sure all replayed data is on permanent storage */ if (journal->j_flags & JBD2_BARRIER) { err2 = blkdev_issue_flush(journal->j_fs_dev);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit d3cc1b0be258191d6360c82ea158c2972f8d3991 upstream.
Since commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key"), xfstest generic/270 causes a WARNING when run on f2fs with test_dummy_encryption in the mount options:
$ kvm-xfstests -c f2fs/encrypt generic/270 [...] WARNING: CPU: 1 PID: 2453 at fs/crypto/keyring.c:240 fscrypt_destroy_keyring+0x1f5/0x260
The cause of the WARNING is that not all encrypted inodes have been evicted before fscrypt_destroy_keyring() is called, which violates an assumption. This happens because the test uses an external quota file, which gets automatically encrypted due to test_dummy_encryption.
Encryption of quota files has never really been supported. On ext4, ext4_quota_read() does not decrypt the data, so encrypted quota files are always considered invalid on ext4. On f2fs, f2fs_quota_read() uses the pagecache, so trying to use an encrypted quota file gets farther, resulting in the issue described above being possible. But this was never intended to be possible, and there is no use case for it.
Therefore, make the quota support layer explicitly reject using IS_ENCRYPTED inodes when quotaon is attempted.
Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20230905003227.326998-1-ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/quota/dquot.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -2403,6 +2403,20 @@ static int vfs_setup_quota_inode(struct if (sb_has_quota_loaded(sb, type)) return -EBUSY;
+ /* + * Quota files should never be encrypted. They should be thought of as + * filesystem metadata, not user data. New-style internal quota files + * cannot be encrypted by users anyway, but old-style external quota + * files could potentially be incorrectly created in an encrypted + * directory, hence this explicit check. Some reasons why encrypted + * quota files don't work include: (1) some filesystems that support + * encryption don't handle it in their quota_read and quota_write, and + * (2) cleaning up encrypted quota files at unmount would need special + * consideration, as quota files are cleaned up later than user files. + */ + if (IS_ENCRYPTED(inode)) + return -EINVAL; + dqopt->files[type] = igrab(inode); if (!dqopt->files[type]) return -EIO;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Bara benjamin.bara@skidata.com
commit 60466c067927abbcaff299845abd4b7069963139 upstream.
As the emergency restart does not call kernel_restart_prepare(), the system_state stays in SYSTEM_RUNNING.
Since bae1d3a05a8b, this hinders i2c_in_atomic_xfer_mode() from becoming active, and therefore might lead to avoidable warnings in the restart handlers, e.g.:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c
Avoid these by setting the correct system_state.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Dmitry Osipenko dmitry.osipenko@collabora.com Tested-by: Nishanth Menon nm@ti.com Signed-off-by: Benjamin Bara benjamin.bara@skidata.com Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-1-18699d5dcd76@skida... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/reboot.c | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -74,6 +74,7 @@ void __weak (*pm_power_off)(void); void emergency_restart(void) { kmsg_dump(KMSG_DUMP_EMERG); + system_state = SYSTEM_RESTART; machine_emergency_restart(); } EXPORT_SYMBOL_GPL(emergency_restart);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Bara benjamin.bara@skidata.com
commit aa49c90894d06e18a1ee7c095edbd2f37c232d02 upstream.
Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is disabled. However, non-atomic i2c transfers require preemption (e.g. in wait_for_completion() while waiting for the DMA).
panic() calls preempt_disable_notrace() before calling emergency_restart(). Therefore, if an i2c device is used for the restart, the xfer should be atomic. This avoids warnings like:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c
Use !preemptible() instead, which is basically the same check as pre-v5.2.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Suggested-by: Dmitry Osipenko dmitry.osipenko@collabora.com Acked-by: Wolfram Sang wsa@kernel.org Reviewed-by: Dmitry Osipenko dmitry.osipenko@collabora.com Tested-by: Nishanth Menon nm@ti.com Signed-off-by: Benjamin Bara benjamin.bara@skidata.com Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-2-18699d5dcd76@skida... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-core.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/i2c-core.h +++ b/drivers/i2c/i2c-core.h @@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const str */ static inline bool i2c_in_atomic_xfer_mode(void) { - return system_state > SYSTEM_RUNNING && irqs_disabled(); + return system_state > SYSTEM_RUNNING && !preemptible(); }
static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tiezhu Yang yangtiezhu@loongson.cn
commit fc7f04dc23db50206bee7891516ed4726c3f64cf upstream.
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0) # Invalid argument - Failed to create new process # [7538] clone3() with flags says: -22 expected 0 not ok 18 [7538] Result (-22) is different than expected (0) ... # Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag CLONE_NEWTIME (0x80) is used to clone a time namespace, it will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time will be not exist, and then we should skip clone3() test with CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3 ... # Time namespaces are not supported ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME ... # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loong... Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME") Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Suggested-by: Thomas Gleixner tglx@linutronix.de Cc: Christian Brauner brauner@kernel.org Cc: Shuah Khan shuah@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/clone3/clone3.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/clone3/clone3.c +++ b/tools/testing/selftests/clone3/clone3.c @@ -196,7 +196,12 @@ int main(int argc, char *argv[]) CLONE3_ARGS_NO_TEST);
/* Do a clone3() in a new time namespace */ - test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST); + if (access("/proc/self/ns/time", F_OK) == 0) { + test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST); + } else { + ksft_print_msg("Time namespaces are not supported\n"); + ksft_test_result_skip("Skipping clone3() with CLONE_NEWTIME\n"); + }
/* Do a clone3() with exit signal (SIGCHLD) in flags */ test_clone3(SIGCHLD, 0, -EINVAL, CLONE3_ARGS_NO_TEST);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 4f7969bcd6d33042d62e249b41b5578161e4c868 upstream.
A synthetic event is created by the synthetic event interface that can read both user or kernel address memory. In reality, it reads any arbitrary memory location from within the kernel. If the address space is in USER (where CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE is set) then it uses strncpy_from_user_nofault() to copy strings otherwise it uses strncpy_from_kernel_nofault().
But since both functions use the same variable there's no annotation to what that variable is (ie. __user). This makes sparse complain.
Quiet sparse by typecasting the strncpy_from_user_nofault() variable to a __user pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20231031151033.73c42e23@gandalf.l...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Fixes: 0934ae9977c2 ("tracing: Fix reading strings from synthetic events"); Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202311010013.fm8WTxa5-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_synth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -452,7 +452,7 @@ static unsigned int trace_string(struct
#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE if ((unsigned long)str_val < TASK_SIZE) - ret = strncpy_from_user_nofault(str_field, str_val, STR_VAR_LEN_MAX); + ret = strncpy_from_user_nofault(str_field, (const void __user *)str_val, STR_VAR_LEN_MAX); else #endif ret = strncpy_from_kernel_nofault(str_field, str_val, STR_VAR_LEN_MAX);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saravana Kannan saravanak@google.com
commit 2e84dc37920012b458e9458b19fc4ed33f81bc74 upstream.
This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released.
Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: stable stable@kernel.org Reported-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by: Saravana Kannan saravanak@google.com Cc: Thierry Reding thierry.reding@gmail.com Cc: Yang Yingliang yangyingliang@huawei.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Mark Brown broonie@kernel.org Cc: Matti Vaittinen mazziesaccount@gmail.com Cc: James Clark james.clark@arm.com Acked-by: "Rafael J. Wysocki" rafael@kernel.org Tested-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Acked-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/dd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -1274,8 +1274,8 @@ static void __device_release_driver(stru if (dev->bus && dev->bus->dma_cleanup) dev->bus->dma_cleanup(dev);
- device_links_driver_cleanup(dev); device_unbind_cleanup(dev); + device_links_driver_cleanup(dev);
klist_remove(&dev->p->knode_driver); device_pm_check_callbacks(dev);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sanjuán García, Jorge Jorge.SanjuanGarcia@duagon.com
commit 63ba2d07b4be72b94216d20561f43e1150b25d98 upstream.
chameleon_parse_gdd() may fail for different reasons and end up in the err tag. Make sure we at least always free the mcb_device allocated with mcb_alloc_dev().
If mcb_device_register() fails, make sure to give up the reference in the same place the device was added.
Fixes: 728ac3389296 ("mcb: mcb-parse: fix error handing in chameleon_parse_gdd()") Cc: stable stable@kernel.org Reviewed-by: Jose Javier Rodriguez Barbarin JoseJavier.Rodriguez@duagon.com Signed-off-by: Jorge Sanjuan Garcia jorge.sanjuangarcia@duagon.com Link: https://lore.kernel.org/r/20231019141434.57971-2-jorge.sanjuangarcia@duagon.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mcb/mcb-core.c | 1 + drivers/mcb/mcb-parse.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/mcb/mcb-core.c +++ b/drivers/mcb/mcb-core.c @@ -246,6 +246,7 @@ int mcb_device_register(struct mcb_bus * return 0;
out: + put_device(&dev->dev);
return ret; } --- a/drivers/mcb/mcb-parse.c +++ b/drivers/mcb/mcb-parse.c @@ -106,7 +106,7 @@ static int chameleon_parse_gdd(struct mc return 0;
err: - put_device(&mdev->dev); + mcb_free_dev(mdev);
return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gaurav Batra gbatra@linux.vnet.ibm.com
commit 3bf983e4e93ce8e6d69e9d63f52a66ec0856672e upstream.
When a device is initialized, the driver invokes dma_supported() twice - first for streaming mappings followed by coherent mappings. For an SR-IOV device, default window is deleted and DDW created. With vPMEM enabled, TCE mappings are dynamically created for both vPMEM and SR-IOV device. There are no direct mappings.
First time when dma_supported() is called with 64 bit mask, DDW is created and marked as dynamic window. The second time dma_supported() is called, enable_ddw() finds existing window for the device and incorrectly returns it as "direct mapping".
This only happens when size of DDW is big enough to map max LPAR memory.
This results in streaming TCEs to not get dynamically mapped, since code incorrently assumes these are already pre-mapped. The adapter initially comes up but goes down due to EEH.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Gaurav Batra gbatra@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20231003030802.47914-1-gbatra@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/pseries/iommu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -916,7 +916,8 @@ static int remove_ddw(struct device_node return 0; }
-static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift) +static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift, + bool *direct_mapping) { struct dma_win *window; const struct dynamic_dma_window_prop *dma64; @@ -929,6 +930,7 @@ static bool find_existing_ddw(struct dev dma64 = window->prop; *dma_addr = be64_to_cpu(dma64->dma_base); *window_shift = be32_to_cpu(dma64->window_shift); + *direct_mapping = window->direct; found = true; break; } @@ -1272,10 +1274,8 @@ static bool enable_ddw(struct pci_dev *d
mutex_lock(&dma_win_init_mutex);
- if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { - direct_mapping = (len >= max_ram_len); + if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len, &direct_mapping)) goto out_unlock; - }
/* * If we already went through this for a previous function of
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alain Volmat alain.volmat@foss.st.com
commit 03f25d53b145bc2f7ccc82fc04e4482ed734f524 upstream.
In case of the prep descriptor while the channel is already running, the CCR register value stored into the channel could already have its EN bit set. This would lead to a bad transfer since, at start transfer time, enabling the channel while other registers aren't yet properly set. To avoid this, ensure to mask the CCR_EN bit when storing the ccr value into the mdma channel structure.
Fixes: a4ffb13c8946 ("dmaengine: Add STM32 MDMA driver") Signed-off-by: Alain Volmat alain.volmat@foss.st.com Signed-off-by: Amelie Delaunay amelie.delaunay@foss.st.com Cc: stable@vger.kernel.org Tested-by: Alain Volmat alain.volmat@foss.st.com Link: https://lore.kernel.org/r/20231009082450.452877-1-amelie.delaunay@foss.st.co... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/stm32-mdma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/dma/stm32-mdma.c +++ b/drivers/dma/stm32-mdma.c @@ -490,7 +490,7 @@ static int stm32_mdma_set_xfer_param(str src_maxburst = chan->dma_config.src_maxburst; dst_maxburst = chan->dma_config.dst_maxburst;
- ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)); + ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)) & ~STM32_MDMA_CCR_EN; ctcr = stm32_mdma_read(dmadev, STM32_MDMA_CTCR(chan->id)); ctbr = stm32_mdma_read(dmadev, STM32_MDMA_CTBR(chan->id));
@@ -966,7 +966,7 @@ stm32_mdma_prep_dma_memcpy(struct dma_ch if (!desc) return NULL;
- ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)); + ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)) & ~STM32_MDMA_CCR_EN; ctcr = stm32_mdma_read(dmadev, STM32_MDMA_CTCR(chan->id)); ctbr = stm32_mdma_read(dmadev, STM32_MDMA_CTBR(chan->id)); cbndtr = stm32_mdma_read(dmadev, STM32_MDMA_CBNDTR(chan->id));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 09cda0a400519b1541591c506e54c9c48e3101bf upstream.
If the cmma no-dat feature is available all pages that are not used for dynamic address translation are marked as "no-dat" with the ESSA instruction. This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. This also means that pages which are used for dynamic address translation must not be marked as "no-dat", since the hypervisor may then incorrectly not purge guest TLB entries.
Region and segment tables allocated via vmem_crst_alloc() are incorrectly marked as "no-dat", as soon as slab_is_available() returns true.
Such tables are allocated e.g. when kernel page tables are split, memory is hotplugged, or a DCSS segment is loaded.
Fix this by adding the missing arch_set_page_dat() call.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/vmem.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -13,6 +13,7 @@ #include <linux/hugetlb.h> #include <linux/slab.h> #include <linux/sort.h> +#include <asm/page-states.h> #include <asm/cacheflush.h> #include <asm/nospec-branch.h> #include <asm/pgalloc.h> @@ -46,8 +47,11 @@ void *vmem_crst_alloc(unsigned long val) unsigned long *table;
table = vmem_alloc_pages(CRST_ALLOC_ORDER); - if (table) - crst_table_init(table, val); + if (!table) + return NULL; + crst_table_init(table, val); + if (slab_is_available()) + arch_set_page_dat(virt_to_page(table), CRST_ALLOC_ORDER); return table; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 16ba44826a04834d3eeeda4b731c2ea3481062b7 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however does not cover the complete kernel address space. This can result in pages being marked as not being used for dynamic address translation, even though they are. In turn guest TLB entries incorrectly may not be purged.
Fix this by adjusting the end address of the kernel address range being walked.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -151,15 +151,22 @@ static void mark_kernel_p4d(pgd_t *pgd,
static void mark_kernel_pgd(void) { - unsigned long addr, next; + unsigned long addr, next, max_addr; struct page *page; pgd_t *pgd; int i;
addr = 0; + /* + * Figure out maximum virtual address accessible with the + * kernel ASCE. This is required to keep the page table walker + * from accessing non-existent entries. + */ + max_addr = (S390_lowcore.kernel_asce.val & _ASCE_TYPE_MASK) >> 2; + max_addr = 1UL << (max_addr * 11 + 31); pgd = pgd_offset_k(addr); do { - next = pgd_addr_end(addr, MODULES_END); + next = pgd_addr_end(addr, max_addr); if (pgd_none(*pgd)) continue; if (!pgd_folded(*pgd)) { @@ -168,7 +175,7 @@ static void mark_kernel_pgd(void) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_p4d(pgd, addr, next); - } while (pgd++, addr = next, addr != MODULES_END); + } while (pgd++, addr = next, addr != max_addr); }
void __init cmma_init_nodat(void)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 44d93045247661acbd50b1629e62f415f2747577 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however is incorrect: only the first three of the four pages which belong to segment and region tables will be marked as being used for DAT. The last page is incorrectly marked as no-dat.
This can result in incorrect guest TLB flushes.
Fix this by simply marking all four pages.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -121,7 +121,7 @@ static void mark_kernel_pud(p4d_t *p4d, continue; if (!pud_folded(*pud)) { page = phys_to_page(pud_val(*pud)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_pmd(pud, addr, next); @@ -142,7 +142,7 @@ static void mark_kernel_p4d(pgd_t *pgd, continue; if (!p4d_folded(*p4d)) { page = phys_to_page(p4d_val(*p4d)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_pud(p4d, addr, next); @@ -171,7 +171,7 @@ static void mark_kernel_pgd(void) continue; if (!pgd_folded(*pgd)) { page = phys_to_page(pgd_val(*pgd)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_p4d(pgd, addr, next);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 84bb41d5df48868055d159d9247b80927f1f70f9 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. All pages used for swapper_pg_dir and invalid_pg_dir are incorrectly marked as no-dat, which in turn can result in incorrect guest TLB flushes.
Fix this by marking those pages correctly as being used for DAT.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -188,6 +188,12 @@ void __init cmma_init_nodat(void) return; /* Mark pages used in kernel page tables */ mark_kernel_pgd(); + page = virt_to_page(&swapper_pg_dir); + for (i = 0; i < 4; i++) + set_bit(PG_arch_1, &page[i].flags); + page = virt_to_page(&invalid_pg_dir); + for (i = 0; i < 4; i++) + set_bit(PG_arch_1, &page[i].flags);
/* Set all kernel pages not used for page tables to stable/no-dat */ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
commit 2e7cfe5cd5b6b0b98abf57a3074885979e187c1c upstream.
Patch series "Use nth_page() in place of direct struct page manipulation", v3.
On SPARSEMEM without VMEMMAP, struct page is not guaranteed to be contiguous, since each memory section's memmap might be allocated independently. hugetlb pages can go beyond a memory section size, thus direct struct page manipulation on hugetlb pages/subpages might give wrong struct page. Kernel provides nth_page() to do the manipulation properly. Use that whenever code can see hugetlb pages.
This patch (of 5):
When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle it properly.
Without the fix, page_kasan_tag_reset() could reset wrong page tags, causing a wrong kasan result. No related bug is reported. The fix comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20230913201248.452081-2-zi.yan@sent.com Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Zi Yan ziy@nvidia.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/cma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/cma.c +++ b/mm/cma.c @@ -501,7 +501,7 @@ struct page *cma_alloc(struct cma *cma, */ if (page) { for (i = 0; i < count; i++) - page_kasan_tag_reset(page + i); + page_kasan_tag_reset(nth_page(page, i)); }
if (ret && !no_warn) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
commit 1640a0ef80f6d572725f5b0330038c18e98ea168 upstream.
When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use pfn calculation to handle it properly.
Without the fix, a wrong number of page might be skipped. Since skip cannot be negative, scan_movable_page() will end early and might miss a movable page with -ENOENT. This might fail offline_pages(). No bug is reported. The fix comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-4-zi.yan@sent.com Fixes: eeb0efd071d8 ("mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages") Signed-off-by: Zi Yan ziy@nvidia.com Reviewed-by: Muchun Song songmuchun@bytedance.com Acked-by: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory_hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1586,7 +1586,7 @@ static int scan_movable_pages(unsigned l */ if (HPageMigratable(head)) goto found; - skip = compound_nr(head) - (page - head); + skip = compound_nr(head) - (pfn - page_to_pfn(head)); pfn += skip - 1; } return -ENOENT;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florent Revest revest@chromium.org
commit 0da668333fb07805c2836d5d50e26eda915b24a1 upstream.
Defining a prctl flag as an int is a footgun because on a 64 bit machine and with a variadic implementation of prctl (like in musl and glibc), when used directly as a prctl argument, it can get casted to long with garbage upper bits which would result in unexpected behaviors.
This patch changes the constant to an unsigned long to eliminate that possibilities. This does not break UAPI.
I think that a stable backport would be "nice to have": to reduce the chances that users build binaries that could end up with garbage bits in their MDWE prctl arguments. We are not aware of anyone having yet encountered this corner case with MDWE prctls but a backport would reduce the likelihood it happens, since this sort of issues has happened with other prctls. But If this is perceived as a backporting burden, I suppose we could also live without a stable backport.
Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by: Florent Revest revest@chromium.org Suggested-by: Alexey Izbyshev izbyshev@ispras.ru Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Ayush Jain ayush.jain3@amd.com Cc: Greg Thelen gthelen@google.com Cc: Joey Gouly joey.gouly@arm.com Cc: KP Singh kpsingh@kernel.org Cc: Mark Brown broonie@kernel.org Cc: Michal Hocko mhocko@suse.com Cc: Peter Xu peterx@redhat.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Szabolcs Nagy Szabolcs.Nagy@arm.com Cc: Topi Miettinen toiwoton@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/prctl.h | 2 +- tools/include/uapi/linux/prctl.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 3c36aeade991..9a85c69782bd 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -283,7 +283,7 @@ struct prctl_mm_map {
/* Memory deny write / execute */ #define PR_SET_MDWE 65 -# define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0)
#define PR_GET_MDWE 66
diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index 3c36aeade991..9a85c69782bd 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -283,7 +283,7 @@ struct prctl_mm_map {
/* Memory deny write / execute */ #define PR_SET_MDWE 65 -# define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0)
#define PR_GET_MDWE 66
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
commit 565fe150624ee77dc63a735cc1b3bff5101f38a3 upstream.
Currently the offset into the device when looking for OTP bits can go outside of the address of the MTD NOR devices, and if that memory isn't readable, bad things happen on the IXP4xx (added prints that illustrate the problem before the crash):
cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x00000100 ixp4xx_copy_from copy from 0x00000100 to 0xc880dd78 cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x12000000 ixp4xx_copy_from copy from 0x12000000 to 0xc880dd78 8<--- cut here --- Unable to handle kernel paging request at virtual address db000000 [db000000] *pgd=00000000 (...)
This happens in this case because the IXP4xx is big endian and the 32- and 16-bit fields in the struct cfi_intelext_otpinfo are not properly byteswapped. Compare to how the code in read_pri_intelext() byteswaps the fields in struct cfi_pri_intelext.
Adding a small byte swapping loop for the OTP in read_pri_intelext() and the crash goes away.
The problem went unnoticed for many years until I enabled CONFIG_MTD_OTP on the IXP4xx as well, triggering the bug.
Cc: stable@vger.kernel.org Reviewed-by: Nicolas Pitre nico@fluxnic.net Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20231020-mtd-otp-byteswap-v4-1-0d132c06aa9... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/chips/cfi_cmdset_0001.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/mtd/chips/cfi_cmdset_0001.c +++ b/drivers/mtd/chips/cfi_cmdset_0001.c @@ -422,9 +422,25 @@ read_pri_intelext(struct map_info *map, extra_size = 0;
/* Protection Register info */ - if (extp->NumProtectionFields) + if (extp->NumProtectionFields) { + struct cfi_intelext_otpinfo *otp = + (struct cfi_intelext_otpinfo *)&extp->extra[0]; + extra_size += (extp->NumProtectionFields - 1) * - sizeof(struct cfi_intelext_otpinfo); + sizeof(struct cfi_intelext_otpinfo); + + if (extp_size >= sizeof(*extp) + extra_size) { + int i; + + /* Do some byteswapping if necessary */ + for (i = 0; i < extp->NumProtectionFields - 1; i++) { + otp->ProtRegAddr = le32_to_cpu(otp->ProtRegAddr); + otp->FactGroups = le16_to_cpu(otp->FactGroups); + otp->UserGroups = le16_to_cpu(otp->UserGroups); + otp++; + } + } + } }
if (extp->MinorVersion >= '1') {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Harris jim.harris@samsung.com
commit 0718588c7aaa7a1510b4de972370535b61dddd0d upstream.
Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed.
Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc3fef.
But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused:
1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch
Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted.
The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow.
Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region.
Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") Cc: stable@vger.kernel.org Signed-off-by: Jim Harris jim.harris@samsung.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Acked-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bg... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cxl/core/region.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1676,6 +1676,12 @@ static int cxl_region_attach(struct cxl_ return -ENXIO; }
+ if (p->nr_targets >= p->interleave_ways) { + dev_dbg(&cxlr->dev, "region already has %d endpoints\n", + p->nr_targets); + return -EINVAL; + } + ep_port = cxled_to_port(cxled); root_port = cxlrd_to_port(cxlrd); dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); @@ -1768,7 +1774,7 @@ static int cxl_region_attach(struct cxl_ if (p->nr_targets == p->interleave_ways) { rc = cxl_region_setup_targets(cxlr); if (rc) - goto err_decrement; + return rc; p->state = CXL_CONFIG_ACTIVE; }
@@ -1800,12 +1806,6 @@ static int cxl_region_attach(struct cxl_ }
return 0; - -err_decrement: - p->nr_targets--; - cxled->pos = -1; - p->targets[pos] = NULL; - return rc; }
static int cxl_region_detach(struct cxl_endpoint_decoder *cxled)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Yeong joshua.yeong@starfivetech.com
commit 4bd8405257da717cd556f99e5fb68693d12c9766 upstream.
IBIR_DEPTH and CMDR_DEPTH should read from status0 instead of status1.
Cc: stable@vger.kernel.org Fixes: 603f2bee2c54 ("i3c: master: Add driver for Cadence IP") Signed-off-by: Joshua Yeong joshua.yeong@starfivetech.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/r/20230913031743.11439-2-joshua.yeong@starfivetech.c... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/i3c-master-cdns.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/i3c/master/i3c-master-cdns.c +++ b/drivers/i3c/master/i3c-master-cdns.c @@ -192,7 +192,7 @@ #define SLV_STATUS1_HJ_DIS BIT(18) #define SLV_STATUS1_MR_DIS BIT(17) #define SLV_STATUS1_PROT_ERR BIT(16) -#define SLV_STATUS1_DA(x) (((s) & GENMASK(15, 9)) >> 9) +#define SLV_STATUS1_DA(s) (((s) & GENMASK(15, 9)) >> 9) #define SLV_STATUS1_HAS_DA BIT(8) #define SLV_STATUS1_DDR_RX_FULL BIT(7) #define SLV_STATUS1_DDR_TX_FULL BIT(6) @@ -1624,13 +1624,13 @@ static int cdns_i3c_master_probe(struct /* Device ID0 is reserved to describe this master. */ master->maxdevs = CONF_STATUS0_DEVS_NUM(val); master->free_rr_slots = GENMASK(master->maxdevs, 1); + master->caps.ibirfifodepth = CONF_STATUS0_IBIR_DEPTH(val); + master->caps.cmdrfifodepth = CONF_STATUS0_CMDR_DEPTH(val);
val = readl(master->regs + CONF_STATUS1); master->caps.cmdfifodepth = CONF_STATUS1_CMD_DEPTH(val); master->caps.rxfifodepth = CONF_STATUS1_RX_DEPTH(val); master->caps.txfifodepth = CONF_STATUS1_TX_DEPTH(val); - master->caps.ibirfifodepth = CONF_STATUS0_IBIR_DEPTH(val); - master->caps.cmdrfifodepth = CONF_STATUS0_CMDR_DEPTH(val);
spin_lock_init(&master->ibi.lock); master->ibi.num_slots = CONF_STATUS1_IBI_HW_RES(val);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 6bf3fc268183816856c96b8794cd66146bc27b35 upstream.
The ibi work thread operates asynchronously with other transfers, such as svc_i3c_master_priv_xfers(). Introduce mutex protection to ensure the completion of the entire i3c/i2c transaction.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-2-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -175,6 +175,7 @@ struct svc_i3c_regs_save { * @ibi.slots: Available IBI slots * @ibi.tbq_slot: To be queued IBI slot * @ibi.lock: IBI lock + * @lock: Transfer lock, protect between IBI work thread and callbacks from master */ struct svc_i3c_master { struct i3c_master_controller base; @@ -203,6 +204,7 @@ struct svc_i3c_master { /* Prevent races within IBI handlers */ spinlock_t lock; } ibi; + struct mutex lock; };
/** @@ -384,6 +386,7 @@ static void svc_i3c_master_ibi_work(stru u32 status, val; int ret;
+ mutex_lock(&master->lock); /* Acknowledge the incoming interrupt with the AUTOIBI mechanism */ writel(SVC_I3C_MCTRL_REQUEST_AUTO_IBI | SVC_I3C_MCTRL_IBIRESP_AUTO, @@ -460,6 +463,7 @@ static void svc_i3c_master_ibi_work(stru
reenable_ibis: svc_i3c_master_enable_interrupts(master, SVC_I3C_MINT_SLVSTART); + mutex_unlock(&master->lock); }
static irqreturn_t svc_i3c_master_irq_handler(int irq, void *dev_id) @@ -1204,9 +1208,11 @@ static int svc_i3c_master_send_bdcast_cc cmd->read_len = 0; cmd->continued = false;
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; kfree(buf); @@ -1250,9 +1256,11 @@ static int svc_i3c_master_send_direct_cc cmd->read_len = read_len; cmd->continued = false;
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
if (cmd->read_len != xfer_len) ccc->dests[0].payload.len = cmd->read_len; @@ -1309,9 +1317,11 @@ static int svc_i3c_master_priv_xfers(str cmd->continued = (i + 1) < nxfers; }
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; svc_i3c_master_free_xfer(xfer); @@ -1347,9 +1357,11 @@ static int svc_i3c_master_i2c_xfers(stru cmd->continued = (i + 1 < nxfers); }
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; svc_i3c_master_free_xfer(xfer); @@ -1540,6 +1552,8 @@ static int svc_i3c_master_probe(struct p
INIT_WORK(&master->hj_work, svc_i3c_master_hj_work); INIT_WORK(&master->ibi_work, svc_i3c_master_ibi_work); + mutex_init(&master->lock); + ret = devm_request_irq(dev, master->irq, svc_i3c_master_irq_handler, IRQF_NO_SUSPEND, "svc-i3c-irq", master); if (ret)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 5e5e3c92e748a6d859190e123b9193cf4911fcca upstream.
┌─────┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┌───── SCL: ┘ └─────┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┘ ───┐ ┌─────┐ ┌─────┐ ┌───────────┐ SDA: └───────────────────────┘ └─────┘ └─────┘ └───── xxx╱ ╲╱ ╲╱ ╲╱ ╲╱ ╲ : xxx╲IBI ╱╲ Addr(0x0a) ╱╲ RW ╱╲NACK╱╲ S ╱
If an In-Band Interrupt (IBI) occurs and IBI work thread is not immediately scheduled, when svc_i3c_master_priv_xfers() initiates the I3C transfer and attempts to send address 0x7e, the target interprets it as an IBI handler and returns the target address 0x0a.
However, svc_i3c_master_priv_xfers() does not handle this case and proceeds with other transfers, resulting in incorrect data being returned.
Add IBIWON check in svc_i3c_master_xfer(). In case this situation occurs, return a failure to the driver.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-3-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -1011,6 +1011,9 @@ static int svc_i3c_master_xfer(struct sv u32 reg; int ret;
+ /* clean SVC_I3C_MINT_IBIWON w1c bits */ + writel(SVC_I3C_MINT_IBIWON, master->regs + SVC_I3C_MSTATUS); + writel(SVC_I3C_MCTRL_REQUEST_START_ADDR | xfer_type | SVC_I3C_MCTRL_IBIRESP_NACK | @@ -1028,6 +1031,23 @@ static int svc_i3c_master_xfer(struct sv ret = -ENXIO; goto emit_stop; } + + /* + * According to I3C spec ver 1.1.1, 5.1.2.2.3 Consequence of Controller Starting a Frame + * with I3C Target Address. + * + * The I3C Controller normally should start a Frame, the Address may be arbitrated, and so + * the Controller shall monitor to see whether an In-Band Interrupt request, a Controller + * Role Request (i.e., Secondary Controller requests to become the Active Controller), or + * a Hot-Join Request has been made. + * + * If missed IBIWON check, the wrong data will be return. When IBIWON happen, return failure + * and yield the above events handler. + */ + if (SVC_I3C_MSTATUS_IBIWON(reg)) { + ret = -ENXIO; + goto emit_stop; + }
if (rnw) ret = svc_i3c_master_read(master, in, xfer_len);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit c85e209b799f12d18a90ae6353b997b1bb1274a5 upstream.
MSTATUS[RXPEND] is only updated after the data transfer cycle started. This creates an issue when the I3C clock is slow, and the CPU is running fast enough that MSTATUS[RXPEND] may not be updated when the code reaches checking point. As a result, mandatory data can be missed.
Add a wait for MSTATUS[COMPLETE] to ensure that all mandatory data is already in FIFO. It also works without mandatory data.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-4-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -333,6 +333,7 @@ static int svc_i3c_master_handle_ibi(str struct i3c_ibi_slot *slot; unsigned int count; u32 mdatactrl; + int ret, val; u8 *buf;
slot = i3c_generic_ibi_get_free_slot(data->ibi_pool); @@ -342,6 +343,13 @@ static int svc_i3c_master_handle_ibi(str slot->len = 0; buf = slot->data;
+ ret = readl_relaxed_poll_timeout(master->regs + SVC_I3C_MSTATUS, val, + SVC_I3C_MSTATUS_COMPLETE(val), 0, 1000); + if (ret) { + dev_err(master->dev, "Timeout when polling for COMPLETE\n"); + return ret; + } + while (SVC_I3C_MSTATUS_RXPEND(readl(master->regs + SVC_I3C_MSTATUS)) && slot->len < SVC_I3C_FIFO_SIZE) { mdatactrl = readl(master->regs + SVC_I3C_MDATACTRL);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 225d5ef048c4ed01a475c95d94833bd7dd61072d upstream.
svc_i3c_master_irq_handler() wrongly checks register SVC_I3C_MINTMASKED. It should be SVC_I3C_MSTATUS.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-5-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -477,7 +477,7 @@ reenable_ibis: static irqreturn_t svc_i3c_master_irq_handler(int irq, void *dev_id) { struct svc_i3c_master *master = (struct svc_i3c_master *)dev_id; - u32 active = readl(master->regs + SVC_I3C_MINTMASKED); + u32 active = readl(master->regs + SVC_I3C_MSTATUS);
if (!SVC_I3C_MSTATUS_SLVSTART(active)) return IRQ_NONE;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit dfd7cd6aafdb1f5ba93828e97e56b38304b23a05 upstream.
Upon IBIWON timeout, the SDA line will always be kept low if we don't emit a stop. Calling svc_i3c_master_emit_stop() there will let the bus return to idle state.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-6-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -405,6 +405,7 @@ static void svc_i3c_master_ibi_work(stru SVC_I3C_MSTATUS_IBIWON(val), 0, 1000); if (ret) { dev_err(master->dev, "Timeout when polling for IBIWON\n"); + svc_i3c_master_emit_stop(master); goto reenable_ibis; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 9aaeef113c55248ecf3ab941c2e4460aaa8b8b9a upstream.
master side report: silvaco-i3c-master 44330000.i3c-master: Error condition: MSTATUS 0x020090c7, MERRWARN 0x00100000
BIT 20: TIMEOUT error The module has stalled too long in a frame. This happens when: - The TX FIFO or RX FIFO is not handled and the bus is stuck in the middle of a message, - No STOP was issued and between messages, - IBI manual is used and no decision was made. The maximum stall period is 100 μs.
This can be considered as being just a warning as the system IRQ latency can easily be greater than 100us.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Signed-off-by: Frank Li Frank.Li@nxp.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/r/20231023161658.3890811-7-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -93,6 +93,7 @@ #define SVC_I3C_MINTMASKED 0x098 #define SVC_I3C_MERRWARN 0x09C #define SVC_I3C_MERRWARN_NACK BIT(2) +#define SVC_I3C_MERRWARN_TIMEOUT BIT(20) #define SVC_I3C_MDMACTRL 0x0A0 #define SVC_I3C_MDATACTRL 0x0AC #define SVC_I3C_MDATACTRL_FLUSHTB BIT(0) @@ -227,6 +228,14 @@ static bool svc_i3c_master_error(struct if (SVC_I3C_MSTATUS_ERRWARN(mstatus)) { merrwarn = readl(master->regs + SVC_I3C_MERRWARN); writel(merrwarn, master->regs + SVC_I3C_MERRWARN); + + /* Ignore timeout error */ + if (merrwarn & SVC_I3C_MERRWARN_TIMEOUT) { + dev_dbg(master->dev, "Warning condition: MSTATUS 0x%08x, MERRWARN 0x%08x\n", + mstatus, merrwarn); + return false; + } + dev_err(master->dev, "Error condition: MSTATUS 0x%08x, MERRWARN 0x%08x\n", mstatus, merrwarn);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jim Harris jim.harris@samsung.com
commit 98a04c7aced2b43b3ac4befe216c4eecc7257d4b upstream.
Root decoder granularity must match value from CFWMS, which may not be the region's granularity for non-interleaved root decoders.
So when calculating granularities for host bridge decoders, use the region's granularity instead of the root decoder's granularity to ensure the correct granularities are set for the host bridge decoders and any downstream switch decoders.
Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.
Region created with 2048 granularity using following command line:
cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ -g 2048 -s 2048M
Use "cxl list -PDE | grep granularity" to get a view of the granularity set at each level of the topology.
Before this patch: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":256,
After: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048,
Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Cc: stable@vger.kernel.org Signed-off-by: Jim Harris jim.harris@samsung.com Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bg... [djbw: fixup the prebuilt cxl_test region] Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cxl/core/region.c | 9 ++++++++- tools/testing/cxl/test/cxl.c | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1127,7 +1127,14 @@ static int cxl_port_setup_targets(struct }
if (is_cxl_root(parent_port)) { - parent_ig = cxlrd->cxlsd.cxld.interleave_granularity; + /* + * Root decoder IG is always set to value in CFMWS which + * may be different than this region's IG. We can use the + * region's IG here since interleave_granularity_store() + * does not allow interleaved host-bridges with + * root IG != region IG. + */ + parent_ig = p->interleave_granularity; parent_iw = cxlrd->cxlsd.cxld.interleave_ways; /* * For purposes of address bit routing, use power-of-2 math for --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -831,7 +831,7 @@ static void mock_init_hdm_decoder(struct cxld->interleave_ways = 2; else cxld->interleave_ways = 1; - cxld->interleave_granularity = 256; + cxld->interleave_granularity = 4096; cxld->hpa_range = (struct range) { .start = base, .end = base + size - 1,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 8d2ad999ca3c64cb08cf6a58d227b9d9e746d708 upstream.
The CXL subsystem, at cxl_mem ->probe() time, establishes a lineage of ports (struct cxl_port objects) between an endpoint and the root of a CXL topology. Each port including the endpoint port is attached to the cxl_port driver.
Given that setup, it follows that when either any port in that lineage goes through a cxl_port ->remove() event, or the memdev goes through a cxl_mem ->remove() event. The hierarchy below the removed port, or the entire hierarchy if the memdev is removed needs to come down.
The delete_endpoint() callback is careful to check whether it is being called to tear down the hierarchy, or if it is only being called to teardown the memdev because an ancestor port is going through ->remove().
That care needs to take the device_lock() of the endpoint's parent. Which requires 2 bugs to be fixed:
1/ A reference on the parent is needed to prevent use-after-free scenarios like this signature:
BUG: spinlock bad magic on CPU#0, kworker/u56:0/11 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023 Workqueue: cxl_port detach_memdev [cxl_core] RIP: 0010:spin_bug+0x65/0xa0 Call Trace: do_raw_spin_lock+0x69/0xa0 __mutex_lock+0x695/0xb80 delete_endpoint+0xad/0x150 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 detach_memdev+0x15/0x20 [cxl_core] process_one_work+0x1e3/0x4c0 worker_thread+0x1dd/0x3d0
2/ In the case of RCH topologies, the parent device that needs to be locked is not always @port->dev as returned by cxl_mem_find_port(), use endpoint->dev.parent instead.
Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Cc: stable@vger.kernel.org Reported-by: Robert Richter rrichter@amd.com Closes: http://lore.kernel.org/r/20231018171713.1883517-2-rrichter@amd.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cxl/core/port.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)
--- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1242,35 +1242,39 @@ static struct device *grandparent(struct return NULL; }
+static struct device *endpoint_host(struct cxl_port *endpoint) +{ + struct cxl_port *port = to_cxl_port(endpoint->dev.parent); + + if (is_cxl_root(port)) + return port->uport_dev; + return &port->dev; +} + static void delete_endpoint(void *data) { struct cxl_memdev *cxlmd = data; struct cxl_port *endpoint = cxlmd->endpoint; - struct cxl_port *parent_port; - struct device *parent; + struct device *host = endpoint_host(endpoint);
- parent_port = cxl_mem_find_port(cxlmd, NULL); - if (!parent_port) - goto out; - parent = &parent_port->dev; - - device_lock(parent); - if (parent->driver && !endpoint->dead) { - devm_release_action(parent, cxl_unlink_parent_dport, endpoint); - devm_release_action(parent, cxl_unlink_uport, endpoint); - devm_release_action(parent, unregister_port, endpoint); + device_lock(host); + if (host->driver && !endpoint->dead) { + devm_release_action(host, cxl_unlink_parent_dport, endpoint); + devm_release_action(host, cxl_unlink_uport, endpoint); + devm_release_action(host, unregister_port, endpoint); } cxlmd->endpoint = NULL; - device_unlock(parent); - put_device(parent); -out: + device_unlock(host); put_device(&endpoint->dev); + put_device(host); }
int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint) { + struct device *host = endpoint_host(endpoint); struct device *dev = &cxlmd->dev;
+ get_device(host); get_device(&endpoint->dev); cxlmd->endpoint = endpoint; cxlmd->depth = endpoint->depth;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maíra Canal mcanal@igalia.com
[ Upstream commit 2e75396f1df61e1f1d26d0d703fc7292c4ae4371 ]
The commit c494a447c14e ("soc: bcm: bcm2835-power: Refactor ASB control") refactored the ASB control by using a general function to handle both the enable and disable. But this patch introduced a subtle regression: we need to check if !!(readl(base + reg) & ASB_ACK) == enable, not just check if (readl(base + reg) & ASB_ACK) == true.
Currently, this is causing an invalid register state in V3D when unloading and loading the driver, because `bcm2835_asb_disable()` will return -ETIMEDOUT and `bcm2835_asb_power_off()` will fail to disable the ASB slave for V3D.
Fixes: c494a447c14e ("soc: bcm: bcm2835-power: Refactor ASB control") Signed-off-by: Maíra Canal mcanal@igalia.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Stefan Wahren stefan.wahren@i2se.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231024101251.6357-2-mcanal@igalia.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/bcm/bcm2835-power.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/bcm/bcm2835-power.c b/drivers/soc/bcm/bcm2835-power.c index 1a179d4e011cf..d2f0233cb6206 100644 --- a/drivers/soc/bcm/bcm2835-power.c +++ b/drivers/soc/bcm/bcm2835-power.c @@ -175,7 +175,7 @@ static int bcm2835_asb_control(struct bcm2835_power *power, u32 reg, bool enable } writel(PM_PASSWORD | val, base + reg);
- while (readl(base + reg) & ASB_ACK) { + while (!!(readl(base + reg) & ASB_ACK) == enable) { cpu_relax(); if (ktime_get_ns() - start >= 1000) return -ETIMEDOUT;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomeu Vizoso tomeu@tomeuvizoso.net
[ Upstream commit b131329b9bfbd1b4c0c5e088cb0c6ec03a12930f ]
Without this change, the NPU hangs when the 8th NN core is used.
It matches what the out-of-tree driver does.
Signed-off-by: Tomeu Vizoso tomeu@tomeuvizoso.net Fixes: 9a217b7e8953 ("soc: amlogic: meson-pwrc: Add NNA power domain for A311D") Acked-by: Neil Armstrong neil.armstrong@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231016080205.41982-2-tomeu@tomeuvizoso.net Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/amlogic/meson-ee-pwrc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/amlogic/meson-ee-pwrc.c b/drivers/soc/amlogic/meson-ee-pwrc.c index f54acffc83f9f..f2b24361c8cac 100644 --- a/drivers/soc/amlogic/meson-ee-pwrc.c +++ b/drivers/soc/amlogic/meson-ee-pwrc.c @@ -229,7 +229,7 @@ static struct meson_ee_pwrc_mem_domain sm1_pwrc_mem_audio[] = {
static struct meson_ee_pwrc_mem_domain g12a_pwrc_mem_nna[] = { { G12A_HHI_NANOQ_MEM_PD_REG0, GENMASK(31, 0) }, - { G12A_HHI_NANOQ_MEM_PD_REG1, GENMASK(23, 0) }, + { G12A_HHI_NANOQ_MEM_PD_REG1, GENMASK(31, 0) }, };
#define VPU_PD(__name, __top_pd, __mem, __is_pwr_off, __resets, __clks) \
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pengfei Li pengfei.li_1@nxp.com
[ Upstream commit 374de39d38f97b0e58cfee88da590b2d056ccf7f ]
Currently, The imx pgc power domain doesn't set the fwnode pointer, which results in supply regulator device can't get consumer imx pgc power domain device from fwnode when creating a link.
This causes the driver core to instead try to create a link between the parent gpc device of imx pgc power domain device and supply regulator device. However, at this point, the gpc device has already been bound, and the link creation will fail. So adding the fwnode pointer to the imx pgc power domain device will fix this issue.
Signed-off-by: Pengfei Li pengfei.li_1@nxp.com Tested-by: Emil Kronborg emil.kronborg@protonmail.com Fixes: 3fb16866b51d ("driver core: fw_devlink: Make cycle detection more robust") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020185949.537083-1-pengfei.li_1@nxp.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/imx/gpc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/soc/imx/gpc.c b/drivers/soc/imx/gpc.c index 90a8b2c0676ff..419ed15cc10c4 100644 --- a/drivers/soc/imx/gpc.c +++ b/drivers/soc/imx/gpc.c @@ -498,6 +498,7 @@ static int imx_gpc_probe(struct platform_device *pdev)
pd_pdev->dev.parent = &pdev->dev; pd_pdev->dev.of_node = np; + pd_pdev->dev.fwnode = of_fwnode_handle(np);
ret = platform_device_add(pd_pdev); if (ret) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org
[ Upstream commit a07d2497ed657eb2efeb967af47e22f573dcd1d6 ]
The DWC core driver exposes the write_dbi2() callback for writing to the DBI2 registers in a vendor-specific way.
On the Qcom EP platforms, the DBI_CS2 bit in the ELBI region needs to be asserted before writing to any DBI2 registers and deasserted once done.
So, let's implement the callback for the Qcom PCIe EP driver so that the DBI2 writes are correctly handled in the hardware.
Without this callback, the DBI2 register writes like BAR size won't go through and as a result, the default BAR size is set for all BARs.
[kwilczynski: commit log, renamed function to match the DWC convention] Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver") Suggested-by: Serge Semin fancer.lancer@gmail.com Link: https://lore.kernel.org/linux-pci/20231025130029.74693-2-manivannan.sadhasiv... Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Cc: stable@vger.kernel.org # 5.16+ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-qcom-ep.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c index 267e1247d548f..4a9741428619f 100644 --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c @@ -121,6 +121,7 @@
/* ELBI registers */ #define ELBI_SYS_STTS 0x08 +#define ELBI_CS2_ENABLE 0xa4
/* DBI registers */ #define DBI_CON_STATUS 0x44 @@ -253,6 +254,21 @@ static void qcom_pcie_dw_stop_link(struct dw_pcie *pci) disable_irq(pcie_ep->perst_irq); }
+static void qcom_pcie_dw_write_dbi2(struct dw_pcie *pci, void __iomem *base, + u32 reg, size_t size, u32 val) +{ + struct qcom_pcie_ep *pcie_ep = to_pcie_ep(pci); + int ret; + + writel(1, pcie_ep->elbi + ELBI_CS2_ENABLE); + + ret = dw_pcie_write(pci->dbi_base2 + reg, size, val); + if (ret) + dev_err(pci->dev, "Failed to write DBI2 register (0x%x): %d\n", reg, ret); + + writel(0, pcie_ep->elbi + ELBI_CS2_ENABLE); +} + static int qcom_pcie_enable_resources(struct qcom_pcie_ep *pcie_ep) { int ret; @@ -451,6 +467,7 @@ static const struct dw_pcie_ops pci_ops = { .link_up = qcom_pcie_dw_link_up, .start_link = qcom_pcie_dw_start_link, .stop_link = qcom_pcie_dw_stop_link, + .write_dbi2 = qcom_pcie_dw_write_dbi2, };
static int qcom_pcie_ep_get_io_resources(struct platform_device *pdev,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lizhi Hou lizhi.hou@amd.com
[ Upstream commit b544fc2b8606d718d0cc788ff2ea2492871df488 ]
of_changeset_create_node() creates device node dynamically and attaches the newly created node to a changeset.
Expand of_changeset APIs to handle specific types of properties. of_changeset_add_prop_string() of_changeset_add_prop_string_array() of_changeset_add_prop_u32_array()
Signed-off-by: Clément Léger clement.leger@bootlin.com Signed-off-by: Lizhi Hou lizhi.hou@amd.com Link: https://lore.kernel.org/r/1692120000-46900-2-git-send-email-lizhi.hou@amd.co... Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/dynamic.c | 164 ++++++++++++++++++++++++++++++++++++++++++ drivers/of/unittest.c | 19 ++++- include/linux/of.h | 23 ++++++ 3 files changed, 205 insertions(+), 1 deletion(-)
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index f7bb73cf821e6..cdad63ecb9023 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -486,6 +486,38 @@ struct device_node *__of_node_dup(const struct device_node *np, return NULL; }
+/** + * of_changeset_create_node - Dynamically create a device node and attach to + * a given changeset. + * + * @ocs: Pointer to changeset + * @parent: Pointer to parent device node + * @full_name: Node full name + * + * Return: Pointer to the created device node or NULL in case of an error. + */ +struct device_node *of_changeset_create_node(struct of_changeset *ocs, + struct device_node *parent, + const char *full_name) +{ + struct device_node *np; + int ret; + + np = __of_node_dup(NULL, full_name); + if (!np) + return NULL; + np->parent = parent; + + ret = of_changeset_attach_node(ocs, np); + if (ret) { + of_node_put(np); + return NULL; + } + + return np; +} +EXPORT_SYMBOL(of_changeset_create_node); + static void __of_changeset_entry_destroy(struct of_changeset_entry *ce) { if (ce->action == OF_RECONFIG_ATTACH_NODE && @@ -947,3 +979,135 @@ int of_changeset_action(struct of_changeset *ocs, unsigned long action, return 0; } EXPORT_SYMBOL_GPL(of_changeset_action); + +static int of_changeset_add_prop_helper(struct of_changeset *ocs, + struct device_node *np, + const struct property *pp) +{ + struct property *new_pp; + int ret; + + new_pp = __of_prop_dup(pp, GFP_KERNEL); + if (!new_pp) + return -ENOMEM; + + ret = of_changeset_add_property(ocs, np, new_pp); + if (ret) { + kfree(new_pp->name); + kfree(new_pp->value); + kfree(new_pp); + } + + return ret; +} + +/** + * of_changeset_add_prop_string - Add a string property to a changeset + * + * @ocs: changeset pointer + * @np: device node pointer + * @prop_name: name of the property to be added + * @str: pointer to null terminated string + * + * Create a string property and add it to a changeset. + * + * Return: 0 on success, a negative error value in case of an error. + */ +int of_changeset_add_prop_string(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, const char *str) +{ + struct property prop; + + prop.name = (char *)prop_name; + prop.length = strlen(str) + 1; + prop.value = (void *)str; + + return of_changeset_add_prop_helper(ocs, np, &prop); +} +EXPORT_SYMBOL_GPL(of_changeset_add_prop_string); + +/** + * of_changeset_add_prop_string_array - Add a string list property to + * a changeset + * + * @ocs: changeset pointer + * @np: device node pointer + * @prop_name: name of the property to be added + * @str_array: pointer to an array of null terminated strings + * @sz: number of string array elements + * + * Create a string list property and add it to a changeset. + * + * Return: 0 on success, a negative error value in case of an error. + */ +int of_changeset_add_prop_string_array(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, + const char **str_array, size_t sz) +{ + struct property prop; + int i, ret; + char *vp; + + prop.name = (char *)prop_name; + + prop.length = 0; + for (i = 0; i < sz; i++) + prop.length += strlen(str_array[i]) + 1; + + prop.value = kmalloc(prop.length, GFP_KERNEL); + if (!prop.value) + return -ENOMEM; + + vp = prop.value; + for (i = 0; i < sz; i++) { + vp += snprintf(vp, (char *)prop.value + prop.length - vp, "%s", + str_array[i]) + 1; + } + ret = of_changeset_add_prop_helper(ocs, np, &prop); + kfree(prop.value); + + return ret; +} +EXPORT_SYMBOL_GPL(of_changeset_add_prop_string_array); + +/** + * of_changeset_add_prop_u32_array - Add a property of 32 bit integers + * property to a changeset + * + * @ocs: changeset pointer + * @np: device node pointer + * @prop_name: name of the property to be added + * @array: pointer to an array of 32 bit integers + * @sz: number of array elements + * + * Create a property of 32 bit integers and add it to a changeset. + * + * Return: 0 on success, a negative error value in case of an error. + */ +int of_changeset_add_prop_u32_array(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, + const u32 *array, size_t sz) +{ + struct property prop; + __be32 *val; + int i, ret; + + val = kcalloc(sz, sizeof(__be32), GFP_KERNEL); + if (!val) + return -ENOMEM; + + for (i = 0; i < sz; i++) + val[i] = cpu_to_be32(array[i]); + prop.name = (char *)prop_name; + prop.length = sizeof(u32) * sz; + prop.value = (void *)val; + + ret = of_changeset_add_prop_helper(ocs, np, &prop); + kfree(val); + + return ret; +} +EXPORT_SYMBOL_GPL(of_changeset_add_prop_u32_array); diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c index f6784cce8369b..68e58b085c3db 100644 --- a/drivers/of/unittest.c +++ b/drivers/of/unittest.c @@ -802,7 +802,9 @@ static void __init of_unittest_changeset(void) struct property *ppname_n21, pname_n21 = { .name = "name", .length = 3, .value = "n21" }; struct property *ppupdate, pupdate = { .name = "prop-update", .length = 5, .value = "abcd" }; struct property *ppremove; - struct device_node *n1, *n2, *n21, *nchangeset, *nremove, *parent, *np; + struct device_node *n1, *n2, *n21, *n22, *nchangeset, *nremove, *parent, *np; + static const char * const str_array[] = { "str1", "str2", "str3" }; + const u32 u32_array[] = { 1, 2, 3 }; struct of_changeset chgset;
n1 = __of_node_dup(NULL, "n1"); @@ -857,6 +859,17 @@ static void __init of_unittest_changeset(void) unittest(!of_changeset_add_property(&chgset, parent, ppadd), "fail add prop prop-add\n"); unittest(!of_changeset_update_property(&chgset, parent, ppupdate), "fail update prop\n"); unittest(!of_changeset_remove_property(&chgset, parent, ppremove), "fail remove prop\n"); + n22 = of_changeset_create_node(&chgset, n2, "n22"); + unittest(n22, "fail create n22\n"); + unittest(!of_changeset_add_prop_string(&chgset, n22, "prop-str", "abcd"), + "fail add prop prop-str"); + unittest(!of_changeset_add_prop_string_array(&chgset, n22, "prop-str-array", + (const char **)str_array, + ARRAY_SIZE(str_array)), + "fail add prop prop-str-array"); + unittest(!of_changeset_add_prop_u32_array(&chgset, n22, "prop-u32-array", + u32_array, ARRAY_SIZE(u32_array)), + "fail add prop prop-u32-array");
unittest(!of_changeset_apply(&chgset), "apply failed\n");
@@ -866,6 +879,9 @@ static void __init of_unittest_changeset(void) unittest((np = of_find_node_by_path("/testcase-data/changeset/n2/n21")), "'%pOF' not added\n", n21); of_node_put(np); + unittest((np = of_find_node_by_path("/testcase-data/changeset/n2/n22")), + "'%pOF' not added\n", n22); + of_node_put(np);
unittest(!of_changeset_revert(&chgset), "revert failed\n");
@@ -874,6 +890,7 @@ static void __init of_unittest_changeset(void) of_node_put(n1); of_node_put(n2); of_node_put(n21); + of_node_put(n22); #endif }
diff --git a/include/linux/of.h b/include/linux/of.h index 6ecde0515677d..9b82a6b0f3f55 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -1580,6 +1580,29 @@ static inline int of_changeset_update_property(struct of_changeset *ocs, { return of_changeset_action(ocs, OF_RECONFIG_UPDATE_PROPERTY, np, prop); } + +struct device_node *of_changeset_create_node(struct of_changeset *ocs, + struct device_node *parent, + const char *full_name); +int of_changeset_add_prop_string(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, const char *str); +int of_changeset_add_prop_string_array(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, + const char **str_array, size_t sz); +int of_changeset_add_prop_u32_array(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, + const u32 *array, size_t sz); +static inline int of_changeset_add_prop_u32(struct of_changeset *ocs, + struct device_node *np, + const char *prop_name, + const u32 val) +{ + return of_changeset_add_prop_u32_array(ocs, np, prop_name, &val, 1); +} + #else /* CONFIG_OF_DYNAMIC */ static inline int of_reconfig_notifier_register(struct notifier_block *nb) {
Hello,
On Fri, 24 Nov 2023 17:50:05 +0000 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
Isn't it quite strange to take this patch, which is clearly a feature patch, into a stable branch?
Signed-off-by: Clément Léger clement.leger@bootlin.com Signed-off-by: Lizhi Hou lizhi.hou@amd.com Link: https://lore.kernel.org/r/1692120000-46900-2-git-send-email-lizhi.hou@amd.co... Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card")
This looks very odd. I don't see how c9260693aa0c (which is the trivial addition of a PCIe fixup) can have a dependency on this complex patch that allows to create device nodes in the Device Tree dynamically.
Am I missing something obvious here?
Thomas
On Fri, Nov 24, 2023 at 10:53:41PM +0100, Thomas Petazzoni wrote:
Hello,
On Fri, 24 Nov 2023 17:50:05 +0000 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.5-stable review patch. If anyone has any objections, please let me know.
Isn't it quite strange to take this patch, which is clearly a feature patch, into a stable branch?
Signed-off-by: Clément Léger clement.leger@bootlin.com Signed-off-by: Lizhi Hou lizhi.hou@amd.com Link: https://lore.kernel.org/r/1692120000-46900-2-git-send-email-lizhi.hou@amd.co... Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card")
This looks very odd. I don't see how c9260693aa0c (which is the trivial addition of a PCIe fixup) can have a dependency on this complex patch that allows to create device nodes in the Device Tree dynamically.
Am I missing something obvious here?
No, I thought I caught this already, this is the "stable-dep-of" bot going a bit crazy. I'll go fix this up for the next -rc release, by dropping this, thanks!
greg k-h
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lizhi Hou lizhi.hou@amd.com
[ Upstream commit 407d1a51921e9f28c1bcec647c2205925bd1fdab ]
The PCI endpoint device such as Xilinx Alveo PCI card maps the register spaces from multiple hardware peripherals to its PCI BAR. Normally, the PCI core discovers devices and BARs using the PCI enumeration process. There is no infrastructure to discover the hardware peripherals that are present in a PCI device, and which can be accessed through the PCI BARs.
Apparently, the device tree framework requires a device tree node for the PCI device. Thus, it can generate the device tree nodes for hardware peripherals underneath. Because PCI is self discoverable bus, there might not be a device tree node created for PCI devices. Furthermore, if the PCI device is hot pluggable, when it is plugged in, the device tree nodes for its parent bridges are required. Add support to generate device tree node for PCI bridges.
Add an of_pci_make_dev_node() interface that can be used to create device tree node for PCI devices.
Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on, the kernel will generate device tree nodes for PCI bridges unconditionally.
Initially, add the basic properties for the dynamically generated device tree nodes which include #address-cells, #size-cells, device_type, compatible, ranges, reg.
Acked-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Lizhi Hou lizhi.hou@amd.com Link: https://lore.kernel.org/r/1692120000-46900-3-git-send-email-lizhi.hou@amd.co... Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/Kconfig | 12 ++ drivers/pci/Makefile | 1 + drivers/pci/bus.c | 2 + drivers/pci/of.c | 79 +++++++++ drivers/pci/of_property.c | 355 ++++++++++++++++++++++++++++++++++++++ drivers/pci/pci.h | 12 ++ drivers/pci/remove.c | 1 + 7 files changed, 462 insertions(+) create mode 100644 drivers/pci/of_property.c
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index 3c07d8d214b38..49bd09c7dd0a1 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -194,6 +194,18 @@ config PCI_HYPERV The PCI device frontend driver allows the kernel to import arbitrary PCI devices from a PCI backend to support PCI driver domains.
+config PCI_DYNAMIC_OF_NODES + bool "Create Device tree nodes for PCI devices" + depends on OF + select OF_DYNAMIC + help + This option enables support for generating device tree nodes for some + PCI devices. Thus, the driver of this kind can load and overlay + flattened device tree for its downstream devices. + + Once this option is selected, the device tree nodes will be generated + for all PCI bridges. + choice prompt "PCI Express hierarchy optimization setting" default PCIE_BUS_DEFAULT diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile index 2680e4c92f0ab..cc8b4e01e29de 100644 --- a/drivers/pci/Makefile +++ b/drivers/pci/Makefile @@ -32,6 +32,7 @@ obj-$(CONFIG_PCI_P2PDMA) += p2pdma.o obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o obj-$(CONFIG_VGA_ARB) += vgaarb.o obj-$(CONFIG_PCI_DOE) += doe.o +obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o
# Endpoint library must be initialized before its users obj-$(CONFIG_PCI_ENDPOINT) += endpoint/ diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 46b252bbe5000..9c2137dae429a 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -342,6 +342,8 @@ void pci_bus_add_device(struct pci_dev *dev) */ pcibios_bus_add_device(dev); pci_fixup_device(pci_fixup_final, dev); + if (pci_is_bridge(dev)) + of_pci_make_dev_node(dev); pci_create_sysfs_dev_files(dev); pci_proc_attach_device(dev); pci_bridge_d3_update(dev); diff --git a/drivers/pci/of.c b/drivers/pci/of.c index 3c158b17dcb53..2af64bcb7da3a 100644 --- a/drivers/pci/of.c +++ b/drivers/pci/of.c @@ -606,6 +606,85 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge) return pci_parse_request_of_pci_ranges(dev, bridge); }
+#ifdef CONFIG_PCI_DYNAMIC_OF_NODES + +void of_pci_remove_node(struct pci_dev *pdev) +{ + struct device_node *np; + + np = pci_device_to_OF_node(pdev); + if (!np || !of_node_check_flag(np, OF_DYNAMIC)) + return; + pdev->dev.of_node = NULL; + + of_changeset_revert(np->data); + of_changeset_destroy(np->data); + of_node_put(np); +} + +void of_pci_make_dev_node(struct pci_dev *pdev) +{ + struct device_node *ppnode, *np = NULL; + const char *pci_type; + struct of_changeset *cset; + const char *name; + int ret; + + /* + * If there is already a device tree node linked to this device, + * return immediately. + */ + if (pci_device_to_OF_node(pdev)) + return; + + /* Check if there is device tree node for parent device */ + if (!pdev->bus->self) + ppnode = pdev->bus->dev.of_node; + else + ppnode = pdev->bus->self->dev.of_node; + if (!ppnode) + return; + + if (pci_is_bridge(pdev)) + pci_type = "pci"; + else + pci_type = "dev"; + + name = kasprintf(GFP_KERNEL, "%s@%x,%x", pci_type, + PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn)); + if (!name) + return; + + cset = kmalloc(sizeof(*cset), GFP_KERNEL); + if (!cset) + goto failed; + of_changeset_init(cset); + + np = of_changeset_create_node(cset, ppnode, name); + if (!np) + goto failed; + np->data = cset; + + ret = of_pci_add_properties(pdev, cset, np); + if (ret) + goto failed; + + ret = of_changeset_apply(cset); + if (ret) + goto failed; + + pdev->dev.of_node = np; + kfree(name); + + return; + +failed: + if (np) + of_node_put(np); + kfree(name); +} +#endif + #endif /* CONFIG_PCI */
/** diff --git a/drivers/pci/of_property.c b/drivers/pci/of_property.c new file mode 100644 index 0000000000000..710ec35ba4a17 --- /dev/null +++ b/drivers/pci/of_property.c @@ -0,0 +1,355 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2022-2023, Advanced Micro Devices, Inc. + */ + +#include <linux/pci.h> +#include <linux/of.h> +#include <linux/of_irq.h> +#include <linux/bitfield.h> +#include <linux/bits.h> +#include "pci.h" + +#define OF_PCI_ADDRESS_CELLS 3 +#define OF_PCI_SIZE_CELLS 2 +#define OF_PCI_MAX_INT_PIN 4 + +struct of_pci_addr_pair { + u32 phys_addr[OF_PCI_ADDRESS_CELLS]; + u32 size[OF_PCI_SIZE_CELLS]; +}; + +/* + * Each entry in the ranges table is a tuple containing the child address, + * the parent address, and the size of the region in the child address space. + * Thus, for PCI, in each entry parent address is an address on the primary + * side and the child address is the corresponding address on the secondary + * side. + */ +struct of_pci_range { + u32 child_addr[OF_PCI_ADDRESS_CELLS]; + u32 parent_addr[OF_PCI_ADDRESS_CELLS]; + u32 size[OF_PCI_SIZE_CELLS]; +}; + +#define OF_PCI_ADDR_SPACE_IO 0x1 +#define OF_PCI_ADDR_SPACE_MEM32 0x2 +#define OF_PCI_ADDR_SPACE_MEM64 0x3 + +#define OF_PCI_ADDR_FIELD_NONRELOC BIT(31) +#define OF_PCI_ADDR_FIELD_SS GENMASK(25, 24) +#define OF_PCI_ADDR_FIELD_PREFETCH BIT(30) +#define OF_PCI_ADDR_FIELD_BUS GENMASK(23, 16) +#define OF_PCI_ADDR_FIELD_DEV GENMASK(15, 11) +#define OF_PCI_ADDR_FIELD_FUNC GENMASK(10, 8) +#define OF_PCI_ADDR_FIELD_REG GENMASK(7, 0) + +enum of_pci_prop_compatible { + PROP_COMPAT_PCI_VVVV_DDDD, + PROP_COMPAT_PCICLASS_CCSSPP, + PROP_COMPAT_PCICLASS_CCSS, + PROP_COMPAT_NUM, +}; + +static void of_pci_set_address(struct pci_dev *pdev, u32 *prop, u64 addr, + u32 reg_num, u32 flags, bool reloc) +{ + prop[0] = FIELD_PREP(OF_PCI_ADDR_FIELD_BUS, pdev->bus->number) | + FIELD_PREP(OF_PCI_ADDR_FIELD_DEV, PCI_SLOT(pdev->devfn)) | + FIELD_PREP(OF_PCI_ADDR_FIELD_FUNC, PCI_FUNC(pdev->devfn)); + prop[0] |= flags | reg_num; + if (!reloc) { + prop[0] |= OF_PCI_ADDR_FIELD_NONRELOC; + prop[1] = upper_32_bits(addr); + prop[2] = lower_32_bits(addr); + } +} + +static int of_pci_get_addr_flags(struct resource *res, u32 *flags) +{ + u32 ss; + + if (res->flags & IORESOURCE_IO) + ss = OF_PCI_ADDR_SPACE_IO; + else if (res->flags & IORESOURCE_MEM_64) + ss = OF_PCI_ADDR_SPACE_MEM64; + else if (res->flags & IORESOURCE_MEM) + ss = OF_PCI_ADDR_SPACE_MEM32; + else + return -EINVAL; + + *flags = 0; + if (res->flags & IORESOURCE_PREFETCH) + *flags |= OF_PCI_ADDR_FIELD_PREFETCH; + + *flags |= FIELD_PREP(OF_PCI_ADDR_FIELD_SS, ss); + + return 0; +} + +static int of_pci_prop_bus_range(struct pci_dev *pdev, + struct of_changeset *ocs, + struct device_node *np) +{ + u32 bus_range[] = { pdev->subordinate->busn_res.start, + pdev->subordinate->busn_res.end }; + + return of_changeset_add_prop_u32_array(ocs, np, "bus-range", bus_range, + ARRAY_SIZE(bus_range)); +} + +static int of_pci_prop_ranges(struct pci_dev *pdev, struct of_changeset *ocs, + struct device_node *np) +{ + struct of_pci_range *rp; + struct resource *res; + int i, j, ret; + u32 flags, num; + u64 val64; + + if (pci_is_bridge(pdev)) { + num = PCI_BRIDGE_RESOURCE_NUM; + res = &pdev->resource[PCI_BRIDGE_RESOURCES]; + } else { + num = PCI_STD_NUM_BARS; + res = &pdev->resource[PCI_STD_RESOURCES]; + } + + rp = kcalloc(num, sizeof(*rp), GFP_KERNEL); + if (!rp) + return -ENOMEM; + + for (i = 0, j = 0; j < num; j++) { + if (!resource_size(&res[j])) + continue; + + if (of_pci_get_addr_flags(&res[j], &flags)) + continue; + + val64 = res[j].start; + of_pci_set_address(pdev, rp[i].parent_addr, val64, 0, flags, + false); + if (pci_is_bridge(pdev)) { + memcpy(rp[i].child_addr, rp[i].parent_addr, + sizeof(rp[i].child_addr)); + } else { + /* + * For endpoint device, the lower 64-bits of child + * address is always zero. + */ + rp[i].child_addr[0] = j; + } + + val64 = resource_size(&res[j]); + rp[i].size[0] = upper_32_bits(val64); + rp[i].size[1] = lower_32_bits(val64); + + i++; + } + + ret = of_changeset_add_prop_u32_array(ocs, np, "ranges", (u32 *)rp, + i * sizeof(*rp) / sizeof(u32)); + kfree(rp); + + return ret; +} + +static int of_pci_prop_reg(struct pci_dev *pdev, struct of_changeset *ocs, + struct device_node *np) +{ + struct of_pci_addr_pair reg = { 0 }; + + /* configuration space */ + of_pci_set_address(pdev, reg.phys_addr, 0, 0, 0, true); + + return of_changeset_add_prop_u32_array(ocs, np, "reg", (u32 *)®, + sizeof(reg) / sizeof(u32)); +} + +static int of_pci_prop_interrupts(struct pci_dev *pdev, + struct of_changeset *ocs, + struct device_node *np) +{ + int ret; + u8 pin; + + ret = pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, &pin); + if (ret != 0) + return ret; + + if (!pin) + return 0; + + return of_changeset_add_prop_u32(ocs, np, "interrupts", (u32)pin); +} + +static int of_pci_prop_intr_map(struct pci_dev *pdev, struct of_changeset *ocs, + struct device_node *np) +{ + struct of_phandle_args out_irq[OF_PCI_MAX_INT_PIN]; + u32 i, addr_sz[OF_PCI_MAX_INT_PIN], map_sz = 0; + __be32 laddr[OF_PCI_ADDRESS_CELLS] = { 0 }; + u32 int_map_mask[] = { 0xffff00, 0, 0, 7 }; + struct device_node *pnode; + struct pci_dev *child; + u32 *int_map, *mapp; + int ret; + u8 pin; + + pnode = pci_device_to_OF_node(pdev->bus->self); + if (!pnode) + pnode = pci_bus_to_OF_node(pdev->bus); + + if (!pnode) { + pci_err(pdev, "failed to get parent device node"); + return -EINVAL; + } + + laddr[0] = cpu_to_be32((pdev->bus->number << 16) | (pdev->devfn << 8)); + for (pin = 1; pin <= OF_PCI_MAX_INT_PIN; pin++) { + i = pin - 1; + out_irq[i].np = pnode; + out_irq[i].args_count = 1; + out_irq[i].args[0] = pin; + ret = of_irq_parse_raw(laddr, &out_irq[i]); + if (ret) { + pci_err(pdev, "parse irq %d failed, ret %d", pin, ret); + continue; + } + ret = of_property_read_u32(out_irq[i].np, "#address-cells", + &addr_sz[i]); + if (ret) + addr_sz[i] = 0; + } + + list_for_each_entry(child, &pdev->subordinate->devices, bus_list) { + for (pin = 1; pin <= OF_PCI_MAX_INT_PIN; pin++) { + i = pci_swizzle_interrupt_pin(child, pin) - 1; + map_sz += 5 + addr_sz[i] + out_irq[i].args_count; + } + } + + int_map = kcalloc(map_sz, sizeof(u32), GFP_KERNEL); + mapp = int_map; + + list_for_each_entry(child, &pdev->subordinate->devices, bus_list) { + for (pin = 1; pin <= OF_PCI_MAX_INT_PIN; pin++) { + *mapp = (child->bus->number << 16) | + (child->devfn << 8); + mapp += OF_PCI_ADDRESS_CELLS; + *mapp = pin; + mapp++; + i = pci_swizzle_interrupt_pin(child, pin) - 1; + *mapp = out_irq[i].np->phandle; + mapp++; + if (addr_sz[i]) { + ret = of_property_read_u32_array(out_irq[i].np, + "reg", mapp, + addr_sz[i]); + if (ret) + goto failed; + } + mapp += addr_sz[i]; + memcpy(mapp, out_irq[i].args, + out_irq[i].args_count * sizeof(u32)); + mapp += out_irq[i].args_count; + } + } + + ret = of_changeset_add_prop_u32_array(ocs, np, "interrupt-map", int_map, + map_sz); + if (ret) + goto failed; + + ret = of_changeset_add_prop_u32(ocs, np, "#interrupt-cells", 1); + if (ret) + goto failed; + + ret = of_changeset_add_prop_u32_array(ocs, np, "interrupt-map-mask", + int_map_mask, + ARRAY_SIZE(int_map_mask)); + if (ret) + goto failed; + + kfree(int_map); + return 0; + +failed: + kfree(int_map); + return ret; +} + +static int of_pci_prop_compatible(struct pci_dev *pdev, + struct of_changeset *ocs, + struct device_node *np) +{ + const char *compat_strs[PROP_COMPAT_NUM] = { 0 }; + int i, ret; + + compat_strs[PROP_COMPAT_PCI_VVVV_DDDD] = + kasprintf(GFP_KERNEL, "pci%x,%x", pdev->vendor, pdev->device); + compat_strs[PROP_COMPAT_PCICLASS_CCSSPP] = + kasprintf(GFP_KERNEL, "pciclass,%06x", pdev->class); + compat_strs[PROP_COMPAT_PCICLASS_CCSS] = + kasprintf(GFP_KERNEL, "pciclass,%04x", pdev->class >> 8); + + ret = of_changeset_add_prop_string_array(ocs, np, "compatible", + compat_strs, PROP_COMPAT_NUM); + for (i = 0; i < PROP_COMPAT_NUM; i++) + kfree(compat_strs[i]); + + return ret; +} + +int of_pci_add_properties(struct pci_dev *pdev, struct of_changeset *ocs, + struct device_node *np) +{ + int ret; + + /* + * The added properties will be released when the + * changeset is destroyed. + */ + if (pci_is_bridge(pdev)) { + ret = of_changeset_add_prop_string(ocs, np, "device_type", + "pci"); + if (ret) + return ret; + + ret = of_pci_prop_bus_range(pdev, ocs, np); + if (ret) + return ret; + + ret = of_pci_prop_intr_map(pdev, ocs, np); + if (ret) + return ret; + } + + ret = of_pci_prop_ranges(pdev, ocs, np); + if (ret) + return ret; + + ret = of_changeset_add_prop_u32(ocs, np, "#address-cells", + OF_PCI_ADDRESS_CELLS); + if (ret) + return ret; + + ret = of_changeset_add_prop_u32(ocs, np, "#size-cells", + OF_PCI_SIZE_CELLS); + if (ret) + return ret; + + ret = of_pci_prop_reg(pdev, ocs, np); + if (ret) + return ret; + + ret = of_pci_prop_compatible(pdev, ocs, np); + if (ret) + return ret; + + ret = of_pci_prop_interrupts(pdev, ocs, np); + if (ret) + return ret; + + return 0; +} diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index a4c3974340576..ba717bdd700db 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -679,6 +679,18 @@ static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_br
#endif /* CONFIG_OF */
+struct of_changeset; + +#ifdef CONFIG_PCI_DYNAMIC_OF_NODES +void of_pci_make_dev_node(struct pci_dev *pdev); +void of_pci_remove_node(struct pci_dev *pdev); +int of_pci_add_properties(struct pci_dev *pdev, struct of_changeset *ocs, + struct device_node *np); +#else +static inline void of_pci_make_dev_node(struct pci_dev *pdev) { } +static inline void of_pci_remove_node(struct pci_dev *pdev) { } +#endif + #ifdef CONFIG_PCIEAER void pci_no_aer(void); void pci_aer_init(struct pci_dev *dev); diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c index d68aee29386b4..d749ea8250d65 100644 --- a/drivers/pci/remove.c +++ b/drivers/pci/remove.c @@ -22,6 +22,7 @@ static void pci_stop_dev(struct pci_dev *dev) device_release_driver(&dev->dev); pci_proc_detach_device(dev); pci_remove_sysfs_dev_files(dev); + of_pci_remove_node(dev);
pci_dev_assign_added(dev, false); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lizhi Hou lizhi.hou@amd.com
[ Upstream commit ae9813db1dc5ac987a09889791155a7b8c527f8d ]
The Xilinx Alveo U50 PCI card exposes multiple hardware peripherals on its PCI BAR. The card firmware provides a flattened device tree to describe the hardware peripherals on its BARs. This allows U50 driver to load the flattened device tree and generate the device tree node for hardware peripherals underneath.
To generate device tree node for U50 card, add PCI quirks to call of_pci_make_dev_node() for U50.
Acked-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Lizhi Hou lizhi.hou@amd.com Link: https://lore.kernel.org/r/1692120000-46900-4-git-send-email-lizhi.hou@amd.co... Signed-off-by: Rob Herring robh@kernel.org Stable-dep-of: c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 9fa3c9225bb30..3ec7bcfbf4dc0 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6161,3 +6161,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size); #endif + +/* + * For a PCI device with multiple downstream devices, its driver may use + * a flattened device tree to describe the downstream devices. + * To overlay the flattened device tree, the PCI device and all its ancestor + * devices need to have device tree nodes on system base device tree. Thus, + * before driver probing, it might need to add a device tree node as the final + * fixup. + */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
[ Upstream commit c9260693aa0c1e029ed23693cfd4d7814eee6624 ]
Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume") shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100 msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required for Conventional PCI.
But it turns out that there are PCIe devices which require a longer delay than prescribed before first config space access after reset recovery or resume from D3cold:
Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator "raises a PCI system error (PERR), as reported by the IPMI event log, and the hardware itself would suffer a catastrophic event, cycling the server" unless the longer delay is observed.
The card is specified to conform to PCIe r1.0 and indeed only supports Gen1 speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same 100 msec delay as PCIe r6.1 sec 6.6.1:
To allow components to perform internal initialization, system software must wait for at least 100 ms from the end of a reset (cold/warm/hot) before it is permitted to issue Configuration Requests
The behavior of the Torrent QN16e card thus appears to be a quirk. Treat it as such and lengthen the reset delay for this specific device.
Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume") Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.169530451... Reported-by: Chad Schroeder CSchroeder@sonifi.com Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM6... Tested-by: Chad Schroeder CSchroeder@sonifi.com Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6172,3 +6172,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node); + +/* + * Devices known to require a longer delay before first config space access + * after reset recovery or resume from D3cold: + * + * VideoPropulsion (aka Genroco) Torrent QN16e MPEG QAM Modulator + */ +static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev) +{ + pdev->d3cold_delay = 1000; +} +DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit 67d5404d274376890d6d095a10e6565854918f8e ]
This commit adds a kthread-creation callback to the _torture_create_kthread() function, which allows callers of a new torture_create_kthread_cb() macro to specify a function to be invoked after the kthread is created but before it is awakened for the first time.
Signed-off-by: Paul E. McKenney paulmck@kernel.org Cc: Dietmar Eggemann dietmar.eggemann@arm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Juri Lelli juri.lelli@redhat.com Cc: Valentin Schneider vschneid@redhat.com Cc: Dietmar Eggemann dietmar.eggemann@arm.com Cc: kernel-team@android.com Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Acked-by: John Stultz jstultz@google.com Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/torture.h | 7 +++++-- kernel/torture.c | 6 +++++- 2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/include/linux/torture.h b/include/linux/torture.h index 7038104463e48..bb466eec01e42 100644 --- a/include/linux/torture.h +++ b/include/linux/torture.h @@ -108,12 +108,15 @@ bool torture_must_stop(void); bool torture_must_stop_irq(void); void torture_kthread_stopping(char *title); int _torture_create_kthread(int (*fn)(void *arg), void *arg, char *s, char *m, - char *f, struct task_struct **tp); + char *f, struct task_struct **tp, void (*cbf)(struct task_struct *tp)); void _torture_stop_kthread(char *m, struct task_struct **tp);
#define torture_create_kthread(n, arg, tp) \ _torture_create_kthread(n, (arg), #n, "Creating " #n " task", \ - "Failed to create " #n, &(tp)) + "Failed to create " #n, &(tp), NULL) +#define torture_create_kthread_cb(n, arg, tp, cbf) \ + _torture_create_kthread(n, (arg), #n, "Creating " #n " task", \ + "Failed to create " #n, &(tp), cbf) #define torture_stop_kthread(n, tp) \ _torture_stop_kthread("Stopping " #n " task", &(tp))
diff --git a/kernel/torture.c b/kernel/torture.c index 1a0519b836ac9..1da48f3816f61 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -926,7 +926,7 @@ EXPORT_SYMBOL_GPL(torture_kthread_stopping); * it starts, you will need to open-code your own. */ int _torture_create_kthread(int (*fn)(void *arg), void *arg, char *s, char *m, - char *f, struct task_struct **tp) + char *f, struct task_struct **tp, void (*cbf)(struct task_struct *tp)) { int ret = 0;
@@ -938,6 +938,10 @@ int _torture_create_kthread(int (*fn)(void *arg), void *arg, char *s, char *m, *tp = NULL; return ret; } + + if (cbf) + cbf(*tp); + wake_up_process(*tp); // Process is sleeping, so ordering provided. torture_shuffle_task_register(*tp); return ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dietmar Eggemann dietmar.eggemann@arm.com
[ Upstream commit 5d248bb39fe1388943acb6510f8f48fa5570e0ec ]
This commit adds a module parameter that causes the locktorture writer to run at real-time priority.
To use it: insmod /lib/modules/torture.ko random_shuffle=1 insmod /lib/modules/locktorture.ko torture_type=mutex_lock rt_boost=1 rt_boost_factor=50 nested_locks=3 writer_fifo=1 ^^^^^^^^^^^^^
A predecessor to this patch has been helpful to uncover issues with the proxy-execution series.
[ paulmck: Remove locktorture-specific code from kernel/torture.c. ]
Cc: "Paul E. McKenney" paulmck@kernel.org Cc: Josh Triplett josh@joshtriplett.org Cc: Joel Fernandes joel@joelfernandes.org Cc: Juri Lelli juri.lelli@redhat.com Cc: Valentin Schneider vschneid@redhat.com Cc: kernel-team@android.com Signed-off-by: Dietmar Eggemann dietmar.eggemann@arm.com [jstultz: Include header change to build, reword commit message] Signed-off-by: John Stultz jstultz@google.com Acked-by: Davidlohr Bueso dave@stgolabs.net Signed-off-by: Paul E. McKenney paulmck@kernel.org Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/admin-guide/kernel-parameters.txt | 4 ++++ kernel/locking/locktorture.c | 12 +++++++----- kernel/torture.c | 3 ++- 3 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index b951b4c2808f1..5711129686d10 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2938,6 +2938,10 @@ locktorture.torture_type= [KNL] Specify the locking implementation to test.
+ locktorture.writer_fifo= [KNL] + Run the write-side locktorture kthreads at + sched_set_fifo() real-time priority. + locktorture.verbose= [KNL] Enable additional printk() statements.
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 949d3deae5062..270c7f80ce84c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -45,6 +45,7 @@ torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable"); torture_param(int, rt_boost, 2, "Do periodic rt-boost. 0=Disable, 1=Only for rt_mutex, 2=For all lock types."); torture_param(int, rt_boost_factor, 50, "A factor determining how often rt-boost happens."); +torture_param(int, writer_fifo, 0, "Run writers at sched_set_fifo() priority"); torture_param(int, verbose, 1, "Enable verbose debugging printk()s"); torture_param(int, nested_locks, 0, "Number of nested locks (max = 8)"); /* Going much higher trips "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" errors */ @@ -809,7 +810,8 @@ static int lock_torture_writer(void *arg) bool skip_main_lock;
VERBOSE_TOROUT_STRING("lock_torture_writer task started"); - set_user_nice(current, MAX_NICE); + if (!rt_task(current)) + set_user_nice(current, MAX_NICE);
do { if ((torture_random(&rand) & 0xfffff) == 0) @@ -1015,8 +1017,7 @@ static void lock_torture_cleanup(void)
if (writer_tasks) { for (i = 0; i < cxt.nrealwriters_stress; i++) - torture_stop_kthread(lock_torture_writer, - writer_tasks[i]); + torture_stop_kthread(lock_torture_writer, writer_tasks[i]); kfree(writer_tasks); writer_tasks = NULL; } @@ -1244,8 +1245,9 @@ static int __init lock_torture_init(void) goto create_reader;
/* Create writer. */ - firsterr = torture_create_kthread(lock_torture_writer, &cxt.lwsa[i], - writer_tasks[i]); + firsterr = torture_create_kthread_cb(lock_torture_writer, &cxt.lwsa[i], + writer_tasks[i], + writer_fifo ? sched_set_fifo : NULL); if (torture_init_error(firsterr)) goto unwind;
diff --git a/kernel/torture.c b/kernel/torture.c index 1da48f3816f61..e06b03e987c9f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -37,6 +37,7 @@ #include <linux/ktime.h> #include <asm/byteorder.h> #include <linux/torture.h> +#include <linux/sched/rt.h> #include "rcu/rcu.h"
MODULE_LICENSE("GPL"); @@ -728,7 +729,7 @@ bool stutter_wait(const char *title) cond_resched_tasks_rcu_qs(); spt = READ_ONCE(stutter_pause_test); for (; spt; spt = READ_ONCE(stutter_pause_test)) { - if (!ret) { + if (!ret && !rt_task(current)) { sched_set_normal(current, MAX_NICE); ret = true; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit 872948c665f50a1446e8a34b1ed57bb0b3a9ca4a ]
Given that it is expected that more code will use torture_hrtimeout_*(), including for longer timeouts, make it use TASK_IDLE instead of TASK_UNINTERRUPTIBLE.
Signed-off-by: Paul E. McKenney paulmck@kernel.org Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/torture.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/torture.c b/kernel/torture.c index e06b03e987c9f..4a2e0512f9197 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -90,7 +90,7 @@ int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_s
if (trsp) hto += (torture_random(trsp) >> 3) % fuzzt_ns; - set_current_state(TASK_UNINTERRUPTIBLE); + set_current_state(TASK_IDLE); return schedule_hrtimeout(&hto, HRTIMER_MODE_REL); } EXPORT_SYMBOL_GPL(torture_hrtimeout_ns);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit 10af43671e8bf4ac153c4991a17cdf57bc6d2cfe ]
In order to gain better race coverage, move the test start/stop waits in stutter_wait() to torture_hrtimeout_jiffies().
Signed-off-by: Paul E. McKenney paulmck@kernel.org Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/torture.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/torture.c b/kernel/torture.c index 4a2e0512f9197..a55cb70b192fc 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -734,7 +734,7 @@ bool stutter_wait(const char *title) ret = true; } if (spt == 1) { - schedule_timeout_interruptible(1); + torture_hrtimeout_jiffies(1, NULL); } else if (spt == 2) { while (READ_ONCE(stutter_pause_test)) { if (!(i++ & 0xffff)) @@ -742,7 +742,7 @@ bool stutter_wait(const char *title) cond_resched(); } } else { - schedule_timeout_interruptible(round_jiffies_relative(HZ)); + torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); } torture_shutdown_absorb(title); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit a741deac787f0d2d7068638c067db20af9e63752 ]
The current torture-test sleeps are waiting for a duration, but there are situations where it is better to wait for an absolute time, for example, when ending a stutter interval. This commit therefore adds an hrtimer mode parameter to torture_hrtimeout_ns(). Why not also the other torture_hrtimeout_*() functions? The theory is that most absolute times will be in nanoseconds, especially not (say) jiffies.
Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/torture.h | 3 ++- kernel/torture.c | 13 +++++++------ 2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/linux/torture.h b/include/linux/torture.h index bb466eec01e42..017f0f710815a 100644 --- a/include/linux/torture.h +++ b/include/linux/torture.h @@ -81,7 +81,8 @@ static inline void torture_random_init(struct torture_random_state *trsp) }
/* Definitions for high-resolution-timer sleeps. */ -int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp); +int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, const enum hrtimer_mode mode, + struct torture_random_state *trsp); int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state *trsp); int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state *trsp); int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp); diff --git a/kernel/torture.c b/kernel/torture.c index a55cb70b192fc..fd2168058dac8 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -84,14 +84,15 @@ EXPORT_SYMBOL_GPL(verbose_torout_sleep); * nanosecond random fuzz. This function and its friends desynchronize * testing from the timer wheel. */ -int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, struct torture_random_state *trsp) +int torture_hrtimeout_ns(ktime_t baset_ns, u32 fuzzt_ns, const enum hrtimer_mode mode, + struct torture_random_state *trsp) { ktime_t hto = baset_ns;
if (trsp) hto += (torture_random(trsp) >> 3) % fuzzt_ns; set_current_state(TASK_IDLE); - return schedule_hrtimeout(&hto, HRTIMER_MODE_REL); + return schedule_hrtimeout(&hto, mode); } EXPORT_SYMBOL_GPL(torture_hrtimeout_ns);
@@ -103,7 +104,7 @@ int torture_hrtimeout_us(u32 baset_us, u32 fuzzt_ns, struct torture_random_state { ktime_t baset_ns = baset_us * NSEC_PER_USEC;
- return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_us);
@@ -120,7 +121,7 @@ int torture_hrtimeout_ms(u32 baset_ms, u32 fuzzt_us, struct torture_random_state fuzzt_ns = (u32)~0U; else fuzzt_ns = fuzzt_us * NSEC_PER_USEC; - return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_ms);
@@ -133,7 +134,7 @@ int torture_hrtimeout_jiffies(u32 baset_j, struct torture_random_state *trsp) { ktime_t baset_ns = jiffies_to_nsecs(baset_j);
- return torture_hrtimeout_ns(baset_ns, jiffies_to_nsecs(1), trsp); + return torture_hrtimeout_ns(baset_ns, jiffies_to_nsecs(1), HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_jiffies);
@@ -150,7 +151,7 @@ int torture_hrtimeout_s(u32 baset_s, u32 fuzzt_ms, struct torture_random_state * fuzzt_ns = (u32)~0U; else fuzzt_ns = fuzzt_ms * NSEC_PER_MSEC; - return torture_hrtimeout_ns(baset_ns, fuzzt_ns, trsp); + return torture_hrtimeout_ns(baset_ns, fuzzt_ns, HRTIMER_MODE_REL, trsp); } EXPORT_SYMBOL_GPL(torture_hrtimeout_s);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joel Fernandes (Google) joel@joelfernandes.org
[ Upstream commit cca42bd8eb1b54a4c9bbf48c79d120e66619a3e4 ]
The stuttering code isn't functioning as expected. Ideally, it should pause the torture threads for a designated period before resuming. Yet, it fails to halt the test for the correct duration. Additionally, a race condition exists, potentially causing the stuttering code to pause for an extended period if the 'spt' variable is non-zero due to the stutter orchestration thread's inadequate CPU time.
Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. This happens as the stuttering code may run within a softirq due to RCU callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, thus obstructing RCU's progress. This situation triggers a warning message in the logs:
[ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9
This warning suggests that an RCU torture object, although invisible to RCU readers, couldn't make it past the pipe array and be freed -- a strong indication that there weren't enough grace periods during the stutter interval.
To address these issues, this patch sets the "stutter end" time to an absolute point in the future set by the main stutter thread. This is then used for waiting in stutter_wait(). While the stutter thread still defines this absolute time, the waiters' waiting logic doesn't rely on the stutter thread receiving sufficient CPU time to halt the stuttering as the halting is now self-controlled.
Cc: stable@vger.kernel.org Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/torture.c | 45 ++++++++++++--------------------------------- 1 file changed, 12 insertions(+), 33 deletions(-)
diff --git a/kernel/torture.c b/kernel/torture.c index fd2168058dac8..cd299ccc4e5d5 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -713,7 +713,7 @@ static void torture_shutdown_cleanup(void) * suddenly applied to or removed from the system. */ static struct task_struct *stutter_task; -static int stutter_pause_test; +static ktime_t stutter_till_abs_time; static int stutter; static int stutter_gap;
@@ -723,30 +723,16 @@ static int stutter_gap; */ bool stutter_wait(const char *title) { - unsigned int i = 0; bool ret = false; - int spt; + ktime_t till_ns;
cond_resched_tasks_rcu_qs(); - spt = READ_ONCE(stutter_pause_test); - for (; spt; spt = READ_ONCE(stutter_pause_test)) { - if (!ret && !rt_task(current)) { - sched_set_normal(current, MAX_NICE); - ret = true; - } - if (spt == 1) { - torture_hrtimeout_jiffies(1, NULL); - } else if (spt == 2) { - while (READ_ONCE(stutter_pause_test)) { - if (!(i++ & 0xffff)) - torture_hrtimeout_us(10, 0, NULL); - cond_resched(); - } - } else { - torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL); - } - torture_shutdown_absorb(title); + till_ns = READ_ONCE(stutter_till_abs_time); + if (till_ns && ktime_before(ktime_get(), till_ns)) { + torture_hrtimeout_ns(till_ns, 0, HRTIMER_MODE_ABS, NULL); + ret = true; } + torture_shutdown_absorb(title); return ret; } EXPORT_SYMBOL_GPL(stutter_wait); @@ -757,23 +743,16 @@ EXPORT_SYMBOL_GPL(stutter_wait); */ static int torture_stutter(void *arg) { - DEFINE_TORTURE_RANDOM(rand); - int wtime; + ktime_t till_ns;
VERBOSE_TOROUT_STRING("torture_stutter task started"); do { if (!torture_must_stop() && stutter > 1) { - wtime = stutter; - if (stutter > 2) { - WRITE_ONCE(stutter_pause_test, 1); - wtime = stutter - 3; - torture_hrtimeout_jiffies(wtime, &rand); - wtime = 2; - } - WRITE_ONCE(stutter_pause_test, 2); - torture_hrtimeout_jiffies(wtime, NULL); + till_ns = ktime_add_ns(ktime_get(), + jiffies_to_nsecs(stutter)); + WRITE_ONCE(stutter_till_abs_time, till_ns); + torture_hrtimeout_jiffies(stutter - 1, NULL); } - WRITE_ONCE(stutter_pause_test, 0); if (!torture_must_stop()) torture_hrtimeout_jiffies(stutter_gap, NULL); torture_shutdown_absorb("torture_stutter");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Xu peterx@redhat.com
[ Upstream commit 458568c92953dee3716234711f1a2830a35261f3 ]
follow_page() doesn't use FOLL_PIN, meanwhile hugetlb seems to not be the target of FOLL_WRITE either. However add the checks.
Namely, either the need to CoW due to missing write bit, or proper unsharing on !AnonExclusive pages over R/O pins to reject the follow page. That brings this function closer to follow_hugetlb_page().
So we don't care before, and also for now. But we'll care if we switch over slow-gup to use hugetlb_follow_page_mask(). We'll also care when to return -EMLINK properly, as that's the gup internal api to mean "we should unshare". Not really needed for follow page path, though.
When at it, switching the try_grab_page() to use WARN_ON_ONCE(), to be clear that it just should never fail. When error happens, instead of setting page==NULL, capture the errno instead.
Link: https://lkml.kernel.org/r/20230628215310.73782-3-peterx@redhat.com Signed-off-by: Peter Xu peterx@redhat.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Hugh Dickins hughd@google.com Cc: James Houghton jthoughton@google.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: John Hubbard jhubbard@nvidia.com Cc: Kirill A . Shutemov kirill@shutemov.name Cc: Lorenzo Stoakes lstoakes@gmail.com Cc: Matthew Wilcox willy@infradead.org Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Yang Shi shy828301@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 426056efe835 ("mm/hugetlb: use nth_page() in place of direct struct page manipulation") Signed-off-by: Sasha Levin sashal@kernel.org --- mm/hugetlb.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 097b81c37597e..d231f23088a77 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6521,13 +6521,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, struct page *page = NULL; spinlock_t *ptl; pte_t *pte, entry; - - /* - * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via - * follow_hugetlb_page(). - */ - if (WARN_ON_ONCE(flags & FOLL_PIN)) - return NULL; + int ret;
hugetlb_vma_lock_read(vma); pte = hugetlb_walk(vma, haddr, huge_page_size(h)); @@ -6537,8 +6531,23 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); if (pte_present(entry)) { - page = pte_page(entry) + - ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); + page = pte_page(entry); + + if (!huge_pte_write(entry)) { + if (flags & FOLL_WRITE) { + page = NULL; + goto out; + } + + if (gup_must_unshare(vma, flags, page)) { + /* Tell the caller to do unsharing */ + page = ERR_PTR(-EMLINK); + goto out; + } + } + + page += ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); + /* * Note that page may be a sub-page, and with vmemmap * optimizations the page struct may be read only. @@ -6548,8 +6557,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, * try_grab_page() should always be able to get the page here, * because we hold the ptl lock and have verified pte_present(). */ - if (try_grab_page(page, flags)) { - page = NULL; + ret = try_grab_page(page, flags); + + if (WARN_ON_ONCE(ret)) { + page = ERR_PTR(ret); goto out; } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
[ Upstream commit 426056efe835cf4864ccf4c328fe3af9146fc539 ]
When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle it properly.
A wrong or non-existing page might be tried to be grabbed, either leading to a non freeable page or kernel memory access errors. No bug is reported. It comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-3-zi.yan@sent.com Fixes: 57a196a58421 ("hugetlb: simplify hugetlb handling in follow_page_mask") Signed-off-by: Zi Yan ziy@nvidia.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/hugetlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d231f23088a77..9951fb7412cc7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6546,7 +6546,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, } }
- page += ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); + page = nth_page(page, ((address & ~huge_page_mask(h)) >> PAGE_SHIFT));
/* * Note that page may be a sub-page, and with vmemmap
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit a406b8b424fa01f244c1aab02ba186258448c36b upstream.
Bail out early with error message when trying to boot a 64-bit kernel on 32-bit machines. This fixes the previous commit to include the check for true 64-bit kernels as well.
Signed-off-by: Helge Deller deller@gmx.de Fixes: 591d2108f3abc ("parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/head.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/arch/parisc/kernel/head.S +++ b/arch/parisc/kernel/head.S @@ -70,9 +70,8 @@ $bss_loop: stw,ma %arg2,4(%r1) stw,ma %arg3,4(%r1)
-#if !defined(CONFIG_64BIT) && defined(CONFIG_PA20) - /* This 32-bit kernel was compiled for PA2.0 CPUs. Check current CPU - * and halt kernel if we detect a PA1.x CPU. */ +#if defined(CONFIG_PA20) + /* check for 64-bit capable CPU as required by current kernel */ ldi 32,%r10 mtctl %r10,%cr11 .level 2.0
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 166b0110d1ee53290bd11618df6e3991c117495a upstream.
When calculating the pfn for the iitlbt/idtlbt instruction, do not drop the upper 5 address bits. This doesn't seem to have an effect on physical hardware which uses less physical address bits, but in qemu the missing bits are visible.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/entry.S | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -475,13 +475,13 @@ * to a CPU TLB 4k PFN (4k => 12 bits to shift) */ #define PAGE_ADD_SHIFT (PAGE_SHIFT-12) #define PAGE_ADD_HUGE_SHIFT (REAL_HPAGE_SHIFT-12) + #define PFN_START_BIT (63-ASM_PFN_PTE_SHIFT+(63-58)-PAGE_ADD_SHIFT)
/* Drop prot bits and convert to page addr for iitlbt and idtlbt */ .macro convert_for_tlb_insert20 pte,tmp #ifdef CONFIG_HUGETLB_PAGE copy \pte,\tmp - extrd,u \tmp,(63-ASM_PFN_PTE_SHIFT)+(63-58)+PAGE_ADD_SHIFT,\ - 64-PAGE_SHIFT-PAGE_ADD_SHIFT,\pte + extrd,u \tmp,PFN_START_BIT,PFN_START_BIT+1,\pte
depdi _PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_SHIFT,\pte @@ -489,8 +489,7 @@ depdi _HUGE_PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_HUGE_SHIFT,\pte #else /* Huge pages disabled */ - extrd,u \pte,(63-ASM_PFN_PTE_SHIFT)+(63-58)+PAGE_ADD_SHIFT,\ - 64-PAGE_SHIFT-PAGE_ADD_SHIFT,\pte + extrd,u \pte,PFN_START_BIT,PFN_START_BIT+1,\pte depdi _PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_SHIFT,\pte #endif
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 6ad6e15a9c46b8f0932cd99724f26f3db4db1cdf upstream.
Firmware returns the physical address of the power switch, so need to use gsc_writel() instead of direct memory access.
Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu") Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/power.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/parisc/power.c +++ b/drivers/parisc/power.c @@ -201,7 +201,7 @@ static struct notifier_block parisc_pani static int qemu_power_off(struct sys_off_data *data) { /* this turns the system off via SeaBIOS */ - *(int *)data->cb_data = 0; + gsc_writel(0, (unsigned long) data->cb_data); pdc_soft_power_button(1); return NOTIFY_DONE; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit a5d6264b638efeca35eff72177fd28d149e0764b upstream.
Use the low-power states of the underlying platform to enable runtime PM. If the platform doesn't support runtime D3, then enabling default RPM will result in the controller malfunctioning, as in the case of hotplug devices not being detected because of a failed interrupt generation.
Cc: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20231019102924.2797346-16-mathias.nyman@linux.inte... Cc: Oleksandr Natalenko oleksandr@natalenko.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index bde43cef8846..95ed9404f6f8 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -695,7 +695,9 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) /* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */ pm_runtime_put_noidle(&dev->dev);
- if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW) + if (pci_choose_state(dev, PMSG_SUSPEND) == PCI_D0) + pm_runtime_forbid(&dev->dev); + else if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW) pm_runtime_allow(&dev->dev);
dma_set_max_seg_size(&dev->dev, UINT_MAX);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Layton jlayton@kernel.org
commit 9b6304c1d53745c300b86f202d0dcff395e2d2db upstream.
struct timespec64 has unused bits in the tv_nsec field that can be used for other purposes. In future patches, we're going to change how the inode->i_ctime is accessed in certain inodes in order to make use of them. In order to do that safely though, we'll need to eradicate raw accesses of the inode->i_ctime field from the kernel.
Add new accessor functions for the ctime that we use to replace them.
Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Message-Id: 20230705185812.579118-2-jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/inode.c | 16 ++++++++++++++++ include/linux/fs.h | 45 ++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 60 insertions(+), 1 deletion(-)
--- a/fs/inode.c +++ b/fs/inode.c @@ -2499,6 +2499,22 @@ struct timespec64 current_time(struct in EXPORT_SYMBOL(current_time);
/** + * inode_set_ctime_current - set the ctime to current_time + * @inode: inode + * + * Set the inode->i_ctime to the current value for the inode. Returns + * the current value that was assigned to i_ctime. + */ +struct timespec64 inode_set_ctime_current(struct inode *inode) +{ + struct timespec64 now = current_time(inode); + + inode_set_ctime(inode, now.tv_sec, now.tv_nsec); + return now; +} +EXPORT_SYMBOL(inode_set_ctime_current); + +/** * in_group_or_capable - check whether caller is CAP_FSETID privileged * @idmap: idmap of the mount @inode was found from * @inode: inode to check --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1474,7 +1474,50 @@ static inline bool fsuidgid_has_mapping( kgid_has_mapping(fs_userns, kgid); }
-extern struct timespec64 current_time(struct inode *inode); +struct timespec64 current_time(struct inode *inode); +struct timespec64 inode_set_ctime_current(struct inode *inode); + +/** + * inode_get_ctime - fetch the current ctime from the inode + * @inode: inode from which to fetch ctime + * + * Grab the current ctime from the inode and return it. + */ +static inline struct timespec64 inode_get_ctime(const struct inode *inode) +{ + return inode->i_ctime; +} + +/** + * inode_set_ctime_to_ts - set the ctime in the inode + * @inode: inode in which to set the ctime + * @ts: value to set in the ctime field + * + * Set the ctime in @inode to @ts + */ +static inline struct timespec64 inode_set_ctime_to_ts(struct inode *inode, + struct timespec64 ts) +{ + inode->i_ctime = ts; + return ts; +} + +/** + * inode_set_ctime - set the ctime in the inode + * @inode: inode in which to set the ctime + * @sec: tv_sec value to set + * @nsec: tv_nsec value to set + * + * Set the ctime in @inode to { @sec, @nsec } + */ +static inline struct timespec64 inode_set_ctime(struct inode *inode, + time64_t sec, long nsec) +{ + struct timespec64 ts = { .tv_sec = sec, + .tv_nsec = nsec }; + + return inode_set_ctime_to_ts(inode, ts); +}
/* * Snapshotting support.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 72bc63f5e23a38b65ff2a201bdc11401d4223fa9 upstream.
Fixes some xfstests including generic/564 and generic/157
The "sfu" mount option can be useful for creating special files (character and block devices in particular) but could not create FIFOs. It did recognize existing empty files with the "system" attribute flag as FIFOs but this is too general, so to support creating FIFOs more safely use a new tag (but the same length as those for char and block devices ie "IntxLNK" and "IntxBLK") "LnxFIFO" to indicate that the file should be treated as a FIFO (when mounted with the "sfu"). For some additional context note that "sfu" followed the way that "Services for Unix" on Windows handled these special files (at least for character and block devices and symlinks), which is different than newer Windows which can handle special files as reparse points (which isn't an option to many servers).
Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifspdu.h | 2 +- fs/smb/client/inode.c | 4 ++++ fs/smb/client/smb2ops.c | 8 +++++++- 3 files changed, 12 insertions(+), 2 deletions(-)
--- a/fs/smb/client/cifspdu.h +++ b/fs/smb/client/cifspdu.h @@ -2570,7 +2570,7 @@ typedef struct {
struct win_dev { - unsigned char type[8]; /* IntxCHR or IntxBLK */ + unsigned char type[8]; /* IntxCHR or IntxBLK or LnxFIFO*/ __le64 major; __le64 minor; } __attribute__((packed)); --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -567,6 +567,10 @@ cifs_sfu_type(struct cifs_fattr *fattr, cifs_dbg(FYI, "Symlink\n"); fattr->cf_mode |= S_IFLNK; fattr->cf_dtype = DT_LNK; + } else if (memcmp("LnxFIFO", pbuf, 8) == 0) { + cifs_dbg(FYI, "FIFO\n"); + fattr->cf_mode |= S_IFIFO; + fattr->cf_dtype = DT_FIFO; } else { fattr->cf_mode |= S_IFREG; /* file? */ fattr->cf_dtype = DT_REG; --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -5212,7 +5212,7 @@ smb2_make_node(unsigned int xid, struct * over SMB2/SMB3 and Samba will do this with SMB3.1.1 POSIX Extensions */
- if (!S_ISCHR(mode) && !S_ISBLK(mode)) + if (!S_ISCHR(mode) && !S_ISBLK(mode) && !S_ISFIFO(mode)) return rc;
cifs_dbg(FYI, "sfu compat create special file\n"); @@ -5260,6 +5260,12 @@ smb2_make_node(unsigned int xid, struct pdev->minor = cpu_to_le64(MINOR(dev)); rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms, &bytes_written, iov, 1); + } else if (S_ISFIFO(mode)) { + memcpy(pdev->type, "LnxFIFO", 8); + pdev->major = 0; + pdev->minor = 0; + rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms, + &bytes_written, iov, 1); } tcon->ses->server->ops->close(xid, tcon, &fid); d_drop(dentry);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 475efd9808a3094944a56240b2711349e433fb66 upstream.
For example: touch -h -t 02011200 testfile where testfile is a symlink would not change the timestamp, but touch -t 02011200 testfile does work to change the timestamp of the target
Suggested-by: David Howells dhowells@redhat.com Reported-by: Micah Veilleux micah.veilleux@iba-group.com Closes: https://bugzilla.samba.org/show_bug.cgi?id=14476 Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifsfs.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -1191,6 +1191,7 @@ const char *cifs_get_link(struct dentry
const struct inode_operations cifs_symlink_inode_ops = { .get_link = cifs_get_link, + .setattr = cifs_setattr, .permission = cifs_permission, .listxattr = cifs_listxattr, };
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit de4eceab578ead12a71e5b5588a57e142bbe8ceb upstream.
When multiple mounts are to the same share from the same client it was not possible to determine which section of /proc/fs/cifs/Stats (and DebugData) correspond to that mount. In some recent examples this turned out to be a significant problem when trying to analyze performance data - since there are many cases where unless we know the tree id and session id we can't figure out which stats (e.g. number of SMB3.1.1 requests by type, the total time they take, which is slowest, how many fail etc.) apply to which mount. The only existing loosely related ioctl CIFS_IOC_GET_MNT_INFO does not return the information needed to uniquely identify which tcon is which mount although it does return various flags and device info.
Add a cifs.ko ioctl CIFS_IOC_GET_TCON_INFO (0x800ccf0c) to return tid, session id, tree connect count.
Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifs_ioctl.h | 6 ++++++ fs/smb/client/ioctl.c | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+)
--- a/fs/smb/client/cifs_ioctl.h +++ b/fs/smb/client/cifs_ioctl.h @@ -26,6 +26,11 @@ struct smb_mnt_fs_info { __u64 cifs_posix_caps; } __packed;
+struct smb_mnt_tcon_info { + __u32 tid; + __u64 session_id; +} __packed; + struct smb_snapshot_array { __u32 number_of_snapshots; __u32 number_of_snapshots_returned; @@ -108,6 +113,7 @@ struct smb3_notify_info { #define CIFS_IOC_NOTIFY _IOW(CIFS_IOCTL_MAGIC, 9, struct smb3_notify) #define CIFS_DUMP_FULL_KEY _IOWR(CIFS_IOCTL_MAGIC, 10, struct smb3_full_key_debug_info) #define CIFS_IOC_NOTIFY_INFO _IOWR(CIFS_IOCTL_MAGIC, 11, struct smb3_notify_info) +#define CIFS_IOC_GET_TCON_INFO _IOR(CIFS_IOCTL_MAGIC, 12, struct smb_mnt_tcon_info) #define CIFS_IOC_SHUTDOWN _IOR('X', 125, __u32)
/* --- a/fs/smb/client/ioctl.c +++ b/fs/smb/client/ioctl.c @@ -117,6 +117,20 @@ out_drop_write: return rc; }
+static long smb_mnt_get_tcon_info(struct cifs_tcon *tcon, void __user *arg) +{ + int rc = 0; + struct smb_mnt_tcon_info tcon_inf; + + tcon_inf.tid = tcon->tid; + tcon_inf.session_id = tcon->ses->Suid; + + if (copy_to_user(arg, &tcon_inf, sizeof(struct smb_mnt_tcon_info))) + rc = -EFAULT; + + return rc; +} + static long smb_mnt_get_fsinfo(unsigned int xid, struct cifs_tcon *tcon, void __user *arg) { @@ -414,6 +428,17 @@ long cifs_ioctl(struct file *filep, unsi tcon = tlink_tcon(pSMBFile->tlink); rc = smb_mnt_get_fsinfo(xid, tcon, (void __user *)arg); break; + case CIFS_IOC_GET_TCON_INFO: + cifs_sb = CIFS_SB(inode->i_sb); + tlink = cifs_sb_tlink(cifs_sb); + if (IS_ERR(tlink)) { + rc = PTR_ERR(tlink); + break; + } + tcon = tlink_tcon(tlink); + rc = smb_mnt_get_tcon_info(tcon, (void __user *)arg); + cifs_put_tlink(tlink); + break; case CIFS_ENUMERATE_SNAPSHOTS: if (pSMBFile == NULL) break;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steve French stfrench@microsoft.com
commit 5923d6686a100c2b4cabd4c2ca9d5a12579c7614 upstream.
Fixes xfstest generic/728 which had been failing due to incorrect ctime after setxattr and removexattr
Update ctime on successful set of xattr
Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/xattr.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/smb/client/xattr.c +++ b/fs/smb/client/xattr.c @@ -150,10 +150,13 @@ static int cifs_xattr_set(const struct x if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_XATTR) goto out;
- if (pTcon->ses->server->ops->set_EA) + if (pTcon->ses->server->ops->set_EA) { rc = pTcon->ses->server->ops->set_EA(xid, pTcon, full_path, name, value, (__u16)size, cifs_sb->local_nls, cifs_sb); + if (rc == 0) + inode_set_ctime_current(inode); + } break;
case XATTR_CIFS_ACL:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.com
commit d328c09ee9f15ee5a26431f5aad7c9239fa85e62 upstream.
Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @ses.
This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting
[ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ... [ 816.260138] Call Trace: [ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifs_debug.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/fs/smb/client/cifs_debug.c +++ b/fs/smb/client/cifs_debug.c @@ -452,6 +452,11 @@ skip_rdma: seq_printf(m, "\n\n\tSessions: "); i = 0; list_for_each_entry(ses, &server->smb_ses_list, smb_ses_list) { + spin_lock(&ses->ses_lock); + if (ses->ses_status == SES_EXITING) { + spin_unlock(&ses->ses_lock); + continue; + } i++; if ((ses->serverDomain == NULL) || (ses->serverOS == NULL) || @@ -472,6 +477,7 @@ skip_rdma: ses->ses_count, ses->serverOS, ses->serverNOS, ses->capabilities, ses->ses_status); } + spin_unlock(&ses->ses_lock);
seq_printf(m, "\n\tSecurity type: %s ", get_security_type_str(server->ops->select_sectype(server, ses->sectype)));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.com
commit 5c86919455c1edec99ebd3338ad213b59271a71b upstream.
The following UAF was triggered when running fstests generic/072 with KASAN enabled against Windows Server 2022 and mount options 'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm'
BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs] Read of size 8 at addr ffff888014941048 by task xfs_io/27534
CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? __phys_addr+0x46/0x90 kasan_report+0xda/0x110 ? smb2_query_info_compound+0x423/0x6d0 [cifs] ? smb2_query_info_compound+0x423/0x6d0 [cifs] smb2_query_info_compound+0x423/0x6d0 [cifs] ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __stack_depot_save+0x39/0x480 ? kasan_save_stack+0x33/0x60 ? kasan_set_track+0x25/0x30 ? ____kasan_slab_free+0x126/0x170 smb2_queryfs+0xc2/0x2c0 [cifs] ? __pfx_smb2_queryfs+0x10/0x10 [cifs] ? __pfx___lock_acquire+0x10/0x10 smb311_queryfs+0x210/0x220 [cifs] ? __pfx_smb311_queryfs+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __lock_acquire+0x480/0x26c0 ? lock_release+0x1ed/0x640 ? srso_alias_return_thunk+0x5/0x7f ? do_raw_spin_unlock+0x9b/0x100 cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 ? __pfx___do_sys_fstatfs+0x10/0x10 ? srso_alias_return_thunk+0x5/0x7f ? lockdep_hardirqs_on_prepare+0x136/0x200 ? srso_alias_return_thunk+0x5/0x7f do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Allocated by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x8f/0xa0 open_cached_dir+0x71b/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] smb311_queryfs+0x210/0x220 [cifs] cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Freed by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 ____kasan_slab_free+0x126/0x170 slab_free_freelist_hook+0xd0/0x1e0 __kmem_cache_free+0x9d/0x1b0 open_cached_dir+0xff5/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs]
This is a race between open_cached_dir() and cached_dir_lease_break() where the cache entry for the open directory handle receives a lease break while creating it. And before returning from open_cached_dir(), we put the last reference of the new @cfid because of !@cfid->has_lease.
Besides the UAF, while running xfstests a lot of missed lease breaks have been noticed in tests that run several concurrent statfs(2) calls on those cached fids
CIFS: VFS: \w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \w22-root1.gandalf.test smb buf 00000000715bfe83 len 108 CIFS: VFS: Dump pending requests: CIFS: VFS: \w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \w22-root1.gandalf.test smb buf 000000005aa7316e len 108 ...
To fix both, in open_cached_dir() ensure that @cfid->has_lease is set right before sending out compounded request so that any potential lease break will be get processed by demultiplex thread while we're still caching @cfid. And, if open failed for some reason, re-check @cfid->has_lease to decide whether or not put lease reference.
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cached_dir.c | 84 ++++++++++++++++++++++++++------------------- 1 file changed, 49 insertions(+), 35 deletions(-)
--- a/fs/smb/client/cached_dir.c +++ b/fs/smb/client/cached_dir.c @@ -32,7 +32,7 @@ static struct cached_fid *find_or_create * fully cached or it may be in the process of * being deleted due to a lease break. */ - if (!cfid->has_lease) { + if (!cfid->time || !cfid->has_lease) { spin_unlock(&cfids->cfid_list_lock); return NULL; } @@ -193,10 +193,20 @@ int open_cached_dir(unsigned int xid, st npath = path_no_prefix(cifs_sb, path); if (IS_ERR(npath)) { rc = PTR_ERR(npath); - kfree(utf16_path); - return rc; + goto out; }
+ if (!npath[0]) { + dentry = dget(cifs_sb->root); + } else { + dentry = path_to_dentry(cifs_sb, npath); + if (IS_ERR(dentry)) { + rc = -ENOENT; + goto out; + } + } + cfid->dentry = dentry; + /* * We do not hold the lock for the open because in case * SMB2_open needs to reconnect. @@ -249,6 +259,15 @@ int open_cached_dir(unsigned int xid, st
smb2_set_related(&rqst[1]);
+ /* + * Set @cfid->has_lease to true before sending out compounded request so + * its lease reference can be put in cached_dir_lease_break() due to a + * potential lease break right after the request is sent or while @cfid + * is still being cached. Concurrent processes won't be to use it yet + * due to @cfid->time being zero. + */ + cfid->has_lease = true; + rc = compound_send_recv(xid, ses, server, flags, 2, rqst, resp_buftype, rsp_iov); @@ -263,6 +282,8 @@ int open_cached_dir(unsigned int xid, st cfid->tcon = tcon; cfid->is_open = true;
+ spin_lock(&cfids->cfid_list_lock); + o_rsp = (struct smb2_create_rsp *)rsp_iov[0].iov_base; oparms.fid->persistent_fid = o_rsp->PersistentFileId; oparms.fid->volatile_fid = o_rsp->VolatileFileId; @@ -270,18 +291,25 @@ int open_cached_dir(unsigned int xid, st oparms.fid->mid = le64_to_cpu(o_rsp->hdr.MessageId); #endif /* CIFS_DEBUG2 */
- if (o_rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE) + rc = -EINVAL; + if (o_rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE) { + spin_unlock(&cfids->cfid_list_lock); goto oshr_free; + }
smb2_parse_contexts(server, o_rsp, &oparms.fid->epoch, oparms.fid->lease_key, &oplock, NULL, NULL); - if (!(oplock & SMB2_LEASE_READ_CACHING_HE)) + if (!(oplock & SMB2_LEASE_READ_CACHING_HE)) { + spin_unlock(&cfids->cfid_list_lock); goto oshr_free; + } qi_rsp = (struct smb2_query_info_rsp *)rsp_iov[1].iov_base; - if (le32_to_cpu(qi_rsp->OutputBufferLength) < sizeof(struct smb2_file_all_info)) + if (le32_to_cpu(qi_rsp->OutputBufferLength) < sizeof(struct smb2_file_all_info)) { + spin_unlock(&cfids->cfid_list_lock); goto oshr_free; + } if (!smb2_validate_and_copy_iov( le16_to_cpu(qi_rsp->OutputBufferOffset), sizeof(struct smb2_file_all_info), @@ -289,37 +317,24 @@ int open_cached_dir(unsigned int xid, st (char *)&cfid->file_all_info)) cfid->file_all_info_is_valid = true;
- if (!npath[0]) - dentry = dget(cifs_sb->root); - else { - dentry = path_to_dentry(cifs_sb, npath); - if (IS_ERR(dentry)) { - rc = -ENOENT; - goto oshr_free; - } - } - spin_lock(&cfids->cfid_list_lock); - cfid->dentry = dentry; cfid->time = jiffies; - cfid->has_lease = true; spin_unlock(&cfids->cfid_list_lock); + /* At this point the directory handle is fully cached */ + rc = 0;
oshr_free: - kfree(utf16_path); SMB2_open_free(&rqst[0]); SMB2_query_info_free(&rqst[1]); free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base); free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base); - spin_lock(&cfids->cfid_list_lock); - if (!cfid->has_lease) { - if (rc) { - if (cfid->on_list) { - list_del(&cfid->entry); - cfid->on_list = false; - cfids->num_entries--; - } - rc = -ENOENT; - } else { + if (rc) { + spin_lock(&cfids->cfid_list_lock); + if (cfid->on_list) { + list_del(&cfid->entry); + cfid->on_list = false; + cfids->num_entries--; + } + if (cfid->has_lease) { /* * We are guaranteed to have two references at this * point. One for the caller and one for a potential @@ -327,25 +342,24 @@ oshr_free: * will be closed when the caller closes the cached * handle. */ + cfid->has_lease = false; spin_unlock(&cfids->cfid_list_lock); kref_put(&cfid->refcount, smb2_close_cached_fid); goto out; } + spin_unlock(&cfids->cfid_list_lock); } - spin_unlock(&cfids->cfid_list_lock); +out: if (rc) { if (cfid->is_open) SMB2_close(0, cfid->tcon, cfid->fid.persistent_fid, cfid->fid.volatile_fid); free_cached_dir(cfid); - cfid = NULL; - } -out: - if (rc == 0) { + } else { *ret_cfid = cfid; atomic_inc(&tcon->num_remote_opens); } - + kfree(utf16_path); return rc; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.com
commit e6322fd177c6885a21dd4609dc5e5c973d1a2eb7 upstream.
All release_mid() callers seem to hold a reference of @mid so there is no need to call kref_put(&mid->refcount, __release_mid) under @server->mid_lock spinlock. If they don't, then an use-after-free bug would have occurred anyways.
By getting rid of such spinlock also fixes a potential deadlock as shown below
CPU 0 CPU 1 ------------------------------------------------------------------ cifs_demultiplex_thread() cifs_debug_data_proc_show() release_mid() spin_lock(&server->mid_lock); spin_lock(&cifs_tcp_ses_lock) spin_lock(&server->mid_lock) __release_mid() smb2_find_smb_tcon() spin_lock(&cifs_tcp_ses_lock) *deadlock*
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifsproto.h | 7 ++++++- fs/smb/client/smb2misc.c | 2 +- fs/smb/client/transport.c | 11 +---------- 3 files changed, 8 insertions(+), 12 deletions(-)
--- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -81,7 +81,7 @@ extern char *cifs_build_path_to_root(str extern char *build_wildcard_path_from_dentry(struct dentry *direntry); char *cifs_build_devname(char *nodename, const char *prepath); extern void delete_mid(struct mid_q_entry *mid); -extern void release_mid(struct mid_q_entry *mid); +void __release_mid(struct kref *refcount); extern void cifs_wake_up_task(struct mid_q_entry *mid); extern int cifs_handle_standard(struct TCP_Server_Info *server, struct mid_q_entry *mid); @@ -741,4 +741,9 @@ static inline bool dfs_src_pathname_equa return true; }
+static inline void release_mid(struct mid_q_entry *mid) +{ + kref_put(&mid->refcount, __release_mid); +} + #endif /* _CIFSPROTO_H */ --- a/fs/smb/client/smb2misc.c +++ b/fs/smb/client/smb2misc.c @@ -787,7 +787,7 @@ __smb2_handle_cancelled_cmd(struct cifs_ { struct close_cancelled_open *cancelled;
- cancelled = kzalloc(sizeof(*cancelled), GFP_ATOMIC); + cancelled = kzalloc(sizeof(*cancelled), GFP_KERNEL); if (!cancelled) return -ENOMEM;
--- a/fs/smb/client/transport.c +++ b/fs/smb/client/transport.c @@ -76,7 +76,7 @@ alloc_mid(const struct smb_hdr *smb_buff return temp; }
-static void __release_mid(struct kref *refcount) +void __release_mid(struct kref *refcount) { struct mid_q_entry *midEntry = container_of(refcount, struct mid_q_entry, refcount); @@ -156,15 +156,6 @@ static void __release_mid(struct kref *r mempool_free(midEntry, cifs_mid_poolp); }
-void release_mid(struct mid_q_entry *mid) -{ - struct TCP_Server_Info *server = mid->server; - - spin_lock(&server->mid_lock); - kref_put(&mid->refcount, __release_mid); - spin_unlock(&server->mid_lock); -} - void delete_mid(struct mid_q_entry *mid) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shyam Prasad N sprasad@microsoft.com
commit c3326a61cdbf3ce1273d9198b6cbf90965d7e029 upstream.
We introduced a helper function to be used by non-cifsd threads to mark the connection for reconnect. For multichannel, when only a particular channel needs to be reconnected, this had a bug.
This change fixes that by marking that particular channel for reconnect.
Fixes: dca65818c80c ("cifs: use a different reconnect helper for non-cifsd threads") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N sprasad@microsoft.com Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/connect.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -156,13 +156,14 @@ cifs_signal_cifsd_for_reconnect(struct T /* If server is a channel, select the primary channel */ pserver = CIFS_SERVER_IS_CHAN(server) ? server->primary_server : server;
- spin_lock(&pserver->srv_lock); + /* if we need to signal just this channel */ if (!all_channels) { - pserver->tcpStatus = CifsNeedReconnect; - spin_unlock(&pserver->srv_lock); + spin_lock(&server->srv_lock); + if (server->tcpStatus != CifsExiting) + server->tcpStatus = CifsNeedReconnect; + spin_unlock(&server->srv_lock); return; } - spin_unlock(&pserver->srv_lock);
spin_lock(&cifs_tcp_ses_lock); list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shyam Prasad N sprasad@microsoft.com
commit d9a6d78096056a3cb5c5f07a730ab92f2f9ac4e6 upstream.
During a session reconnect, it is possible that the server moved to another physical server (happens in case of Azure files). So at this time, force a query of server interfaces again (in case of multichannel session), such that the secondary channels connect to the right IP addresses (possibly updated now).
Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/connect.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -3850,8 +3850,12 @@ cifs_setup_session(const unsigned int xi is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); spin_unlock(&ses->chan_lock);
- if (!is_binding) + if (!is_binding) { ses->ses_status = SES_IN_SETUP; + + /* force iface_list refresh */ + ses->iface_last_update = 0; + } spin_unlock(&ses->ses_lock);
/* update ses ip_addr only for primary chan */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shyam Prasad N sprasad@microsoft.com
commit 6e5e64c9477d58e73cb1a0e83eacad1f8df247cf upstream.
If the mount command has specified multichannel as a mount option, but multichannel is found to be unsupported by the server at the time of mount, we set chan_max to 1. Which means that the user needs to remount the share if the server starts supporting multichannel.
This change removes this reset. What it means is that if the user specified multichannel or max_channels during mount, and at this time, multichannel is not supported, but the server starts supporting it at a later point, the client will be capable of scaling out the number of channels.
Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/sess.c | 1 - 1 file changed, 1 deletion(-)
--- a/fs/smb/client/sess.c +++ b/fs/smb/client/sess.c @@ -186,7 +186,6 @@ int cifs_try_adding_channels(struct cifs }
if (!(server->capabilities & SMB2_GLOBAL_CAP_MULTI_CHANNEL)) { - ses->chan_max = 1; spin_unlock(&ses->chan_lock); cifs_server_dbg(VFS, "no multichannel support\n"); return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells dhowells@redhat.com
commit 37de5a80e932f828c34abeaae63170d73930dca3 upstream.
Each smb_rqst struct contains two things: an array of kvecs (rq_iov) that contains the protocol data for an RPC op and an iterator (rq_iter) that contains the data payload of an RPC op. When an smb_rqst is allocated rq_iter is it always cleared, but we don't set it up unless we're going to use it.
The functions that determines the size of the ciphertext buffer that will be needed to encrypt a request, cifs_get_num_sgs(), assumes that rq_iter is always initialised - and employs user_backed_iter() to check that the iterator isn't user-backed. This used to incidentally work, because ->user_backed was set to false because the iterator has never been initialised, but with commit f1b4cb650b9a0eeba206d8f069fcdc532bfbcd74[1] which changes user_backed_iter() to determine this based on the iterator type insted, a warning is now emitted:
WARNING: CPU: 7 PID: 4584 at fs/smb/client/cifsglob.h:2165 smb2_get_aead_req+0x3fc/0x420 [cifs] ... RIP: 0010:smb2_get_aead_req+0x3fc/0x420 [cifs] ... crypt_message+0x33e/0x550 [cifs] smb3_init_transform_rq+0x27d/0x3f0 [cifs] smb_send_rqst+0xc7/0x160 [cifs] compound_send_recv+0x3ca/0x9f0 [cifs] cifs_send_recv+0x25/0x30 [cifs] SMB2_tcon+0x38a/0x820 [cifs] cifs_get_smb_ses+0x69c/0xee0 [cifs] cifs_mount_get_session+0x76/0x1d0 [cifs] dfs_mount_share+0x74/0x9d0 [cifs] cifs_mount+0x6e/0x2e0 [cifs] cifs_smb3_do_mount+0x143/0x300 [cifs] smb3_get_tree+0x15e/0x290 [cifs] vfs_get_tree+0x2d/0xe0 do_new_mount+0x124/0x340 __se_sys_mount+0x143/0x1a0
The problem is that rq_iter was never set, so the type is 0 (ie. ITER_UBUF) which causes user_backed_iter() to return true. The code doesn't malfunction because it checks the size of the iterator - which is 0.
Fix cifs_get_num_sgs() to ignore rq_iter if its count is 0, thereby bypassing the warnings.
It might be better to explicitly initialise rq_iter to a zero-length ITER_BVEC, say, as it can always be reinitialised later.
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Reported-by: Damian Tometzki damian@riscv-rocks.de Closes: https://lore.kernel.org/r/ZUfQo47uo0p2ZsYg@fedora.fritz.box/ Tested-by: Damian Tometzki damian@riscv-rocks.de Cc: stable@vger.kernel.org cc: Eric Biggers ebiggers@kernel.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [1] Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifsglob.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -2113,6 +2113,7 @@ static inline int cifs_get_num_sgs(const unsigned int len, skip; unsigned int nents = 0; unsigned long addr; + size_t data_size; int i, j;
/* @@ -2128,17 +2129,21 @@ static inline int cifs_get_num_sgs(const * rqst[1+].rq_iov[0+] data to be encrypted/decrypted */ for (i = 0; i < num_rqst; i++) { + data_size = iov_iter_count(&rqst[i].rq_iter); + /* We really don't want a mixture of pinned and unpinned pages * in the sglist. It's hard to keep track of which is what. * Instead, we convert to a BVEC-type iterator higher up. */ - if (WARN_ON_ONCE(user_backed_iter(&rqst[i].rq_iter))) + if (data_size && + WARN_ON_ONCE(user_backed_iter(&rqst[i].rq_iter))) return -EIO;
/* We also don't want to have any extra refs or pins to clean * up in the sglist. */ - if (WARN_ON_ONCE(iov_iter_extract_will_pin(&rqst[i].rq_iter))) + if (data_size && + WARN_ON_ONCE(iov_iter_extract_will_pin(&rqst[i].rq_iter))) return -EIO;
for (j = 0; j < rqst[i].rq_nvec; j++) { @@ -2154,7 +2159,8 @@ static inline int cifs_get_num_sgs(const } skip = 0; } - nents += iov_iter_npages(&rqst[i].rq_iter, INT_MAX); + if (data_size) + nents += iov_iter_npages(&rqst[i].rq_iter, INT_MAX); } nents += DIV_ROUND_UP(offset_in_page(sig) + SMB2_SIGNATURE_SIZE, PAGE_SIZE); return nents;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner dchinner@redhat.com
commit 7930d9e103700cde15833638855b750715c12091 upstream.
Because on v3 inodes, di_flushiter doesn't exist. It overlaps with zero padding in the inode, except when NREXT64=1 configurations are in use and the zero padding is no longer padding but holds the 64 bit extent counter.
This manifests obviously on big endian platforms (e.g. s390) because the log dinode is in host order and the overlap is the LSBs of the extent count field. It is not noticed on little endian machines because the overlap is at the MSB end of the extent count field and we need to get more than 2^^48 extents in the inode before it manifests. i.e. the heat death of the universe will occur before we see the problem in little endian machines.
This is a zero-day issue for NREXT64=1 configuraitons on big endian machines. Fix it by only clearing di_flushiter on v2 inodes during recovery.
Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers") cc: stable@kernel.org # 5.19+ Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: "Darrick J. Wong" djwong@kernel.org Signed-off-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_inode_item_recover.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
--- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -369,24 +369,26 @@ xlog_recover_inode_commit_pass2( * superblock flag to determine whether we need to look at di_flushiter * to skip replay when the on disk inode is newer than the log one */ - if (!xfs_has_v3inodes(mp) && - ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) { - /* - * Deal with the wrap case, DI_MAX_FLUSH is less - * than smaller numbers - */ - if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH && - ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) { - /* do nothing */ - } else { - trace_xfs_log_recover_inode_skip(log, in_f); - error = 0; - goto out_release; + if (!xfs_has_v3inodes(mp)) { + if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) { + /* + * Deal with the wrap case, DI_MAX_FLUSH is less + * than smaller numbers + */ + if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH && + ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) { + /* do nothing */ + } else { + trace_xfs_log_recover_inode_skip(log, in_f); + error = 0; + goto out_release; + } } + + /* Take the opportunity to reset the flush iteration count */ + ldip->di_flushiter = 0; }
- /* Take the opportunity to reset the flush iteration count */ - ldip->di_flushiter = 0;
if (unlikely(S_ISREG(ldip->di_mode))) { if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naohiro Aota naohiro.aota@wdc.com
commit 776a838f1fa95670c1c1cf7109a898090b473fa3 upstream.
Running the fio command below on a ZNS device results in "Resource temporarily unavailable" error.
$ sudo fio --name=w --directory=/mnt --filesize=1GB --bs=16MB --numjobs=16 \ --rw=write --ioengine=libaio --iodepth=128 --direct=1
fio: io_u error on file /mnt/w.2.0: Resource temporarily unavailable: write offset=117440512, buflen=16777216 fio: io_u error on file /mnt/w.2.0: Resource temporarily unavailable: write offset=134217728, buflen=16777216 ...
This happens because -EAGAIN error returned from btrfs_reserve_extent() called from btrfs_new_extent_direct() is spilling over to the userland.
btrfs_reserve_extent() returns -EAGAIN when there is no active zone available. Then, the caller should wait for some other on-going IO to finish a zone and retry the allocation.
This logic is already implemented for buffered write in cow_file_range(), but it is missing for the direct IO counterpart. Implement the same logic for it.
Reported-by: Shinichiro Kawasaki shinichiro.kawasaki@wdc.com Fixes: 2ce543f47843 ("btrfs: zoned: wait until zone is finished when allocation didn't progress") CC: stable@vger.kernel.org # 6.1+ Tested-by: Shinichiro Kawasaki shinichiro.kawasaki@wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7142,8 +7142,15 @@ static struct extent_map *btrfs_new_exte int ret;
alloc_hint = get_extent_allocation_hint(inode, start, len); +again: ret = btrfs_reserve_extent(root, len, len, fs_info->sectorsize, 0, alloc_hint, &ins, 1, 1); + if (ret == -EAGAIN) { + ASSERT(btrfs_is_zoned(fs_info)); + wait_on_bit_io(&inode->root->fs_info->flags, BTRFS_FS_NEED_ZONE_FINISH, + TASK_UNINTERRUPTIBLE); + goto again; + } if (ret) return ERR_PTR(ret);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit c7a60651953359f98dbf24b43e1bf561e1573ed4 upstream.
As reported recently, ALSA core info helper may cause a deadlock at the forced device disconnection during the procfs operation.
The proc_remove() (that is called from the snd_card_disconnect() helper) has a synchronization of the pending procfs accesses via wait_for_completion(). Meanwhile, ALSA procfs helper takes the global mutex_lock(&info_mutex) at both the proc_open callback and snd_card_info_disconnect() helper. Since the proc_open can't finish due to the mutex lock, wait_for_completion() never returns, either, hence it deadlocks.
TASK#1 TASK#2 proc_reg_open() takes use_pde() snd_info_text_entry_open() snd_card_disconnect() snd_info_card_disconnect() takes mutex_lock(&info_mutex) proc_remove() wait_for_completion(unused_pde) ... waiting task#1 closes mutex_lock(&info_mutex) => DEADLOCK
This patch is a workaround for avoiding the deadlock scenario above.
The basic strategy is to move proc_remove() call outside the mutex lock. proc_remove() can work gracefully without extra locking, and it can delete the tree recursively alone. So, we call proc_remove() at snd_info_card_disconnection() at first, then delete the rest resources recursively within the info_mutex lock.
After the change, the function snd_info_disconnect() doesn't do disconnection by itself any longer, but it merely clears the procfs pointer. So rename the function to snd_info_clear_entries() for avoiding confusion.
The similar change is applied to snd_info_free_entry(), too. Since the proc_remove() is called only conditionally with the non-NULL entry->p, it's skipped after the snd_info_clear_entries() call.
Reported-by: Shinhyung Kang s47.kang@samsung.com Closes: https://lore.kernel.org/r/664457955.21699345385931.JavaMail.epsvc@epcpadp4 Reviewed-by: Jaroslav Kysela perex@perex.cz Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231109141954.4283-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/info.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)
--- a/sound/core/info.c +++ b/sound/core/info.c @@ -56,7 +56,7 @@ struct snd_info_private_data { };
static int snd_info_version_init(void); -static void snd_info_disconnect(struct snd_info_entry *entry); +static void snd_info_clear_entries(struct snd_info_entry *entry);
/*
@@ -569,11 +569,16 @@ void snd_info_card_disconnect(struct snd { if (!card) return; - mutex_lock(&info_mutex); + proc_remove(card->proc_root_link); - card->proc_root_link = NULL; if (card->proc_root) - snd_info_disconnect(card->proc_root); + proc_remove(card->proc_root->p); + + mutex_lock(&info_mutex); + if (card->proc_root) + snd_info_clear_entries(card->proc_root); + card->proc_root_link = NULL; + card->proc_root = NULL; mutex_unlock(&info_mutex); }
@@ -745,15 +750,14 @@ struct snd_info_entry *snd_info_create_c } EXPORT_SYMBOL(snd_info_create_card_entry);
-static void snd_info_disconnect(struct snd_info_entry *entry) +static void snd_info_clear_entries(struct snd_info_entry *entry) { struct snd_info_entry *p;
if (!entry->p) return; list_for_each_entry(p, &entry->children, list) - snd_info_disconnect(p); - proc_remove(entry->p); + snd_info_clear_entries(p); entry->p = NULL; }
@@ -770,8 +774,9 @@ void snd_info_free_entry(struct snd_info if (!entry) return; if (entry->p) { + proc_remove(entry->p); mutex_lock(&info_mutex); - snd_info_disconnect(entry); + snd_info_clear_entries(entry); mutex_unlock(&info_mutex); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eymen Yigit eymenyg01@gmail.com
commit 8384c0baf223e1c3bc7b1c711d80a4c6106d210e upstream.
This HP Notebook uses ALC236 codec with COEF 0x07 idx 1 controlling the mute LED. Enable already existing quirk for this device.
Signed-off-by: Eymen Yigit eymenyg01@gmail.com Cc: Luka Guzenko l.guzenko@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231110150715.5141-1-eymenyg01@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9670,6 +9670,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x8898, "HP EliteBook 845 G8 Notebook PC", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x103c, 0x88d0, "HP Pavilion 15-eh1xxx (mainboard 88D0)", ALC287_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8902, "HP OMEN 16", ALC285_FIXUP_HP_MUTE_LED), + SND_PCI_QUIRK(0x103c, 0x890e, "HP 255 G8 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8919, "HP Pavilion Aero Laptop 13-be0xxx", ALC287_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x896d, "HP ZBook Firefly 16 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x896e, "HP EliteBook x360 830 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit 4b21a669ca21ed8f24ef4530b2918be5730114de upstream.
Add ALC295 to pin fall back table. Remove 5 pin quirks for Dell ALC295. ALC295 was only support MIC2 for external MIC function. ALC295 assigned model "ALC269_FIXUP_DELL1_MIC_NO_PRESENCE" for pin fall back table. It was assigned wrong model. So, let's remove it.
Fixes: fbc571290d9f ("ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform") Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/7c1998e873834df98d59bd7e0d08c72e@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -10641,22 +10641,6 @@ static const struct snd_hda_pin_quirk al {0x12, 0x90a60130}, {0x17, 0x90170110}, {0x21, 0x03211020}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, - {0x14, 0x90170110}, - {0x21, 0x04211020}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, - {0x14, 0x90170110}, - {0x21, 0x04211030}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS, - {0x17, 0x21014020}, - {0x18, 0x21a19030}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS, - {0x17, 0x21014040}, - {0x18, 0x21a19050}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS), SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE, ALC298_STANDARD_PINS, {0x17, 0x90170110}), @@ -10700,6 +10684,9 @@ static const struct snd_hda_pin_quirk al SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, {0x19, 0x40000000}, {0x1b, 0x40000000}), + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, + {0x19, 0x40000000}, + {0x1b, 0x40000000}), SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, {0x19, 0x40000000}, {0x1a, 0x40000000}),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandradeep Dey codesigning@chandradeepdey.com
commit 713f040cd22285fcc506f40a0d259566e6758c3c upstream.
Apply the already existing quirk chain ALC294_FIXUP_ASUS_SPK to enable the internal speaker of ASUS K6500ZC.
Signed-off-by: Chandradeep Dey codesigning@chandradeepdey.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/NizcVHQ--3-9@chandradeepdey.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9745,6 +9745,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1043, 0x10a1, "ASUS UX391UA", ALC294_FIXUP_ASUS_SPK), SND_PCI_QUIRK(0x1043, 0x10c0, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x10d0, "ASUS X540LA/X540LJ", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1043, 0x10d3, "ASUS K6500ZC", ALC294_FIXUP_ASUS_SPK), SND_PCI_QUIRK(0x1043, 0x115d, "Asus 1015E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x1043, 0x11c0, "ASUS X556UR", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x125e, "ASUS Q524UQK", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matus Malych matus@malych.org
commit b944aa9d86d5f782bfe5e51336434c960304839c upstream.
HP 255 G10 has a mute LED that can be made to work using quirk ALC236_FIXUP_HP_MUTE_LED_COEFBIT2. Enable already existing quirk - at correct line to keep order
Signed-off-by: Matus Malych matus@malych.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231114133524.11340-1-matus@malych.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9706,6 +9706,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x8abb, "HP ZBook Firefly 14 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8ad1, "HP EliteBook 840 14 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8ad2, "HP EliteBook 860 16 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b2f, "HP 255 15.6 inch G10 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8b42, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b43, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b44, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Binding sbinding@opensource.cirrus.com
commit 5d639b60971f003d3a9b2b31f8ec73b0718b5d57 upstream.
These HP laptops use Realtek HDA codec combined with 2 or 4 CS35L41 Amplifiers using SPI with Internal Boost.
Signed-off-by: Stefan Binding sbinding@opensource.cirrus.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231115162116.494968-3-sbinding@opensource.cirrus... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9740,6 +9740,9 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x8c70, "HP EliteBook 835 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8c71, "HP EliteBook 845 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8c72, "HP EliteBook 865 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8ca4, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8ca7, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8cf5, "HP ZBook Studio 16", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED), SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300), SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johnathan Mantey johnathanx.mantey@intel.com
commit 9e2e7efbbbff69d8340abb56d375dd79d1f5770f upstream.
This reverts commit 3780bb29311eccb7a1c9641032a112eed237f7e3.
The cited commit introduced unwanted behavior.
The intent for the commit was to be able to detect carrier loss/gain for just the NIC connected to the BMC. The unwanted effect is a carrier loss for auxiliary paths also causes the BMC to lose carrier. The BMC never regains carrier despite the secondary NIC regaining a link.
This change, when merged, needs to be backported to stable kernels. 5.4-stable, 5.10-stable, 5.15-stable, 6.1-stable, 6.5-stable
Fixes: 3780bb29311e ("ncsi: Propagate carrier gain/loss events to the NCSI controller") CC: stable@vger.kernel.org Signed-off-by: Johnathan Mantey johnathanx.mantey@intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ncsi/ncsi-aen.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/net/ncsi/ncsi-aen.c +++ b/net/ncsi/ncsi-aen.c @@ -89,11 +89,6 @@ static int ncsi_aen_handler_lsc(struct n if ((had_link == has_link) || chained) return 0;
- if (had_link) - netif_carrier_off(ndp->ndev.dev); - else - netif_carrier_on(ndp->ndev.dev); - if (!ndp->multi_package && !nc->package->multi_channel) { if (had_link) { ndp->flags |= NCSI_DEV_RESHUFFLE;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robert Marko robert.marko@sartura.hr
commit 7b211c7671212cad0b83603c674838c7e824d845 upstream.
This reverts commit 0b01392c18b9993a584f36ace1d61118772ad0ca.
Conversion of PXA to generic I2C recovery, makes the I2C bus completely lock up if recovery pinctrl is present in the DT and I2C recovery is enabled.
So, until the generic I2C recovery can also work with PXA lets revert to have working I2C and I2C recovery again.
Signed-off-by: Robert Marko robert.marko@sartura.hr Cc: stable@vger.kernel.org # 5.11+ Acked-by: Andi Shyti andi.shyti@kernel.org Acked-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-pxa.c | 76 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 68 insertions(+), 8 deletions(-)
--- a/drivers/i2c/busses/i2c-pxa.c +++ b/drivers/i2c/busses/i2c-pxa.c @@ -264,6 +264,9 @@ struct pxa_i2c { u32 hs_mask;
struct i2c_bus_recovery_info recovery; + struct pinctrl *pinctrl; + struct pinctrl_state *pinctrl_default; + struct pinctrl_state *pinctrl_recovery; };
#define _IBMR(i2c) ((i2c)->reg_ibmr) @@ -1300,12 +1303,13 @@ static void i2c_pxa_prepare_recovery(str */ gpiod_set_value(i2c->recovery.scl_gpiod, ibmr & IBMR_SCLS); gpiod_set_value(i2c->recovery.sda_gpiod, ibmr & IBMR_SDAS); + + WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery)); }
static void i2c_pxa_unprepare_recovery(struct i2c_adapter *adap) { struct pxa_i2c *i2c = adap->algo_data; - struct i2c_bus_recovery_info *bri = adap->bus_recovery_info; u32 isr;
/* @@ -1319,7 +1323,7 @@ static void i2c_pxa_unprepare_recovery(s i2c_pxa_do_reset(i2c); }
- WARN_ON(pinctrl_select_state(bri->pinctrl, bri->pins_default)); + WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default));
dev_dbg(&i2c->adap.dev, "recovery: IBMR 0x%08x ISR 0x%08x\n", readl(_IBMR(i2c)), readl(_ISR(i2c))); @@ -1341,20 +1345,76 @@ static int i2c_pxa_init_recovery(struct if (IS_ENABLED(CONFIG_I2C_PXA_SLAVE)) return 0;
- bri->pinctrl = devm_pinctrl_get(dev); - if (PTR_ERR(bri->pinctrl) == -ENODEV) { - bri->pinctrl = NULL; + i2c->pinctrl = devm_pinctrl_get(dev); + if (PTR_ERR(i2c->pinctrl) == -ENODEV) + i2c->pinctrl = NULL; + if (IS_ERR(i2c->pinctrl)) + return PTR_ERR(i2c->pinctrl); + + if (!i2c->pinctrl) + return 0; + + i2c->pinctrl_default = pinctrl_lookup_state(i2c->pinctrl, + PINCTRL_STATE_DEFAULT); + i2c->pinctrl_recovery = pinctrl_lookup_state(i2c->pinctrl, "recovery"); + + if (IS_ERR(i2c->pinctrl_default) || IS_ERR(i2c->pinctrl_recovery)) { + dev_info(dev, "missing pinmux recovery information: %ld %ld\n", + PTR_ERR(i2c->pinctrl_default), + PTR_ERR(i2c->pinctrl_recovery)); + return 0; + } + + /* + * Claiming GPIOs can influence the pinmux state, and may glitch the + * I2C bus. Do this carefully. + */ + bri->scl_gpiod = devm_gpiod_get(dev, "scl", GPIOD_OUT_HIGH_OPEN_DRAIN); + if (bri->scl_gpiod == ERR_PTR(-EPROBE_DEFER)) + return -EPROBE_DEFER; + if (IS_ERR(bri->scl_gpiod)) { + dev_info(dev, "missing scl gpio recovery information: %pe\n", + bri->scl_gpiod); + return 0; + } + + /* + * We have SCL. Pull SCL low and wait a bit so that SDA glitches + * have no effect. + */ + gpiod_direction_output(bri->scl_gpiod, 0); + udelay(10); + bri->sda_gpiod = devm_gpiod_get(dev, "sda", GPIOD_OUT_HIGH_OPEN_DRAIN); + + /* Wait a bit in case of a SDA glitch, and then release SCL. */ + udelay(10); + gpiod_direction_output(bri->scl_gpiod, 1); + + if (bri->sda_gpiod == ERR_PTR(-EPROBE_DEFER)) + return -EPROBE_DEFER; + + if (IS_ERR(bri->sda_gpiod)) { + dev_info(dev, "missing sda gpio recovery information: %pe\n", + bri->sda_gpiod); return 0; } - if (IS_ERR(bri->pinctrl)) - return PTR_ERR(bri->pinctrl);
bri->prepare_recovery = i2c_pxa_prepare_recovery; bri->unprepare_recovery = i2c_pxa_unprepare_recovery; + bri->recover_bus = i2c_generic_scl_recovery;
i2c->adap.bus_recovery_info = bri;
- return 0; + /* + * Claiming GPIOs can change the pinmux state, which confuses the + * pinctrl since pinctrl's idea of the current setting is unaffected + * by the pinmux change caused by claiming the GPIO. Work around that + * by switching pinctrl to the GPIO state here. We do it this way to + * avoid glitching the I2C bus. + */ + pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery); + + return pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default); }
static int i2c_pxa_probe(struct platform_device *dev)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Mosnacek omosnace@redhat.com
commit 866d648059d5faf53f1cd960b43fe8365ad93ea7 upstream.
1 is the return value that implements a "no-op" hook, not 0.
Cc: stable@vger.kernel.org Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/lsm_hook_defs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -48,7 +48,7 @@ LSM_HOOK(int, 0, quota_on, struct dentry LSM_HOOK(int, 0, syslog, int type) LSM_HOOK(int, 0, settime, const struct timespec64 *ts, const struct timezone *tz) -LSM_HOOK(int, 0, vm_enough_memory, struct mm_struct *mm, long pages) +LSM_HOOK(int, 1, vm_enough_memory, struct mm_struct *mm, long pages) LSM_HOOK(int, 0, bprm_creds_for_exec, struct linux_binprm *bprm) LSM_HOOK(int, 0, bprm_creds_from_file, struct linux_binprm *bprm, struct file *file) LSM_HOOK(int, 0, bprm_check_security, struct linux_binprm *bprm)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Mosnacek omosnace@redhat.com
commit b36995b8609a5a8fe5cf259a1ee768fcaed919f8 upstream.
-EOPNOTSUPP is the return value that implements a "no-op" hook, not 0.
Without this fix having only the BPF LSM enabled (with no programs attached) can cause uninitialized variable reads in nfsd4_encode_fattr(), because the BPF hook returns 0 without touching the 'ctxlen' variable and the corresponding 'contextlen' variable in nfsd4_encode_fattr() remains uninitialized, yet being treated as valid based on the 0 return value.
Cc: stable@vger.kernel.org Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Reported-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/lsm_hook_defs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -273,7 +273,7 @@ LSM_HOOK(void, LSM_RET_VOID, release_sec LSM_HOOK(void, LSM_RET_VOID, inode_invalidate_secctx, struct inode *inode) LSM_HOOK(int, 0, inode_notifysecctx, struct inode *inode, void *ctx, u32 ctxlen) LSM_HOOK(int, 0, inode_setsecctx, struct dentry *dentry, void *ctx, u32 ctxlen) -LSM_HOOK(int, 0, inode_getsecctx, struct inode *inode, void **ctx, +LSM_HOOK(int, -EOPNOTSUPP, inode_getsecctx, struct inode *inode, void **ctx, u32 *ctxlen)
#if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darren Hart darren@os.amperecomputing.com
commit 5d6aa89bba5bd6af2580f872b57f438dab883738 upstream.
Commit abd3ac7902fb ("watchdog: sbsa: Support architecture version 1") introduced new timer math for watchdog revision 1 with the 48 bit offset register.
The gwdt->clk and timeout are u32, but the argument being calculated is u64. Without a cast, the compiler performs u32 operations, truncating intermediate steps, resulting in incorrect values.
A watchdog revision 1 implementation with a gwdt->clk of 1GHz and a timeout of 600s writes 3647256576 to the one shot watchdog instead of 300000000000, resulting in the watchdog firing in 3.6s instead of 600s.
Force u64 math by casting the first argument (gwdt->clk) as a u64. Make the order of operations explicit with parenthesis.
Fixes: abd3ac7902fb ("watchdog: sbsa: Support architecture version 1") Reported-by: Vanshidhar Konda vanshikonda@os.amperecomputing.com Signed-off-by: Darren Hart darren@os.amperecomputing.com Cc: Wim Van Sebroeck wim@linux-watchdog.org Cc: Guenter Roeck linux@roeck-us.net Cc: linux-watchdog@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org # 5.14.x Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/7d1713c5ffab19b0f3de796d82df19e8b1f340de.169528612... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/watchdog/sbsa_gwdt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/watchdog/sbsa_gwdt.c +++ b/drivers/watchdog/sbsa_gwdt.c @@ -153,14 +153,14 @@ static int sbsa_gwdt_set_timeout(struct timeout = clamp_t(unsigned int, timeout, 1, wdd->max_hw_heartbeat_ms / 1000);
if (action) - sbsa_gwdt_reg_write(gwdt->clk * timeout, gwdt); + sbsa_gwdt_reg_write((u64)gwdt->clk * timeout, gwdt); else /* * In the single stage mode, The first signal (WS0) is ignored, * the timeout is (WOR * 2), so the WOR should be configured * to half value of timeout. */ - sbsa_gwdt_reg_write(gwdt->clk / 2 * timeout, gwdt); + sbsa_gwdt_reg_write(((u64)gwdt->clk / 2) * timeout, gwdt);
return 0; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tam Nguyen tamnguyenchi@os.amperecomputing.com
commit e8183fa10c25c7b3c20670bf2b430ddcc1ee03c0 upstream.
During SMBus block data read process, we have seen high interrupt rate because of TX_EMPTY irq status while waiting for block length byte (the first data byte after the address phase). The interrupt handler does not do anything because the internal state is kept as STATUS_WRITE_IN_PROGRESS. Hence, we should disable TX_EMPTY IRQ until I2C DesignWare receives first data byte from I2C device, then re-enable it to resume SMBus transaction.
It takes 0.789 ms for host to receive data length from slave. Without the patch, i2c_dw_isr() is called 99 times by TX_EMPTY interrupt. And it is none after applying the patch.
Cc: stable@vger.kernel.org Co-developed-by: Chuong Tran chuong@os.amperecomputing.com Signed-off-by: Chuong Tran chuong@os.amperecomputing.com Signed-off-by: Tam Nguyen tamnguyenchi@os.amperecomputing.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-designware-master.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
--- a/drivers/i2c/busses/i2c-designware-master.c +++ b/drivers/i2c/busses/i2c-designware-master.c @@ -518,10 +518,16 @@ i2c_dw_xfer_msg(struct dw_i2c_dev *dev)
/* * Because we don't know the buffer length in the - * I2C_FUNC_SMBUS_BLOCK_DATA case, we can't stop - * the transaction here. + * I2C_FUNC_SMBUS_BLOCK_DATA case, we can't stop the + * transaction here. Also disable the TX_EMPTY IRQ + * while waiting for the data length byte to avoid the + * bogus interrupts flood. */ - if (buf_len > 0 || flags & I2C_M_RECV_LEN) { + if (flags & I2C_M_RECV_LEN) { + dev->status |= STATUS_WRITE_IN_PROGRESS; + intr_mask &= ~DW_IC_INTR_TX_EMPTY; + break; + } else if (buf_len > 0) { /* more bytes to be written */ dev->status |= STATUS_WRITE_IN_PROGRESS; break; @@ -557,6 +563,13 @@ i2c_dw_recv_len(struct dw_i2c_dev *dev, msgs[dev->msg_read_idx].len = len; msgs[dev->msg_read_idx].flags &= ~I2C_M_RECV_LEN;
+ /* + * Received buffer length, re-enable TX_EMPTY interrupt + * to resume the SMBUS transaction. + */ + regmap_update_bits(dev->map, DW_IC_INTR_MASK, DW_IC_INTR_TX_EMPTY, + DW_IC_INTR_TX_EMPTY); + return len; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harald Freudenberger freude@linux.ibm.com
commit e14aec23025eeb1f2159ba34dbc1458467c4c347 upstream.
Fix kernel crash in AP bus code caused by very early invocation of the config change callback function via SCLP.
After a fresh IML of the machine the crypto cards are still offline and will get switched online only with activation of any LPAR which has the card in it's configuration. A crypto card coming online is reported to the LPAR via SCLP and the AP bus offers a callback function to get this kind of information. However, it may happen that the callback is invoked before the AP bus init function is complete. As the callback triggers a synchronous AP bus scan, the scan may already run but some internal states are not initialized by the AP bus init function resulting in a crash like this:
[ 11.635859] Unable to handle kernel pointer dereference in virtual kernel address space [ 11.635861] Failing address: 0000000000000000 TEID: 0000000000000887 [ 11.635862] Fault in home space mode while using kernel ASCE. [ 11.635864] AS:00000000894c4007 R3:00000001fece8007 S:00000001fece7800 P:000000000000013d [ 11.635879] Oops: 0004 ilc:1 [#1] SMP [ 11.635882] Modules linked in: [ 11.635884] CPU: 5 PID: 42 Comm: kworker/5:0 Not tainted 6.6.0-rc3-00003-g4dbf7cdc6b42 #12 [ 11.635886] Hardware name: IBM 3931 A01 751 (LPAR) [ 11.635887] Workqueue: events_long ap_scan_bus [ 11.635891] Krnl PSW : 0704c00180000000 0000000000000000 (0x0) [ 11.635895] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 11.635897] Krnl GPRS: 0000000001000a00 0000000000000000 0000000000000006 0000000089591940 [ 11.635899] 0000000080000000 0000000000000a00 0000000000000000 0000000000000000 [ 11.635901] 0000000081870c00 0000000089591000 000000008834e4e2 0000000002625a00 [ 11.635903] 0000000081734200 0000038000913c18 000000008834c6d6 0000038000913ac8 [ 11.635906] Krnl Code:>0000000000000000: 0000 illegal [ 11.635906] 0000000000000002: 0000 illegal [ 11.635906] 0000000000000004: 0000 illegal [ 11.635906] 0000000000000006: 0000 illegal [ 11.635906] 0000000000000008: 0000 illegal [ 11.635906] 000000000000000a: 0000 illegal [ 11.635906] 000000000000000c: 0000 illegal [ 11.635906] 000000000000000e: 0000 illegal [ 11.635915] Call Trace: [ 11.635916] [<0000000000000000>] 0x0 [ 11.635918] [<000000008834e4e2>] ap_queue_init_state+0x82/0xb8 [ 11.635921] [<000000008834ba1c>] ap_scan_domains+0x6fc/0x740 [ 11.635923] [<000000008834c092>] ap_scan_adapter+0x632/0x8b0 [ 11.635925] [<000000008834c3e4>] ap_scan_bus+0xd4/0x288 [ 11.635927] [<00000000879a33ba>] process_one_work+0x19a/0x410 [ 11.635930] Discipline DIAG cannot be used without z/VM [ 11.635930] [<00000000879a3a2c>] worker_thread+0x3fc/0x560 [ 11.635933] [<00000000879aea60>] kthread+0x120/0x128 [ 11.635936] [<000000008792afa4>] __ret_from_fork+0x3c/0x58 [ 11.635938] [<00000000885ebe62>] ret_from_fork+0xa/0x30 [ 11.635942] Last Breaking-Event-Address: [ 11.635942] [<000000008834c6d4>] ap_wait+0xcc/0x148
This patch improves the ap_bus_force_rescan() function which is invoked by the config change callback by checking if a first initial AP bus scan has been done. If not, the force rescan request is simple ignored. Anyhow it does not make sense to trigger AP bus re-scans even before the very first bus scan is complete.
Cc: stable@vger.kernel.org Reviewed-by: Holger Dengler dengler@linux.ibm.com Signed-off-by: Harald Freudenberger freude@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/crypto/ap_bus.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -1030,6 +1030,10 @@ EXPORT_SYMBOL(ap_driver_unregister);
void ap_bus_force_rescan(void) { + /* Only trigger AP bus scans after the initial scan is done */ + if (atomic64_read(&ap_scan_bus_count) <= 0) + return; + /* processing a asynchronous bus rescan */ del_timer(&ap_config_timer); queue_work(system_long_wq, &ap_scan_work);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Lunn andrew@lunn.ch
commit f55d8e60f10909dbc5524e261041e1d28d7d20d8 upstream.
This function takes a pointer to a pointer, unlike sprintf() which is passed a plain pointer. Fix up the documentation to make this clear.
Fixes: 7888fe53b706 ("ethtool: Add common function for filling out strings") Cc: Alexander Duyck alexanderduyck@fb.com Cc: Justin Stitt justinstitt@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Justin Stitt justinstitt@google.com Link: https://lore.kernel.org/r/20231028192511.100001-1-andrew@lunn.ch Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/ethtool.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -1045,10 +1045,10 @@ static inline int ethtool_mm_frag_size_m
/** * ethtool_sprintf - Write formatted string to ethtool string data - * @data: Pointer to start of string to update + * @data: Pointer to a pointer to the start of string to update * @fmt: Format of string to write * - * Write formatted string to data. Update data to point at start of + * Write formatted string to *data. Update *data to point at start of * next string. */ extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Sverdlin alexander.sverdlin@siemens.com
commit 5a22fbcc10f3f7d94c5d88afbbffa240a3677057 upstream.
When LAN9303 is MDIO-connected two callchains exist into mdio->bus->write():
1. switch ports 1&2 ("physical" PHYs):
virtual (switch-internal) MDIO bus (lan9303_switch_ops->phy_{read|write})-> lan9303_mdio_phy_{read|write} -> mdiobus_{read|write}_nested
2. LAN9303 virtual PHY:
virtual MDIO bus (lan9303_phy_{read|write}) -> lan9303_virt_phy_reg_{read|write} -> regmap -> lan9303_mdio_{read|write}
If the latter functions just take mutex_lock(&sw_dev->device->bus->mdio_lock) it triggers a LOCKDEP false-positive splat. It's false-positive because the first mdio_lock in the second callchain above belongs to virtual MDIO bus, the second mdio_lock belongs to physical MDIO bus.
Consequent annotation in lan9303_mdio_{read|write} as nested lock (similar to lan9303_mdio_phy_{read|write}, it's the same physical MDIO bus) prevents the following splat:
WARNING: possible circular locking dependency detected 5.15.71 #1 Not tainted ------------------------------------------------------ kworker/u4:3/609 is trying to acquire lock: ffff000011531c68 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}, at: regmap_lock_mutex but task is already holding lock: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&bus->mdio_lock){+.+.}-{3:3}: lock_acquire __mutex_lock mutex_lock_nested lan9303_mdio_read _regmap_read regmap_read lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork -> #0 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}: __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); *** DEADLOCK *** 5 locks held by kworker/u4:3/609: #0: ffff000002842938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work #1: ffff80000bacbd60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work #2: ffff000007645178 (&dev->mutex){....}-{3:3}, at: __device_attach #3: ffff8000096e6e78 (dsa2_mutex){+.+.}-{3:3}, at: dsa_register_switch #4: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read stack backtrace: CPU: 1 PID: 609 Comm: kworker/u4:3 Not tainted 5.15.71 #1 Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace show_stack dump_stack_lvl dump_stack print_circular_bug check_noncircular __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe ...
Cc: stable@vger.kernel.org Fixes: dc7005831523 ("net: dsa: LAN9303: add MDIO managed mode support") Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20231027065741.534971-1-alexander.sverdlin@siemens... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lan9303_mdio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/dsa/lan9303_mdio.c +++ b/drivers/net/dsa/lan9303_mdio.c @@ -32,7 +32,7 @@ static int lan9303_mdio_write(void *ctx, struct lan9303_mdio *sw_dev = (struct lan9303_mdio *)ctx;
reg <<= 2; /* reg num to offset */ - mutex_lock(&sw_dev->device->bus->mdio_lock); + mutex_lock_nested(&sw_dev->device->bus->mdio_lock, MDIO_MUTEX_NESTED); lan9303_mdio_real_write(sw_dev->device, reg, val & 0xffff); lan9303_mdio_real_write(sw_dev->device, reg + 2, (val >> 16) & 0xffff); mutex_unlock(&sw_dev->device->bus->mdio_lock); @@ -50,7 +50,7 @@ static int lan9303_mdio_read(void *ctx, struct lan9303_mdio *sw_dev = (struct lan9303_mdio *)ctx;
reg <<= 2; /* reg num to offset */ - mutex_lock(&sw_dev->device->bus->mdio_lock); + mutex_lock_nested(&sw_dev->device->bus->mdio_lock, MDIO_MUTEX_NESTED); *val = lan9303_mdio_real_read(sw_dev->device, reg); *val |= (lan9303_mdio_real_read(sw_dev->device, reg + 2) << 16); mutex_unlock(&sw_dev->device->bus->mdio_lock);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Klaus Kudielka klaus.kudielka@gmail.com
commit 02d5fdbf4f2b8c406f7a4c98fa52aa181a11d733 upstream.
Background: Turris Omnia (Armada 385); eth2 (mvneta) connected to SFP bus; SFP module is present, but no fiber connected, so definitely no carrier.
After booting, eth2 is down, but netdev LED trigger surprisingly reports link active. Then, after "ip link set eth2 up", the link indicator goes away - as I would have expected it from the beginning.
It turns out, that the default carrier state after netdev creation is "carrier ok". Some ethernet drivers explicitly call netif_carrier_off during probing, others (like mvneta) don't - which explains the current behaviour: only when the device is brought up, phylink_start calls netif_carrier_off.
Fix this for all drivers using phylink, by calling netif_carrier_off in phylink_create.
Fixes: 089381b27abe ("leds: initial support for Turris Omnia LEDs") Cc: stable@vger.kernel.org Suggested-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Klaus Kudielka klaus.kudielka@gmail.com Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phylink.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -1568,6 +1568,7 @@ struct phylink *phylink_create(struct ph pl->config = config; if (config->type == PHYLINK_NETDEV) { pl->netdev = to_net_dev(config->dev); + netif_carrier_off(pl->netdev); } else if (config->type == PHYLINK_DEV) { pl->dev = config->dev; } else {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
commit 0cdc6f44e9fdc2d20d720145bf99a39f611f6d61 upstream.
In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted, kthread_create() can return -EINTR. When that happens, we roll back what has already been done and abort the mount.
Since commit 62dd0f98a0e5 ("gfs2: Flag a withdraw if init_threads() fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super(); first via gfs2_make_fs_rw(), then directly. But gfs2_withdraw_delayed() only marks the filesystem as withdrawing and relies on a caller further up the stack to do the actual withdraw, which doesn't exist in the gfs2_fill_super() case. Because the filesystem is marked as withdrawing / withdrawn, function gfs2_lm_unmount() doesn't release the dlm lockspace, so when we try to mount that filesystem again, we get:
gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0" gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17
Since commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the deadlock this gfs2_withdraw_delayed() call was supposed to work around cannot occur anymore because freeze_go_callback() won't take the sb->s_umount semaphore unconditionally anymore, so we can get rid of the gfs2_withdraw_delayed() in gfs2_fill_super() entirely.
Reported-by: Alexander Aring aahringo@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/gfs2/ops_fstype.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/gfs2/ops_fstype.c +++ b/fs/gfs2/ops_fstype.c @@ -1261,10 +1261,8 @@ static int gfs2_fill_super(struct super_
if (!sb_rdonly(sb)) { error = init_threads(sdp); - if (error) { - gfs2_withdraw_delayed(sdp); + if (error) goto fail_per_node; - } }
error = gfs2_freeze_lock_shared(sdp);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit f78ca48a8ba9cdec96e8839351e49eec3233b177 upstream.
Currently we set SMBHSTCNT_LAST_BYTE only after the host has started receiving the last byte. If we get e.g. preempted before setting SMBHSTCNT_LAST_BYTE, the host may be finished with receiving the byte before SMBHSTCNT_LAST_BYTE is set. Therefore change the code to set SMBHSTCNT_LAST_BYTE before writing SMBHSTSTS_BYTE_DONE for the byte before the last byte. Now the code is also consistent with what we do in i801_isr_byte_done().
Reported-by: Jean Delvare jdelvare@suse.com Closes: https://lore.kernel.org/linux-i2c/20230828152747.09444625@endymion.delvare/ Cc: stable@vger.kernel.org Acked-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Jean Delvare jdelvare@suse.de Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-i801.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
--- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -681,15 +681,11 @@ static int i801_block_transaction_byte_b return result ? priv->status : -ETIMEDOUT; }
- for (i = 1; i <= len; i++) { - if (i == len && read_write == I2C_SMBUS_READ) - smbcmd |= SMBHSTCNT_LAST_BYTE; - outb_p(smbcmd, SMBHSTCNT(priv)); - - if (i == 1) - outb_p(inb(SMBHSTCNT(priv)) | SMBHSTCNT_START, - SMBHSTCNT(priv)); + if (len == 1 && read_write == I2C_SMBUS_READ) + smbcmd |= SMBHSTCNT_LAST_BYTE; + outb_p(smbcmd | SMBHSTCNT_START, SMBHSTCNT(priv));
+ for (i = 1; i <= len; i++) { status = i801_wait_byte_done(priv); if (status) return status; @@ -712,9 +708,12 @@ static int i801_block_transaction_byte_b data->block[0] = len; }
- /* Retrieve/store value in SMBBLKDAT */ - if (read_write == I2C_SMBUS_READ) + if (read_write == I2C_SMBUS_READ) { data->block[i] = inb_p(SMBBLKDAT(priv)); + if (i == len - 1) + outb_p(smbcmd | SMBHSTCNT_LAST_BYTE, SMBHSTCNT(priv)); + } + if (read_write == I2C_SMBUS_WRITE && i+1 <= len) outb_p(data->block[i+1], SMBBLKDAT(priv));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaegeuk Kim jaegeuk@kernel.org
commit 50a472bbc79ff9d5a88be8019a60e936cadf9f13 upstream.
If we return the error, there's no way to recover the status as of now, since fsck does not fix the xattr boundary issue.
Cc: stable@vger.kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/node.c | 4 +++- fs/f2fs/xattr.c | 20 +++++++++++++------- 2 files changed, 16 insertions(+), 8 deletions(-)
--- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2751,7 +2751,9 @@ recover_xnid: f2fs_update_inode_page(inode);
/* 3: update and set xattr node page dirty */ - memcpy(F2FS_NODE(xpage), F2FS_NODE(page), VALID_XATTR_BLOCK_SIZE); + if (page) + memcpy(F2FS_NODE(xpage), F2FS_NODE(page), + VALID_XATTR_BLOCK_SIZE);
set_page_dirty(xpage); f2fs_put_page(xpage, 1); --- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -364,10 +364,10 @@ static int lookup_all_xattrs(struct inod
*xe = __find_xattr(cur_addr, last_txattr_addr, NULL, index, len, name); if (!*xe) { - f2fs_err(F2FS_I_SB(inode), "inode (%lu) has corrupted xattr", + f2fs_err(F2FS_I_SB(inode), "lookup inode (%lu) has corrupted xattr", inode->i_ino); set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK); - err = -EFSCORRUPTED; + err = -ENODATA; f2fs_handle_error(F2FS_I_SB(inode), ERROR_CORRUPTED_XATTR); goto out; @@ -584,13 +584,12 @@ ssize_t f2fs_listxattr(struct dentry *de
if ((void *)(entry) + sizeof(__u32) > last_base_addr || (void *)XATTR_NEXT_ENTRY(entry) > last_base_addr) { - f2fs_err(F2FS_I_SB(inode), "inode (%lu) has corrupted xattr", + f2fs_err(F2FS_I_SB(inode), "list inode (%lu) has corrupted xattr", inode->i_ino); set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK); - error = -EFSCORRUPTED; f2fs_handle_error(F2FS_I_SB(inode), ERROR_CORRUPTED_XATTR); - goto cleanup; + break; }
if (!prefix) @@ -650,7 +649,7 @@ static int __f2fs_setxattr(struct inode
if (size > MAX_VALUE_LEN(inode)) return -E2BIG; - +retry: error = read_all_xattrs(inode, ipage, &base_addr); if (error) return error; @@ -660,7 +659,14 @@ static int __f2fs_setxattr(struct inode /* find entry with wanted name. */ here = __find_xattr(base_addr, last_base_addr, NULL, index, len, name); if (!here) { - f2fs_err(F2FS_I_SB(inode), "inode (%lu) has corrupted xattr", + if (!F2FS_I(inode)->i_xattr_nid) { + f2fs_notice(F2FS_I_SB(inode), + "recover xattr in inode (%lu)", inode->i_ino); + f2fs_recover_xattr_data(inode, NULL); + kfree(base_addr); + goto retry; + } + f2fs_err(F2FS_I_SB(inode), "set inode (%lu) has corrupted xattr", inode->i_ino); set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK); error = -EFSCORRUPTED;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaegeuk Kim jaegeuk@kernel.org
commit f5f3bd903a5d3e3b2ba89f11e0e29db25e60c048 upstream.
Otherwise, we'll get a broken inode.
# touch $FILE # f2fs_io setflags compression $FILE # f2fs_io set_coption 2 8 $FILE
[ 112.227612] F2FS-fs (dm-51): sanity_check_compress_inode: inode (ino=8d3fe) has unsupported compress level: 0, run fsck to fix
Cc: stable@vger.kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/file.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -4006,6 +4006,15 @@ static int f2fs_ioc_set_compress_option( F2FS_I(inode)->i_compress_algorithm = option.algorithm; F2FS_I(inode)->i_log_cluster_size = option.log_cluster_size; F2FS_I(inode)->i_cluster_size = BIT(option.log_cluster_size); + /* Set default level */ + if (F2FS_I(inode)->i_compress_algorithm == COMPRESS_ZSTD) + F2FS_I(inode)->i_compress_level = F2FS_ZSTD_DEFAULT_CLEVEL; + else + F2FS_I(inode)->i_compress_level = 0; + /* Adjust mount option level */ + if (option.algorithm == F2FS_OPTION(sbi).compress_algorithm && + F2FS_OPTION(sbi).compress_level) + F2FS_I(inode)->i_compress_level = F2FS_OPTION(sbi).compress_level; f2fs_mark_inode_dirty_sync(inode, true);
if (!f2fs_is_compress_backend_ready(inode))
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui suhui@nfschina.com
commit e0d4e8acb3789c5a8651061fbab62ca24a45c063 upstream.
With gcc and W=1 option, there's a warning like this:
fs/f2fs/compress.c: In function ‘f2fs_init_page_array_cache’: fs/f2fs/compress.c:1984:47: error: ‘%u’ directive writing between 1 and 7 bytes into a region of size between 5 and 8 [-Werror=format-overflow=] 1984 | sprintf(slab_name, "f2fs_page_array_entry-%u:%u", MAJOR(dev), MINOR(dev)); | ^~
String "f2fs_page_array_entry-%u:%u" can up to 35. The first "%u" can up to 4 and the second "%u" can up to 7, so total size is "24 + 4 + 7 = 35". slab_name's size should be 35 rather than 32.
Cc: stable@vger.kernel.org Signed-off-by: Su Hui suhui@nfschina.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/compress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1988,7 +1988,7 @@ void f2fs_destroy_compress_inode(struct int f2fs_init_page_array_cache(struct f2fs_sb_info *sbi) { dev_t dev = sbi->sb->s_bdev->bd_dev; - char slab_name[32]; + char slab_name[35];
if (!f2fs_sb_has_compression(sbi)) return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaegeuk Kim jaegeuk@kernel.org
commit f803982190f0265fd36cf84670aa6daefc2b0768 upstream.
Let's allocate the extent_cache tree without dynamic conditions to avoid a missing condition causing a panic as below.
# create a file w/ a compressed flag # disable the compression # panic while updating extent_cache
F2FS-fs (dm-64): Swapfile: last extent is not aligned to section F2FS-fs (dm-64): Swapfile (3) is not align to section: 1) creat(), 2) ioctl(F2FS_IOC_SET_PIN_FILE), 3) fallocate(2097152 * N) Adding 124996k swap on ./swap-file. Priority:0 extents:2 across:17179494468k ================================================================== BUG: KASAN: null-ptr-deref in instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline] BUG: KASAN: null-ptr-deref in atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline] BUG: KASAN: null-ptr-deref in queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline] BUG: KASAN: null-ptr-deref in __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline] BUG: KASAN: null-ptr-deref in _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295 Write of size 4 at addr 0000000000000030 by task syz-executor154/3327
CPU: 0 PID: 3327 Comm: syz-executor154 Tainted: G O 5.10.185 #1 Hardware name: emulation qemu-x86/qemu-x86, BIOS 2023.01-21885-gb3cc1cd24d 01/01/2023 Call Trace: __dump_stack out/common/lib/dump_stack.c:77 [inline] dump_stack_lvl+0x17e/0x1c4 out/common/lib/dump_stack.c:118 __kasan_report+0x16c/0x260 out/common/mm/kasan/report.c:415 kasan_report+0x51/0x70 out/common/mm/kasan/report.c:428 kasan_check_range+0x2f3/0x340 out/common/mm/kasan/generic.c:186 __kasan_check_write+0x14/0x20 out/common/mm/kasan/shadow.c:37 instrument_atomic_read_write out/common/include/linux/instrumented.h:101 [inline] atomic_try_cmpxchg_acquire out/common/include/asm-generic/atomic-instrumented.h:705 [inline] queued_write_lock out/common/include/asm-generic/qrwlock.h:92 [inline] __raw_write_lock out/common/include/linux/rwlock_api_smp.h:211 [inline] _raw_write_lock+0x5a/0x110 out/common/kernel/locking/spinlock.c:295 __drop_extent_tree+0xdf/0x2f0 out/common/fs/f2fs/extent_cache.c:1155 f2fs_drop_extent_tree+0x17/0x30 out/common/fs/f2fs/extent_cache.c:1172 f2fs_insert_range out/common/fs/f2fs/file.c:1600 [inline] f2fs_fallocate+0x19fd/0x1f40 out/common/fs/f2fs/file.c:1764 vfs_fallocate+0x514/0x9b0 out/common/fs/open.c:310 ksys_fallocate out/common/fs/open.c:333 [inline] __do_sys_fallocate out/common/fs/open.c:341 [inline] __se_sys_fallocate out/common/fs/open.c:339 [inline] __x64_sys_fallocate+0xb8/0x100 out/common/fs/open.c:339 do_syscall_64+0x35/0x50 out/common/arch/x86/entry/common.c:46
Cc: stable@vger.kernel.org Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default") Reported-and-tested-by: syzbot+d342e330a37b48c094b7@syzkaller.appspotmail.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/extent_cache.c | 53 +++++++++++++++++++------------------------------ 1 file changed, 21 insertions(+), 32 deletions(-)
--- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -74,40 +74,14 @@ static void __set_extent_info(struct ext } }
-static bool __may_read_extent_tree(struct inode *inode) -{ - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - - if (!test_opt(sbi, READ_EXTENT_CACHE)) - return false; - if (is_inode_flag_set(inode, FI_NO_EXTENT)) - return false; - if (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && - !f2fs_sb_has_readonly(sbi)) - return false; - return S_ISREG(inode->i_mode); -} - -static bool __may_age_extent_tree(struct inode *inode) -{ - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - - if (!test_opt(sbi, AGE_EXTENT_CACHE)) - return false; - if (is_inode_flag_set(inode, FI_COMPRESSED_FILE)) - return false; - if (file_is_cold(inode)) - return false; - - return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode); -} - static bool __init_may_extent_tree(struct inode *inode, enum extent_type type) { if (type == EX_READ) - return __may_read_extent_tree(inode); - else if (type == EX_BLOCK_AGE) - return __may_age_extent_tree(inode); + return test_opt(F2FS_I_SB(inode), READ_EXTENT_CACHE) && + S_ISREG(inode->i_mode); + if (type == EX_BLOCK_AGE) + return test_opt(F2FS_I_SB(inode), AGE_EXTENT_CACHE) && + (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)); return false; }
@@ -120,7 +94,22 @@ static bool __may_extent_tree(struct ino if (list_empty(&F2FS_I_SB(inode)->s_list)) return false;
- return __init_may_extent_tree(inode, type); + if (!__init_may_extent_tree(inode, type)) + return false; + + if (type == EX_READ) { + if (is_inode_flag_set(inode, FI_NO_EXTENT)) + return false; + if (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && + !f2fs_sb_has_readonly(F2FS_I_SB(inode))) + return false; + } else if (type == EX_BLOCK_AGE) { + if (is_inode_flag_set(inode, FI_COMPRESSED_FILE)) + return false; + if (file_is_cold(inode)) + return false; + } + return true; }
static void __try_update_largest_extent(struct extent_tree *et,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Young sean@mess.org
commit c8a489f820179fb12251e262b50303c29de991ac upstream.
When transmitting, infrared drivers expect an odd number of samples; iow without a trailing space. No problems have been observed so far, so this is just belt and braces.
Fixes: 9b6192589be7 ("media: lirc: implement scancode sending") Cc: stable@vger.kernel.org Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/lirc_dev.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/media/rc/lirc_dev.c +++ b/drivers/media/rc/lirc_dev.c @@ -276,7 +276,11 @@ static ssize_t lirc_transmit(struct file if (ret < 0) goto out_kfree_raw;
- count = ret; + /* drop trailing space */ + if (!(ret % 2)) + count = ret - 1; + else + count = ret;
txbuf = kmalloc_array(count, sizeof(unsigned int), GFP_KERNEL); if (!txbuf) {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Young sean@mess.org
commit 4f7efc71891462ab7606da7039f480d7c1584a13 upstream.
The Sharp protocol[1] encoding has incorrect timings for bit space.
[1] https://www.sbprojects.net/knowledge/ir/sharp.php
Fixes: d35afc5fe097 ("[media] rc: ir-sharp-decoder: Add encode capability") Cc: stable@vger.kernel.org Reported-by: Joe Ferner joe.m.ferner@gmail.com Closes: https://sourceforge.net/p/lirc/mailman/message/38604507/ Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/ir-sharp-decoder.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/media/rc/ir-sharp-decoder.c +++ b/drivers/media/rc/ir-sharp-decoder.c @@ -15,7 +15,9 @@ #define SHARP_UNIT 40 /* us */ #define SHARP_BIT_PULSE (8 * SHARP_UNIT) /* 320us */ #define SHARP_BIT_0_PERIOD (25 * SHARP_UNIT) /* 1ms (680us space) */ -#define SHARP_BIT_1_PERIOD (50 * SHARP_UNIT) /* 2ms (1680ms space) */ +#define SHARP_BIT_1_PERIOD (50 * SHARP_UNIT) /* 2ms (1680us space) */ +#define SHARP_BIT_0_SPACE (17 * SHARP_UNIT) /* 680us space */ +#define SHARP_BIT_1_SPACE (42 * SHARP_UNIT) /* 1680us space */ #define SHARP_ECHO_SPACE (1000 * SHARP_UNIT) /* 40 ms */ #define SHARP_TRAILER_SPACE (125 * SHARP_UNIT) /* 5 ms (even longer) */
@@ -168,8 +170,8 @@ static const struct ir_raw_timings_pd ir .header_pulse = 0, .header_space = 0, .bit_pulse = SHARP_BIT_PULSE, - .bit_space[0] = SHARP_BIT_0_PERIOD, - .bit_space[1] = SHARP_BIT_1_PERIOD, + .bit_space[0] = SHARP_BIT_0_SPACE, + .bit_space[1] = SHARP_BIT_1_SPACE, .trailer_pulse = SHARP_BIT_PULSE, .trailer_space = SHARP_ECHO_SPACE, .msb_first = 1,
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 0768a9dd809ef52440b5df7dce5a1c1c7e97abbd upstream.
Supported codec bitmask is populated from the payload from venus firmware. There is a possible case when all the bits in the codec bitmask is set. In such case, core cap for decoder is filled and MAX_CODEC_NUM is utilized. Now while filling the caps for encoder, it can lead to access the caps array beyong 32 index. Hence leading to OOB write. The fix counts the supported encoder and decoder. If the count is more than max, then it skips accessing the caps.
Cc: stable@vger.kernel.org Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_parser.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_parser.c +++ b/drivers/media/platform/qcom/venus/hfi_parser.c @@ -19,6 +19,9 @@ static void init_codecs(struct venus_cor struct hfi_plat_caps *caps = core->caps, *cap; unsigned long bit;
+ if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) > MAX_CODEC_NUM) + return; + for_each_set_bit(bit, &core->dec_codecs, MAX_CODEC_NUM) { cap = &caps[core->codecs_count++]; cap->codec = BIT(bit);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit b18e36dfd6c935da60a971310374f3dfec3c82e1 upstream.
Buffer requirement, for different buffer type, comes from video firmware. While copying these requirements, there is an OOB possibility when the payload from firmware is more than expected size. Fix the check to avoid the OOB possibility.
Cc: stable@vger.kernel.org Fixes: 09c2845e8fe4 ("[media] media: venus: hfi: add Host Firmware Interface (HFI)") Reviewed-by: Nathan Hebert nhebert@chromium.org Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/venus/hfi_msgs.c +++ b/drivers/media/platform/qcom/venus/hfi_msgs.c @@ -398,7 +398,7 @@ session_get_prop_buf_req(struct hfi_msg_ memcpy(&bufreq[idx], buf_req, sizeof(*bufreq)); idx++;
- if (idx > HFI_BUFFER_TYPE_MAX) + if (idx >= HFI_BUFFER_TYPE_MAX) return HFI_ERR_SESSION_INVALID_PARAMETER;
req_bytes -= sizeof(struct hfi_buffer_requirements);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 8d0b89398b7ebc52103e055bf36b60b045f5258f upstream.
The hfi parser, parses the capabilities received from venus firmware and copies them to core capabilities. Consider below api, for example, fill_caps - In this api, caps in core structure gets updated with the number of capabilities received in firmware data payload. If the same api is called multiple times, there is a possibility of copying beyond the max allocated size in core caps. Similar possibilities in fill_raw_fmts and fill_profile_level functions.
Cc: stable@vger.kernel.org Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_parser.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_parser.c +++ b/drivers/media/platform/qcom/venus/hfi_parser.c @@ -89,6 +89,9 @@ static void fill_profile_level(struct hf { const struct hfi_profile_level *pl = data;
+ if (cap->num_pl + num >= HFI_MAX_PROFILE_COUNT) + return; + memcpy(&cap->pl[cap->num_pl], pl, num * sizeof(*pl)); cap->num_pl += num; } @@ -114,6 +117,9 @@ fill_caps(struct hfi_plat_caps *cap, con { const struct hfi_capability *caps = data;
+ if (cap->num_caps + num >= MAX_CAP_ENTRIES) + return; + memcpy(&cap->caps[cap->num_caps], caps, num * sizeof(*caps)); cap->num_caps += num; } @@ -140,6 +146,9 @@ static void fill_raw_fmts(struct hfi_pla { const struct raw_formats *formats = fmts;
+ if (cap->num_fmts + num_fmts >= MAX_FMT_ENTRIES) + return; + memcpy(&cap->fmts[cap->num_fmts], formats, num_fmts * sizeof(*formats)); cap->num_fmts += num_fmts; } @@ -162,6 +171,9 @@ parse_raw_formats(struct venus_core *cor rawfmts[i].buftype = fmt->buffer_type; i++;
+ if (i >= MAX_FMT_ENTRIES) + return; + if (pinfo->num_planes > MAX_PLANES) break;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 724ff68e968b19d786870d333f9952bdd6b119cb upstream.
Initialise the try sink compose rectangle size to the sink compose rectangle for binner and scaler sub-devices. This was missed due to the faulty condition that lead to the compose rectangles to be initialised for the pixel array sub-device where it is not relevant.
Fixes: ccfc97bdb5ae ("[media] smiapp: Add driver") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/i2c/ccs/ccs-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/i2c/ccs/ccs-core.c +++ b/drivers/media/i2c/ccs/ccs-core.c @@ -3097,7 +3097,7 @@ static int ccs_open(struct v4l2_subdev * try_fmt->code = sensor->internal_csi_format->code; try_fmt->field = V4L2_FIELD_NONE;
- if (ssd != sensor->pixel_array) + if (ssd == sensor->pixel_array) continue;
try_comp = v4l2_subdev_get_try_compose(sd, fh->state, i);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit dab12fa8d2bd3868cf2de485ed15a3feef28a13d upstream.
The sads returned by drm_edid_to_sad() needs to be freed.
Fixes: e71a8ebbe086 ("drm/mediatek: dp: Audio support for MT8195") Cc: Guillaume Ranquet granquet@baylibre.com Cc: Bo-Chen Chen rex-bc.chen@mediatek.com Cc: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Cc: Dmitry Osipenko dmitry.osipenko@collabora.com Cc: Chun-Kuang Hu chunkuang.hu@kernel.org Cc: Philipp Zabel p.zabel@pengutronix.de Cc: Matthias Brugger matthias.bgg@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: linux-mediatek@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Chen-Yu Tsai wenst@chromium.org Link: https://patchwork.kernel.org/project/dri-devel/patch/20230914155317.2511876-... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/mediatek/mtk_dp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/mediatek/mtk_dp.c +++ b/drivers/gpu/drm/mediatek/mtk_dp.c @@ -1983,7 +1983,6 @@ static struct edid *mtk_dp_get_edid(stru bool enabled = mtk_dp->enabled; struct edid *new_edid = NULL; struct mtk_dp_audio_cfg *audio_caps = &mtk_dp->info.audio_cur_cfg; - struct cea_sad *sads;
if (!enabled) { drm_atomic_bridge_chain_pre_enable(bridge, connector->state->state); @@ -2010,7 +2009,11 @@ static struct edid *mtk_dp_get_edid(stru }
if (new_edid) { + struct cea_sad *sads; + audio_caps->sad_count = drm_edid_to_sad(new_edid, &sads); + kfree(sads); + audio_caps->detect_monitor = drm_detect_monitor_audio(new_edid); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit fcaf9761fd5884a64eaac48536f8c27ecfd2e6bc upstream.
Setting new_edid to NULL leaks the buffer.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Cc: Markus Schneider-Pargmann msp@baylibre.com Cc: Guillaume Ranquet granquet@baylibre.com Cc: Bo-Chen Chen rex-bc.chen@mediatek.com Cc: CK Hu ck.hu@mediatek.com Cc: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Cc: Dmitry Osipenko dmitry.osipenko@collabora.com Cc: Chun-Kuang Hu chunkuang.hu@kernel.org Cc: Philipp Zabel p.zabel@pengutronix.de Cc: Matthias Brugger matthias.bgg@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: linux-mediatek@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Guillaume Ranquet granquet@baylibre.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20230914131058.2472260-... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/mediatek/mtk_dp.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/mediatek/mtk_dp.c +++ b/drivers/gpu/drm/mediatek/mtk_dp.c @@ -2005,6 +2005,7 @@ static struct edid *mtk_dp_get_edid(stru */ if (mtk_dp_parse_capabilities(mtk_dp)) { drm_err(mtk_dp->drm_dev, "Can't parse capabilities\n"); + kfree(new_edid); new_edid = NULL; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 2a695062a5a42aead8c539a344168d4806b3fda2 upstream.
dm-bufio has a no-sleep mode. When activated (with the DM_BUFIO_CLIENT_NO_SLEEP flag), the bufio client is read-only and we could call dm_bufio_get from tasklets. This is used by dm-verity.
Unfortunately, commit 450e8dee51aa ("dm bufio: improve concurrent IO performance") broke this and the kernel would warn that cache_get() was calling down_read() from no-sleeping context. The bug can be reproduced by using "veritysetup open" with the "--use-tasklets" flag.
This commit fixes dm-bufio, so that the tasklet mode works again, by expanding use of the 'no_sleep_enabled' static_key to conditionally use either a rw_semaphore or rwlock_t (which are colocated in the buffer_tree structure using a union).
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org # v6.4 Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance") Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-bufio.c | 87 ++++++++++++++++++++++++++++++------------- 1 file changed, 62 insertions(+), 25 deletions(-)
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 62eb27639c9b..f03d7dba270c 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -254,7 +254,7 @@ enum evict_result {
typedef enum evict_result (*le_predicate)(struct lru_entry *le, void *context);
-static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *context) +static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *context, bool no_sleep) { unsigned long tested = 0; struct list_head *h = lru->cursor; @@ -295,7 +295,8 @@ static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *con
h = h->next;
- cond_resched(); + if (!no_sleep) + cond_resched(); }
return NULL; @@ -382,7 +383,10 @@ struct dm_buffer { */
struct buffer_tree { - struct rw_semaphore lock; + union { + struct rw_semaphore lock; + rwlock_t spinlock; + } u; struct rb_root root; } ____cacheline_aligned_in_smp;
@@ -393,9 +397,12 @@ struct dm_buffer_cache { * on the locks. */ unsigned int num_locks; + bool no_sleep; struct buffer_tree trees[]; };
+static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); + static inline unsigned int cache_index(sector_t block, unsigned int num_locks) { return dm_hash_locks_index(block, num_locks); @@ -403,22 +410,34 @@ static inline unsigned int cache_index(sector_t block, unsigned int num_locks)
static inline void cache_read_lock(struct dm_buffer_cache *bc, sector_t block) { - down_read(&bc->trees[cache_index(block, bc->num_locks)].lock); + if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep) + read_lock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock); + else + down_read(&bc->trees[cache_index(block, bc->num_locks)].u.lock); }
static inline void cache_read_unlock(struct dm_buffer_cache *bc, sector_t block) { - up_read(&bc->trees[cache_index(block, bc->num_locks)].lock); + if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep) + read_unlock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock); + else + up_read(&bc->trees[cache_index(block, bc->num_locks)].u.lock); }
static inline void cache_write_lock(struct dm_buffer_cache *bc, sector_t block) { - down_write(&bc->trees[cache_index(block, bc->num_locks)].lock); + if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep) + write_lock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock); + else + down_write(&bc->trees[cache_index(block, bc->num_locks)].u.lock); }
static inline void cache_write_unlock(struct dm_buffer_cache *bc, sector_t block) { - up_write(&bc->trees[cache_index(block, bc->num_locks)].lock); + if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep) + write_unlock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock); + else + up_write(&bc->trees[cache_index(block, bc->num_locks)].u.lock); }
/* @@ -442,18 +461,32 @@ static void lh_init(struct lock_history *lh, struct dm_buffer_cache *cache, bool
static void __lh_lock(struct lock_history *lh, unsigned int index) { - if (lh->write) - down_write(&lh->cache->trees[index].lock); - else - down_read(&lh->cache->trees[index].lock); + if (lh->write) { + if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep) + write_lock_bh(&lh->cache->trees[index].u.spinlock); + else + down_write(&lh->cache->trees[index].u.lock); + } else { + if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep) + read_lock_bh(&lh->cache->trees[index].u.spinlock); + else + down_read(&lh->cache->trees[index].u.lock); + } }
static void __lh_unlock(struct lock_history *lh, unsigned int index) { - if (lh->write) - up_write(&lh->cache->trees[index].lock); - else - up_read(&lh->cache->trees[index].lock); + if (lh->write) { + if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep) + write_unlock_bh(&lh->cache->trees[index].u.spinlock); + else + up_write(&lh->cache->trees[index].u.lock); + } else { + if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep) + read_unlock_bh(&lh->cache->trees[index].u.spinlock); + else + up_read(&lh->cache->trees[index].u.lock); + } }
/* @@ -502,14 +535,18 @@ static struct dm_buffer *list_to_buffer(struct list_head *l) return le_to_buffer(le); }
-static void cache_init(struct dm_buffer_cache *bc, unsigned int num_locks) +static void cache_init(struct dm_buffer_cache *bc, unsigned int num_locks, bool no_sleep) { unsigned int i;
bc->num_locks = num_locks; + bc->no_sleep = no_sleep;
for (i = 0; i < bc->num_locks; i++) { - init_rwsem(&bc->trees[i].lock); + if (no_sleep) + rwlock_init(&bc->trees[i].u.spinlock); + else + init_rwsem(&bc->trees[i].u.lock); bc->trees[i].root = RB_ROOT; }
@@ -648,7 +685,7 @@ static struct dm_buffer *__cache_evict(struct dm_buffer_cache *bc, int list_mode struct lru_entry *le; struct dm_buffer *b;
- le = lru_evict(&bc->lru[list_mode], __evict_pred, &w); + le = lru_evict(&bc->lru[list_mode], __evict_pred, &w, bc->no_sleep); if (!le) return NULL;
@@ -702,7 +739,7 @@ static void __cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_ struct evict_wrapper w = {.lh = lh, .pred = pred, .context = context};
while (true) { - le = lru_evict(&bc->lru[old_mode], __evict_pred, &w); + le = lru_evict(&bc->lru[old_mode], __evict_pred, &w, bc->no_sleep); if (!le) break;
@@ -915,10 +952,11 @@ static void cache_remove_range(struct dm_buffer_cache *bc, { unsigned int i;
+ BUG_ON(bc->no_sleep); for (i = 0; i < bc->num_locks; i++) { - down_write(&bc->trees[i].lock); + down_write(&bc->trees[i].u.lock); __remove_range(bc, &bc->trees[i].root, begin, end, pred, release); - up_write(&bc->trees[i].lock); + up_write(&bc->trees[i].u.lock); } }
@@ -979,8 +1017,6 @@ struct dm_bufio_client { struct dm_buffer_cache cache; /* must be last member */ };
-static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); - /*----------------------------------------------------------------*/
#define dm_bufio_in_request() (!!current->bio_list) @@ -1871,7 +1907,8 @@ static void *new_read(struct dm_bufio_client *c, sector_t block, if (need_submit) submit_io(b, REQ_OP_READ, read_endio);
- wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE); + if (nf != NF_GET) /* we already tested this condition above */ + wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);
if (b->read_error) { int error = blk_status_to_errno(b->read_error); @@ -2421,7 +2458,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign r = -ENOMEM; goto bad_client; } - cache_init(&c->cache, num_locks); + cache_init(&c->cache, num_locks, (flags & DM_BUFIO_CLIENT_NO_SLEEP) != 0);
c->bdev = bdev; c->block_size = block_size;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 28f07f2ab4b3a2714f1fefcc58ada4bcc195f806 upstream.
The commit 5721d4e5a9cd enhanced dm-verity, so that it can verify blocks from tasklets rather than from workqueues. This reportedly improves performance significantly.
However, dm-verity was using the flag CRYPTO_TFM_REQ_MAY_SLEEP from tasklets which resulted in warnings about sleeping function being called from non-sleeping context.
BUG: sleeping function called from invalid context at crypto/internal.h:206 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0 preempt_count: 100, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 PID: 14 Comm: ksoftirqd/0 Tainted: G W 6.7.0-rc1 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x32/0x50 __might_resched+0x110/0x160 crypto_hash_walk_done+0x54/0xb0 shash_ahash_update+0x51/0x60 verity_hash_update.isra.0+0x4a/0x130 [dm_verity] verity_verify_io+0x165/0x550 [dm_verity] ? free_unref_page+0xdf/0x170 ? psi_group_change+0x113/0x390 verity_tasklet+0xd/0x70 [dm_verity] tasklet_action_common.isra.0+0xb3/0xc0 __do_softirq+0xaf/0x1ec ? smpboot_thread_fn+0x1d/0x200 ? sort_range+0x20/0x20 run_ksoftirqd+0x15/0x30 smpboot_thread_fn+0xed/0x200 kthread+0xdc/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x28/0x40 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK>
This commit fixes dm-verity so that it doesn't use the flags CRYPTO_TFM_REQ_MAY_SLEEP and CRYPTO_TFM_REQ_MAY_BACKLOG from tasklets. The crypto API would do GFP_ATOMIC allocation instead, it could return -ENOMEM and we catch -ENOMEM in verity_tasklet and requeue the request to the workqueue.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org # v6.0+ Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature") Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-verity-fec.c | 4 ++-- drivers/md/dm-verity-target.c | 23 ++++++++++++----------- drivers/md/dm-verity.h | 2 +- 3 files changed, 15 insertions(+), 14 deletions(-)
--- a/drivers/md/dm-verity-fec.c +++ b/drivers/md/dm-verity-fec.c @@ -185,7 +185,7 @@ static int fec_is_erasure(struct dm_veri { if (unlikely(verity_hash(v, verity_io_hash_req(v, io), data, 1 << v->data_dev_block_bits, - verity_io_real_digest(v, io)))) + verity_io_real_digest(v, io), true))) return 0;
return memcmp(verity_io_real_digest(v, io), want_digest, @@ -386,7 +386,7 @@ static int fec_decode_rsb(struct dm_veri /* Always re-validate the corrected block against the expected hash */ r = verity_hash(v, verity_io_hash_req(v, io), fio->output, 1 << v->data_dev_block_bits, - verity_io_real_digest(v, io)); + verity_io_real_digest(v, io), true); if (unlikely(r < 0)) return r;
--- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -135,20 +135,21 @@ static int verity_hash_update(struct dm_ * Wrapper for crypto_ahash_init, which handles verity salting. */ static int verity_hash_init(struct dm_verity *v, struct ahash_request *req, - struct crypto_wait *wait) + struct crypto_wait *wait, bool may_sleep) { int r;
ahash_request_set_tfm(req, v->tfm); - ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP | - CRYPTO_TFM_REQ_MAY_BACKLOG, - crypto_req_done, (void *)wait); + ahash_request_set_callback(req, + may_sleep ? CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG : 0, + crypto_req_done, (void *)wait); crypto_init_wait(wait);
r = crypto_wait_req(crypto_ahash_init(req), wait);
if (unlikely(r < 0)) { - DMERR("crypto_ahash_init failed: %d", r); + if (r != -ENOMEM) + DMERR("crypto_ahash_init failed: %d", r); return r; }
@@ -179,12 +180,12 @@ out: }
int verity_hash(struct dm_verity *v, struct ahash_request *req, - const u8 *data, size_t len, u8 *digest) + const u8 *data, size_t len, u8 *digest, bool may_sleep) { int r; struct crypto_wait wait;
- r = verity_hash_init(v, req, &wait); + r = verity_hash_init(v, req, &wait, may_sleep); if (unlikely(r < 0)) goto out;
@@ -322,7 +323,7 @@ static int verity_verify_level(struct dm
r = verity_hash(v, verity_io_hash_req(v, io), data, 1 << v->hash_dev_block_bits, - verity_io_real_digest(v, io)); + verity_io_real_digest(v, io), !io->in_tasklet); if (unlikely(r < 0)) goto release_ret_r;
@@ -556,7 +557,7 @@ static int verity_verify_io(struct dm_ve continue; }
- r = verity_hash_init(v, req, &wait); + r = verity_hash_init(v, req, &wait, !io->in_tasklet); if (unlikely(r < 0)) return r;
@@ -652,7 +653,7 @@ static void verity_tasklet(unsigned long
io->in_tasklet = true; err = verity_verify_io(io); - if (err == -EAGAIN) { + if (err == -EAGAIN || err == -ENOMEM) { /* fallback to retrying with work-queue */ INIT_WORK(&io->work, verity_work); queue_work(io->v->verify_wq, &io->work); @@ -1033,7 +1034,7 @@ static int verity_alloc_zero_digest(stru goto out;
r = verity_hash(v, req, zero_data, 1 << v->data_dev_block_bits, - v->zero_digest); + v->zero_digest, true);
out: kfree(req); --- a/drivers/md/dm-verity.h +++ b/drivers/md/dm-verity.h @@ -128,7 +128,7 @@ extern int verity_for_bv_block(struct dm u8 *data, size_t len));
extern int verity_hash(struct dm_verity *v, struct ahash_request *req, - const u8 *data, size_t len, u8 *digest); + const u8 *data, size_t len, u8 *digest, bool may_sleep);
extern int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io, sector_t block, u8 *digest, bool *is_zero);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahmoud Adam mngyadam@amazon.com
commit bc1b5acb40201a0746d68a7d7cfc141899937f4f upstream.
seq_release should be called to free the allocated seq_file
Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Mahmoud Adam mngyadam@amazon.com Reviewed-by: Jeff Layton jlayton@kernel.org Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown neilb@suse.de Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2785,7 +2785,7 @@ static int client_opens_release(struct i
/* XXX: alternatively, we could get/drop in seq start/stop */ drop_client(clp); - return 0; + return seq_release(inode, file); }
static const struct file_operations client_states_fops = {
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 49cecd8628a9855cd993792a0377559ea32d5e7c upstream.
When inserting a DRC-cached response into the reply buffer, ensure that the reply buffer's xdr_stream is updated properly. Otherwise the server will send a garbage response.
Cc: stable@vger.kernel.org # v6.3+ Reviewed-by: Jeff Layton jlayton@kernel.org Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfscache.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-)
--- a/fs/nfsd/nfscache.c +++ b/fs/nfsd/nfscache.c @@ -582,24 +582,17 @@ void nfsd_cache_update(struct svc_rqst * return; }
-/* - * Copy cached reply to current reply buffer. Should always fit. - * FIXME as reply is in a page, we should just attach the page, and - * keep a refcount.... - */ static int nfsd_cache_append(struct svc_rqst *rqstp, struct kvec *data) { - struct kvec *vec = &rqstp->rq_res.head[0]; + __be32 *p;
- if (vec->iov_len + data->iov_len > PAGE_SIZE) { - printk(KERN_WARNING "nfsd: cached reply too large (%zd).\n", - data->iov_len); - return 0; - } - memcpy((char*)vec->iov_base + vec->iov_len, data->iov_base, data->iov_len); - vec->iov_len += data->iov_len; - return 1; + p = xdr_reserve_space(&rqstp->rq_res_stream, data->iov_len); + if (unlikely(!p)) + return false; + memcpy(p, data->iov_base, data->iov_len); + xdr_commit_encode(&rqstp->rq_res_stream); + return true; }
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 71945968d8b128c955204baa33ec03bdd91bdc26 upstream.
A recent change to the optimization pipeline in LLVM reveals some fragility around the inlining of LoongArch's __percpu functions, which manifests as a BUILD_BUG() failure:
In file included from kernel/sched/build_policy.c:17: In file included from include/linux/sched/cputime.h:5: In file included from include/linux/sched/signal.h:5: In file included from include/linux/rculist.h:11: In file included from include/linux/rcupdate.h:26: In file included from include/linux/irqflags.h:18: arch/loongarch/include/asm/percpu.h:97:3: error: call to '__compiletime_assert_51' declared with 'error' attribute: BUILD_BUG failed 97 | BUILD_BUG(); | ^ include/linux/build_bug.h:59:21: note: expanded from macro 'BUILD_BUG' 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") | ^ include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^ include/linux/compiler_types.h:425:2: note: expanded from macro 'compiletime_assert' 425 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:413:2: note: expanded from macro '_compiletime_assert' 413 | __compiletime_assert(condition, msg, prefix, suffix) | ^ include/linux/compiler_types.h:406:4: note: expanded from macro '__compiletime_assert' 406 | prefix ## suffix(); \ | ^ <scratch space>:86:1: note: expanded from here 86 | __compiletime_assert_51 | ^ 1 error generated.
If these functions are not inlined (which the compiler is free to do even with functions marked with the standard 'inline' keyword), the BUILD_BUG() in the default case cannot be eliminated since the compiler cannot prove it is never used, resulting in a build failure due to the error attribute.
Mark these functions as __always_inline to guarantee inlining so that the BUILD_BUG() only triggers when the default case genuinely cannot be eliminated due to an unexpected size.
Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/1955 Fixes: 46859ac8af52 ("LoongArch: Add multi-processor (SMP) support") Link: https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a56... Suggested-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/loongarch/include/asm/percpu.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/arch/loongarch/include/asm/percpu.h +++ b/arch/loongarch/include/asm/percpu.h @@ -32,7 +32,7 @@ static inline void set_my_cpu_offset(uns #define __my_cpu_offset __my_cpu_offset
#define PERCPU_OP(op, asm_op, c_op) \ -static inline unsigned long __percpu_##op(void *ptr, \ +static __always_inline unsigned long __percpu_##op(void *ptr, \ unsigned long val, int size) \ { \ unsigned long ret; \ @@ -63,7 +63,7 @@ PERCPU_OP(and, and, &) PERCPU_OP(or, or, |) #undef PERCPU_OP
-static inline unsigned long __percpu_read(void *ptr, int size) +static __always_inline unsigned long __percpu_read(void *ptr, int size) { unsigned long ret;
@@ -100,7 +100,7 @@ static inline unsigned long __percpu_rea return ret; }
-static inline void __percpu_write(void *ptr, unsigned long val, int size) +static __always_inline void __percpu_write(void *ptr, unsigned long val, int size) { switch (size) { case 1: @@ -132,8 +132,8 @@ static inline void __percpu_write(void * } }
-static inline unsigned long __percpu_xchg(void *ptr, unsigned long val, - int size) +static __always_inline unsigned long __percpu_xchg(void *ptr, unsigned long val, + int size) { switch (size) { case 1:
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Minda Chen minda.chen@starfivetech.com
commit dd16ac404a685cce07e67261a94c6225d90ea7ba upstream.
Actually it is a part of Conor's commit aae538cd03bc ("riscv: fix detection of toolchain Zihintpause support"). It is looks like a merge issue. Samuel's commit 0b1d60d6dd9e ("riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y") do not base on Conor's commit and revert to __riscv_zihintpause. So this patch can fix it.
Signed-off-by: Minda Chen minda.chen@starfivetech.com Fixes: 3c349eacc559 ("Merge patch "riscv: Fix build with CONFIG_CC_OPTIMIZE_FOR_SIZE=y"") Reviewed-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20230802064215.31111-1-minda.chen@starfivetech.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/include/asm/vdso/processor.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/vdso/processor.h b/arch/riscv/include/asm/vdso/processor.h index 14f5d27783b8..96b65a5396df 100644 --- a/arch/riscv/include/asm/vdso/processor.h +++ b/arch/riscv/include/asm/vdso/processor.h @@ -14,7 +14,7 @@ static inline void cpu_relax(void) __asm__ __volatile__ ("div %0, %0, zero" : "=r" (dummy)); #endif
-#ifdef __riscv_zihintpause +#ifdef CONFIG_TOOLCHAIN_HAS_ZIHINTPAUSE /* * Reduce instruction retirement. * This assumes the PC changes.
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcaov@gmail.com
commit 87615e95f6f9ccd36d4a3905a2d87f91967ea9d2 upstream.
The interrupt entries are expected to be in the .irqentry.text section. For example, for kprobes to work properly, exception code cannot be probed; this is ensured by blacklisting addresses in the .irqentry.text section.
Fixes: 7db91e57a0ac ("RISC-V: Task implementation") Signed-off-by: Nam Cao namcaov@gmail.com Link: https://lore.kernel.org/r/20230821145708.21270-1-namcaov@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/entry.S | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -16,6 +16,8 @@ #include <asm/errata_list.h> #include <linux/sizes.h>
+ .section .irqentry.text, "ax" + SYM_CODE_START(handle_exception) /* * If coming from userspace, preserve the user thread pointer and load
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Shuai suagrfillet@gmail.com
commit 559fe94a449cba5b50a7cffea60474b385598c00 upstream.
Since the commit 011f09d12052 set sv57 as default for CONFIG_64BIT, the comment of CONFIG_PAGE_OFFSET should be updated too.
Fixes: 011f09d12052 ("riscv: mm: Set sv57 on defaultly") Signed-off-by: Song Shuai suagrfillet@gmail.com Link: https://lore.kernel.org/r/20230809031023.3575407-1-songshuaishuai@tinylab.or... Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/include/asm/page.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -33,8 +33,8 @@ #define PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL) #endif /* - * By default, CONFIG_PAGE_OFFSET value corresponds to SV48 address space so - * define the PAGE_OFFSET value for SV39. + * By default, CONFIG_PAGE_OFFSET value corresponds to SV57 address space so + * define the PAGE_OFFSET value for SV48 and SV39. */ #define PAGE_OFFSET_L4 _AC(0xffffaf8000000000, UL) #define PAGE_OFFSET_L3 _AC(0xffffffd800000000, UL)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Shuai suagrfillet@gmail.com
commit e59e5e2754bf983fc58ad18f99b5eec01f1a0745 upstream.
The pt_level uses CONFIG_PGTABLE_LEVELS to display page table names. But if page mode is downgraded from kernel cmdline or restricted by the hardware in 64BIT, it will give a wrong name.
Like, using no4lvl for sv39, ptdump named the 1G-mapping as "PUD" that should be "PGD":
0xffffffd840000000-0xffffffd900000000 0x00000000c0000000 3G PUD D A G . . W R V
So select "P4D/PUD" or "PGD" via pgtable_l5/4_enabled to correct it.
Fixes: e8a62cc26ddf ("riscv: Implement sv48 support") Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: Song Shuai suagrfillet@gmail.com Link: https://lore.kernel.org/r/20230712115740.943324-1-suagrfillet@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230830044129.11481-3-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/mm/ptdump.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/riscv/mm/ptdump.c +++ b/arch/riscv/mm/ptdump.c @@ -384,6 +384,9 @@ static int __init ptdump_init(void)
kernel_ptd_info.base_addr = KERN_VIRT_START;
+ pg_level[1].name = pgtable_l5_enabled ? "P4D" : "PGD"; + pg_level[2].name = pgtable_l4_enabled ? "PUD" : "PGD"; + for (i = 0; i < ARRAY_SIZE(pg_level); i++) for (j = 0; j < ARRAY_SIZE(pte_bits); j++) pg_level[i].mask |= pte_bits[j].mask;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcaov@gmail.com
commit 8cb22bec142624d21bc85ff96b7bad10b6220e6a upstream.
Instructions can write to x0, so we should simulate these instructions normally.
Currently, the kernel hangs if an instruction who writes to x0 is simulated.
Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcaov@gmail.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Acked-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/r/20230829182500.61875-1-namcaov@gmail.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/probes/simulate-insn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/riscv/kernel/probes/simulate-insn.c +++ b/arch/riscv/kernel/probes/simulate-insn.c @@ -24,7 +24,7 @@ static inline bool rv_insn_reg_set_val(s unsigned long val) { if (index == 0) - return false; + return true; else if (index <= 31) *((unsigned long *)regs + index) = val; else
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Shih victor.shih@genesyslogic.com.tw
commit d7133797e9e1b72fd89237f68cb36d745599ed86 upstream.
When GL9750 enters ASPM L1 sub-states, it will stay at L1.1 and will not enter L1.2. The workaround is to toggle PM state to allow GL9750 to enter ASPM L1.2.
Signed-off-by: Victor Shih victor.shih@genesyslogic.com.tw Link: https://lore.kernel.org/r/20230912091710.7797-1-victorshihgli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-gli.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -25,6 +25,9 @@ #define GLI_9750_WT_EN_ON 0x1 #define GLI_9750_WT_EN_OFF 0x0
+#define PCI_GLI_9750_PM_CTRL 0xFC +#define PCI_GLI_9750_PM_STATE GENMASK(1, 0) + #define SDHCI_GLI_9750_CFG2 0x848 #define SDHCI_GLI_9750_CFG2_L1DLY GENMASK(28, 24) #define GLI_9750_CFG2_L1DLY_VALUE 0x1F @@ -539,8 +542,12 @@ static void sdhci_gl9750_set_clock(struc
static void gl9750_hw_setting(struct sdhci_host *host) { + struct sdhci_pci_slot *slot = sdhci_priv(host); + struct pci_dev *pdev; u32 value;
+ pdev = slot->chip->pdev; + gl9750_wt_on(host);
value = sdhci_readl(host, SDHCI_GLI_9750_CFG2); @@ -550,6 +557,13 @@ static void gl9750_hw_setting(struct sdh GLI_9750_CFG2_L1DLY_VALUE); sdhci_writel(host, value, SDHCI_GLI_9750_CFG2);
+ /* toggle PM state to allow GL9750 to enter ASPM L1.2 */ + pci_read_config_dword(pdev, PCI_GLI_9750_PM_CTRL, &value); + value |= PCI_GLI_9750_PM_STATE; + pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value); + value &= ~PCI_GLI_9750_PM_STATE; + pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value); + gl9750_wt_off(host); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Roesch shr@devkernel.io
commit a48d5bdc877b85201e42cef9c2fdf5378164c23a upstream.
While qualifiying the 6.4 release, the following warning was detected in messages:
vmstat_refresh: nr_file_hugepages -15664
The warning is caused by the incorrect updating of the NR_FILE_THPS counter in the function split_huge_page_to_list. The if case is checking for folio_test_swapbacked, but the else case is missing the check for folio_test_pmd_mappable. The other functions that manipulate the counter like __filemap_add_folio and filemap_unaccount_folio have the corresponding check.
I have a test case, which reproduces the problem. It can be found here: https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c
The test case reproduces on an XFS filesystem. Running the same test case on a BTRFS filesystem does not reproduce the problem.
AFAIK version 6.1 until 6.6 are affected by this problem.
[akpm@linux-foundation.org: whitespace fix] [shr@devkernel.io: test for folio_test_pmd_mappable()] Link: https://lkml.kernel.org/r/20231108171517.2436103-1-shr@devkernel.io Link: https://lkml.kernel.org/r/20231106181918.1091043-1-shr@devkernel.io Signed-off-by: Stefan Roesch shr@devkernel.io Co-debugged-by: Johannes Weiner hannes@cmpxchg.org Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: Rik van Riel riel@surriel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2759,13 +2759,15 @@ int split_huge_page_to_list(struct page int nr = folio_nr_pages(folio);
xas_split(&xas, folio, folio_order(folio)); - if (folio_test_swapbacked(folio)) { - __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, - -nr); - } else { - __lruvec_stat_mod_folio(folio, NR_FILE_THPS, - -nr); - filemap_nr_thps_dec(mapping); + if (folio_test_pmd_mappable(folio)) { + if (folio_test_swapbacked(folio)) { + __lruvec_stat_mod_folio(folio, + NR_SHMEM_THPS, -nr); + } else { + __lruvec_stat_mod_folio(folio, + NR_FILE_THPS, -nr); + filemap_nr_thps_dec(mapping); + } } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin roman.gushchin@linux.dev
commit 24948e3b7b12e0031a6edb4f49bbb9fb2ad1e4e9 upstream.
Objcg vectors attached to slab pages to store slab object ownership information are allocated using gfp flags for the original slab allocation. Depending on slab page order and the size of slab objects, objcg vector can take several pages.
If the original allocation was done with the __GFP_NOFAIL flag, it triggered a warning in the page allocation code. Indeed, order > 1 pages should not been allocated with the __GFP_NOFAIL flag.
Fix this by simply dropping the __GFP_NOFAIL flag when allocating the objcg vector. It effectively allows to skip the accounting of a single slab object under a heavy memory pressure.
An alternative would be to implement the mechanism to fallback to order-0 allocations for accounting metadata, which is also not perfect because it will increase performance penalty and memory footprint of the kernel memory accounting under memory pressure.
Link: https://lkml.kernel.org/r/ZUp8ZFGxwmCx4ZFr@P9FQF9L96D.corp.robot.car Signed-off-by: Roman Gushchin roman.gushchin@linux.dev Reported-by: Christoph Lameter cl@linux.com Closes: https://lkml.kernel.org/r/6b42243e-f197-600a-5d22-56bd728a5ad8@gentwo.org Acked-by: Shakeel Butt shakeelb@google.com Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2868,7 +2868,8 @@ static void commit_charge(struct folio * * Moreover, it should not come from DMA buffer and is not readily * reclaimable. So those GFP bits should be masked off. */ -#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT) +#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \ + __GFP_ACCOUNT | __GFP_NOFAIL)
/* * mod_objcg_mlstate() may be called with irq enabled, so
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 9fce92f050f448a0d1ddd9083ef967d9930f1e52 upstream.
After the blamed commit below, the TCP sockets (and the MPTCP subflows) can build egress packets larger than 64K. That exceeds the maximum DSS data size, the length being misrepresent on the wire and the stream being corrupted, as later observed on the receiver:
WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0 CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'. RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705 RSP: 0018:ffffc90000006e80 EFLAGS: 00010246 RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000 netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'. RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908 RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908 R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29 FS: 00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <IRQ> mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819 subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409 tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151 tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098 tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483 tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749 ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438 ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483 ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304 __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532 process_backlog+0x353/0x660 net/core/dev.c:5974 __napi_poll+0xc6/0x5a0 net/core/dev.c:6536 net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603 __do_softirq+0x184/0x524 kernel/softirq.c:553 do_softirq+0xdd/0x130 kernel/softirq.c:454
Address the issue explicitly bounding the maximum GSO size to what MPTCP actually allows.
Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450 Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matttbe@kernel.org Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1233,6 +1233,8 @@ static void mptcp_update_infinite_map(st mptcp_do_fallback(ssk); }
+#define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1)) + static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, struct mptcp_data_frag *dfrag, struct mptcp_sendmsg_info *info) @@ -1259,6 +1261,8 @@ static int mptcp_sendmsg_frag(struct soc return -EAGAIN;
/* compute send limit */ + if (unlikely(ssk->sk_gso_max_size > MPTCP_MAX_GSO_SIZE)) + ssk->sk_gso_max_size = MPTCP_MAX_GSO_SIZE; info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags); copy = info->size_goal;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geliang Tang geliang.tang@suse.com
commit 8df220b29282e8b450ea57be62e1eccd4996837c upstream.
This patch adds the validity check for sending RM_ADDRs for userspace PM in mptcp_pm_remove_addrs(), only send a RM_ADDR when the address is in the anno_list or conn_list.
Fixes: 8b1c94da1e48 ("mptcp: only send RM_ADDR in nl_cmd_remove") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang geliang.tang@suse.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matttbe@kernel.org Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/pm_netlink.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1537,8 +1537,9 @@ void mptcp_pm_remove_addrs(struct mptcp_ struct mptcp_pm_addr_entry *entry;
list_for_each_entry(entry, rm_list, list) { - remove_anno_list_by_saddr(msk, &entry->addr); - if (alist.nr < MPTCP_RM_IDS_MAX) + if ((remove_anno_list_by_saddr(msk, &entry->addr) || + lookup_subflow_by_saddr(&msk->conn_list, &entry->addr)) && + alist.nr < MPTCP_RM_IDS_MAX) alist.ids[alist.nr++] = entry->addr.id; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 7679d34f97b7a09fd565f5729f79fd61b7c55329 upstream.
The MPTCP implementation of the IP_TOS socket option uses the lockless variant of the TOS manipulation helper and does not hold such lock at the helper invocation time.
Add the required locking.
Fixes: ffcacff87cd6 ("mptcp: Support for IP_TOS for MPTCP setsockopt()") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/457 Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matttbe@kernel.org Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/sockopt.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -737,8 +737,11 @@ static int mptcp_setsockopt_v4_set_tos(s val = inet_sk(sk)->tos; mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + bool slow;
+ slow = lock_sock_fast(ssk); __ip_sock_set_tos(ssk, val); + unlock_sock_fast(ssk, slow); } release_sock(sk);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 7cefbe5e1dacc7236caa77e9d072423f21422fe2 upstream.
Running the mp_join selftest manually with the following command line:
./mptcp_join.sh -z -C
leads to some failures:
002 fastclose server test # ... rtx [fail] got 1 MP_RST[s] TX expected 0 # ... rstrx [fail] got 1 MP_RST[s] RX expected 0
The problem is really in the wrong expectations for the RST checks implied by the csum validation. Note that the same check is repeated explicitly in the same test-case, with the correct expectation and pass successfully.
Address the issue explicitly setting the correct expectation for the failing checks.
Reported-by: Xiumei Mu xmu@redhat.com Fixes: 6bf41020b72b ("selftests: mptcp: update and extend fastclose test-cases") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matttbe@kernel.org Signed-off-by: Matthieu Baerts matttbe@kernel.org Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3207,7 +3207,7 @@ fastclose_tests() if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then test_linkfail=1024 addr_nr_ns2=fastclose_server \ run_tests $ns1 $ns2 10.0.1.1 - chk_join_nr 0 0 0 + chk_join_nr 0 0 0 0 0 0 1 chk_fclose_nr 1 1 invert chk_rst_nr 1 1 fi
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChunHao Lin hau@realtek.com
commit 868c3b95afef4883bfb66c9397482da6840b5baf upstream.
Device that support DASH may be reseted or powered off during suspend. So driver needs to handle DASH during system suspend and resume. Or DASH firmware will influence device behavior and causes network lost.
Fixes: b646d90053f8 ("r8169: magic.") Cc: stable@vger.kernel.org Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: ChunHao Lin hau@realtek.com Link: https://lore.kernel.org/r/20231109173400.4573-3-hau@realtek.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4648,10 +4648,16 @@ static void rtl8169_down(struct rtl8169_ rtl8169_cleanup(tp); rtl_disable_exit_l1(tp); rtl_prepare_power_down(tp); + + if (tp->dash_type != RTL_DASH_NONE) + rtl8168_driver_stop(tp); }
static void rtl8169_up(struct rtl8169_private *tp) { + if (tp->dash_type != RTL_DASH_NONE) + rtl8168_driver_start(tp); + pci_set_master(tp->pci_dev); phy_init_hw(tp->phydev); phy_resume(tp->phydev);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChunHao Lin hau@realtek.com
commit 0ab0c45d8aaea5192328bfa6989673aceafc767c upstream.
For devices that support DASH, even DASH is disabled, there may still exist a default firmware that will influence device behavior. So driver needs to handle DASH for devices that support DASH, no matter the DASH status is.
This patch also prepares for "fix network lost after resume on DASH systems".
Fixes: ee7a1beb9759 ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled") Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/20231109173400.4573-2-hau@realtek.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 36 ++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 11 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -624,6 +624,7 @@ struct rtl8169_private {
unsigned supports_gmii:1; unsigned aspm_manageable:1; + unsigned dash_enabled:1; dma_addr_t counters_phys_addr; struct rtl8169_counters *counters; struct rtl8169_tc_offsets tc_offset; @@ -1253,14 +1254,26 @@ static bool r8168ep_check_dash(struct rt return r8168ep_ocp_read(tp, 0x128) & BIT(0); }
-static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp) +static bool rtl_dash_is_enabled(struct rtl8169_private *tp) +{ + switch (tp->dash_type) { + case RTL_DASH_DP: + return r8168dp_check_dash(tp); + case RTL_DASH_EP: + return r8168ep_check_dash(tp); + default: + return false; + } +} + +static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp) { switch (tp->mac_version) { case RTL_GIGA_MAC_VER_28: case RTL_GIGA_MAC_VER_31: - return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE; + return RTL_DASH_DP; case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53: - return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE; + return RTL_DASH_EP; default: return RTL_DASH_NONE; } @@ -1453,7 +1466,7 @@ static void __rtl8169_set_wol(struct rtl
device_set_wakeup_enable(tp_to_dev(tp), wolopts);
- if (tp->dash_type == RTL_DASH_NONE) { + if (!tp->dash_enabled) { rtl_set_d3_pll_down(tp, !wolopts); tp->dev->wol_enabled = wolopts ? 1 : 0; } @@ -2512,7 +2525,7 @@ static void rtl_wol_enable_rx(struct rtl
static void rtl_prepare_power_down(struct rtl8169_private *tp) { - if (tp->dash_type != RTL_DASH_NONE) + if (tp->dash_enabled) return;
if (tp->mac_version == RTL_GIGA_MAC_VER_32 || @@ -4875,7 +4888,7 @@ static int rtl8169_runtime_idle(struct d { struct rtl8169_private *tp = dev_get_drvdata(device);
- if (tp->dash_type != RTL_DASH_NONE) + if (tp->dash_enabled) return -EBUSY;
if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev)) @@ -4901,8 +4914,7 @@ static void rtl_shutdown(struct pci_dev /* Restore original MAC address */ rtl_rar_set(tp, tp->dev->perm_addr);
- if (system_state == SYSTEM_POWER_OFF && - tp->dash_type == RTL_DASH_NONE) { + if (system_state == SYSTEM_POWER_OFF && !tp->dash_enabled) { pci_wake_from_d3(pdev, tp->saved_wolopts); pci_set_power_state(pdev, PCI_D3hot); } @@ -5260,7 +5272,8 @@ static int rtl_init_one(struct pci_dev * rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1); tp->aspm_manageable = !rc;
- tp->dash_type = rtl_check_dash(tp); + tp->dash_type = rtl_get_dash_type(tp); + tp->dash_enabled = rtl_dash_is_enabled(tp);
tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
@@ -5331,7 +5344,7 @@ static int rtl_init_one(struct pci_dev * /* configure chip for default features */ rtl8169_set_features(dev, dev->features);
- if (tp->dash_type == RTL_DASH_NONE) { + if (!tp->dash_enabled) { rtl_set_d3_pll_down(tp, true); } else { rtl_set_d3_pll_down(tp, false); @@ -5371,7 +5384,8 @@ static int rtl_init_one(struct pci_dev * "ok" : "ko");
if (tp->dash_type != RTL_DASH_NONE) { - netdev_info(dev, "DASH enabled\n"); + netdev_info(dev, "DASH %s\n", + tp->dash_enabled ? "enabled" : "disabled"); rtl8168_driver_start(tp); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Shih victor.shih@genesyslogic.com.tw
commit 015c9cbcf0ad709079117d27c2094a46e0eadcdb upstream.
Due to a flaw in the hardware design, the GL9750 replay timer frequently times out when ASPM is enabled. As a result, the warning messages will often appear in the system log when the system accesses the GL9750 PCI config. Therefore, the replay timer timeout must be masked.
Fixes: d7133797e9e1 ("mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2") Signed-off-by: Victor Shih victor.shih@genesyslogic.com.tw Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Kai-Heng Feng kai.heng.geng@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231107095741.8832-2-victorshihgli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-gli.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -28,6 +28,9 @@ #define PCI_GLI_9750_PM_CTRL 0xFC #define PCI_GLI_9750_PM_STATE GENMASK(1, 0)
+#define PCI_GLI_9750_CORRERR_MASK 0x214 +#define PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT BIT(12) + #define SDHCI_GLI_9750_CFG2 0x848 #define SDHCI_GLI_9750_CFG2_L1DLY GENMASK(28, 24) #define GLI_9750_CFG2_L1DLY_VALUE 0x1F @@ -564,6 +567,11 @@ static void gl9750_hw_setting(struct sdh value &= ~PCI_GLI_9750_PM_STATE; pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value);
+ /* mask the replay timer timeout of AER */ + pci_read_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, &value); + value |= PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT; + pci_write_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, value); + gl9750_wt_off(host); }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 7405116519ad70b8c7340359bfac8db8279e7ce4 upstream.
We need to make sure camss_configure_pd() happens before camss_register_entities() as the vfe_get() path relies on the pointer provided by camss_configure_pd().
Fix the ordering sequence in probe to ensure the pointers vfe_get() demands are present by the time camss_register_entities() runs.
In order to facilitate backporting to stable kernels I've moved the configure_pd() call pretty early on the probe() function so that irrespective of the existence of the old error handling jump labels this patch should still apply to -next circa Aug 2023 to v5.13 inclusive.
Fixes: 2f6f8af67203 ("media: camss: Refactor VFE power domain toggling") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss.c +++ b/drivers/media/platform/qcom/camss/camss.c @@ -1627,6 +1627,12 @@ static int camss_probe(struct platform_d if (ret < 0) goto err_cleanup;
+ ret = camss_configure_pd(camss); + if (ret < 0) { + dev_err(dev, "Failed to configure power domains: %d\n", ret); + goto err_cleanup; + } + ret = camss_init_subdevices(camss); if (ret < 0) goto err_cleanup; @@ -1679,12 +1685,6 @@ static int camss_probe(struct platform_d } }
- ret = camss_configure_pd(camss); - if (ret < 0) { - dev_err(dev, "Failed to configure power domains: %d\n", ret); - return ret; - } - pm_runtime_enable(dev);
return 0;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 26bda3da00c3edef727a6acb00ed2eb4b22f8723 upstream.
Right now it is possible to do a vfe_get() with the internal reference count at 1. If vfe_check_clock_rates() returns non-zero then we will leave the reference count as-is and
run: - pm_runtime_put_sync() - vfe->ops->pm_domain_off()
skip: - camss_disable_clocks()
Subsequent vfe_put() calls will when the ref-count is non-zero unconditionally run:
- pm_runtime_put_sync() - vfe->ops->pm_domain_off() - camss_disable_clocks()
vfe_get() should not attempt to roll-back on error when the ref-count is non-zero as the upper layers will still do their own vfe_put() operations.
vfe_put() will drop the reference count and do the necessary power domain release, the cleanup jumps in vfe_get() should only be run when the ref-count is zero.
[ 50.095796] CPU: 7 PID: 3075 Comm: cam Not tainted 6.3.2+ #80 [ 50.095798] Hardware name: LENOVO 21BXCTO1WW/21BXCTO1WW, BIOS N3HET82W (1.54 ) 05/26/2023 [ 50.095799] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 50.095802] pc : refcount_warn_saturate+0xf4/0x148 [ 50.095804] lr : refcount_warn_saturate+0xf4/0x148 [ 50.095805] sp : ffff80000c7cb8b0 [ 50.095806] x29: ffff80000c7cb8b0 x28: ffff16ecc0e3fc10 x27: 0000000000000000 [ 50.095810] x26: 0000000000000000 x25: 0000000000020802 x24: 0000000000000000 [ 50.095813] x23: ffff16ecc7360640 x22: 00000000ffffffff x21: 0000000000000005 [ 50.095815] x20: ffff16ed175f4400 x19: ffffb4d9852942a8 x18: ffffffffffffffff [ 50.095818] x17: ffffb4d9852d4a48 x16: ffffb4d983da5db8 x15: ffff80000c7cb320 [ 50.095821] x14: 0000000000000001 x13: 2e656572662d7265 x12: 7466612d65737520 [ 50.095823] x11: 00000000ffffefff x10: ffffb4d9850cebf0 x9 : ffffb4d9835cf954 [ 50.095826] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000057fa8 [ 50.095829] x5 : ffff16f813fe3d08 x4 : 0000000000000000 x3 : ffff621e8f4d2000 [ 50.095832] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff16ed32119040 [ 50.095835] Call trace: [ 50.095836] refcount_warn_saturate+0xf4/0x148 [ 50.095838] device_link_put_kref+0x84/0xc8 [ 50.095843] device_link_del+0x38/0x58 [ 50.095846] vfe_pm_domain_off+0x3c/0x50 [qcom_camss] [ 50.095860] vfe_put+0x114/0x140 [qcom_camss] [ 50.095869] csid_set_power+0x2c8/0x408 [qcom_camss] [ 50.095878] pipeline_pm_power_one+0x164/0x170 [videodev] [ 50.095896] pipeline_pm_power+0xc4/0x110 [videodev] [ 50.095909] v4l2_pipeline_pm_use+0x5c/0xa0 [videodev] [ 50.095923] v4l2_pipeline_pm_get+0x1c/0x30 [videodev] [ 50.095937] video_open+0x7c/0x100 [qcom_camss] [ 50.095945] v4l2_open+0x84/0x130 [videodev] [ 50.095960] chrdev_open+0xc8/0x250 [ 50.095964] do_dentry_open+0x1bc/0x498 [ 50.095966] vfs_open+0x34/0x40 [ 50.095968] path_openat+0xb44/0xf20 [ 50.095971] do_filp_open+0xa4/0x160 [ 50.095974] do_sys_openat2+0xc8/0x188 [ 50.095975] __arm64_sys_openat+0x6c/0xb8 [ 50.095977] invoke_syscall+0x50/0x128 [ 50.095982] el0_svc_common.constprop.0+0x4c/0x100 [ 50.095985] do_el0_svc+0x40/0xa8 [ 50.095988] el0_svc+0x2c/0x88 [ 50.095991] el0t_64_sync_handler+0xf4/0x120 [ 50.095994] el0t_64_sync+0x190/0x198 [ 50.095996] ---[ end trace 0000000000000000 ]---
Fixes: 779096916dae ("media: camss: vfe: Fix runtime PM imbalance on error") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe.c +++ b/drivers/media/platform/qcom/camss/camss-vfe.c @@ -611,7 +611,7 @@ int vfe_get(struct vfe_device *vfe) } else { ret = vfe_check_clock_rates(vfe); if (ret < 0) - goto error_pm_runtime_get; + goto error_pm_domain; } vfe->power_count++;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 3143ad282fc08bf995ee73e32a9e40c527bf265d upstream.
There are two problems with the current vfe_disable_output() routine.
Firstly we rightly use a spinlock to protect output->gen2.active_num everywhere except for in the IDLE timeout path of vfe_disable_output(). Even if that is not racy "in practice" somehow it is by happenstance not by design.
Secondly we do not get consistent behaviour from this routine. On sc8280xp 50% of the time I get "VFE idle timeout - resetting". In this case the subsequent capture will succeed. The other 50% of the time, we don't hit the idle timeout, never do the VFE reset and subsequent captures stall indefinitely.
Rewrite the vfe_disable_output() routine to
- Quiesce write masters with vfe_wm_stop() - Set active_num = 0
remembering to hold the spinlock when we do so followed by
- Reset the VFE
Testing on sc8280xp and sdm845 shows this to be a valid fix.
Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe-170.c | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe-170.c +++ b/drivers/media/platform/qcom/camss/camss-vfe-170.c @@ -7,7 +7,6 @@ * Copyright (C) 2020-2021 Linaro Ltd. */
-#include <linux/delay.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/iopoll.h> @@ -494,35 +493,20 @@ static int vfe_enable_output(struct vfe_ return 0; }
-static int vfe_disable_output(struct vfe_line *line) +static void vfe_disable_output(struct vfe_line *line) { struct vfe_device *vfe = to_vfe(line); struct vfe_output *output = &line->output; unsigned long flags; unsigned int i; - bool done; - int timeout = 0; - - do { - spin_lock_irqsave(&vfe->output_lock, flags); - done = !output->gen2.active_num; - spin_unlock_irqrestore(&vfe->output_lock, flags); - usleep_range(10000, 20000); - - if (timeout++ == 100) { - dev_err(vfe->camss->dev, "VFE idle timeout - resetting\n"); - vfe_reset(vfe); - output->gen2.active_num = 0; - return 0; - } - } while (!done);
spin_lock_irqsave(&vfe->output_lock, flags); for (i = 0; i < output->wm_num; i++) vfe_wm_stop(vfe, output->wm_idx[i]); + output->gen2.active_num = 0; spin_unlock_irqrestore(&vfe->output_lock, flags);
- return 0; + vfe_reset(vfe); }
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 7f24d291350426d40b36dfbe6b3090617cdfd37a upstream.
vfe-480 is copied from vfe-17x and has the same racy idle timeout bug as in 17x.
Fix the vfe_disable_output() logic to no longer be racy and to conform to the 17x way of quiescing and then resetting the VFE.
Fixes: 4edc8eae715c ("media: camss: Add initial support for VFE hardware version Titan 480") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe-480.c | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe-480.c +++ b/drivers/media/platform/qcom/camss/camss-vfe-480.c @@ -8,7 +8,6 @@ * Copyright (C) 2021 Jonathan Marek */
-#include <linux/delay.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/iopoll.h> @@ -328,35 +327,20 @@ static int vfe_enable_output(struct vfe_ return 0; }
-static int vfe_disable_output(struct vfe_line *line) +static void vfe_disable_output(struct vfe_line *line) { struct vfe_device *vfe = to_vfe(line); struct vfe_output *output = &line->output; unsigned long flags; unsigned int i; - bool done; - int timeout = 0; - - do { - spin_lock_irqsave(&vfe->output_lock, flags); - done = !output->gen2.active_num; - spin_unlock_irqrestore(&vfe->output_lock, flags); - usleep_range(10000, 20000); - - if (timeout++ == 100) { - dev_err(vfe->camss->dev, "VFE idle timeout - resetting\n"); - vfe_reset(vfe); - output->gen2.active_num = 0; - return 0; - } - } while (!done);
spin_lock_irqsave(&vfe->output_lock, flags); for (i = 0; i < output->wm_num; i++) vfe_wm_stop(vfe, output->wm_idx[i]); + output->gen2.active_num = 0; spin_unlock_irqrestore(&vfe->output_lock, flags);
- return 0; + vfe_reset(vfe); }
/*
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit b6e1bdca463a932c1ac02caa7d3e14bf39288e0c upstream.
check_clock doesn't account for vfe_lite which means that vfe_lite will never get validated by this routine. Add the clock name to the expected set to remediate.
Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe.c +++ b/drivers/media/platform/qcom/camss/camss-vfe.c @@ -535,7 +535,8 @@ static int vfe_check_clock_rates(struct struct camss_clock *clock = &vfe->clock[i];
if (!strcmp(clock->name, "vfe0") || - !strcmp(clock->name, "vfe1")) { + !strcmp(clock->name, "vfe1") || + !strcmp(clock->name, "vfe_lite")) { u64 min_rate = 0; unsigned long rate;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit e655d1ae9703286cef7fda8675cad62f649dc183 upstream.
VC_MODE = 0 implies a two bit VC address. VC_MODE = 1 is required for VCs with a larger address than two bits.
Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-csid-gen2.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/media/platform/qcom/camss/camss-csid-gen2.c +++ b/drivers/media/platform/qcom/camss/camss-csid-gen2.c @@ -449,6 +449,8 @@ static void __csid_configure_stream(stru writel_relaxed(val, csid->base + CSID_CSI2_RX_CFG0);
val = 1 << CSI2_RX_CFG1_PACKET_ECC_CORRECTION_EN; + if (vc > 3) + val |= 1 << CSI2_RX_CFG1_VC_MODE; val |= 1 << CSI2_RX_CFG1_MISR_EN; writel_relaxed(val, csid->base + CSID_CSI2_RX_CFG1);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit d8f7e1a60d01739a1d78db2b08603089c6cf7c8e upstream.
define CSIPHY_3PH_CMN_CSI_COMMON_CTRL5_CLK_ENABLE BIT(7)
disjunction for gen2 ? BIT(7) : is a nop we are setting the same bit either way.
Fixes: 4abb21309fda ("media: camss: csiphy: Move to hardcode CSI Clock Lane number") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c +++ b/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c @@ -476,7 +476,7 @@ static void csiphy_lanes_enable(struct c
settle_cnt = csiphy_settle_cnt_calc(link_freq, csiphy->timer_clk_rate);
- val = is_gen2 ? BIT(7) : CSIPHY_3PH_CMN_CSI_COMMON_CTRL5_CLK_ENABLE; + val = CSIPHY_3PH_CMN_CSI_COMMON_CTRL5_CLK_ENABLE; for (i = 0; i < c->num_data; i++) val |= BIT(c->data[i].pos * 2);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Konovalov andrey.konovalov@linaro.org
commit 87889f1b7ea40d2544b49c62092e6ef2792dced7 upstream.
In the current driver csid Test Pattern Generator (TPG) doesn't work. This change: - fixes writing frame width and height values into CSID_TPG_DT_n_CFG_0 - fixes the shift by one between test_pattern control value and the actual pattern. - drops fixed VC of 0x0a which testing showed prohibited some test patterns in the CSID to produce output. So that TPG starts working, but with the below limitations: - only test_pattern=9 works as it should - test_pattern=8 and test_pattern=7 produce black frame (all zeroes) - the rest of test_pattern's don't work (yavta doesn't get the data) - regardless of the CFA pattern set by 'media-ctl -V' the actual pixel order is always the same (RGGB for any RAW8 or RAW10P format in 4608x2592 resolution).
Tested with:
RAW10P format, VC0: media-ctl -V '"msm_csid0":0[fmt:SRGGB10/4608x2592 field:none]' media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB10/4608x2592 field:none]' media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]' v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9 yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video0
RAW10P format, VC1: media-ctl -V '"msm_csid0":2[fmt:SRGGB10/4608x2592 field:none]' media-ctl -V '"msm_vfe0_rdi1":0[fmt:SRGGB10/4608x2592 field:none]' media-ctl -l '"msm_csid0":2->"msm_vfe0_rdi1":0[1]' v4l2-ctl -d /dev/v4l-subdev6 -c test_pattern=9 yavta -B capture-mplane --capture=3 -n 3 -f SRGGB10P -s 4608x2592 /dev/video1
RAW8 format, VC0: media-ctl --reset media-ctl -V '"msm_csid0":0[fmt:SRGGB8/4608x2592 field:none]' media-ctl -V '"msm_vfe0_rdi0":0[fmt:SRGGB8/4608x2592 field:none]' media-ctl -l '"msm_csid0":1->"msm_vfe0_rdi0":0[1]' yavta -B capture-mplane --capture=3 -n 3 -f SRGGB8 -s 4608x2592 /dev/video0
Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Andrey Konovalov andrey.konovalov@linaro.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-csid-gen2.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss-csid-gen2.c +++ b/drivers/media/platform/qcom/camss/camss-csid-gen2.c @@ -355,9 +355,6 @@ static void __csid_configure_stream(stru u8 dt_id = vc;
if (tg->enabled) { - /* Config Test Generator */ - vc = 0xa; - /* configure one DT, infinite frames */ val = vc << TPG_VC_CFG0_VC_NUM; val |= INTELEAVING_MODE_ONE_SHOT << TPG_VC_CFG0_LINE_INTERLEAVING_MODE; @@ -370,14 +367,14 @@ static void __csid_configure_stream(stru
writel_relaxed(0x12345678, csid->base + CSID_TPG_LFSR_SEED);
- val = input_format->height & 0x1fff << TPG_DT_n_CFG_0_FRAME_HEIGHT; - val |= input_format->width & 0x1fff << TPG_DT_n_CFG_0_FRAME_WIDTH; + val = (input_format->height & 0x1fff) << TPG_DT_n_CFG_0_FRAME_HEIGHT; + val |= (input_format->width & 0x1fff) << TPG_DT_n_CFG_0_FRAME_WIDTH; writel_relaxed(val, csid->base + CSID_TPG_DT_n_CFG_0(0));
val = format->data_type << TPG_DT_n_CFG_1_DATA_TYPE; writel_relaxed(val, csid->base + CSID_TPG_DT_n_CFG_1(0));
- val = tg->mode << TPG_DT_n_CFG_2_PAYLOAD_MODE; + val = (tg->mode - 1) << TPG_DT_n_CFG_2_PAYLOAD_MODE; val |= 0xBE << TPG_DT_n_CFG_2_USER_SPECIFIED_PAYLOAD; val |= format->decode_format << TPG_DT_n_CFG_2_ENCODE_FORMAT; writel_relaxed(val, csid->base + CSID_TPG_DT_n_CFG_2(0));
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit 6a26310273c323380da21eb23fcfd50e31140913 upstream.
This reverts commit efa5f1311c4998e9e6317c52bc5ee93b3a0f36df.
I couldn't reproduce the reported issue. What I did, based on a pcap packet log provided by the reporter: - Used same chip version (RTL8168h) - Set MAC address to the one used on the reporters system - Replayed the EAPOL unicast packet that, according to the reporter, was filtered out by the mc filter. The packet was properly received.
Therefore the root cause of the reported issue seems to be somewhere else. Disabling mc filtering completely for the most common chip version is a quite big hammer. Therefore revert the change and wait for further analysis results from the reporter.
Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2599,9 +2599,7 @@ static void rtl_set_rx_mode(struct net_d rx_mode &= ~AcceptMulticast; } else if (netdev_mc_count(dev) > MC_FILTER_LIMIT || dev->flags & IFF_ALLMULTI || - tp->mac_version == RTL_GIGA_MAC_VER_35 || - tp->mac_version == RTL_GIGA_MAC_VER_46 || - tp->mac_version == RTL_GIGA_MAC_VER_48) { + tp->mac_version == RTL_GIGA_MAC_VER_35) { /* accept all multicasts */ } else if (netdev_mc_empty(dev)) { rx_mode &= ~AcceptMulticast;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
commit 745f17a4166e79315e4b7f33ce89d03e75a76983 upstream.
We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ==================================================================
Above issue may happen as follows:
cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable@vger.kernel.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230524072538.2883391-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ext4.h | 3 ++- fs/ext4/super.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-)
--- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1674,7 +1674,8 @@ struct ext4_sb_info {
/* * Barrier between writepages ops and changing any inode's JOURNAL_DATA - * or EXTENTS flag. + * or EXTENTS flag or between writepages ops and changing DELALLOC or + * DIOREAD_NOLOCK mount options on remount. */ struct percpu_rw_semaphore s_writepages_rwsem; struct dax_device *s_daxdev; --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6425,6 +6425,7 @@ static int __ext4_remount(struct fs_cont struct ext4_mount_options old_opts; ext4_group_t g; int err = 0; + int alloc_ctx; #ifdef CONFIG_QUOTA int enable_quota = 0; int i, j; @@ -6465,7 +6466,16 @@ static int __ext4_remount(struct fs_cont
}
+ /* + * Changing the DIOREAD_NOLOCK or DELALLOC mount options may cause + * two calls to ext4_should_dioread_nolock() to return inconsistent + * values, triggering WARN_ON in ext4_add_complete_io(). we grab + * here s_writepages_rwsem to avoid race between writepages ops and + * remount. + */ + alloc_ctx = ext4_writepages_down_write(sb); ext4_apply_options(fc, sb); + ext4_writepages_up_write(sb, alloc_ctx);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^ test_opt(sb, JOURNAL_CHECKSUM)) { @@ -6683,6 +6693,8 @@ restore_opts: if ((sb->s_flags & SB_RDONLY) && !(old_sb_flags & SB_RDONLY) && sb_any_quota_suspended(sb)) dquot_resume(sb, -1); + + alloc_ctx = ext4_writepages_down_write(sb); sb->s_flags = old_sb_flags; sbi->s_mount_opt = old_opts.s_mount_opt; sbi->s_mount_opt2 = old_opts.s_mount_opt2; @@ -6691,6 +6703,8 @@ restore_opts: sbi->s_commit_interval = old_opts.s_commit_interval; sbi->s_min_batch_time = old_opts.s_min_batch_time; sbi->s_max_batch_time = old_opts.s_max_batch_time; + ext4_writepages_up_write(sb, alloc_ctx); + if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks) ext4_release_system_zone(sb); #ifdef CONFIG_QUOTA
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 8e387c89e96b9543a339f84043cf9df15fed2632 upstream.
__insert_pending() allocate memory in atomic context, so the allocation could fail, but we are not handling that failure now. It could lead ext4_es_remove_extent() to get wrong reserved clusters, and the global data blocks reservation count will be incorrect. The same to extents_status entry preallocation, preallocate pending entry out of the i_es_lock with __GFP_NOFAIL, make sure __insert_pending() and __revise_pending() always succeeds.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230824092619.1327976-3-yi.zhang@huaweicloud.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/extents_status.c | 123 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 89 insertions(+), 34 deletions(-)
--- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -152,8 +152,9 @@ static int __es_remove_extent(struct ino static int es_reclaim_extents(struct ext4_inode_info *ei, int *nr_to_scan); static int __es_shrink(struct ext4_sb_info *sbi, int nr_to_scan, struct ext4_inode_info *locked_ei); -static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len); +static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len, + struct pending_reservation **prealloc);
int __init ext4_init_es(void) { @@ -448,6 +449,19 @@ static void ext4_es_list_del(struct inod spin_unlock(&sbi->s_es_lock); }
+static inline struct pending_reservation *__alloc_pending(bool nofail) +{ + if (!nofail) + return kmem_cache_alloc(ext4_pending_cachep, GFP_ATOMIC); + + return kmem_cache_zalloc(ext4_pending_cachep, GFP_KERNEL | __GFP_NOFAIL); +} + +static inline void __free_pending(struct pending_reservation *pr) +{ + kmem_cache_free(ext4_pending_cachep, pr); +} + /* * Returns true if we cannot fail to allocate memory for this extent_status * entry and cannot reclaim it until its status changes. @@ -836,11 +850,12 @@ void ext4_es_insert_extent(struct inode { struct extent_status newes; ext4_lblk_t end = lblk + len - 1; - int err1 = 0; - int err2 = 0; + int err1 = 0, err2 = 0, err3 = 0; struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; + struct pending_reservation *pr = NULL; + bool revise_pending = false;
if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; @@ -868,11 +883,17 @@ void ext4_es_insert_extent(struct inode
ext4_es_insert_extent_check(inode, &newes);
+ revise_pending = sbi->s_cluster_ratio > 1 && + test_opt(inode->i_sb, DELALLOC) && + (status & (EXTENT_STATUS_WRITTEN | + EXTENT_STATUS_UNWRITTEN)); retry: if (err1 && !es1) es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); + if ((err1 || err2 || err3) && revise_pending && !pr) + pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock);
err1 = __es_remove_extent(inode, lblk, end, NULL, es1); @@ -897,13 +918,18 @@ retry: es2 = NULL; }
- if (sbi->s_cluster_ratio > 1 && test_opt(inode->i_sb, DELALLOC) && - (status & EXTENT_STATUS_WRITTEN || - status & EXTENT_STATUS_UNWRITTEN)) - __revise_pending(inode, lblk, len); + if (revise_pending) { + err3 = __revise_pending(inode, lblk, len, &pr); + if (err3 != 0) + goto error; + if (pr) { + __free_pending(pr); + pr = NULL; + } + } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2) + if (err1 || err2 || err3) goto retry;
ext4_es_print_tree(inode); @@ -1311,7 +1337,7 @@ static unsigned int get_rsvd(struct inod rc->ndelonly--; node = rb_next(&pr->rb_node); rb_erase(&pr->rb_node, &tree->root); - kmem_cache_free(ext4_pending_cachep, pr); + __free_pending(pr); if (!node) break; pr = rb_entry(node, struct pending_reservation, @@ -1907,11 +1933,13 @@ static struct pending_reservation *__get * * @inode - file containing the cluster * @lblk - logical block in the cluster to be added + * @prealloc - preallocated pending entry * * Returns 0 on successful insertion and -ENOMEM on failure. If the * pending reservation is already in the set, returns successfully. */ -static int __insert_pending(struct inode *inode, ext4_lblk_t lblk) +static int __insert_pending(struct inode *inode, ext4_lblk_t lblk, + struct pending_reservation **prealloc) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_pending_tree *tree = &EXT4_I(inode)->i_pending_tree; @@ -1937,10 +1965,15 @@ static int __insert_pending(struct inode } }
- pr = kmem_cache_alloc(ext4_pending_cachep, GFP_ATOMIC); - if (pr == NULL) { - ret = -ENOMEM; - goto out; + if (likely(*prealloc == NULL)) { + pr = __alloc_pending(false); + if (!pr) { + ret = -ENOMEM; + goto out; + } + } else { + pr = *prealloc; + *prealloc = NULL; } pr->lclu = lclu;
@@ -1970,7 +2003,7 @@ static void __remove_pending(struct inod if (pr != NULL) { tree = &EXT4_I(inode)->i_pending_tree; rb_erase(&pr->rb_node, &tree->root); - kmem_cache_free(ext4_pending_cachep, pr); + __free_pending(pr); } }
@@ -2029,10 +2062,10 @@ void ext4_es_insert_delayed_block(struct bool allocated) { struct extent_status newes; - int err1 = 0; - int err2 = 0; + int err1 = 0, err2 = 0, err3 = 0; struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; + struct pending_reservation *pr = NULL;
if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; @@ -2052,6 +2085,8 @@ retry: es1 = __es_alloc_extent(true); if ((err1 || err2) && !es2) es2 = __es_alloc_extent(true); + if ((err1 || err2 || err3) && allocated && !pr) + pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock);
err1 = __es_remove_extent(inode, lblk, lblk, NULL, es1); @@ -2074,11 +2109,18 @@ retry: es2 = NULL; }
- if (allocated) - __insert_pending(inode, lblk); + if (allocated) { + err3 = __insert_pending(inode, lblk, &pr); + if (err3 != 0) + goto error; + if (pr) { + __free_pending(pr); + pr = NULL; + } + } error: write_unlock(&EXT4_I(inode)->i_es_lock); - if (err1 || err2) + if (err1 || err2 || err3) goto retry;
ext4_es_print_tree(inode); @@ -2184,21 +2226,24 @@ unsigned int ext4_es_delayed_clu(struct * @inode - file containing the range * @lblk - logical block defining the start of range * @len - length of range in blocks + * @prealloc - preallocated pending entry * * Used after a newly allocated extent is added to the extents status tree. * Requires that the extents in the range have either written or unwritten * status. Must be called while holding i_es_lock. */ -static void __revise_pending(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t len) +static int __revise_pending(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len, + struct pending_reservation **prealloc) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); ext4_lblk_t end = lblk + len - 1; ext4_lblk_t first, last; bool f_del = false, l_del = false; + int ret = 0;
if (len == 0) - return; + return 0;
/* * Two cases - block range within single cluster and block range @@ -2219,7 +2264,9 @@ static void __revise_pending(struct inod f_del = __es_scan_range(inode, &ext4_es_is_delonly, first, lblk - 1); if (f_del) { - __insert_pending(inode, first); + ret = __insert_pending(inode, first, prealloc); + if (ret < 0) + goto out; } else { last = EXT4_LBLK_CMASK(sbi, end) + sbi->s_cluster_ratio - 1; @@ -2227,9 +2274,11 @@ static void __revise_pending(struct inod l_del = __es_scan_range(inode, &ext4_es_is_delonly, end + 1, last); - if (l_del) - __insert_pending(inode, last); - else + if (l_del) { + ret = __insert_pending(inode, last, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, last); } } else { @@ -2237,18 +2286,24 @@ static void __revise_pending(struct inod if (first != lblk) f_del = __es_scan_range(inode, &ext4_es_is_delonly, first, lblk - 1); - if (f_del) - __insert_pending(inode, first); - else + if (f_del) { + ret = __insert_pending(inode, first, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, first);
last = EXT4_LBLK_CMASK(sbi, end) + sbi->s_cluster_ratio - 1; if (last != end) l_del = __es_scan_range(inode, &ext4_es_is_delonly, end + 1, last); - if (l_del) - __insert_pending(inode, last); - else + if (l_del) { + ret = __insert_pending(inode, last, prealloc); + if (ret < 0) + goto out; + } else __remove_pending(inode, last); } +out: + return ret; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit 484fd6c1de13b336806a967908a927cc0356e312 upstream.
The function ext4_init_acl() calls posix_acl_create() which is responsible for applying the umask. But without CONFIG_EXT4_FS_POSIX_ACL, ext4_init_acl() is an empty inline function, and nobody applies the umask.
This fixes a bug which causes the umask to be ignored with O_TMPFILE on ext4:
https://github.com/MusicPlayerDaemon/MPD/issues/558 https://bugs.gentoo.org/show_bug.cgi?id=686142#c3 https://bugzilla.kernel.org/show_bug.cgi?id=203625
Reviewed-by: "J. Bruce Fields" bfields@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann max.kellermann@ionos.com Link: https://lore.kernel.org/r/20230919081824.1096619-1-max.kellermann@ionos.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/acl.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/ext4/acl.h +++ b/fs/ext4/acl.h @@ -68,6 +68,11 @@ extern int ext4_init_acl(handle_t *, str static inline int ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir) { + /* usually, the umask is applied by posix_acl_create(), but if + ext4 ACL support is disabled at compile time, we need to do + it here, because posix_acl_create() will never be called */ + inode->i_mode &= ~current_umask(); + return 0; } #endif /* CONFIG_EXT4_FS_POSIX_ACL */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 31f13421c004a420c0e9d288859c9ea9259ea0cc upstream.
Commit 0aeaa2559d6d5 ("ext4: fix corruption when online resizing a 1K bigalloc fs") found that primary superblock's offset in its group is not equal to offset of backup superblock in its group when block size is 1K and bigalloc is enabled. As group descriptor blocks are right after superblock, we can't pass block number of gdb to update_backups for the same reason.
The root casue of the issue above is that leading 1K padding block is count as data block offset for primary block while backup block has no padding block offset in its group.
Remove padding data block count to fix the issue for gdb backups.
For meta_bg case, update_backups treat blk_off as block number, do no conversion in this case.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1601,6 +1601,8 @@ exit_journal: int gdb_num_end = ((group + flex_gd->count - 1) / EXT4_DESC_PER_BLOCK(sb)); int meta_bg = ext4_has_feature_meta_bg(sb); + sector_t padding_blocks = meta_bg ? 0 : sbi->s_sbh->b_blocknr - + ext4_group_first_block_no(sb, 0); sector_t old_gdb = 0;
update_backups(sb, ext4_group_first_block_no(sb, 0), @@ -1612,8 +1614,8 @@ exit_journal: gdb_num); if (old_gdb == gdb_bh->b_blocknr) continue; - update_backups(sb, gdb_bh->b_blocknr, gdb_bh->b_data, - gdb_bh->b_size, meta_bg); + update_backups(sb, gdb_bh->b_blocknr - padding_blocks, + gdb_bh->b_data, gdb_bh->b_size, meta_bg); old_gdb = gdb_bh->b_blocknr; } }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ojaswin Mujoo ojaswin@linux.ibm.com
commit 2cd8bdb5efc1e0d5b11a4b7ba6b922fd2736a87f upstream.
** Short Version **
In ext4 with dioread_nolock, we could have a scenario where the bh returned by get_blocks (ext4_get_block_unwritten()) in __block_write_begin_int() has UNWRITTEN and MAPPED flag set. Since such a bh does not have NEW flag set we never zero out the range of bh that is not under write, causing whatever stale data is present in the folio at that time to be written out to disk. To fix this mark the buffer as new, in case it is unwritten, in ext4_get_block_unwritten().
** Long Version **
The issue mentioned above was resulting in two different bugs:
1. On block size < page size case in ext4, generic/269 was reliably failing with dioread_nolock. The state of the write was as follows:
* The write was extending i_size. * The last block of the file was fallocated and had an unwritten extent * We were near ENOSPC and hence we were switching to non-delayed alloc allocation.
In this case, the back trace that triggers the bug is as follows:
ext4_da_write_begin() /* switch to nodelalloc due to low space */ ext4_write_begin() ext4_should_dioread_nolock() // true since mount flags still have delalloc __block_write_begin(..., ext4_get_block_unwritten) __block_write_begin_int() for(each buffer head in page) { /* first iteration, this is bh1 which contains i_size */ if (!buffer_mapped) get_block() /* returns bh with only UNWRITTEN and MAPPED */ /* second iteration, bh2 */ if (!buffer_mapped) get_block() /* we fail here, could be ENOSPC */ } if (err) /* * this would zero out all new buffers and mark them uptodate. * Since bh1 was never marked new, we skip it here which causes * the bug later. */ folio_zero_new_buffers(); /* ext4_wrte_begin() error handling */ ext4_truncate_failed_write() ext4_truncate() ext4_block_truncate_page() __ext4_block_zero_page_range() if(!buffer_uptodate()) ext4_read_bh_lock() ext4_read_bh() -> ... ext4_submit_bh_wbc() BUG_ON(buffer_unwritten(bh)); /* !!! */
2. The second issue is stale data exposure with page size >= blocksize with dioread_nolock. The conditions needed for it to happen are same as the previous issue ie dioread_nolock around ENOSPC condition. The issue is also similar where in __block_write_begin_int() when we call ext4_get_block_unwritten() on the buffer_head and the underlying extent is unwritten, we get an unwritten and mapped buffer head. Since it is not new, we never zero out the partial range which is not under write, thus writing stale data to disk. This can be easily observed with the following reproducer:
fallocate -l 4k testfile xfs_io -c "pwrite 2k 2k" testfile # hexdump output will have stale data in from byte 0 to 2k in testfile hexdump -C testfile
NOTE: To trigger this, we need dioread_nolock enabled and write happening via ext4_write_begin(), which is usually used when we have -o nodealloc. Since dioread_nolock is disabled with nodelalloc, the only alternate way to call ext4_write_begin() is to ensure that delayed alloc switches to nodelalloc ie ext4_da_write_begin() calls ext4_write_begin(). This will usually happen when ext4 is almost full like the way generic/269 was triggering it in Issue 1 above. This might make the issue harder to hit. Hence, for reliable replication, I used the below patch to temporarily allow dioread_nolock with nodelalloc and then mount the disk with -o nodealloc,dioread_nolock. With this you can hit the stale data issue 100% of times:
@@ -508,8 +508,8 @@ static inline int ext4_should_dioread_nolock(struct inode *inode) if (ext4_should_journal_data(inode)) return 0; /* temporary fix to prevent generic/422 test failures */ - if (!test_opt(inode->i_sb, DELALLOC)) - return 0; + // if (!test_opt(inode->i_sb, DELALLOC)) + // return 0; return 1; }
After applying this patch to mark buffer as NEW, both the above issues are fixed.
Signed-off-by: Ojaswin Mujoo ojaswin@linux.ibm.com Cc: stable@kernel.org Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: "Ritesh Harjani (IBM)" ritesh.list@gmail.com Link: https://lore.kernel.org/r/d0ed09d70a9733fbb5349c5c7b125caac186ecdf.169503364... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inode.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
--- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -789,10 +789,22 @@ int ext4_get_block(struct inode *inode, int ext4_get_block_unwritten(struct inode *inode, sector_t iblock, struct buffer_head *bh_result, int create) { + int ret = 0; + ext4_debug("ext4_get_block_unwritten: inode %lu, create flag %d\n", inode->i_ino, create); - return _ext4_get_block(inode, iblock, bh_result, + ret = _ext4_get_block(inode, iblock, bh_result, EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT); + + /* + * If the buffer is marked unwritten, mark it as new to make sure it is + * zeroed out correctly in case of partial writes. Otherwise, there is + * a chance of stale data getting exposed. + */ + if (ret == 0 && buffer_unwritten(bh_result)) + set_buffer_new(bh_result); + + return ret; }
/* Maximum number of blocks we map for direct IO at once. */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 48f1551592c54f7d8e2befc72a99ff4e47f7dca0 upstream.
Avoid to ignore error in "err".
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Link: https://lore.kernel.org/r/20230826174712.4059355-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1982,9 +1982,7 @@ static int ext4_convert_meta_bg(struct s
errout: ret = ext4_journal_stop(handle); - if (!err) - err = ret; - return ret; + return err ? err : ret;
invalid_resize_inode: ext4_error(sb, "corrupted/inconsistent resize inode");
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 40ea98396a3659062267d1fe5f99af4f7e4f05e3 upstream.
When big allocate feature is enabled, we need to count and update reserved clusters before removing a delayed only extent_status entry. {init|count|get}_rsvd() have already done this, but the start block number of this counting isn't correct in the following case.
lblk end | | v v ------------------------- | | orig_es ------------------------- ^ ^ len1 is 0 | len2 |
If the start block of the orig_es entry founded is bigger than lblk, we passed lblk as start block to count_rsvd(), but the length is correct, finally, the range to be counted is offset. This patch fix this by passing the start blocks to 'orig_es->lblk + len1'.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230824092619.1327976-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/extents_status.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1431,8 +1431,8 @@ static int __es_remove_extent(struct ino } } if (count_reserved) - count_rsvd(inode, lblk, orig_es.es_len - len1 - len2, - &orig_es, &rc); + count_rsvd(inode, orig_es.es_lblk + len1, + orig_es.es_len - len1 - len2, &orig_es, &rc); goto out_get_reserved; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 40dd7953f4d606c280074f10d23046b6812708ce upstream.
Wrong check of gdb backup in meta bg as following: first_group is the first group of meta_bg which contains target group, so target group is always >= first_group. We check if target group has gdb backup by comparing first_group with [group + 1] and [group + EXT4_DESC_PER_BLOCK(sb) - 1]. As group >= first_group, then [group + N] is
first_group. So no copy of gdb backup in meta bg is done in
setup_new_flex_group_blocks.
No need to do gdb backup copy in meta bg from setup_new_flex_group_blocks as we always copy updated gdb block to backups at end of ext4_flex_group_add as following:
ext4_flex_group_add /* no gdb backup copy for meta bg any more */ setup_new_flex_group_blocks
/* update current group number */ ext4_update_super sbi->s_groups_count += flex_gd->count;
/* * if group in meta bg contains backup is added, the primary gdb block * of the meta bg will be copy to backup in new added group here. */ for (; gdb_num <= gdb_num_end; gdb_num++) update_backups(...)
In summary, we can remove wrong gdb backup copy code in setup_new_flex_group_blocks.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -560,13 +560,8 @@ static int setup_new_flex_group_blocks(s if (meta_bg == 0 && !ext4_bg_has_super(sb, group)) goto handle_itb;
- if (meta_bg == 1) { - ext4_group_t first_group; - first_group = ext4_meta_bg_first_group(sb, group); - if (first_group != group + 1 && - first_group != group + EXT4_DESC_PER_BLOCK(sb) - 1) - goto handle_itb; - } + if (meta_bg == 1) + goto handle_itb;
block = start + ext4_bg_has_super(sb, group); /* Copy all of the GDT blocks into the backup in this group */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 9adac8b01f4be28acd5838aade42b8daa4f0b642 upstream.
add missed brelse in update_backups
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1186,8 +1186,10 @@ static void update_backups(struct super_ ext4_group_first_block_no(sb, group)); BUFFER_TRACE(bh, "get_write_access"); if ((err = ext4_journal_get_write_access(handle, sb, bh, - EXT4_JTR_NONE))) + EXT4_JTR_NONE))) { + brelse(bh); break; + } lock_buffer(bh); memcpy(bh->b_data, data, size); if (rest)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 91562895f8030cb9a0470b1db49de79346a69f91 upstream.
Gao Xiang has reported that on ext4 O_SYNC direct IO does not properly sync file size update and thus if we crash at unfortunate moment, the file can have smaller size although O_SYNC IO has reported successful completion. The problem happens because update of on-disk inode size is handled in ext4_dio_write_iter() *after* iomap_dio_rw() (and thus dio_complete() in particular) has returned and generic_file_sync() gets called by dio_complete(). Fix the problem by handling on-disk inode size update directly in our ->end_io completion handler.
References: https://lore.kernel.org/all/02d18236-26ef-09b0-90ad-030c4fe3ee20@linux.aliba... Reported-by: Gao Xiang hsiangkao@linux.alibaba.com CC: stable@vger.kernel.org Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") Signed-off-by: Jan Kara jack@suse.cz Tested-by: Joseph Qi joseph.qi@linux.alibaba.com Reviewed-by: "Ritesh Harjani (IBM)" ritesh.list@gmail.com Link: https://lore.kernel.org/r/20231013121350.26872-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/file.c | 153 ++++++++++++++++++++++++--------------------------------- 1 file changed, 65 insertions(+), 88 deletions(-)
--- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -306,80 +306,38 @@ out: }
static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, - ssize_t written, size_t count) + ssize_t count) { handle_t *handle; - bool truncate = false; - u8 blkbits = inode->i_blkbits; - ext4_lblk_t written_blk, end_blk; - int ret; - - /* - * Note that EXT4_I(inode)->i_disksize can get extended up to - * inode->i_size while the I/O was running due to writeback of delalloc - * blocks. But, the code in ext4_iomap_alloc() is careful to use - * zeroed/unwritten extents if this is possible; thus we won't leave - * uninitialized blocks in a file even if we didn't succeed in writing - * as much as we intended. - */ - WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize); - if (offset + count <= EXT4_I(inode)->i_disksize) { - /* - * We need to ensure that the inode is removed from the orphan - * list if it has been added prematurely, due to writeback of - * delalloc blocks. - */ - if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) { - handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); - - if (IS_ERR(handle)) { - ext4_orphan_del(NULL, inode); - return PTR_ERR(handle); - } - - ext4_orphan_del(handle, inode); - ext4_journal_stop(handle); - } - - return written; - } - - if (written < 0) - goto truncate;
+ lockdep_assert_held_write(&inode->i_rwsem); handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); - if (IS_ERR(handle)) { - written = PTR_ERR(handle); - goto truncate; - } + if (IS_ERR(handle)) + return PTR_ERR(handle);
- if (ext4_update_inode_size(inode, offset + written)) { - ret = ext4_mark_inode_dirty(handle, inode); + if (ext4_update_inode_size(inode, offset + count)) { + int ret = ext4_mark_inode_dirty(handle, inode); if (unlikely(ret)) { - written = ret; ext4_journal_stop(handle); - goto truncate; + return ret; } }
- /* - * We may need to truncate allocated but not written blocks beyond EOF. - */ - written_blk = ALIGN(offset + written, 1 << blkbits); - end_blk = ALIGN(offset + count, 1 << blkbits); - if (written_blk < end_blk && ext4_can_truncate(inode)) - truncate = true; - - /* - * Remove the inode from the orphan list if it has been extended and - * everything went OK. - */ - if (!truncate && inode->i_nlink) + if (inode->i_nlink) ext4_orphan_del(handle, inode); ext4_journal_stop(handle);
- if (truncate) { -truncate: + return count; +} + +/* + * Clean up the inode after DIO or DAX extending write has completed and the + * inode size has been updated using ext4_handle_inode_extension(). + */ +static void ext4_inode_extension_cleanup(struct inode *inode, ssize_t count) +{ + lockdep_assert_held_write(&inode->i_rwsem); + if (count < 0) { ext4_truncate_failed_write(inode); /* * If the truncate operation failed early, then the inode may @@ -388,9 +346,28 @@ truncate: */ if (inode->i_nlink) ext4_orphan_del(NULL, inode); + return; } + /* + * If i_disksize got extended due to writeback of delalloc blocks while + * the DIO was running we could fail to cleanup the orphan list in + * ext4_handle_inode_extension(). Do it now. + */ + if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) { + handle_t *handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
- return written; + if (IS_ERR(handle)) { + /* + * The write has successfully completed. Not much to + * do with the error here so just cleanup the orphan + * list and hope for the best. + */ + ext4_orphan_del(NULL, inode); + return; + } + ext4_orphan_del(handle, inode); + ext4_journal_stop(handle); + } }
static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size, @@ -399,31 +376,22 @@ static int ext4_dio_write_end_io(struct loff_t pos = iocb->ki_pos; struct inode *inode = file_inode(iocb->ki_filp);
+ if (!error && size && flags & IOMAP_DIO_UNWRITTEN) + error = ext4_convert_unwritten_extents(NULL, inode, pos, size); if (error) return error; - - if (size && flags & IOMAP_DIO_UNWRITTEN) { - error = ext4_convert_unwritten_extents(NULL, inode, pos, size); - if (error < 0) - return error; - } /* - * If we are extending the file, we have to update i_size here before - * page cache gets invalidated in iomap_dio_rw(). Otherwise racing - * buffered reads could zero out too much from page cache pages. Update - * of on-disk size will happen later in ext4_dio_write_iter() where - * we have enough information to also perform orphan list handling etc. - * Note that we perform all extending writes synchronously under - * i_rwsem held exclusively so i_size update is safe here in that case. - * If the write was not extending, we cannot see pos > i_size here - * because operations reducing i_size like truncate wait for all - * outstanding DIO before updating i_size. + * Note that EXT4_I(inode)->i_disksize can get extended up to + * inode->i_size while the I/O was running due to writeback of delalloc + * blocks. But the code in ext4_iomap_alloc() is careful to use + * zeroed/unwritten extents if this is possible; thus we won't leave + * uninitialized blocks in a file even if we didn't succeed in writing + * as much as we intended. */ - pos += size; - if (pos > i_size_read(inode)) - i_size_write(inode, pos); - - return 0; + WARN_ON_ONCE(i_size_read(inode) < READ_ONCE(EXT4_I(inode)->i_disksize)); + if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize)) + return size; + return ext4_handle_inode_extension(inode, pos, size); }
static const struct iomap_dio_ops ext4_dio_write_ops = { @@ -606,9 +574,16 @@ static ssize_t ext4_dio_write_iter(struc dio_flags, NULL, 0); if (ret == -ENOTBLK) ret = 0; - - if (extend) - ret = ext4_handle_inode_extension(inode, offset, ret, count); + if (extend) { + /* + * We always perform extending DIO write synchronously so by + * now the IO is completed and ext4_handle_inode_extension() + * was called. Cleanup the inode in case of error or race with + * writeback of delalloc blocks. + */ + WARN_ON_ONCE(ret == -EIOCBQUEUED); + ext4_inode_extension_cleanup(inode, ret); + }
out: if (ilock_shared) @@ -689,8 +664,10 @@ ext4_dax_write_iter(struct kiocb *iocb,
ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
- if (extend) - ret = ext4_handle_inode_extension(inode, offset, ret, count); + if (extend) { + ret = ext4_handle_inode_extension(inode, offset, ret); + ext4_inode_extension_cleanup(inode, ret); + } out: inode_unlock(inode); if (ret > 0)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Foster bfoster@redhat.com
commit ce56d21355cd6f6937aca32f1f44ca749d1e4808 upstream.
syzbot reports that the following warning from ext4_iomap_begin() triggers as of the commit referenced below:
if (WARN_ON_ONCE(ext4_has_inline_data(inode))) return -ERANGE;
This occurs during a dio write, which is never expected to encounter an inode with inline data. To enforce this behavior, ext4_dio_write_iter() checks the current inline state of the inode and clears the MAY_INLINE_DATA state flag to either fall back to buffered writes, or enforce that any other writers in progress on the inode are not allowed to create inline data.
The problem is that the check for existing inline data and the state flag can span a lock cycle. For example, if the ilock is originally locked shared and subsequently upgraded to exclusive, another writer may have reacquired the lock and created inline data before the dio write task acquires the lock and proceeds.
The commit referenced below loosens the lock requirements to allow some forms of unaligned dio writes to occur under shared lock, but AFAICT the inline data check was technically already racy for any dio write that would have involved a lock cycle. Regardless, lift clearing of the state bit to the same lock critical section that checks for preexisting inline data on the inode to close the race.
Cc: stable@kernel.org Reported-by: syzbot+307da6ca5cb0d01d581a@syzkaller.appspotmail.com Fixes: 310ee0902b8d ("ext4: allow concurrent unaligned dio overwrites") Signed-off-by: Brian Foster bfoster@redhat.com Link: https://lore.kernel.org/r/20231002185020.531537-1-bfoster@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/file.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -537,18 +537,20 @@ static ssize_t ext4_dio_write_iter(struc return ext4_buffered_write_iter(iocb, from); }
+ /* + * Prevent inline data from being created since we are going to allocate + * blocks for DIO. We know the inode does not currently have inline data + * because ext4_should_use_dio() checked for it, but we have to clear + * the state flag before the write checks because a lock cycle could + * introduce races with other writers. + */ + ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + ret = ext4_dio_write_checks(iocb, from, &ilock_shared, &extend, &unwritten, &dio_flags); if (ret <= 0) return ret;
- /* - * Make sure inline data cannot be created anymore since we are going - * to allocate blocks for DIO. We know the inode does not have any - * inline data now because ext4_dio_supported() checked for that. - */ - ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); - offset = iocb->ki_pos; count = ret;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl
commit 08e9ebc75b5bcfec9d226f9e16bab2ab7b25a39a upstream.
The incoming strings might not be terminated by a newline or a 0.
(found while testing a program that just wrote the string itself, causing a crash)
Cc: stable@vger.kernel.org Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs") Signed-off-by: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -734,7 +734,7 @@ static ssize_t amdgpu_set_pp_od_clk_volt if (adev->in_suspend && !adev->in_runpm) return -EPERM;
- if (count > 127) + if (count > 127 || count == 0) return -EINVAL;
if (*buf == 's') @@ -754,7 +754,8 @@ static ssize_t amdgpu_set_pp_od_clk_volt else return -EINVAL;
- memcpy(buf_cpy, buf, count+1); + memcpy(buf_cpy, buf, count); + buf_cpy[count] = 0;
tmp_str = buf_cpy;
@@ -771,6 +772,9 @@ static ssize_t amdgpu_set_pp_od_clk_volt return -EINVAL; parameter_size++;
+ if (!tmp_str) + break; + while (isspace(*tmp_str)) tmp_str++; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit 81995ee1620318b4c7bbeb02bcc372da2c078c76 upstream.
The drm stack does not expect error valued pointers for EDID anywhere.
Fixes: e66856508746 ("drm: bridge: it66121: Set DDC preamble only once before reading EDID") Cc: Paul Cercueil paul@crapouillou.net Cc: Robert Foss robert.foss@linaro.org Cc: Phong LE ple@baylibre.com Cc: Neil Armstrong neil.armstrong@linaro.org Cc: Andrzej Hajda andrzej.hajda@intel.com Cc: Robert Foss rfoss@kernel.org Cc: Laurent Pinchart Laurent.pinchart@ideasonboard.com Cc: Jonas Karlman jonas@kwiboo.se Cc: Jernej Skrabec jernej.skrabec@gmail.com Cc: stable@vger.kernel.org # v6.3+ Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Paul Cercueil paul@crapouillou.net Link: https://patchwork.freedesktop.org/patch/msgid/20230914131159.2472513-1-jani.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/bridge/ite-it66121.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 3c9b42c9d2ee..1cf3fb1f13dc 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -884,14 +884,14 @@ static struct edid *it66121_bridge_get_edid(struct drm_bridge *bridge, mutex_lock(&ctx->lock); ret = it66121_preamble_ddc(ctx); if (ret) { - edid = ERR_PTR(ret); + edid = NULL; goto out_unlock; }
ret = regmap_write(ctx->regmap, IT66121_DDC_HEADER_REG, IT66121_DDC_HEADER_EDID); if (ret) { - edid = ERR_PTR(ret); + edid = NULL; goto out_unlock; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chaitanya Kumar Borah chaitanya.kumar.borah@intel.com
commit ce4941c2d6459664761c9854701015d8e99414fb upstream.
eDP specification supports HBR3 link rate since v1.4a. Moreover, C10 phy can support HBR3 link rate for both DP and eDP. Therefore, do not clamp the supported rates for eDP at 6.75Gbps.
BSpec: 70073 74224
Signed-off-by: Chaitanya Kumar Borah chaitanya.kumar.borah@intel.com Reviewed-by: Mika Kahola mika.kahola@intel.com Signed-off-by: Mika Kahola mika.kahola@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231018113622.2761997-1-chait... (cherry picked from commit a3431650f30a94b179d419ef87c21213655c28cd) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_dp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -430,7 +430,7 @@ static int mtl_max_source_rate(struct in enum phy phy = intel_port_to_phy(i915, dig_port->base.port);
if (intel_is_c10phy(i915, phy)) - return intel_dp_is_edp(intel_dp) ? 675000 : 810000; + return 810000;
return 2000000; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 0cb89cd42fd22bbdec0b046c48f35775f5b88bdb upstream.
On GLK CDCLK frequency needs to be at least 2*96 MHz when accessing the audio hardware. Currently we bump the CDCLK frequency up temporarily (if not high enough already) whenever audio hardware is being accessed, and drop it back down afterwards.
With a single active pipe this works just fine as we can switch between all the valid CDCLK frequencies by changing the cd2x divider, which doesn't require a full modeset. However with multiple active pipes the cd2x divider trick no longer works, and thus we end up blinking all displays off and back on.
To avoid this let's just bump the CDCLK frequency to >=2*96MHz whenever multiple pipes are active. The downside is slightly higher power consumption, but that seems like an acceptable tradeoff. With a single active pipe we can stick to the current more optiomal (from power comsumption POV) behaviour.
Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9599 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231031160800.18371-1-ville.s... Reviewed-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit 451eaa1a614c911f5a51078dcb68022874e4cb12) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_cdclk.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c @@ -2680,6 +2680,18 @@ static int intel_compute_min_cdclk(struc for_each_pipe(dev_priv, pipe) min_cdclk = max(cdclk_state->min_cdclk[pipe], min_cdclk);
+ /* + * Avoid glk_force_audio_cdclk() causing excessive screen + * blinking when multiple pipes are active by making sure + * CDCLK frequency is always high enough for audio. With a + * single active pipe we can always change CDCLK frequency + * by changing the cd2x divider (see glk_cdclk_table[]) and + * thus a full modeset won't be needed then. + */ + if (IS_GEMINILAKE(dev_priv) && cdclk_state->active_pipes && + !is_power_of_2(cdclk_state->active_pipes)) + min_cdclk = max(2 * 96000, min_cdclk); + if (min_cdclk > dev_priv->display.cdclk.max_cdclk_freq) { drm_dbg_kms(&dev_priv->drm, "required cdclk (%d kHz) exceeds max (%d kHz)\n",
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kunwu Chan chentao@kylinos.cn
commit 1a8e9bad6ef563c28ab0f8619628d5511be55431 upstream.
Fix smatch warning: drivers/gpu/drm/i915/gem/i915_gem_context.c:847 set_proto_ctx_sseu() warn: potential spectre issue 'pc->user_engines' [r] (local cap)
Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Kunwu Chan chentao@kylinos.cn Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231103110922.430122-1-tvrtko... (cherry picked from commit 27b086382c22efb7e0a16442f7bdc2e120108ef3) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -844,6 +844,7 @@ static int set_proto_ctx_sseu(struct drm if (idx >= pc->num_user_engines) return -EINVAL;
+ idx = array_index_nospec(idx, pc->num_user_engines); pe = &pc->user_engines[idx];
/* Only render engine supports RPCS configuration. */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nirmoy Das nirmoy.das@intel.com
commit 7d7a328d0e8d6edefb7b0d665185d468667588d0 upstream.
gen8_ggtt_invalidate() is only needed for limited set of platforms where GGTT is mapped as WC. This was added as way to fix WC based GGTT in commit 0f9b91c754b7 ("drm/i915: flush system agent TLBs on SNB") and there are no reference in HW docs that forces us to use this on non-WC backed GGTT.
This can also cause unwanted side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not valid anymore.
v2: Add a func to detect wc ggtt detection (Ville) v3: Improve commit log and add reference commit (Daniel)
Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: Jonathan Cavitt jonathan.cavitt@intel.com Cc: John Harrison john.c.harrison@intel.com Cc: Andi Shyti andi.shyti@linux.intel.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: stable@vger.kernel.org # v6.2+ Suggested-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231018093815.1349-1-nirmoy.d... (cherry picked from commit 81de3e296b10a13e5c9f13172825b0d8d9495c68) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -190,6 +190,21 @@ void gen6_ggtt_invalidate(struct i915_gg spin_unlock_irq(&uncore->lock); }
+static bool needs_wc_ggtt_mapping(struct drm_i915_private *i915) +{ + /* + * On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range + * will be dropped. For WC mappings in general we have 64 byte burst + * writes when the WC buffer is flushed, so we can't use it, but have to + * resort to an uncached mapping. The WC issue is easily caught by the + * readback check when writing GTT PTE entries. + */ + if (!IS_GEN9_LP(i915) && GRAPHICS_VER(i915) < 11) + return true; + + return false; +} + static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt) { struct intel_uncore *uncore = ggtt->vm.gt->uncore; @@ -197,8 +212,12 @@ static void gen8_ggtt_invalidate(struct /* * Note that as an uncached mmio write, this will flush the * WCB of the writes into the GGTT before it triggers the invalidate. + * + * Only perform this when GGTT is mapped as WC, see ggtt_probe_common(). */ - intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); + if (needs_wc_ggtt_mapping(ggtt->vm.i915)) + intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, + GFX_FLSH_CNTL_EN); }
static void guc_ggtt_invalidate(struct i915_ggtt *ggtt) @@ -902,17 +921,11 @@ static int ggtt_probe_common(struct i915 GEM_WARN_ON(pci_resource_len(pdev, GEN4_GTTMMADR_BAR) != gen6_gttmmadr_size(i915)); phys_addr = pci_resource_start(pdev, GEN4_GTTMMADR_BAR) + gen6_gttadr_offset(i915);
- /* - * On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range - * will be dropped. For WC mappings in general we have 64 byte burst - * writes when the WC buffer is flushed, so we can't use it, but have to - * resort to an uncached mapping. The WC issue is easily caught by the - * readback check when writing GTT PTE entries. - */ - if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11) - ggtt->gsm = ioremap(phys_addr, size); - else + if (needs_wc_ggtt_mapping(i915)) ggtt->gsm = ioremap_wc(phys_addr, size); + else + ggtt->gsm = ioremap(phys_addr, size); + if (!ggtt->gsm) { drm_err(&i915->drm, "Failed to map the ggtt page table\n"); return -ENOMEM;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Jun Jun.Ma2@amd.com
commit 7f3e6b840fa8b0889d776639310a5dc672c1e9e1 upstream.
MACO only works if BACO is supported
Signed-off-by: Ma Jun Jun.Ma2@amd.com Reviewed-by: Kenneth Feng kenneth.feng@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 8 ++++---- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 9 +++++---- 2 files changed, 9 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -343,12 +343,12 @@ static int smu_v13_0_0_check_powerplay_t if (powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_HARDWAREDC) smu->dc_controlled_by_gpio = true;
- if (powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_BACO || - powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_MACO) + if (powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_BACO) { smu_baco->platform_support = true;
- if (powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_MACO) - smu_baco->maco_support = true; + if (powerplay_table->platform_caps & SMU_13_0_0_PP_PLATFORM_CAP_MACO) + smu_baco->maco_support = true; + }
/* * We are in the transition to a new OD mechanism. --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c @@ -333,12 +333,13 @@ static int smu_v13_0_7_check_powerplay_t if (powerplay_table->platform_caps & SMU_13_0_7_PP_PLATFORM_CAP_HARDWAREDC) smu->dc_controlled_by_gpio = true;
- if (powerplay_table->platform_caps & SMU_13_0_7_PP_PLATFORM_CAP_BACO || - powerplay_table->platform_caps & SMU_13_0_7_PP_PLATFORM_CAP_MACO) + if (powerplay_table->platform_caps & SMU_13_0_7_PP_PLATFORM_CAP_BACO) { smu_baco->platform_support = true;
- if (smu_baco->platform_support && (BoardTable->HsrEnabled || BoardTable->VddqOffEnabled)) - smu_baco->maco_support = true; + if ((powerplay_table->platform_caps & SMU_13_0_7_PP_PLATFORM_CAP_MACO) + && (BoardTable->HsrEnabled || BoardTable->VddqOffEnabled)) + smu_baco->maco_support = true; + }
#if 0 if (!overdrive_lowerlimits->FeatureCtrlMask ||
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 23170863ea0a0965d224342c0eb2ad8303b1f267 upstream.
This was fixed in PMFW before launch and is no longer required.
Reviewed-by: Yang Wang kevinyang.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 32 +------------------ 1 file changed, 2 insertions(+), 30 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -2162,38 +2162,10 @@ static int smu_v13_0_0_set_power_profile } }
- if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE && - (((smu->adev->pdev->device == 0x744C) && (smu->adev->pdev->revision == 0xC8)) || - ((smu->adev->pdev->device == 0x744C) && (smu->adev->pdev->revision == 0xCC)))) { - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_COMPUTE_BIT, - (void *)(&activity_monitor_external), - false); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to get activity monitor!", __func__); - return ret; - } - - ret = smu_cmn_update_table(smu, - SMU_TABLE_ACTIVITY_MONITOR_COEFF, - WORKLOAD_PPLIB_CUSTOM_BIT, - (void *)(&activity_monitor_external), - true); - if (ret) { - dev_err(smu->adev->dev, "[%s] Failed to set activity monitor!", __func__); - return ret; - } - - workload_type = smu_cmn_to_asic_specific_index(smu, - CMN2ASIC_MAPPING_WORKLOAD, - PP_SMC_POWER_PROFILE_CUSTOM); - } else { - /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ - workload_type = smu_cmn_to_asic_specific_index(smu, + /* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */ + workload_type = smu_cmn_to_asic_specific_index(smu, CMN2ASIC_MAPPING_WORKLOAD, smu->power_profile_mode); - }
if (workload_type < 0) return -EINVAL;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 7b1c6263eaf4fd64ffe1cafdc504a42ee4bfbb33 upstream.
It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++---- drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 5 +++-- 2 files changed, 7 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -43,6 +43,7 @@ #include <drm/drm_fb_helper.h> #include <drm/drm_probe_helper.h> #include <drm/amdgpu_drm.h> +#include <linux/device.h> #include <linux/vgaarb.h> #include <linux/vga_switcheroo.h> #include <linux/efi.h> @@ -2233,7 +2234,6 @@ out: */ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) { - struct drm_device *dev = adev_to_drm(adev); struct pci_dev *parent; int i, r; bool total; @@ -2304,7 +2304,7 @@ static int amdgpu_device_ip_early_init(s (amdgpu_is_atpx_hybrid() || amdgpu_has_atpx_dgpu_power_cntl()) && ((adev->flags & AMD_IS_APU) == 0) && - !pci_is_thunderbolt_attached(to_pci_dev(dev->dev))) + !dev_is_removable(&adev->pdev->dev)) adev->flags |= AMD_IS_PX;
if (!(adev->flags & AMD_IS_APU)) { @@ -4132,7 +4132,7 @@ fence_driver_init:
px = amdgpu_device_supports_px(ddev);
- if (px || (!pci_is_thunderbolt_attached(adev->pdev) && + if (px || (!dev_is_removable(&adev->pdev->dev) && apple_gmux_detect(NULL, NULL))) vga_switcheroo_register_client(adev->pdev, &amdgpu_switcheroo_ops, px); @@ -4278,7 +4278,7 @@ void amdgpu_device_fini_sw(struct amdgpu
px = amdgpu_device_supports_px(adev_to_drm(adev));
- if (px || (!pci_is_thunderbolt_attached(adev->pdev) && + if (px || (!dev_is_removable(&adev->pdev->dev) && apple_gmux_detect(NULL, NULL))) vga_switcheroo_unregister_client(adev->pdev);
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c @@ -28,6 +28,7 @@ #include "nbio/nbio_2_3_offset.h" #include "nbio/nbio_2_3_sh_mask.h" #include <uapi/linux/kfd_ioctl.h> +#include <linux/device.h> #include <linux/pci.h>
#define smnPCIE_CONFIG_CNTL 0x11180044 @@ -361,7 +362,7 @@ static void nbio_v2_3_enable_aspm(struct
data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT;
- if (pci_is_thunderbolt_attached(adev->pdev)) + if (dev_is_removable(&adev->pdev->dev)) data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; else data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; @@ -480,7 +481,7 @@ static void nbio_v2_3_program_aspm(struc
def = data = RREG32_PCIE(smnPCIE_LC_CNTL); data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT; - if (pci_is_thunderbolt_attached(adev->pdev)) + if (dev_is_removable(&adev->pdev->dev)) data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT; else data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tim Huang Tim.Huang@amd.com
commit 36e7ff5c13cb15cb7b06c76d42bb76cbf6b7ea75 upstream.
Use a proper MEID to make sure the CP_HQD_* and CP_GFX_HQD_* registers can be touched when initialize the compute and gfx mqd in mes_self_test. Otherwise, we expect no response from CP and an GRBM eventual timeout.
Signed-off-by: Tim Huang Tim.Huang@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Yifan Zhang yifan1.zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -627,8 +627,20 @@ static void amdgpu_mes_queue_init_mqd(st mqd_prop.hqd_queue_priority = p->hqd_queue_priority; mqd_prop.hqd_active = false;
+ if (p->queue_type == AMDGPU_RING_TYPE_GFX || + p->queue_type == AMDGPU_RING_TYPE_COMPUTE) { + mutex_lock(&adev->srbm_mutex); + amdgpu_gfx_select_me_pipe_q(adev, p->ring->me, p->ring->pipe, 0, 0, 0); + } + mqd_mgr->init_mqd(adev, q->mqd_cpu_ptr, &mqd_prop);
+ if (p->queue_type == AMDGPU_RING_TYPE_GFX || + p->queue_type == AMDGPU_RING_TYPE_COMPUTE) { + amdgpu_gfx_select_me_pipe_q(adev, 0, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + } + amdgpu_bo_unreserve(q->mqd_obj); }
@@ -1062,9 +1074,13 @@ int amdgpu_mes_add_ring(struct amdgpu_de switch (queue_type) { case AMDGPU_RING_TYPE_GFX: ring->funcs = adev->gfx.gfx_ring[0].funcs; + ring->me = adev->gfx.gfx_ring[0].me; + ring->pipe = adev->gfx.gfx_ring[0].pipe; break; case AMDGPU_RING_TYPE_COMPUTE: ring->funcs = adev->gfx.compute_ring[0].funcs; + ring->me = adev->gfx.compute_ring[0].me; + ring->pipe = adev->gfx.compute_ring[0].pipe; break; case AMDGPU_RING_TYPE_SDMA: ring->funcs = adev->sdma.instance[0].ring.funcs;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 3938eb956e383ef88b8fc7d556492336ebee52df upstream.
AMD dGPUs have integrated FW that runs as soon as the device gets power and initializes the board (determines the amount of memory, provides configuration details to the driver, etc.). For direct PCIe attached cards this happens as soon as power is applied and normally completes well before the OS has even started loading. However, with hotpluggable ports like USB4, the driver needs to wait for this to complete before initializing the device.
This normally takes 60-100ms, but could take longer on some older boards periodically due to memory training.
Retry for up to a second. In the non-hotplug case, there should be no change in behavior and this should complete on the first try.
v2: adjust test criteria v3: adjust checks for the masks, only enable on removable devices v4: skip bif_fb_en check
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -92,6 +92,7 @@ MODULE_FIRMWARE(FIRMWARE_IP_DISCOVERY);
#define mmRCC_CONFIG_MEMSIZE 0xde3 +#define mmMP0_SMN_C2PMSG_33 0x16061 #define mmMM_INDEX 0x0 #define mmMM_INDEX_HI 0x6 #define mmMM_DATA 0x1 @@ -230,8 +231,26 @@ static int amdgpu_discovery_read_binary_ static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev, uint8_t *binary) { - uint64_t vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20; - int ret = 0; + uint64_t vram_size; + u32 msg; + int i, ret = 0; + + /* It can take up to a second for IFWI init to complete on some dGPUs, + * but generally it should be in the 60-100ms range. Normally this starts + * as soon as the device gets power so by the time the OS loads this has long + * completed. However, when a card is hotplugged via e.g., USB4, we need to + * wait for this to complete. Once the C2PMSG is updated, we can + * continue. + */ + if (dev_is_removable(&adev->pdev->dev)) { + for (i = 0; i < 1000; i++) { + msg = RREG32(mmMP0_SMN_C2PMSG_33); + if (msg & 0x80000000) + break; + msleep(1); + } + } + vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
if (vram_size) { uint64_t pos = vram_size - DISCOVERY_TMR_OFFSET;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 432e664e7c98c243fab4c3c95bd463bea3aeed28 upstream.
The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM.
v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3: dev_is_removable() seems to be what we want
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c @@ -29,6 +29,7 @@ #include "amdgpu.h" #include "atom.h"
+#include <linux/device.h> #include <linux/pci.h> #include <linux/slab.h> #include <linux/acpi.h> @@ -287,6 +288,10 @@ static bool amdgpu_atrm_get_bios(struct if (adev->flags & AMD_IS_APU) return false;
+ /* ATRM is for on-platform devices only */ + if (dev_is_removable(&adev->pdev->dev)) + return false; + while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) { dhandle = ACPI_HANDLE(&pdev->dev); if (!dhandle)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 8473bfdcb5b1a32fd05629c4535ccacd73bc5567 upstream.
When clearing the root PD fails we need to properly release it again.
Signed-off-by: Christian König christian.koenig@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2129,7 +2129,8 @@ long amdgpu_vm_wait_idle(struct amdgpu_v * Returns: * 0 for success, error for failure. */ -int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, int32_t xcp_id) +int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, + int32_t xcp_id) { struct amdgpu_bo *root_bo; struct amdgpu_bo_vm *root; @@ -2148,6 +2149,7 @@ int amdgpu_vm_init(struct amdgpu_device INIT_LIST_HEAD(&vm->done); INIT_LIST_HEAD(&vm->pt_freed); INIT_WORK(&vm->pt_free_work, amdgpu_vm_pt_free_work); + INIT_KFIFO(vm->faults);
r = amdgpu_vm_init_entities(adev, vm); if (r) @@ -2182,34 +2184,33 @@ int amdgpu_vm_init(struct amdgpu_device false, &root, xcp_id); if (r) goto error_free_delayed; - root_bo = &root->bo; + + root_bo = amdgpu_bo_ref(&root->bo); r = amdgpu_bo_reserve(root_bo, true); - if (r) - goto error_free_root; + if (r) { + amdgpu_bo_unref(&root->shadow); + amdgpu_bo_unref(&root_bo); + goto error_free_delayed; + }
+ amdgpu_vm_bo_base_init(&vm->root, vm, root_bo); r = dma_resv_reserve_fences(root_bo->tbo.base.resv, 1); if (r) - goto error_unreserve; - - amdgpu_vm_bo_base_init(&vm->root, vm, root_bo); + goto error_free_root;
r = amdgpu_vm_pt_clear(adev, vm, root, false); if (r) - goto error_unreserve; + goto error_free_root;
amdgpu_bo_unreserve(vm->root.bo); - - INIT_KFIFO(vm->faults); + amdgpu_bo_unref(&root_bo);
return 0;
-error_unreserve: - amdgpu_bo_unreserve(vm->root.bo); - error_free_root: - amdgpu_bo_unref(&root->shadow); + amdgpu_vm_pt_free_root(adev, vm); + amdgpu_bo_unreserve(vm->root.bo); amdgpu_bo_unref(&root_bo); - vm->root.bo = NULL;
error_free_delayed: dma_fence_put(vm->last_tlb_flush);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 12f76050d8d4d10dab96333656b821bd4620d103 upstream.
We should not leak the pointer where we couldn't grab the reference on to the caller because it can be that the error handling still tries to put the reference then.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c @@ -179,6 +179,7 @@ int amdgpu_bo_list_get(struct amdgpu_fpr }
rcu_read_unlock(); + *result = NULL; return -ENOENT; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 17daf01ab4e3e5a5929747aa05cc15eb2bad5438 upstream.
Otherwise userspace can spam the logs by using incorrect input values.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1438,7 +1438,7 @@ int amdgpu_cs_ioctl(struct drm_device *d if (r == -ENOMEM) DRM_ERROR("Not enough memory for command submission!\n"); else if (r != -ERESTARTSYS && r != -EAGAIN) - DRM_ERROR("Failed to process the buffer list %d!\n", r); + DRM_DEBUG("Failed to process the buffer list %d!\n", r); goto error_fini; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Kuehling Felix.Kuehling@amd.com
commit 256503071c2de2b5b5c20e06654aa9a44f13aa62 upstream.
mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1099,8 +1099,8 @@ int amdgpu_vm_bo_update(struct amdgpu_de bo = gem_to_amdgpu_bo(gobj); } mem = bo->tbo.resource; - if (mem->mem_type == TTM_PL_TT || - mem->mem_type == AMDGPU_PL_PREEMPT) + if (mem && (mem->mem_type == TTM_PL_TT || + mem->mem_type == AMDGPU_PL_PREEMPT)) pages_addr = bo->tbo.ttm->dma_address; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
commit 1ffa8602e39b89469dc703ebab7a7e44c33da0f7 upstream.
[WHY] HW can return invalid values on register read, guard against these being set and causing us to access memory out of range and page fault.
[HOW] Guard at sync_inbox1 and guard at pushing commands.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Hansen Dsouza hansen.dsouza@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -657,9 +657,16 @@ enum dmub_status dmub_srv_sync_inbox1(st return DMUB_STATUS_INVALID;
if (dmub->hw_funcs.get_inbox1_rptr && dmub->hw_funcs.get_inbox1_wptr) { - dmub->inbox1_rb.rptr = dmub->hw_funcs.get_inbox1_rptr(dmub); - dmub->inbox1_rb.wrpt = dmub->hw_funcs.get_inbox1_wptr(dmub); - dmub->inbox1_last_wptr = dmub->inbox1_rb.wrpt; + uint32_t rptr = dmub->hw_funcs.get_inbox1_rptr(dmub); + uint32_t wptr = dmub->hw_funcs.get_inbox1_wptr(dmub); + + if (rptr > dmub->inbox1_rb.capacity || wptr > dmub->inbox1_rb.capacity) { + return DMUB_STATUS_HW_FAILURE; + } else { + dmub->inbox1_rb.rptr = rptr; + dmub->inbox1_rb.wrpt = wptr; + dmub->inbox1_last_wptr = dmub->inbox1_rb.wrpt; + } }
return DMUB_STATUS_OK; @@ -693,6 +700,11 @@ enum dmub_status dmub_srv_cmd_queue(stru if (!dmub->hw_init) return DMUB_STATUS_INVALID;
+ if (dmub->inbox1_rb.rptr > dmub->inbox1_rb.capacity || + dmub->inbox1_rb.wrpt > dmub->inbox1_rb.capacity) { + return DMUB_STATUS_HW_FAILURE; + } + if (dmub_rb_push_front(&dmub->inbox1_rb, cmd)) return DMUB_STATUS_OK;
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fangzhi Zuo jerry.zuo@amd.com
commit a58555359a9f870543aaddef277c3396159895ce upstream.
[WHY & HOW] For the scenario when a dsc capable MST sink device is directly connected, it needs to use max dsc compression as the link bw constraint.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Roman Li roman.li@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Fangzhi Zuo jerry.zuo@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 29 +++++------- 1 file changed, 14 insertions(+), 15 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -1591,31 +1591,31 @@ enum dc_status dm_dp_mst_is_port_support unsigned int upper_link_bw_in_kbps = 0, down_link_bw_in_kbps = 0; unsigned int max_compressed_bw_in_kbps = 0; struct dc_dsc_bw_range bw_range = {0}; - struct drm_dp_mst_topology_mgr *mst_mgr; + uint16_t full_pbn = aconnector->mst_output_port->full_pbn;
/* - * check if the mode could be supported if DSC pass-through is supported - * AND check if there enough bandwidth available to support the mode - * with DSC enabled. + * Consider the case with the depth of the mst topology tree is equal or less than 2 + * A. When dsc bitstream can be transmitted along the entire path + * 1. dsc is possible between source and branch/leaf device (common dsc params is possible), AND + * 2. dsc passthrough supported at MST branch, or + * 3. dsc decoding supported at leaf MST device + * Use maximum dsc compression as bw constraint + * B. When dsc bitstream cannot be transmitted along the entire path + * Use native bw as bw constraint */ if (is_dsc_common_config_possible(stream, &bw_range) && - aconnector->mst_output_port->passthrough_aux) { - mst_mgr = aconnector->mst_output_port->mgr; - mutex_lock(&mst_mgr->lock); - + (aconnector->mst_output_port->passthrough_aux || + aconnector->dsc_aux == &aconnector->mst_output_port->aux)) { cur_link_settings = stream->link->verified_link_cap;
upper_link_bw_in_kbps = dc_link_bandwidth_kbps(aconnector->dc_link, - &cur_link_settings - ); - down_link_bw_in_kbps = kbps_from_pbn(aconnector->mst_output_port->full_pbn); + &cur_link_settings); + down_link_bw_in_kbps = kbps_from_pbn(full_pbn);
/* pick the bottleneck */ end_to_end_bw_in_kbps = min(upper_link_bw_in_kbps, down_link_bw_in_kbps);
- mutex_unlock(&mst_mgr->lock); - /* * use the maximum dsc compression bandwidth as the required * bandwidth for the mode @@ -1630,8 +1630,7 @@ enum dc_status dm_dp_mst_is_port_support /* check if mode could be supported within full_pbn */ bpp = convert_dc_color_depth_into_bpc(stream->timing.display_color_depth) * 3; pbn = drm_dp_calc_pbn_mode(stream->timing.pix_clk_100hz / 10, bpp, false); - - if (pbn > aconnector->mst_output_port->full_pbn) + if (pbn > full_pbn) return DC_FAIL_BANDWIDTH_VALIDATE; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit b71f4ade1b8900d30c661d6c27f87c35214c398c upstream.
When ddc_service_construct() is called, it explicitly checks both the link type and whether there is something on the link which will dictate whether the pin is marked as hw_supported.
If the pin isn't set or the link is not set (such as from unloading/reloading amdgpu in an IGT test) then fail the amdgpu_dm_i2c_xfer() call.
Cc: stable@vger.kernel.org Fixes: 22676bc500c2 ("drm/amd/display: Fix dmub soft hang for PSR 1") Link: https://github.com/fwupd/fwupd/issues/6327 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7394,6 +7394,9 @@ static int amdgpu_dm_i2c_xfer(struct i2c int i; int result = -EIO;
+ if (!ddc_service->ddc_pin || !ddc_service->ddc_pin->hw_info.hw_supported) + return result; + cmd.payloads = kcalloc(num, sizeof(struct i2c_payload), GFP_KERNEL);
if (!cmd.payloads)
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tianci Yin tianci.yin@amd.com
commit 435f5b369657cffee4b04db1f5805b48599f4dbe upstream.
[WHY] When cursor moves across screen boarder, lag cursor observed, since subvp settings need to sync up with vblank that causes cursor updates being delayed.
[HOW] Enable fast plane updates on DCN3.2 to fix it.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Tianci Yin tianci.yin@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9507,14 +9507,14 @@ static bool should_reset_plane(struct dr struct drm_plane *other; struct drm_plane_state *old_other_state, *new_other_state; struct drm_crtc_state *new_crtc_state; + struct amdgpu_device *adev = drm_to_adev(plane->dev); int i;
/* - * TODO: Remove this hack once the checks below are sufficient - * enough to determine when we need to reset all the planes on - * the stream. + * TODO: Remove this hack for all asics once it proves that the + * fast updates works fine on DCN3.2+. */ - if (state->allow_modeset) + if (adev->ip_versions[DCE_HWIP][0] < IP_VERSION(3, 2, 0) && state->allow_modeset) return true;
/* Exit early if we know that we're adding or removing the plane. */
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lewis Huang lewis.huang@amd.com
commit 5911d02cac70d7fb52009fbd37423e63f8f6f9bc upstream.
[WHY] Flush command sent to DMCUB spends more time for execution on a dGPU than on an APU. This causes cursor lag when using high refresh rate mouses.
[HOW] 1. Change the DMCUB mailbox memory location from FB to inbox. 2. Only change windows memory to inbox.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Lewis Huang lewis.huang@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++++---- drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 22 +++++++++------ drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 32 ++++++++++++++++------ 3 files changed, 45 insertions(+), 22 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2077,7 +2077,7 @@ static int dm_dmub_sw_init(struct amdgpu struct dmub_srv_create_params create_params; struct dmub_srv_region_params region_params; struct dmub_srv_region_info region_info; - struct dmub_srv_fb_params fb_params; + struct dmub_srv_memory_params memory_params; struct dmub_srv_fb_info *fb_info; struct dmub_srv *dmub_srv; const struct dmcub_firmware_header_v1_0 *hdr; @@ -2177,6 +2177,7 @@ static int dm_dmub_sw_init(struct amdgpu adev->dm.dmub_fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes) + PSP_HEADER_BYTES; + region_params.is_mailbox_in_inbox = false;
status = dmub_srv_calc_region_info(dmub_srv, ®ion_params, ®ion_info); @@ -2200,10 +2201,10 @@ static int dm_dmub_sw_init(struct amdgpu return r;
/* Rebase the regions on the framebuffer address. */ - memset(&fb_params, 0, sizeof(fb_params)); - fb_params.cpu_addr = adev->dm.dmub_bo_cpu_addr; - fb_params.gpu_addr = adev->dm.dmub_bo_gpu_addr; - fb_params.region_info = ®ion_info; + memset(&memory_params, 0, sizeof(memory_params)); + memory_params.cpu_fb_addr = adev->dm.dmub_bo_cpu_addr; + memory_params.gpu_fb_addr = adev->dm.dmub_bo_gpu_addr; + memory_params.region_info = ®ion_info;
adev->dm.dmub_fb_info = kzalloc(sizeof(*adev->dm.dmub_fb_info), GFP_KERNEL); @@ -2215,7 +2216,7 @@ static int dm_dmub_sw_init(struct amdgpu return -ENOMEM; }
- status = dmub_srv_calc_fb_info(dmub_srv, &fb_params, fb_info); + status = dmub_srv_calc_mem_info(dmub_srv, &memory_params, fb_info); if (status != DMUB_STATUS_OK) { DRM_ERROR("Error calculating DMUB FB info: %d\n", status); return -EINVAL; --- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -186,6 +186,7 @@ struct dmub_srv_region_params { uint32_t vbios_size; const uint8_t *fw_inst_const; const uint8_t *fw_bss_data; + bool is_mailbox_in_inbox; };
/** @@ -205,20 +206,25 @@ struct dmub_srv_region_params { */ struct dmub_srv_region_info { uint32_t fb_size; + uint32_t inbox_size; uint8_t num_regions; struct dmub_region regions[DMUB_WINDOW_TOTAL]; };
/** - * struct dmub_srv_fb_params - parameters used for driver fb setup + * struct dmub_srv_memory_params - parameters used for driver fb setup * @region_info: region info calculated by dmub service - * @cpu_addr: base cpu address for the framebuffer - * @gpu_addr: base gpu virtual address for the framebuffer + * @cpu_fb_addr: base cpu address for the framebuffer + * @cpu_inbox_addr: base cpu address for the gart + * @gpu_fb_addr: base gpu virtual address for the framebuffer + * @gpu_inbox_addr: base gpu virtual address for the gart */ -struct dmub_srv_fb_params { +struct dmub_srv_memory_params { const struct dmub_srv_region_info *region_info; - void *cpu_addr; - uint64_t gpu_addr; + void *cpu_fb_addr; + void *cpu_inbox_addr; + uint64_t gpu_fb_addr; + uint64_t gpu_inbox_addr; };
/** @@ -545,8 +551,8 @@ dmub_srv_calc_region_info(struct dmub_sr * DMUB_STATUS_OK - success * DMUB_STATUS_INVALID - unspecified error */ -enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub, - const struct dmub_srv_fb_params *params, +enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub, + const struct dmub_srv_memory_params *params, struct dmub_srv_fb_info *out);
/** --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -385,7 +385,7 @@ dmub_srv_calc_region_info(struct dmub_sr uint32_t fw_state_size = DMUB_FW_STATE_SIZE; uint32_t trace_buffer_size = DMUB_TRACE_BUFFER_SIZE; uint32_t scratch_mem_size = DMUB_SCRATCH_MEM_SIZE; - + uint32_t previous_top = 0; if (!dmub->sw_init) return DMUB_STATUS_INVALID;
@@ -410,8 +410,15 @@ dmub_srv_calc_region_info(struct dmub_sr bios->base = dmub_align(stack->top, 256); bios->top = bios->base + params->vbios_size;
- mail->base = dmub_align(bios->top, 256); - mail->top = mail->base + DMUB_MAILBOX_SIZE; + if (params->is_mailbox_in_inbox) { + mail->base = 0; + mail->top = mail->base + DMUB_MAILBOX_SIZE; + previous_top = bios->top; + } else { + mail->base = dmub_align(bios->top, 256); + mail->top = mail->base + DMUB_MAILBOX_SIZE; + previous_top = mail->top; + }
fw_info = dmub_get_fw_meta_info(params);
@@ -430,7 +437,7 @@ dmub_srv_calc_region_info(struct dmub_sr dmub->fw_version = fw_info->fw_version; }
- trace_buff->base = dmub_align(mail->top, 256); + trace_buff->base = dmub_align(previous_top, 256); trace_buff->top = trace_buff->base + dmub_align(trace_buffer_size, 64);
fw_state->base = dmub_align(trace_buff->top, 256); @@ -441,11 +448,14 @@ dmub_srv_calc_region_info(struct dmub_sr
out->fb_size = dmub_align(scratch_mem->top, 4096);
+ if (params->is_mailbox_in_inbox) + out->inbox_size = dmub_align(mail->top, 4096); + return DMUB_STATUS_OK; }
-enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub, - const struct dmub_srv_fb_params *params, +enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub, + const struct dmub_srv_memory_params *params, struct dmub_srv_fb_info *out) { uint8_t *cpu_base; @@ -460,8 +470,8 @@ enum dmub_status dmub_srv_calc_fb_info(s if (params->region_info->num_regions != DMUB_NUM_WINDOWS) return DMUB_STATUS_INVALID;
- cpu_base = (uint8_t *)params->cpu_addr; - gpu_base = params->gpu_addr; + cpu_base = (uint8_t *)params->cpu_fb_addr; + gpu_base = params->gpu_fb_addr;
for (i = 0; i < DMUB_NUM_WINDOWS; ++i) { const struct dmub_region *reg = @@ -469,6 +479,12 @@ enum dmub_status dmub_srv_calc_fb_info(s
out->fb[i].cpu_addr = cpu_base + reg->base; out->fb[i].gpu_addr = gpu_base + reg->base; + + if (i == DMUB_WINDOW_4_MAILBOX && params->cpu_inbox_addr != 0) { + out->fb[i].cpu_addr = (uint8_t *)params->cpu_inbox_addr + reg->base; + out->fb[i].gpu_addr = params->gpu_inbox_addr + reg->base; + } + out->fb[i].size = reg->top - reg->base; }
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit feea65a338e52297b68ceb688eaf0ffc50310a83 upstream.
As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a FORTIFY_SOURCE warning:
memcpy: detected field-spanning write (size 32) of single field "&item->msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4) WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 opal_prd_msg_notifier+0x174/0x188 [opal_prd] NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd] LR opal_prd_msg_notifier+0x170/0x188 [opal_prd] Call Trace: opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable) notifier_call_chain+0xc0/0x1b0 atomic_notifier_call_chain+0x2c/0x40 opal_message_notify+0xf4/0x2c0
This happens because the copy is targeting item->msg, which is only 4 bytes in size, even though the enclosing item was allocated with extra space following the msg.
To fix the warning define struct opal_prd_msg with a union of the header and a flex array, and have the memcpy target the flex array.
Reported-by: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com Reported-by: Mahesh Salgaonkar mahesh@linux.ibm.com Tested-by: Mahesh Salgaonkar mahesh@linux.ibm.com Reviewed-by: Mahesh Salgaonkar mahesh@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230821142820.497107-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/powernv/opal-prd.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/arch/powerpc/platforms/powernv/opal-prd.c +++ b/arch/powerpc/platforms/powernv/opal-prd.c @@ -24,13 +24,20 @@ #include <linux/uaccess.h>
+struct opal_prd_msg { + union { + struct opal_prd_msg_header header; + DECLARE_FLEX_ARRAY(u8, data); + }; +}; + /* * The msg member must be at the end of the struct, as it's followed by the * message data. */ struct opal_prd_msg_queue_item { - struct list_head list; - struct opal_prd_msg_header msg; + struct list_head list; + struct opal_prd_msg msg; };
static struct device_node *prd_node; @@ -156,7 +163,7 @@ static ssize_t opal_prd_read(struct file int rc;
/* we need at least a header's worth of data */ - if (count < sizeof(item->msg)) + if (count < sizeof(item->msg.header)) return -EINVAL;
if (*ppos) @@ -186,7 +193,7 @@ static ssize_t opal_prd_read(struct file return -EINTR; }
- size = be16_to_cpu(item->msg.size); + size = be16_to_cpu(item->msg.header.size); if (size > count) { err = -EINVAL; goto err_requeue; @@ -352,7 +359,7 @@ static int opal_prd_msg_notifier(struct if (!item) return -ENOMEM;
- memcpy(&item->msg, msg->params, msg_size); + memcpy(&item->msg.data, msg->params, msg_size);
spin_lock_irqsave(&opal_prd_msg_queue_lock, flags); list_add_tail(&item->list, &opal_prd_msg_queue);
6.5-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit bb32500fb9b78215e4ef6ee8b4345c5f5d7eafb4 upstream.
The following can crash the kernel:
# cd /sys/kernel/tracing # echo 'p:sched schedule' > kprobe_events # exec 5>>events/kprobes/sched/enable # > kprobe_events # exec 5>&-
The above commands:
1. Change directory to the tracefs directory 2. Create a kprobe event (doesn't matter what one) 3. Open bash file descriptor 5 on the enable file of the kprobe event 4. Delete the kprobe event (removes the files too) 5. Close the bash file descriptor 5
The above causes a crash!
BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 877 Comm: bash Not tainted 6.5.0-rc4-test-00008-g2c6b6b1029d4-dirty #186 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:tracing_release_file_tr+0xc/0x50
What happens here is that the kprobe event creates a trace_event_file "file" descriptor that represents the file in tracefs to the event. It maintains state of the event (is it enabled for the given instance?). Opening the "enable" file gets a reference to the event "file" descriptor via the open file descriptor. When the kprobe event is deleted, the file is also deleted from the tracefs system which also frees the event "file" descriptor.
But as the tracefs file is still opened by user space, it will not be totally removed until the final dput() is called on it. But this is not true with the event "file" descriptor that is already freed. If the user does a write to or simply closes the file descriptor it will reference the event "file" descriptor that was just freed, causing a use-after-free bug.
To solve this, add a ref count to the event "file" descriptor as well as a new flag called "FREED". The "file" will not be freed until the last reference is released. But the FREE flag will be set when the event is removed to prevent any more modifications to that event from happening, even if there's still a reference to the event "file" descriptor.
Link: https://lore.kernel.org/linux-trace-kernel/20231031000031.1e705592@gandalf.l... Link: https://lore.kernel.org/linux-trace-kernel/20231031122453.7a48b923@gandalf.l...
Cc: stable@vger.kernel.org Cc: Mark Rutland mark.rutland@arm.com Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and filter files") Reported-by: Beau Belgrave beaub@linux.microsoft.com Tested-by: Beau Belgrave beaub@linux.microsoft.com Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/trace_events.h | 4 +++ kernel/trace/trace.c | 15 ++++++++++++ kernel/trace/trace.h | 3 ++ kernel/trace/trace_events.c | 43 ++++++++++++++++++++++++------------- kernel/trace/trace_events_filter.c | 3 ++ 5 files changed, 53 insertions(+), 15 deletions(-)
--- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -492,6 +492,7 @@ enum { EVENT_FILE_FL_TRIGGER_COND_BIT, EVENT_FILE_FL_PID_FILTER_BIT, EVENT_FILE_FL_WAS_ENABLED_BIT, + EVENT_FILE_FL_FREED_BIT, };
extern struct trace_event_file *trace_get_event_file(const char *instance, @@ -630,6 +631,7 @@ extern int __kprobe_event_add_fields(str * TRIGGER_COND - When set, one or more triggers has an associated filter * PID_FILTER - When set, the event is filtered based on pid * WAS_ENABLED - Set when enabled to know to clear trace on module removal + * FREED - File descriptor is freed, all fields should be considered invalid */ enum { EVENT_FILE_FL_ENABLED = (1 << EVENT_FILE_FL_ENABLED_BIT), @@ -643,6 +645,7 @@ enum { EVENT_FILE_FL_TRIGGER_COND = (1 << EVENT_FILE_FL_TRIGGER_COND_BIT), EVENT_FILE_FL_PID_FILTER = (1 << EVENT_FILE_FL_PID_FILTER_BIT), EVENT_FILE_FL_WAS_ENABLED = (1 << EVENT_FILE_FL_WAS_ENABLED_BIT), + EVENT_FILE_FL_FREED = (1 << EVENT_FILE_FL_FREED_BIT), };
struct trace_event_file { @@ -671,6 +674,7 @@ struct trace_event_file { * caching and such. Which is mostly OK ;-) */ unsigned long flags; + atomic_t ref; /* ref count for opened files */ atomic_t sm_ref; /* soft-mode reference counter */ atomic_t tm_ref; /* trigger-mode reference counter */ }; --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5000,6 +5000,20 @@ int tracing_open_file_tr(struct inode *i if (ret) return ret;
+ mutex_lock(&event_mutex); + + /* Fail if the file is marked for removal */ + if (file->flags & EVENT_FILE_FL_FREED) { + trace_array_put(file->tr); + ret = -ENODEV; + } else { + event_file_get(file); + } + + mutex_unlock(&event_mutex); + if (ret) + return ret; + filp->private_data = inode->i_private;
return 0; @@ -5010,6 +5024,7 @@ int tracing_release_file_tr(struct inode struct trace_event_file *file = inode->i_private;
trace_array_put(file->tr); + event_file_put(file);
return 0; } --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -1656,6 +1656,9 @@ extern void event_trigger_unregister(str char *glob, struct event_trigger_data *trigger_data);
+extern void event_file_get(struct trace_event_file *file); +extern void event_file_put(struct trace_event_file *file); + /** * struct event_trigger_ops - callbacks for trace event triggers * --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -990,26 +990,38 @@ static void remove_subsystem(struct trac } }
-static void remove_event_file_dir(struct trace_event_file *file) +void event_file_get(struct trace_event_file *file) { - struct dentry *dir = file->dir; - struct dentry *child; + atomic_inc(&file->ref); +}
- if (dir) { - spin_lock(&dir->d_lock); /* probably unneeded */ - list_for_each_entry(child, &dir->d_subdirs, d_child) { - if (d_really_is_positive(child)) /* probably unneeded */ - d_inode(child)->i_private = NULL; - } - spin_unlock(&dir->d_lock); +void event_file_put(struct trace_event_file *file) +{ + if (WARN_ON_ONCE(!atomic_read(&file->ref))) { + if (file->flags & EVENT_FILE_FL_FREED) + kmem_cache_free(file_cachep, file); + return; + }
- tracefs_remove(dir); + if (atomic_dec_and_test(&file->ref)) { + /* Count should only go to zero when it is freed */ + if (WARN_ON_ONCE(!(file->flags & EVENT_FILE_FL_FREED))) + return; + kmem_cache_free(file_cachep, file); } +} + +static void remove_event_file_dir(struct trace_event_file *file) +{ + struct dentry *dir = file->dir; + + tracefs_remove(dir);
list_del(&file->list); remove_subsystem(file->system); free_event_filter(file->filter); - kmem_cache_free(file_cachep, file); + file->flags |= EVENT_FILE_FL_FREED; + event_file_put(file); }
/* @@ -1382,7 +1394,7 @@ event_enable_read(struct file *filp, cha flags = file->flags; mutex_unlock(&event_mutex);
- if (!file) + if (!file || flags & EVENT_FILE_FL_FREED) return -ENODEV;
if (flags & EVENT_FILE_FL_ENABLED && @@ -1420,7 +1432,7 @@ event_enable_write(struct file *filp, co ret = -ENODEV; mutex_lock(&event_mutex); file = event_file_data(filp); - if (likely(file)) + if (likely(file && !(file->flags & EVENT_FILE_FL_FREED))) ret = ftrace_event_enable_disable(file, val); mutex_unlock(&event_mutex); break; @@ -1694,7 +1706,7 @@ event_filter_read(struct file *filp, cha
mutex_lock(&event_mutex); file = event_file_data(filp); - if (file) + if (file && !(file->flags & EVENT_FILE_FL_FREED)) print_event_filter(file, s); mutex_unlock(&event_mutex);
@@ -2810,6 +2822,7 @@ trace_create_new_event(struct trace_even atomic_set(&file->tm_ref, 0); INIT_LIST_HEAD(&file->triggers); list_add(&file->list, &tr->events); + event_file_get(file);
return file; } --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -2088,6 +2088,9 @@ int apply_event_filter(struct trace_even struct event_filter *filter = NULL; int err;
+ if (file->flags & EVENT_FILE_FL_FREED) + return -ENODEV; + if (!strcmp(strstrip(filter_string), "0")) { filter_disable(file); filter = event_filter(file);
On Fri, 24 Nov 2023 at 23:56, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.5.13 release. There are 491 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Following build warnings / errors noticed while building the arm64 tinyconfig on stable-rc linux-6.1.y, linux-6.5.y and linux-6.6.y.
Zhen Lei thunder.leizhen@huawei.com rcu: Dump memory object info if callback function is invalid
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Build log:
In file included from kernel/rcu/update.c:49: kernel/rcu/rcu.h: In function 'debug_rcu_head_callback': kernel/rcu/rcu.h:255:17: error: implicit declaration of function 'kmem_dump_obj'; did you mean 'mem_dump_obj'? [-Werror=implicit-function-declaration] 255 | kmem_dump_obj(rhp); | ^~~~~~~~~~~~~ | mem_dump_obj cc1: some warnings being treated as errors
Links, - https://storage.tuxsuite.com/public/linaro/lkft/builds/2YdNC69sBYUq3cp4iEDo4...
-- Linaro LKFT https://lkft.linaro.org
On Fri, Nov 24, 2023 at 05:43:56PM +0000, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.13 release. There are 491 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Noticed no problem with qemu-system-riscv64:
Tested-by: Nam Cao namcao@linutronix.de
Best regards, Nam
On 11/24/23 9:43 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.5.13 release. There are 491 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.5.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.5.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
linux-stable-mirror@lists.linaro.org