This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.95-rc1
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
Jean Delvare jdelvare@suse.de i2c: designware: Fix the functionality flags of the slave-only interface
Jean Delvare jdelvare@suse.de i2c: at91: Fix the functionality flags of the slave-only interface
Yongzhi Liu hyperlyzcs@gmail.com misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe()
Shichao Lai shichaorai@gmail.com usb-storage: alauda: Check whether the media is initialized
Andy Shevchenko andriy.shevchenko@linux.intel.com serial: core: Add UPIO_UNKNOWN constant for unknown port type
Jisheng Zhang jszhang@kernel.org serial: 8250_dw: fall back to poll if there's no interrupt
Sicong Huang congei42@163.com greybus: Fix use-after-free bug in gb_interface_release due to race condition.
Johan Hovold johan+linaro@kernel.org Bluetooth: qca: generalise device address check
Johan Hovold johan+linaro@kernel.org Bluetooth: qca: fix wcn3991 device address check
David Howells dhowells@redhat.com cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
Beleswar Padhi b-padhi@ti.com remoteproc: k3-r5: Jump to error handling labels in start/stop errors
Sam James sam@gentoo.org Revert "fork: defer linking file vma until vma is fully initialized"
YonglongLi liyonglong@chinatelecom.cn mptcp: pm: update add_addr counters after connect
Doug Brown doug@schmorgal.com serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
Miaohe Lin linmiaohe@huawei.com mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
Miaohe Lin linmiaohe@huawei.com mm/huge_memory: don't unpoison huge_zero_folio
Oleg Nesterov oleg@redhat.com tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix potential kernel bug due to lack of writeback flag waiting
Filipe Manana fdmanana@suse.com btrfs: zoned: fix use-after-free due to race with dev replace
Christoph Hellwig hch@lst.de btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
Christoph Hellwig hch@lst.de btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
Christoph Hellwig hch@lst.de btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
Christoph Hellwig hch@lst.de btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Lunar Lake support
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Meteor Lake-S support
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Sapphire Rapids SOC support
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Granite Rapids SOC support
Alexander Shishkin alexander.shishkin@linux.intel.com intel_th: pci: Add Granite Rapids support
Vidya Srinivas vidya.srinivas@intel.com drm/i915/dpt: Make DPT object unshrinkable
Chris Wilson chris@chris-wilson.co.uk drm/i915/gt: Disarm breadcrumbs if engines are already idle
Nam Cao namcao@linutronix.de riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
Beleswar Padhi b-padhi@ti.com remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
Apurva Nandan a-nandan@ti.com remoteproc: k3-r5: Wait for core0 power-up before powering up core1
Nuno Sa nuno.sa@analog.com dmaengine: axi-dmac: fix possible race in remove()
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
Su Yue glass.su@suse.com ocfs2: fix races between hole punching and AIO+DIO
Su Yue glass.su@suse.com ocfs2: use coarse time for new created files
Rik van Riel riel@surriel.com fs/proc: fix softlockup in __read_vmcore
Trond Myklebust trond.myklebust@hammerspace.com knfsd: LOOKUP can return an illegal error value
Vamshi Gajjela vamshigajjela@google.com spmi: hisi-spmi-controller: Do not override device identifier
Hagar Gamal Halim Hemdan hagarhem@amazon.com vmci: prevent speculation leaks by sanitizing event in event_deliver()
Thadeu Lima de Souza Cascardo cascardo@igalia.com sock_map: avoid race between sock_map_close and sk_psock_put
Damien Le Moal dlemoal@kernel.org null_blk: Print correct max open zones limit in null_init_zoned_dev()
Steven Rostedt (Google) rostedt@goodmis.org tracing/selftests: Fix kprobe event name test for .isra. functions
Nam Cao namcao@linutronix.de riscv: fix overlap of allocated page and PTR_ERR
Haifeng Xu haifeng.xu@shopee.com perf/core: Fix missing wakeup when waiting for context reference
Yazen Ghannam yazen.ghannam@amd.com x86/amd_nb: Check for invalid SMN reads
Hagar Hemdan hagarhem@amazon.com irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
YonglongLi liyonglong@chinatelecom.cn mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
Paolo Abeni pabeni@redhat.com mptcp: ensure snd_una is properly initialized on connect
Marek Szyprowski m.szyprowski@samsung.com drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
Jani Nikula jani.nikula@intel.com drm/exynos/vidi: fix memory leak in .get_modes()
Dirk Behme dirk.behme@de.bosch.com drivers: core: synchronize really_probe() and dev_uevent()
Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com iio: imu: inv_icm42600: delete unneeded update watermark call
Marc Ferland marc.ferland@sonatest.com iio: dac: ad5592r: fix temperature channel scaling value
David Lechner dlechner@baylibre.com iio: adc: ad9467: fix scan type sign
Benjamin Segall bsegall@google.com x86/boot: Don't add the EFI stub to targets, again
Yongzhi Liu hyperlyzcs@gmail.com misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe()
Aleksandr Mishin amishin@t-argos.ru bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
Rao Shoaib Rao.Shoaib@oracle.com af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
Taehee Yoo ap420073@gmail.com ionic: fix use after netif_napi_del()
Nikolay Aleksandrov razor@blackwall.org net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
Nikolay Aleksandrov razor@blackwall.org net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
Petr Pavlu petr.pavlu@suse.com net/ipv6: Fix the RT cache flush via sysctl using a previous delay
Daniel Wagner dwagner@suse.de nvmet-passthru: propagate status from id override functions
Xiaolei Wang xiaolei.wang@windriver.com net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
Joshua Washington joshwash@google.com gve: ignore nonrelevant GSO type bits when processing TSO headers
Kory Maincent kory.maincent@bootlin.com net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
Jozsef Kadlecsik kadlec@netfilter.org netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
Gal Pressman gal@nvidia.com net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
Gal Pressman gal@nvidia.com geneve: Fix incorrect inner network header offset when innerprotoinherit is set
Eric Dumazet edumazet@google.com tcp: fix race in tcp_v6_syn_recv_sock()
Adam Miotk adam.miotk@arm.com drm/bridge/panel: Fix runtime warning on panel bridge release
Amjad Ouled-Ameur amjad.ouled-ameur@arm.com drm/komeda: check for error-valued pointer
Aleksandr Mishin amishin@t-argos.ru liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
Jie Wang wangjie125@huawei.com net: hns3: add cond_resched() to hns3 ring buffer init process
Yonglong Liu liuyonglong@huawei.com net: hns3: fix kernel crash problem in concurrent scenario
Csókás, Bence csokas.bence@prolan.hu net: sfp: Always call `sfp_sm_mod_remove()` on remove
Ian Forbes ian.forbes@broadcom.com drm/vmwgfx: Remove STDU logic from generic mode_valid function
Ian Forbes ian.forbes@broadcom.com drm/vmwgfx: 3D disabled should not effect STDU memory limits
Ian Forbes ian.forbes@broadcom.com drm/vmwgfx: Filter modes which exceed graphics memory
Martin Krastev martin.krastev@broadcom.com drm/vmwgfx: Refactor drm connector probing for display modes
Zack Rusin zackr@vmware.com drm/vmwgfx: Port the framebuffer code to drm fb helpers
José Expósito jose.exposito89@gmail.com HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
Kun(llfl) llfl@linux.alibaba.com iommu/amd: Fix sysfs leak in iommu init
Nikita Zhandarovich n.zhandarovich@fintech.ru HID: core: remove unnecessary WARN_ON() in implement()
Matthias Schiffer matthias.schiffer@ew.tq-group.com gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
Matthias Schiffer matthias.schiffer@ew.tq-group.com gpio: tqmx86: store IRQ trigger type and unmask status separately
Linus Walleij linus.walleij@linaro.org gpio: tqmx86: Convert to immutable irq_chip
Matthias Schiffer matthias.schiffer@ew.tq-group.com gpio: tqmx86: introduce shadow register for GPIO output value
Andrei Coardos aboutphysycs@gmail.com gpio: tqmx86: remove unneeded call to platform_set_drvdata()
Gregor Herburger gregor.herburger@tq-group.com gpio: tqmx86: fix typo in Kconfig label
Armin Wolf W_Armin@gmx.de platform/x86: dell-smbios: Fix wrong token data in sysfs
NeilBrown neilb@suse.de NFS: add barriers when testing for NFS_FSDATA_BLOCKED
Chen Hanxiao chenhx.fnst@fujitsu.com SUNRPC: return proper error from gss_wrap_req_priv
Olga Kornievskaia kolga@netapp.com NFSv4.1 enforce rootpath check in fs_location query
Samuel Holland samuel.holland@sifive.com clk: sifive: Do not register clkdevs for PRCI clocks
Masami Hiramatsu (Google) mhiramat@kernel.org selftests/ftrace: Fix to check required event file
Baokun Li libaokun1@huawei.com cachefiles: flush all requests after setting CACHEFILES_DEAD
Baokun Li libaokun1@huawei.com cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
Baokun Li libaokun1@huawei.com cachefiles: never get a new anonymous fd if ondemand_id is valid
Baokun Li libaokun1@huawei.com cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
Baokun Li libaokun1@huawei.com cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
Baokun Li libaokun1@huawei.com cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
Jia Zhu zhujia.zj@bytedance.com cachefiles: add restore command to recover inflight ondemand read requests
Baokun Li libaokun1@huawei.com cachefiles: add spin_lock for cachefiles_ondemand_info
Jia Zhu zhujia.zj@bytedance.com cachefiles: resend an open request if the read request's object is closed
Jia Zhu zhujia.zj@bytedance.com cachefiles: extract ondemand info field from cachefiles_object
Jia Zhu zhujia.zj@bytedance.com cachefiles: introduce object ondemand state
Baokun Li libaokun1@huawei.com cachefiles: remove requests from xarray during flushing requests
Baokun Li libaokun1@huawei.com cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
Dave Jiang dave.jiang@intel.com cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
Dmitry Torokhov dmitry.torokhov@gmail.com Input: try trimming too long modalias strings
Michael Ellerman mpe@ellerman.id.au powerpc/uaccess: Fix build errors seen with GCC 13/14
Ziwei Xiao ziweixiao@google.com gve: Clear napi->skb before dev_kfree_skb_any()
Martin K. Petersen martin.petersen@oracle.com scsi: sd: Use READ(16) when reading block zero on large capacity disks
Breno Leitao leitao@debian.org scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
Damien Le Moal dlemoal@kernel.org scsi: mpi3mr: Fix ATA NCQ priority support
Aapo Vienamo aapo.vienamo@linux.intel.com thunderbolt: debugfs: Fix margin debugfs node creation condition
Kuangyi Chiang ki.chiang65@gmail.com xhci: Apply broken streams quirk to Etron EJ188 xHCI host
Hector Martin marcan@marcan.st xhci: Handle TD clearing for multiple streams case
Kuangyi Chiang ki.chiang65@gmail.com xhci: Apply reset resume quirk to Etron EJ188 xHCI host
Mathias Nyman mathias.nyman@linux.intel.com xhci: Set correct transferred length for cancelled bulk transfers
Greg Kroah-Hartman gregkh@linuxfoundation.org jfs: xattr: fix buffer overflow for invalid xattr
Mickaël Salaün mic@digikod.net landlock: Fix d_parent walk
Ilpo Järvinen ilpo.jarvinen@linux.intel.com tty: n_tty: Fix buffer offsets when lookahead is used
Tomas Winkler tomas.winkler@intel.com mei: me: release irq in mei_me_pci_resume error path
Kyle Tso kyletso@google.com usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state
Amit Sunil Dhamne amitsd@google.com usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
John Ernberg john.ernberg@actia.se USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected
Alan Stern stern@rowland.harvard.edu USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
Jens Axboe axboe@kernel.dk io_uring: check for non-NULL file pointer in io_file_can_poll()
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
Matthew Wilcox (Oracle) willy@infradead.org nilfs2: return the mapped address from nilfs_get_page()
Filipe Manana fdmanana@suse.com btrfs: fix leak of qgroup extent records after transaction abort
Filipe Manana fdmanana@suse.com btrfs: make btrfs_destroy_delayed_refs() return void
Filipe Manana fdmanana@suse.com btrfs: remove unnecessary prototype declarations at disk-io.c
Dmitry Baryshkov dmitry.baryshkov@linaro.org wifi: ath10k: fix QCOM_RPROC_COMMON dependency
Dev Jain dev.jain@arm.com selftests/mm: compaction_test: fix bogus test success on Aarch64
Mark Brown broonie@kernel.org selftests/mm: log a consistent test name for check_compaction
Muhammad Usama Anjum usama.anjum@collabora.com selftests/mm: conform test to TAP format output
Dev Jain dev.jain@arm.com selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
Hailong.Liu hailong.liu@oppo.com mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
Michal Hocko mhocko@suse.com mm, vmalloc: fix high order __GFP_NOFAIL allocations
Hamish Martin hamish.martin@alliedtelesis.co.nz i2c: acpi: Unbind mux adapters before delete
Russell King (Oracle) rmk+kernel@armlinux.org.uk i2c: add fwnode APIs
Johan Hovold johan+linaro@kernel.org HID: i2c-hid: elan: fix reset suspend current leakage
Cong Yang yangcong5@huaqin.corp-partner.google.com HID: i2c-hid: elan: Add ili9882t timing
Gabor Juhos j4g8y7@gmail.com firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails
Uwe Kleine-König u.kleine-koenig@pengutronix.de mmc: davinci: Don't strip remove function when driver is builtin
Hugo Villeneuve hvilleneuve@dimonoff.com serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
Hugo Villeneuve hvilleneuve@dimonoff.com serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
Thomas Weißschuh linux@weissschuh.net misc/pvpanic-pci: register attributes via pci_driver
Thomas Weißschuh linux@weissschuh.net misc/pvpanic: deduplicate common code
Volodymyr Babchuk Volodymyr_Babchuk@epam.com arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm8150: align TLMM pin configuration with DT schema
Hersen Wu hersenxs.wu@amd.com drm/amd/display: Fix incorrect DSC instance for MST
Alexey Kodanev aleksei.kodanev@bell-sw.com drm/amd/display: drop unnecessary NULL checks in debugfs
Max Filippov jcmvbkbc@gmail.com xtensa: fix MAKE_PC_FROM_RA second argument
Randy Dunlap rdunlap@infradead.org xtensa: stacktrace: include <asm/ftrace.h> for prototype
Hans de Goede hdegoede@redhat.com iio: accel: mxc4005: Reset chip on probe() and resume()
Luca Ceresoli luca.ceresoli@bootlin.com iio: accel: mxc4005: allow module autoloading via OF compatible
Wesley Cheng quic_wcheng@quicinc.com usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete
John Keeping john@metanate.com usb: gadget: f_fs: use io_data->status consistently
Qu Wenruo wqu@suse.com btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range()
Johan Hovold johan+linaro@kernel.org Bluetooth: qca: fix invalid device address check
Eric Dumazet edumazet@google.com ipv6: fix possible race in __fib6_drop_pcpu_from()
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
Eric Dumazet edumazet@google.com af_unix: annotate lockless accesses to sk->sk_err
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Annodate data-races around sk->sk_state for writers.
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
Aleksandr Mishin amishin@t-argos.ru net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
Larysa Zaremba larysa.zaremba@intel.com ice: remove af_xdp_zc_qps bitmap
Przemek Kitszel przemyslaw.kitszel@intel.com ice: remove null checks before devm_kfree() calls
Michal Wilczynski michal.wilczynski@intel.com ice: Introduce new parameters in ice_sched_node
Jacob Keller jacob.e.keller@intel.com ice: fix iteration of TLVs in Preserved Fields Area
Karol Kolacinski karol.kolacinski@intel.com ptp: Fix error message on failed pin verification
Eric Dumazet edumazet@google.com net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
Aleksandr Mishin amishin@t-argos.ru net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
Shay Drory shayd@nvidia.com net/mlx5: Always stop health timer during driver removal
Shay Drory shayd@nvidia.com net/mlx5: Split function_setup() to enable and open functions
Moshe Shemesh moshe@nvidia.com net/mlx5: Stop waiting for PCI if pci channel is offline
Moshe Shemesh moshe@mellanox.com net/mlx5: Stop waiting for PCI up if teardown was triggered
Jason Xing kernelxing@tencent.com tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
Daniel Borkmann daniel@iogearbox.net vxlan: Fix regression when dropping packets due to invalid src addresses
Hangyu Hua hbh25y@gmail.com net: sched: sch_multiq: fix possible OOB write in multiq_tune()
Wen Gu guwen@linux.alibaba.com net/smc: avoid overwriting when adjusting sock bufsizes
Subbaraya Sundeep sbhatta@marvell.com octeontx2-af: Always allocate PF entries from low prioriy zone
Jiri Olsa jolsa@kernel.org bpf: Set run context for rawtp test_run callback
Eric Dumazet edumazet@google.com ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
Eric Dumazet edumazet@google.com ipv6: ioam: block BH from ioam6_output()
DelphineCCChiu delphine_cc_chiu@wiwynn.com net/ncsi: Fix the multi thread manner of NCSI driver
Peter Delevoryas peter@pjd.dev net/ncsi: Simplify Kconfig/dts control flow
Duoming Zhou duoming@zju.edu.cn ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()
Lars Kellogg-Stedman lars@oddbit.com ax25: Fix refcount imbalance on inbound connections
Lingbo Kong quic_lingbok@quicinc.com wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
Emmanuel Grumbach emmanuel.grumbach@intel.com wifi: iwlwifi: mvm: don't read past the mfuart notifcation
Miri Korenblit miriam.rachel.korenblit@intel.com wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
Shahar S Matityahu shahar.s.matityahu@intel.com wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
Johannes Berg johannes.berg@intel.com wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
Lin Ma linma@zju.edu.cn wifi: cfg80211: pmsr: use correct nla_get_uX functions
Remi Pommarel repk@triplefau.lt wifi: cfg80211: Lock wiphy in cfg80211_get_station
Johannes Berg johannes.berg@intel.com wifi: cfg80211: fully move wiphy work to unbound workqueue
Remi Pommarel repk@triplefau.lt wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
Nicolas Escande nico.escande@gmail.com wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
-------------
Diffstat:
Makefile | 4 +- arch/arm64/boot/dts/qcom/sa8155p-adp.dts | 86 +-- .../boot/dts/qcom/sm8150-microsoft-surface-duo.dts | 2 +- arch/arm64/boot/dts/qcom/sm8150.dtsi | 376 ++++------ arch/powerpc/include/asm/uaccess.h | 15 +- arch/riscv/mm/init.c | 21 +- arch/riscv/mm/pageattr.c | 28 +- arch/x86/boot/compressed/Makefile | 4 +- arch/x86/kernel/amd_nb.c | 9 +- arch/xtensa/include/asm/processor.h | 8 +- arch/xtensa/include/asm/ptrace.h | 2 +- arch/xtensa/kernel/process.c | 5 +- arch/xtensa/kernel/stacktrace.c | 4 +- drivers/base/core.c | 3 + drivers/block/null_blk/zoned.c | 2 +- drivers/bluetooth/btqca.c | 46 +- drivers/bluetooth/btqca.h | 2 + drivers/bluetooth/hci_qca.c | 2 - drivers/clk/sifive/sifive-prci.c | 8 - drivers/dma/dma-axi-dmac.c | 2 +- drivers/firmware/qcom_scm.c | 18 +- drivers/gpio/Kconfig | 2 +- drivers/gpio/gpio-tqmx86.c | 136 +++- .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 120 ++- .../drm/arm/display/komeda/komeda_pipeline_state.c | 2 +- drivers/gpu/drm/bridge/panel.c | 7 +- drivers/gpu/drm/exynos/exynos_drm_vidi.c | 7 +- drivers/gpu/drm/exynos/exynos_hdmi.c | 7 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 +- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +- drivers/gpu/drm/vmwgfx/Kconfig | 7 - drivers/gpu/drm/vmwgfx/Makefile | 2 - drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 65 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 38 +- drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 831 --------------------- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 353 +++------ drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 13 +- drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 +- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 5 +- drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 47 +- drivers/greybus/interface.c | 1 + drivers/hid/hid-core.c | 1 - drivers/hid/hid-logitech-dj.c | 4 +- drivers/hid/i2c-hid/i2c-hid-of-elan.c | 103 ++- drivers/hwtracing/intel_th/pci.c | 25 + drivers/i2c/busses/i2c-at91-slave.c | 3 +- drivers/i2c/busses/i2c-designware-slave.c | 2 +- drivers/i2c/i2c-core-acpi.c | 30 +- drivers/i2c/i2c-core-base.c | 98 +++ drivers/i2c/i2c-core-of.c | 66 -- drivers/iio/accel/mxc4005.c | 76 ++ drivers/iio/adc/ad9467.c | 4 +- drivers/iio/dac/ad5592r-base.c | 2 +- drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 4 - drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 4 - drivers/input/input.c | 104 ++- drivers/iommu/amd/init.c | 9 + drivers/irqchip/irq-gic-v3-its.c | 44 +- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c | 9 +- drivers/misc/mei/pci-me.c | 4 +- drivers/misc/pvpanic/pvpanic-mmio.c | 58 +- drivers/misc/pvpanic/pvpanic-pci.c | 60 +- drivers/misc/pvpanic/pvpanic.c | 76 +- drivers/misc/pvpanic/pvpanic.h | 10 +- drivers/misc/vmw_vmci/vmci_event.c | 6 +- drivers/mmc/host/davinci_mmc.c | 4 +- drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c | 2 +- drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c | 11 +- drivers/net/ethernet/google/gve/gve_rx_dqo.c | 8 +- drivers/net/ethernet/google/gve/gve_tx_dqo.c | 20 +- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 + drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 2 + .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 21 +- drivers/net/ethernet/intel/ice/ice.h | 32 +- drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 4 +- drivers/net/ethernet/intel/ice/ice_common.c | 9 +- drivers/net/ethernet/intel/ice/ice_controlq.c | 3 +- drivers/net/ethernet/intel/ice/ice_flow.c | 23 +- drivers/net/ethernet/intel/ice/ice_lib.c | 46 +- drivers/net/ethernet/intel/ice/ice_nvm.c | 28 +- drivers/net/ethernet/intel/ice/ice_sched.c | 90 ++- drivers/net/ethernet/intel/ice/ice_sched.h | 27 + drivers/net/ethernet/intel/ice/ice_switch.c | 19 +- drivers/net/ethernet/intel/ice/ice_type.h | 8 + drivers/net/ethernet/intel/ice/ice_xsk.c | 13 +- .../net/ethernet/marvell/octeontx2/af/rvu_npc.c | 33 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 4 + drivers/net/ethernet/mellanox/mlx5/core/health.c | 12 + .../net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 8 +- .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 4 + drivers/net/ethernet/mellanox/mlx5/core/main.c | 116 ++- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +- drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 25 +- drivers/net/geneve.c | 10 +- drivers/net/phy/sfp.c | 3 +- drivers/net/vxlan/vxlan_core.c | 4 + drivers/net/wireless/ath/ath10k/Kconfig | 1 + drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 +- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 10 - drivers/net/wireless/intel/iwlwifi/mvm/rs.h | 9 +- drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 4 +- drivers/net/wwan/iosm/iosm_ipc_devlink.c | 2 +- drivers/nvme/target/passthru.c | 6 +- drivers/pci/controller/pcie-rockchip-ep.c | 6 +- drivers/platform/x86/dell/dell-smbios-base.c | 92 +-- drivers/ptp/ptp_chardev.c | 3 +- drivers/remoteproc/ti_k3_r5_remoteproc.c | 58 +- drivers/scsi/mpi3mr/mpi3mr_app.c | 62 ++ drivers/scsi/mpt3sas/mpt3sas_base.c | 19 + drivers/scsi/mpt3sas/mpt3sas_base.h | 3 - drivers/scsi/mpt3sas/mpt3sas_ctl.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 23 - drivers/scsi/scsi_transport_sas.c | 23 + drivers/scsi/sd.c | 17 +- drivers/spmi/hisi-spmi-controller.c | 1 - drivers/thunderbolt/debugfs.c | 5 +- drivers/tty/n_tty.c | 22 +- drivers/tty/serial/8250/8250_dw.c | 5 +- drivers/tty/serial/8250/8250_pxa.c | 1 + drivers/tty/serial/sc16is7xx.c | 25 +- drivers/usb/Makefile | 1 + drivers/usb/class/cdc-wdm.c | 4 +- drivers/usb/gadget/function/f_fs.c | 9 +- drivers/usb/host/xhci-pci.c | 7 + drivers/usb/host/xhci-ring.c | 59 +- drivers/usb/host/xhci.h | 1 + drivers/usb/storage/alauda.c | 9 +- drivers/usb/typec/tcpm/tcpm.c | 5 +- fs/btrfs/disk-io.c | 26 +- fs/btrfs/extent_map.c | 2 +- fs/btrfs/zoned.c | 330 ++++---- fs/cachefiles/daemon.c | 4 +- fs/cachefiles/interface.c | 7 +- fs/cachefiles/internal.h | 52 +- fs/cachefiles/ondemand.c | 324 ++++++-- fs/jfs/xattr.c | 4 +- fs/nfs/dir.c | 47 +- fs/nfs/nfs4proc.c | 23 +- fs/nfsd/nfsfh.c | 4 +- fs/nilfs2/dir.c | 59 +- fs/nilfs2/segment.c | 3 + fs/ocfs2/file.c | 2 + fs/ocfs2/namei.c | 2 +- fs/proc/vmcore.c | 2 + include/linux/i2c.h | 26 +- include/linux/pse-pd/pse.h | 4 +- include/linux/serial_core.h | 1 + include/net/bluetooth/hci_core.h | 36 +- include/net/ip_tunnels.h | 5 +- include/scsi/scsi_transport_sas.h | 2 + include/trace/events/cachefiles.h | 8 +- io_uring/kbuf.c | 3 +- kernel/events/core.c | 13 + kernel/fork.c | 18 +- kernel/pid_namespace.c | 1 + kernel/time/tick-common.c | 42 +- mm/memory-failure.c | 11 +- mm/vmalloc.c | 29 +- net/ax25/af_ax25.c | 6 + net/ax25/ax25_dev.c | 2 +- net/bluetooth/l2cap_core.c | 8 +- net/bpf/test_run.c | 6 + net/bridge/br_mst.c | 13 +- net/core/sock_map.c | 16 +- net/ipv4/tcp.c | 6 +- net/ipv6/ioam6_iptunnel.c | 8 +- net/ipv6/ip6_fib.c | 6 +- net/ipv6/route.c | 5 +- net/ipv6/seg6_iptunnel.c | 14 +- net/ipv6/tcp_ipv6.c | 3 +- net/mac80211/he.c | 10 +- net/mac80211/mesh_pathtbl.c | 13 + net/mac80211/sta_info.c | 4 +- net/mptcp/pm_netlink.c | 21 +- net/mptcp/protocol.c | 1 + net/ncsi/internal.h | 2 + net/ncsi/ncsi-manage.c | 95 +-- net/ncsi/ncsi-rsp.c | 4 +- net/netfilter/ipset/ip_set_core.c | 93 ++- net/netfilter/ipset/ip_set_list_set.c | 30 +- net/sched/sch_multiq.c | 2 +- net/sched/sch_taprio.c | 15 +- net/smc/af_smc.c | 22 +- net/sunrpc/auth_gss/auth_gss.c | 4 +- net/unix/af_unix.c | 109 ++- net/unix/diag.c | 12 +- net/wireless/core.c | 2 +- net/wireless/pmsr.c | 8 +- net/wireless/sysfs.c | 4 +- net/wireless/util.c | 7 +- security/landlock/fs.c | 13 +- tools/testing/cxl/test/mem.c | 1 + .../ftrace/test.d/dynevent/test_duplicates.tc | 2 +- .../ftrace/test.d/kprobe/kprobe_eventname.tc | 3 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +- tools/testing/selftests/vm/compaction_test.c | 108 +-- 197 files changed, 2950 insertions(+), 3037 deletions(-)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Escande nico.escande@gmail.com
[ Upstream commit b7d7f11a291830fdf69d3301075dd0fb347ced84 ]
The hwmp code use objects of type mesh_preq_queue, added to a list in ieee80211_if_mesh, to keep track of mpath we need to resolve. If the mpath gets deleted, ex mesh interface is removed, the entries in that list will never get cleaned. Fix this by flushing all corresponding items of the preq_queue in mesh_path_flush_pending().
This should take care of KASAN reports like this:
unreferenced object 0xffff00000668d800 (size 128): comm "kworker/u8:4", pid 67, jiffies 4295419552 (age 1836.444s) hex dump (first 32 bytes): 00 1f 05 09 00 00 ff ff 00 d5 68 06 00 00 ff ff ..........h..... 8e 97 ea eb 3e b8 01 00 00 00 00 00 00 00 00 00 ....>........... backtrace: [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c [<00000000049bd418>] kmalloc_trace+0x34/0x80 [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8 [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4 [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764 [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4 [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440 [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4 [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508 [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c [<00000000b36425d1>] worker_thread+0x9c/0x634 [<0000000005852dd5>] kthread+0x1bc/0x1c4 [<000000005fccd770>] ret_from_fork+0x10/0x20 unreferenced object 0xffff000009051f00 (size 128): comm "kworker/u8:4", pid 67, jiffies 4295419553 (age 1836.440s) hex dump (first 32 bytes): 90 d6 92 0d 00 00 ff ff 00 d8 68 06 00 00 ff ff ..........h..... 36 27 92 e4 02 e0 01 00 00 58 79 06 00 00 ff ff 6'.......Xy..... backtrace: [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c [<00000000049bd418>] kmalloc_trace+0x34/0x80 [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8 [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4 [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764 [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4 [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440 [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4 [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508 [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c [<00000000b36425d1>] worker_thread+0x9c/0x634 [<0000000005852dd5>] kthread+0x1bc/0x1c4 [<000000005fccd770>] ret_from_fork+0x10/0x20
Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol") Signed-off-by: Nicolas Escande nico.escande@gmail.com Link: https://msgid.link/20240528142605.1060566-1-nico.escande@gmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mesh_pathtbl.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c index 69d5e1ec6edef..e7b9dcf30adc9 100644 --- a/net/mac80211/mesh_pathtbl.c +++ b/net/mac80211/mesh_pathtbl.c @@ -723,10 +723,23 @@ void mesh_path_discard_frame(struct ieee80211_sub_if_data *sdata, */ void mesh_path_flush_pending(struct mesh_path *mpath) { + struct ieee80211_sub_if_data *sdata = mpath->sdata; + struct ieee80211_if_mesh *ifmsh = &sdata->u.mesh; + struct mesh_preq_queue *preq, *tmp; struct sk_buff *skb;
while ((skb = skb_dequeue(&mpath->frame_queue)) != NULL) mesh_path_discard_frame(mpath->sdata, skb); + + spin_lock_bh(&ifmsh->mesh_preq_queue_lock); + list_for_each_entry_safe(preq, tmp, &ifmsh->preq_queue.list, list) { + if (ether_addr_equal(mpath->dst, preq->dst)) { + list_del(&preq->list); + kfree(preq); + --ifmsh->preq_queue_len; + } + } + spin_unlock_bh(&ifmsh->mesh_preq_queue_lock); }
/**
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Remi Pommarel repk@triplefau.lt
[ Upstream commit 44c06bbde6443de206b30f513100b5670b23fc5e ]
The ieee80211_sta_ps_deliver_wakeup() function takes sta->ps_lock to synchronizes with ieee80211_tx_h_unicast_ps_buf() which is called from softirq context. However using only spin_lock() to get sta->ps_lock in ieee80211_sta_ps_deliver_wakeup() does not prevent softirq to execute on this same CPU, to run ieee80211_tx_h_unicast_ps_buf() and try to take this same lock ending in deadlock. Below is an example of rcu stall that arises in such situation.
rcu: INFO: rcu_sched self-detected stall on CPU rcu: 2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996 rcu: (t=42586894 jiffies g=2057 q=362405 ncpus=4) CPU: 2 PID: 719 Comm: wpa_supplicant Tainted: G W 6.4.0-02158-g1b062f552873 #742 Hardware name: RPT (r1) (DT) pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : queued_spin_lock_slowpath+0x58/0x2d0 lr : invoke_tx_handlers_early+0x5b4/0x5c0 sp : ffff00001ef64660 x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8 x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000 x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000 x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000 x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80 x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440 x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880 x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8 Call trace: queued_spin_lock_slowpath+0x58/0x2d0 ieee80211_tx+0x80/0x12c ieee80211_tx_pending+0x110/0x278 tasklet_action_common.constprop.0+0x10c/0x144 tasklet_action+0x20/0x28 _stext+0x11c/0x284 ____do_softirq+0xc/0x14 call_on_irq_stack+0x24/0x34 do_softirq_own_stack+0x18/0x20 do_softirq+0x74/0x7c __local_bh_enable_ip+0xa0/0xa4 _ieee80211_wake_txqs+0x3b0/0x4b8 __ieee80211_wake_queue+0x12c/0x168 ieee80211_add_pending_skbs+0xec/0x138 ieee80211_sta_ps_deliver_wakeup+0x2a4/0x480 ieee80211_mps_sta_status_update.part.0+0xd8/0x11c ieee80211_mps_sta_status_update+0x18/0x24 sta_apply_parameters+0x3bc/0x4c0 ieee80211_change_station+0x1b8/0x2dc nl80211_set_station+0x444/0x49c genl_family_rcv_msg_doit.isra.0+0xa4/0xfc genl_rcv_msg+0x1b0/0x244 netlink_rcv_skb+0x38/0x10c genl_rcv+0x34/0x48 netlink_unicast+0x254/0x2bc netlink_sendmsg+0x190/0x3b4 ____sys_sendmsg+0x1e8/0x218 ___sys_sendmsg+0x68/0x8c __sys_sendmsg+0x44/0x84 __arm64_sys_sendmsg+0x20/0x28 do_el0_svc+0x6c/0xe8 el0_svc+0x14/0x48 el0t_64_sync_handler+0xb0/0xb4 el0t_64_sync+0x14c/0x150
Using spin_lock_bh()/spin_unlock_bh() instead prevents softirq to raise on the same CPU that is holding the lock.
Fixes: 1d147bfa6429 ("mac80211: fix AP powersave TX vs. wakeup race") Signed-off-by: Remi Pommarel repk@triplefau.lt Link: https://msgid.link/8e36fe07d0fbc146f89196cd47a53c8a0afe84aa.1716910344.git.r... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/sta_info.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index bd56015b29258..f388b39531748 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -1555,7 +1555,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta) skb_queue_head_init(&pending);
/* sync with ieee80211_tx_h_unicast_ps_buf */ - spin_lock(&sta->ps_lock); + spin_lock_bh(&sta->ps_lock); /* Send all buffered frames to the station */ for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) { int count = skb_queue_len(&pending), tmp; @@ -1584,7 +1584,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta) */ clear_sta_flag(sta, WLAN_STA_PSPOLL); clear_sta_flag(sta, WLAN_STA_UAPSD); - spin_unlock(&sta->ps_lock); + spin_unlock_bh(&sta->ps_lock);
atomic_dec(&ps->num_sta_ps);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit e296c95eac655008d5a709b8cf54d0018da1c916 ]
Previously I had moved the wiphy work to the unbound system workqueue, but missed that when it restarts and during resume it was still using the normal system workqueue. Fix that.
Fixes: 91d20ab9d9ca ("wifi: cfg80211: use system_unbound_wq for wiphy work") Reviewed-by: Miriam Rachel Korenblit miriam.rachel.korenblit@intel.com Link: https://msgid.link/20240522124126.7ca959f2cbd3.I3e2a71ef445d167b84000ccf934e... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/core.c | 2 +- net/wireless/sysfs.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/wireless/core.c b/net/wireless/core.c index 3fcddc8687ed4..22f67b64135d2 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -427,7 +427,7 @@ static void cfg80211_wiphy_work(struct work_struct *work) if (wk) { list_del_init(&wk->entry); if (!list_empty(&rdev->wiphy_work_list)) - schedule_work(work); + queue_work(system_unbound_wq, work); spin_unlock_irq(&rdev->wiphy_work_lock);
wk->func(&rdev->wiphy, wk); diff --git a/net/wireless/sysfs.c b/net/wireless/sysfs.c index a88f338c61d31..17ccb9c6091e8 100644 --- a/net/wireless/sysfs.c +++ b/net/wireless/sysfs.c @@ -5,7 +5,7 @@ * * Copyright 2005-2006 Jiri Benc jbenc@suse.cz * Copyright 2006 Johannes Berg johannes@sipsolutions.net - * Copyright (C) 2020-2021, 2023 Intel Corporation + * Copyright (C) 2020-2021, 2023-2024 Intel Corporation */
#include <linux/device.h> @@ -137,7 +137,7 @@ static int wiphy_resume(struct device *dev) if (rdev->wiphy.registered && rdev->ops->resume) ret = rdev_resume(rdev); rdev->suspended = false; - schedule_work(&rdev->wiphy_work); + queue_work(system_unbound_wq, &rdev->wiphy_work); wiphy_unlock(&rdev->wiphy);
if (ret)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Remi Pommarel repk@triplefau.lt
[ Upstream commit 642f89daa34567d02f312d03e41523a894906dae ]
Wiphy should be locked before calling rdev_get_station() (see lockdep assert in ieee80211_get_station()).
This fixes the following kernel NULL dereference:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000 [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] SMP Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705 Hardware name: RPT (r1) (DT) Workqueue: bat_events batadv_v_elp_throughput_metric_update pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core] lr : sta_set_sinfo+0xcc/0xbd4 sp : ffff000007b43ad0 x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98 x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000 x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000 x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000 x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000 x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90 x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000 Call trace: ath10k_sta_statistics+0x10/0x2dc [ath10k_core] sta_set_sinfo+0xcc/0xbd4 ieee80211_get_station+0x2c/0x44 cfg80211_get_station+0x80/0x154 batadv_v_elp_get_throughput+0x138/0x1fc batadv_v_elp_throughput_metric_update+0x1c/0xa4 process_one_work+0x1ec/0x414 worker_thread+0x70/0x46c kthread+0xdc/0xe0 ret_from_fork+0x10/0x20 Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814)
This happens because STA has time to disconnect and reconnect before batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In this situation, ath10k_sta_state() can be in the middle of resetting arsta data when the work queue get chance to be scheduled and ends up accessing it. Locking wiphy prevents that.
Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API") Signed-off-by: Remi Pommarel repk@triplefau.lt Reviewed-by: Nicolas Escande nico.escande@gmail.com Acked-by: Antonio Quartulli a@unstable.cc Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.r... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/util.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/wireless/util.c b/net/wireless/util.c index f433f3fdd9e94..73b3648e1b4c3 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -2202,6 +2202,7 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr, { struct cfg80211_registered_device *rdev; struct wireless_dev *wdev; + int ret;
wdev = dev->ieee80211_ptr; if (!wdev) @@ -2213,7 +2214,11 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr,
memset(sinfo, 0, sizeof(*sinfo));
- return rdev_get_station(rdev, dev, mac_addr, sinfo); + wiphy_lock(&rdev->wiphy); + ret = rdev_get_station(rdev, dev, mac_addr, sinfo); + wiphy_unlock(&rdev->wiphy); + + return ret; } EXPORT_SYMBOL(cfg80211_get_station);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lin Ma linma@zju.edu.cn
[ Upstream commit ab904521f4de52fef4f179d2dfc1877645ef5f5c ]
The commit 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API") defines four attributes NL80211_PMSR_FTM_REQ_ATTR_ {NUM_BURSTS_EXP}/{BURST_PERIOD}/{BURST_DURATION}/{FTMS_PER_BURST} in following ways.
static const struct nla_policy nl80211_pmsr_ftm_req_attr_policy[NL80211_PMSR_FTM_REQ_ATTR_MAX + 1] = { ... [NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP] = NLA_POLICY_MAX(NLA_U8, 15), [NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD] = { .type = NLA_U16 }, [NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION] = NLA_POLICY_MAX(NLA_U8, 15), [NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST] = NLA_POLICY_MAX(NLA_U8, 31), ... };
That is, those attributes are expected to be NLA_U8 and NLA_U16 types. However, the consumers of these attributes in `pmsr_parse_ftm` blindly all use `nla_get_u32`, which is incorrect and causes functionality issues on little-endian platforms. Hence, fix them with the correct `nla_get_u8` and `nla_get_u16` functions.
Fixes: 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API") Signed-off-by: Lin Ma linma@zju.edu.cn Link: https://msgid.link/20240521075059.47999-1-linma@zju.edu.cn Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/pmsr.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/wireless/pmsr.c b/net/wireless/pmsr.c index 2bc647720cda5..d26daa0370e71 100644 --- a/net/wireless/pmsr.c +++ b/net/wireless/pmsr.c @@ -56,7 +56,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev, out->ftm.burst_period = 0; if (tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD]) out->ftm.burst_period = - nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD]); + nla_get_u16(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD]);
out->ftm.asap = !!tb[NL80211_PMSR_FTM_REQ_ATTR_ASAP]; if (out->ftm.asap && !capa->ftm.asap) { @@ -75,7 +75,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev, out->ftm.num_bursts_exp = 0; if (tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP]) out->ftm.num_bursts_exp = - nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP]); + nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP]);
if (capa->ftm.max_bursts_exponent >= 0 && out->ftm.num_bursts_exp > capa->ftm.max_bursts_exponent) { @@ -88,7 +88,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev, out->ftm.burst_duration = 15; if (tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION]) out->ftm.burst_duration = - nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION]); + nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION]);
out->ftm.ftms_per_burst = 0; if (tb[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST]) @@ -107,7 +107,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev, out->ftm.ftmr_retries = 3; if (tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES]) out->ftm.ftmr_retries = - nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES]); + nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES]);
out->ftm.request_lci = !!tb[NL80211_PMSR_FTM_REQ_ATTR_REQUEST_LCI]; if (out->ftm.request_lci && !capa->ftm.request_lci) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 4a7aace2899711592327463c1a29ffee44fcc66e ]
We don't actually support >64 even for HE devices, so revert back to 64. This fixes an issue where the session is refused because the queue is configured differently from the actual session later.
Fixes: 514c30696fbc ("iwlwifi: add support for IEEE802.11ax") Signed-off-by: Johannes Berg johannes.berg@intel.com Reviewed-by: Liad Kaufman liad.kaufman@intel.com Reviewed-by: Luciano Coelho luciano.coelho@intel.com Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Link: https://msgid.link/20240510170500.52f7b4cf83aa.If47e43adddf7fe250ed7f5571fbb... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/rs.h | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rs.h b/drivers/net/wireless/intel/iwlwifi/mvm/rs.h index b7bc8c1b2ddae..00f04f675cbbb 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/rs.h +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rs.h @@ -123,13 +123,8 @@ enum {
#define LINK_QUAL_AGG_FRAME_LIMIT_DEF (63) #define LINK_QUAL_AGG_FRAME_LIMIT_MAX (63) -/* - * FIXME - various places in firmware API still use u8, - * e.g. LQ command and SCD config command. - * This should be 256 instead. - */ -#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_DEF (255) -#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_MAX (255) +#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_DEF (64) +#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_MAX (64) #define LINK_QUAL_AGG_FRAME_LIMIT_MIN (0)
#define LQ_SIZE 2 /* 2 mode tables: "Active" and "Search" */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shahar S Matityahu shahar.s.matityahu@intel.com
[ Upstream commit 87821b67dea87addbc4ab093ba752753b002176a ]
The driver should call iwl_dbg_tlv_free even if debugfs is not defined since ini mode does not depend on debugfs ifdef.
Fixes: 68f6f492c4fa ("iwlwifi: trans: support loading ini TLVs from external file") Signed-off-by: Shahar S Matityahu shahar.s.matityahu@intel.com Reviewed-by: Luciano Coelho luciano.coelho@intel.com Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Link: https://msgid.link/20240510170500.c8e3723f55b0.I5e805732b0be31ee6b83c642ec65... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c index 5eba1a355f043..024c37062a60b 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c @@ -1750,8 +1750,8 @@ struct iwl_drv *iwl_drv_start(struct iwl_trans *trans) err_fw: #ifdef CONFIG_IWLWIFI_DEBUGFS debugfs_remove_recursive(drv->dbgfs_drv); - iwl_dbg_tlv_free(drv->trans); #endif + iwl_dbg_tlv_free(drv->trans); kfree(drv); err: return ERR_PTR(ret);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miri Korenblit miriam.rachel.korenblit@intel.com
[ Upstream commit 60d62757df30b74bf397a2847a6db7385c6ee281 ]
In some versions of cfg80211, the ssids poinet might be a valid one even though n_ssids is 0. Accessing the pointer in this case will cuase an out-of-bound access. Fix this by checking n_ssids first.
Fixes: c1a7515393e4 ("iwlwifi: mvm: add adaptive dwell support") Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Reviewed-by: Ilan Peer ilan.peer@intel.com Reviewed-by: Johannes Berg johannes.berg@intel.com Link: https://msgid.link/20240513132416.6e4d1762bf0d.I5a0e6cc8f02050a766db704d1559... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c index b20d64dbba1ad..a7a29f1659ea6 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c @@ -1298,7 +1298,7 @@ static void iwl_mvm_scan_umac_dwell(struct iwl_mvm *mvm, if (IWL_MVM_ADWELL_MAX_BUDGET) cmd->v7.adwell_max_budget = cpu_to_le16(IWL_MVM_ADWELL_MAX_BUDGET); - else if (params->ssids && params->ssids[0].ssid_len) + else if (params->n_ssids && params->ssids[0].ssid_len) cmd->v7.adwell_max_budget = cpu_to_le16(IWL_SCAN_ADWELL_MAX_BUDGET_DIRECTED_SCAN); else @@ -1400,7 +1400,7 @@ iwl_mvm_scan_umac_dwell_v11(struct iwl_mvm *mvm, if (IWL_MVM_ADWELL_MAX_BUDGET) general_params->adwell_max_budget = cpu_to_le16(IWL_MVM_ADWELL_MAX_BUDGET); - else if (params->ssids && params->ssids[0].ssid_len) + else if (params->n_ssids && params->ssids[0].ssid_len) general_params->adwell_max_budget = cpu_to_le16(IWL_SCAN_ADWELL_MAX_BUDGET_DIRECTED_SCAN); else
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Emmanuel Grumbach emmanuel.grumbach@intel.com
[ Upstream commit 4bb95f4535489ed830cf9b34b0a891e384d1aee4 ]
In case the firmware sends a notification that claims it has more data than it has, we will read past that was allocated for the notification. Remove the print of the buffer, we won't see it by default. If needed, we can see the content with tracing.
This was reported by KFENCE.
Fixes: bdccdb854f2f ("iwlwifi: mvm: support MFUART dump in case of MFUART assert") Signed-off-by: Emmanuel Grumbach emmanuel.grumbach@intel.com Reviewed-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aeb... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c index 2e3c98eaa400c..668bb9ce293db 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -91,20 +91,10 @@ void iwl_mvm_mfu_assert_dump_notif(struct iwl_mvm *mvm, { struct iwl_rx_packet *pkt = rxb_addr(rxb); struct iwl_mfu_assert_dump_notif *mfu_dump_notif = (void *)pkt->data; - __le32 *dump_data = mfu_dump_notif->data; - int n_words = le32_to_cpu(mfu_dump_notif->data_size) / sizeof(__le32); - int i;
if (mfu_dump_notif->index_num == 0) IWL_INFO(mvm, "MFUART assert id 0x%x occurred\n", le32_to_cpu(mfu_dump_notif->assert_id)); - - for (i = 0; i < n_words; i++) - IWL_DEBUG_INFO(mvm, - "MFUART assert dump, dword %u: 0x%08x\n", - le16_to_cpu(mfu_dump_notif->index_num) * - n_words + i, - le32_to_cpu(dump_data[i])); }
static bool iwl_alive_fn(struct iwl_notif_wait_data *notif_wait,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lingbo Kong quic_lingbok@quicinc.com
[ Upstream commit a26d8dc5227f449a54518a8b40733a54c6600a8b ]
Currently, the way of parsing Spatial Reuse Parameter Set element is incorrect and some members of struct ieee80211_he_obss_pd are not assigned.
To address this issue, it must be parsed in the order of the elements of Spatial Reuse Parameter Set defined in the IEEE Std 802.11ax specification.
The diagram of the Spatial Reuse Parameter Set element (IEEE Std 802.11ax -2021-9.4.2.252).
------------------------------------------------------------------------- | | | | |Non-SRG| SRG | SRG | SRG | SRG | |Element|Length| Element | SR |OBSS PD|OBSS PD|OBSS PD| BSS |Partial| | ID | | ID |Control| Max | Min | Max |Color | BSSID | | | |Extension| | Offset| Offset|Offset |Bitmap|Bitmap | -------------------------------------------------------------------------
Fixes: 1ced169cc1c2 ("mac80211: allow setting spatial reuse parameters from bss_conf") Signed-off-by: Lingbo Kong quic_lingbok@quicinc.com Link: https://msgid.link/20240516021854.5682-3-quic_lingbok@quicinc.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/he.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/he.c b/net/mac80211/he.c index 0322abae08250..147ff0f71b9bb 100644 --- a/net/mac80211/he.c +++ b/net/mac80211/he.c @@ -231,15 +231,21 @@ ieee80211_he_spr_ie_to_bss_conf(struct ieee80211_vif *vif,
if (!he_spr_ie_elem) return; + + he_obss_pd->sr_ctrl = he_spr_ie_elem->he_sr_control; data = he_spr_ie_elem->optional;
if (he_spr_ie_elem->he_sr_control & IEEE80211_HE_SPR_NON_SRG_OFFSET_PRESENT) - data++; + he_obss_pd->non_srg_max_offset = *data++; + if (he_spr_ie_elem->he_sr_control & IEEE80211_HE_SPR_SRG_INFORMATION_PRESENT) { - he_obss_pd->max_offset = *data++; he_obss_pd->min_offset = *data++; + he_obss_pd->max_offset = *data++; + memcpy(he_obss_pd->bss_color_bitmap, data, 8); + data += 8; + memcpy(he_obss_pd->partial_bssid_bitmap, data, 8); he_obss_pd->enable = true; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lars Kellogg-Stedman lars@oddbit.com
[ Upstream commit 3c34fb0bd4a4237592c5ecb5b2e2531900c55774 ]
When releasing a socket in ax25_release(), we call netdev_put() to decrease the refcount on the associated ax.25 device. However, the execution path for accepting an incoming connection never calls netdev_hold(). This imbalance leads to refcount errors, and ultimately to kernel crashes.
A typical call trace for the above situation will start with one of the following errors:
refcount_t: decrement hit 0; leaking memory. refcount_t: underflow; use-after-free.
And will then have a trace like:
Call Trace: <TASK> ? show_regs+0x64/0x70 ? __warn+0x83/0x120 ? refcount_warn_saturate+0xb2/0x100 ? report_bug+0x158/0x190 ? prb_read_valid+0x20/0x30 ? handle_bug+0x3e/0x70 ? exc_invalid_op+0x1c/0x70 ? asm_exc_invalid_op+0x1f/0x30 ? refcount_warn_saturate+0xb2/0x100 ? refcount_warn_saturate+0xb2/0x100 ax25_release+0x2ad/0x360 __sock_release+0x35/0xa0 sock_close+0x19/0x20 [...]
On reboot (or any attempt to remove the interface), the kernel gets stuck in an infinite loop:
unregister_netdevice: waiting for ax0 to become free. Usage count = 0
This patch corrects these issues by ensuring that we call netdev_hold() and ax25_dev_hold() for new connections in ax25_accept(). This makes the logic leading to ax25_accept() match the logic for ax25_bind(): in both cases we increment the refcount, which is ultimately decremented in ax25_release().
Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()") Signed-off-by: Lars Kellogg-Stedman lars@oddbit.com Tested-by: Duoming Zhou duoming@zju.edu.cn Tested-by: Dan Cross crossd@gmail.com Tested-by: Chris Maness christopher.maness@gmail.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/20240529210242.3346844-2-lars@oddbit.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ax25/af_ax25.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 0bffac238b615..a1e0be8716870 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -1378,8 +1378,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, int flags, { struct sk_buff *skb; struct sock *newsk; + ax25_dev *ax25_dev; DEFINE_WAIT(wait); struct sock *sk; + ax25_cb *ax25; int err = 0;
if (sock->state != SS_UNCONNECTED) @@ -1434,6 +1436,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, int flags, kfree_skb(skb); sk_acceptq_removed(sk); newsock->state = SS_CONNECTED; + ax25 = sk_to_ax25(newsk); + ax25_dev = ax25->ax25_dev; + netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC); + ax25_dev_hold(ax25_dev);
out: release_sock(sk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit 166fcf86cd34e15c7f383eda4642d7a212393008 ]
The object "ax25_dev" is managed by reference counting. Thus it should not be directly released by kfree(), replace with ax25_dev_put().
Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Suggested-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Duoming Zhou duoming@zju.edu.cn Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/20240530051733.11416-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ax25/ax25_dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c index fcc64645bbf5e..e165fe108bb00 100644 --- a/net/ax25/ax25_dev.c +++ b/net/ax25/ax25_dev.c @@ -193,7 +193,7 @@ void __exit ax25_dev_free(void) list_for_each_entry_safe(s, n, &ax25_dev_list, list) { netdev_put(s->dev, &s->dev_tracker); list_del(&s->list); - kfree(s); + ax25_dev_put(s); } spin_unlock_bh(&ax25_dev_lock); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Delevoryas peter@pjd.dev
[ Upstream commit c797ce168930ce3d62a9b7fc4d7040963ee6a01e ]
Background:
1. CONFIG_NCSI_OEM_CMD_KEEP_PHY
If this is enabled, we send an extra OEM Intel command in the probe sequence immediately after discovering a channel (e.g. after "Clear Initial State").
2. CONFIG_NCSI_OEM_CMD_GET_MAC
If this is enabled, we send one of 3 OEM "Get MAC Address" commands from Broadcom, Mellanox (Nvidida), and Intel in the *configuration* sequence for a channel.
3. mellanox,multi-host (or mlx,multi-host)
Introduced by this patch:
https://lore.kernel.org/all/20200108234341.2590674-1-vijaykhemka@fb.com/
Which was actually originally from cosmo.chou@quantatw.com:
https://github.com/facebook/openbmc-linux/commit/9f132a10ec48db84613519258cd...
Cosmo claimed that the Nvidia ConnectX-4 and ConnectX-6 NIC's don't respond to Get Version ID, et. al in the probe sequence unless you send the Set MC Affinity command first.
Problem Statement:
We've been using a combination of #ifdef code blocks and IS_ENABLED() conditions to conditionally send these OEM commands.
It makes adding any new code around these commands hard to understand.
Solution:
In this patch, I just want to remove the conditionally compiled blocks of code, and always use IS_ENABLED(...) to do dynamic control flow.
I don't think the small amount of code this adds to non-users of the OEM Kconfigs is a big deal.
Signed-off-by: Peter Delevoryas peter@pjd.dev Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: e85e271dec02 ("net/ncsi: Fix the multi thread manner of NCSI driver") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ncsi/ncsi-manage.c | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-)
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c index 80713febfac6d..f567957698935 100644 --- a/net/ncsi/ncsi-manage.c +++ b/net/ncsi/ncsi-manage.c @@ -689,8 +689,6 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc, return 0; }
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY) - static int ncsi_oem_keep_phy_intel(struct ncsi_cmd_arg *nca) { unsigned char data[NCSI_OEM_INTEL_CMD_KEEP_PHY_LEN]; @@ -716,10 +714,6 @@ static int ncsi_oem_keep_phy_intel(struct ncsi_cmd_arg *nca) return ret; }
-#endif - -#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC) - /* NCSI OEM Command APIs */ static int ncsi_oem_gma_handler_bcm(struct ncsi_cmd_arg *nca) { @@ -856,8 +850,6 @@ static int ncsi_gma_handler(struct ncsi_cmd_arg *nca, unsigned int mf_id) return nch->handler(nca); }
-#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */ - /* Determine if a given channel from the channel_queue should be used for Tx */ static bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc) @@ -1039,20 +1031,18 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) goto error; }
- nd->state = ncsi_dev_state_config_oem_gma; + nd->state = IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC) + ? ncsi_dev_state_config_oem_gma + : ncsi_dev_state_config_clear_vids; break; case ncsi_dev_state_config_oem_gma: nd->state = ncsi_dev_state_config_clear_vids; - ret = -1;
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC) nca.type = NCSI_PKT_CMD_OEM; nca.package = np->id; nca.channel = nc->id; ndp->pending_req_num = 1; ret = ncsi_gma_handler(&nca, nc->version.mf_id); -#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */ - if (ret < 0) schedule_work(&ndp->work);
@@ -1404,7 +1394,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
schedule_work(&ndp->work); break; -#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC) case ncsi_dev_state_probe_mlx_gma: ndp->pending_req_num = 1;
@@ -1429,7 +1418,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_cis; break; -#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */ case ncsi_dev_state_probe_cis: ndp->pending_req_num = NCSI_RESERVED_CHANNEL;
@@ -1447,7 +1435,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp) if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY)) nd->state = ncsi_dev_state_probe_keep_phy; break; -#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY) case ncsi_dev_state_probe_keep_phy: ndp->pending_req_num = 1;
@@ -1460,7 +1447,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_gvi; break; -#endif /* CONFIG_NCSI_OEM_CMD_KEEP_PHY */ case ncsi_dev_state_probe_gvi: case ncsi_dev_state_probe_gc: case ncsi_dev_state_probe_gls:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: DelphineCCChiu delphine_cc_chiu@wiwynn.com
[ Upstream commit e85e271dec0270982afed84f70dc37703fcc1d52 ]
Currently NCSI driver will send several NCSI commands back to back without waiting the response of previous NCSI command or timeout in some state when NIC have multi channel. This operation against the single thread manner defined by NCSI SPEC(section 6.3.2.3 in DSP0222_1.1.1)
According to NCSI SPEC(section 6.2.13.1 in DSP0222_1.1.1), we should probe one channel at a time by sending NCSI commands (Clear initial state, Get version ID, Get capabilities...), than repeat this steps until the max number of channels which we got from NCSI command (Get capabilities) has been probed.
Fixes: e6f44ed6d04d ("net/ncsi: Package and channel management") Signed-off-by: DelphineCCChiu delphine_cc_chiu@wiwynn.com Link: https://lore.kernel.org/r/20240529065856.825241-1-delphine_cc_chiu@wiwynn.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ncsi/internal.h | 2 ++ net/ncsi/ncsi-manage.c | 73 +++++++++++++++++++++--------------------- net/ncsi/ncsi-rsp.c | 4 ++- 3 files changed, 41 insertions(+), 38 deletions(-)
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h index 374412ed780b6..ef0f8f73826f5 100644 --- a/net/ncsi/internal.h +++ b/net/ncsi/internal.h @@ -325,6 +325,7 @@ struct ncsi_dev_priv { spinlock_t lock; /* Protect the NCSI device */ unsigned int package_probe_id;/* Current ID during probe */ unsigned int package_num; /* Number of packages */ + unsigned int channel_probe_id;/* Current cahnnel ID during probe */ struct list_head packages; /* List of packages */ struct ncsi_channel *hot_channel; /* Channel was ever active */ struct ncsi_request requests[256]; /* Request table */ @@ -343,6 +344,7 @@ struct ncsi_dev_priv { bool multi_package; /* Enable multiple packages */ bool mlx_multi_host; /* Enable multi host Mellanox */ u32 package_whitelist; /* Packages to configure */ + unsigned char channel_count; /* Num of channels to probe */ };
struct ncsi_cmd_arg { diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c index f567957698935..760b33fa03a8b 100644 --- a/net/ncsi/ncsi-manage.c +++ b/net/ncsi/ncsi-manage.c @@ -510,17 +510,19 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
break; case ncsi_dev_state_suspend_gls: - ndp->pending_req_num = np->channel_num; + ndp->pending_req_num = 1;
nca.type = NCSI_PKT_CMD_GLS; nca.package = np->id; + nca.channel = ndp->channel_probe_id; + ret = ncsi_xmit_cmd(&nca); + if (ret) + goto error; + ndp->channel_probe_id++;
- nd->state = ncsi_dev_state_suspend_dcnt; - NCSI_FOR_EACH_CHANNEL(np, nc) { - nca.channel = nc->id; - ret = ncsi_xmit_cmd(&nca); - if (ret) - goto error; + if (ndp->channel_probe_id == ndp->channel_count) { + ndp->channel_probe_id = 0; + nd->state = ncsi_dev_state_suspend_dcnt; }
break; @@ -1340,7 +1342,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp) { struct ncsi_dev *nd = &ndp->ndev; struct ncsi_package *np; - struct ncsi_channel *nc; struct ncsi_cmd_arg nca; unsigned char index; int ret; @@ -1418,23 +1419,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_cis; break; - case ncsi_dev_state_probe_cis: - ndp->pending_req_num = NCSI_RESERVED_CHANNEL; - - /* Clear initial state */ - nca.type = NCSI_PKT_CMD_CIS; - nca.package = ndp->active_package->id; - for (index = 0; index < NCSI_RESERVED_CHANNEL; index++) { - nca.channel = index; - ret = ncsi_xmit_cmd(&nca); - if (ret) - goto error; - } - - nd->state = ncsi_dev_state_probe_gvi; - if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY)) - nd->state = ncsi_dev_state_probe_keep_phy; - break; case ncsi_dev_state_probe_keep_phy: ndp->pending_req_num = 1;
@@ -1447,14 +1431,17 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_gvi; break; + case ncsi_dev_state_probe_cis: case ncsi_dev_state_probe_gvi: case ncsi_dev_state_probe_gc: case ncsi_dev_state_probe_gls: np = ndp->active_package; - ndp->pending_req_num = np->channel_num; + ndp->pending_req_num = 1;
- /* Retrieve version, capability or link status */ - if (nd->state == ncsi_dev_state_probe_gvi) + /* Clear initial state Retrieve version, capability or link status */ + if (nd->state == ncsi_dev_state_probe_cis) + nca.type = NCSI_PKT_CMD_CIS; + else if (nd->state == ncsi_dev_state_probe_gvi) nca.type = NCSI_PKT_CMD_GVI; else if (nd->state == ncsi_dev_state_probe_gc) nca.type = NCSI_PKT_CMD_GC; @@ -1462,19 +1449,29 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp) nca.type = NCSI_PKT_CMD_GLS;
nca.package = np->id; - NCSI_FOR_EACH_CHANNEL(np, nc) { - nca.channel = nc->id; - ret = ncsi_xmit_cmd(&nca); - if (ret) - goto error; - } + nca.channel = ndp->channel_probe_id;
- if (nd->state == ncsi_dev_state_probe_gvi) + ret = ncsi_xmit_cmd(&nca); + if (ret) + goto error; + + if (nd->state == ncsi_dev_state_probe_cis) { + nd->state = ncsi_dev_state_probe_gvi; + if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY) && ndp->channel_probe_id == 0) + nd->state = ncsi_dev_state_probe_keep_phy; + } else if (nd->state == ncsi_dev_state_probe_gvi) { nd->state = ncsi_dev_state_probe_gc; - else if (nd->state == ncsi_dev_state_probe_gc) + } else if (nd->state == ncsi_dev_state_probe_gc) { nd->state = ncsi_dev_state_probe_gls; - else + } else { + nd->state = ncsi_dev_state_probe_cis; + ndp->channel_probe_id++; + } + + if (ndp->channel_probe_id == ndp->channel_count) { + ndp->channel_probe_id = 0; nd->state = ncsi_dev_state_probe_dp; + } break; case ncsi_dev_state_probe_dp: ndp->pending_req_num = 1; @@ -1775,6 +1772,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev, ndp->requests[i].ndp = ndp; timer_setup(&ndp->requests[i].timer, ncsi_request_timeout, 0); } + ndp->channel_count = NCSI_RESERVED_CHANNEL;
spin_lock_irqsave(&ncsi_dev_lock, flags); list_add_tail_rcu(&ndp->node, &ncsi_dev_list); @@ -1808,6 +1806,7 @@ int ncsi_start_dev(struct ncsi_dev *nd)
if (!(ndp->flags & NCSI_DEV_PROBED)) { ndp->package_probe_id = 0; + ndp->channel_probe_id = 0; nd->state = ncsi_dev_state_probe; schedule_work(&ndp->work); return 0; diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c index 480e80e3c2836..f22d67cb04d37 100644 --- a/net/ncsi/ncsi-rsp.c +++ b/net/ncsi/ncsi-rsp.c @@ -795,12 +795,13 @@ static int ncsi_rsp_handler_gc(struct ncsi_request *nr) struct ncsi_rsp_gc_pkt *rsp; struct ncsi_dev_priv *ndp = nr->ndp; struct ncsi_channel *nc; + struct ncsi_package *np; size_t size;
/* Find the channel */ rsp = (struct ncsi_rsp_gc_pkt *)skb_network_header(nr->rsp); ncsi_find_package_and_channel(ndp, rsp->rsp.common.channel, - NULL, &nc); + &np, &nc); if (!nc) return -ENODEV;
@@ -835,6 +836,7 @@ static int ncsi_rsp_handler_gc(struct ncsi_request *nr) */ nc->vlan_filter.bitmap = U64_MAX; nc->vlan_filter.n_vids = rsp->vlan_cnt; + np->ndp->channel_count = rsp->channel_cnt;
return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 2fe40483ec257de2a0d819ef88e3e76c7e261319 ]
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled.
Disabling preemption in ioam6_output() is not good enough, because ioam6_output() is called from process context, lwtunnel_output() only uses rcu_read_lock().
We might be interrupted by a softirq, re-enter ioam6_output() and corrupt dst_cache data structures.
Fix the race by using local_bh_disable() instead of preempt_disable().
Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Justin Iurman justin.iurman@uliege.be Acked-by: Paolo Abeni pabeni@redhat.com Link: https://lore.kernel.org/r/20240531132636.2637995-2-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ioam6_iptunnel.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c index f6f5b83dd954d..a5cfc5b0b206b 100644 --- a/net/ipv6/ioam6_iptunnel.c +++ b/net/ipv6/ioam6_iptunnel.c @@ -351,9 +351,9 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) goto drop;
if (!ipv6_addr_equal(&orig_daddr, &ipv6_hdr(skb)->daddr)) { - preempt_disable(); + local_bh_disable(); dst = dst_cache_get(&ilwt->cache); - preempt_enable(); + local_bh_enable();
if (unlikely(!dst)) { struct ipv6hdr *hdr = ipv6_hdr(skb); @@ -373,9 +373,9 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) goto drop; }
- preempt_disable(); + local_bh_disable(); dst_cache_set_ip6(&ilwt->cache, dst, &fl6.saddr); - preempt_enable(); + local_bh_enable(); }
skb_dst_drop(skb);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit c0b98ac1cc104f48763cdb27b1e9ac25fd81fc90 ]
As explained in commit 1378817486d6 ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled.
Disabling preemption in seg6_output_core() is not good enough, because seg6_output_core() is called from process context, lwtunnel_output() only uses rcu_read_lock().
We might be interrupted by a softirq, re-enter seg6_output_core() and corrupt dst_cache data structures.
Fix the race by using local_bh_disable() instead of preempt_disable().
Apply a similar change in seg6_input_core().
Fixes: fa79581ea66c ("ipv6: sr: fix several BUGs when preemption is enabled") Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Eric Dumazet edumazet@google.com Cc: David Lebrun dlebrun@google.com Acked-by: Paolo Abeni pabeni@redhat.com Link: https://lore.kernel.org/r/20240531132636.2637995-4-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/seg6_iptunnel.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c index 5924407b87b07..ae5299c277bcf 100644 --- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -464,9 +464,8 @@ static int seg6_input_core(struct net *net, struct sock *sk,
slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate);
- preempt_disable(); + local_bh_disable(); dst = dst_cache_get(&slwt->cache); - preempt_enable();
skb_dst_drop(skb);
@@ -474,14 +473,13 @@ static int seg6_input_core(struct net *net, struct sock *sk, ip6_route_input(skb); dst = skb_dst(skb); if (!dst->error) { - preempt_disable(); dst_cache_set_ip6(&slwt->cache, dst, &ipv6_hdr(skb)->saddr); - preempt_enable(); } } else { skb_dst_set(skb, dst); } + local_bh_enable();
err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); if (unlikely(err)) @@ -537,9 +535,9 @@ static int seg6_output_core(struct net *net, struct sock *sk,
slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate);
- preempt_disable(); + local_bh_disable(); dst = dst_cache_get(&slwt->cache); - preempt_enable(); + local_bh_enable();
if (unlikely(!dst)) { struct ipv6hdr *hdr = ipv6_hdr(skb); @@ -559,9 +557,9 @@ static int seg6_output_core(struct net *net, struct sock *sk, goto drop; }
- preempt_disable(); + local_bh_disable(); dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); - preempt_enable(); + local_bh_enable(); }
skb_dst_drop(skb);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa jolsa@kernel.org
[ Upstream commit d0d1df8ba18abc57f28fb3bc053b2bf319367f2c ]
syzbot reported crash when rawtp program executed through the test_run interface calls bpf_get_attach_cookie helper or any other helper that touches task->bpf_ctx pointer.
Setting the run context (task->bpf_ctx pointer) for test_run callback.
Fixes: 7adfc6c9b315 ("bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value") Reported-by: syzbot+3ab78ff125b7979e45f9@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Closes: https://syzkaller.appspot.com/bug?extid=3ab78ff125b7979e45f9 Link: https://lore.kernel.org/bpf/20240604150024.359247-1-jolsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bpf/test_run.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 6094ef7cffcd2..64be562f0fe32 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -841,10 +841,16 @@ static void __bpf_prog_test_run_raw_tp(void *data) { struct bpf_raw_tp_test_run_info *info = data; + struct bpf_trace_run_ctx run_ctx = {}; + struct bpf_run_ctx *old_run_ctx; + + old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
rcu_read_lock(); info->retval = bpf_prog_run(info->prog, info->ctx); rcu_read_unlock(); + + bpf_reset_run_ctx(old_run_ctx); }
int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Subbaraya Sundeep sbhatta@marvell.com
[ Upstream commit 8b0f7410942cdc420c4557eda02bfcdf60ccec17 ]
PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs.
Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep sbhatta@marvell.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/marvell/octeontx2/af/rvu_npc.c | 33 ++++++++++++------- 1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c index 91a4ea529d077..00ef6d201b973 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c @@ -2506,7 +2506,17 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc, * - when available free entries are less. * Lower priority ones out of avaialble free entries are always * chosen when 'high vs low' question arises. + * + * For a VF base MCAM match rule is set by its PF. And all the + * further MCAM rules installed by VF on its own are + * concatenated with the base rule set by its PF. Hence PF entries + * should be at lower priority compared to VF entries. Otherwise + * base rule is hit always and rules installed by VF will be of + * no use. Hence if the request is from PF then allocate low + * priority entries. */ + if (!(pcifunc & RVU_PFVF_FUNC_MASK)) + goto lprio_alloc;
/* Get the search range for priority allocation request */ if (req->priority) { @@ -2515,17 +2525,6 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc, goto alloc; }
- /* For a VF base MCAM match rule is set by its PF. And all the - * further MCAM rules installed by VF on its own are - * concatenated with the base rule set by its PF. Hence PF entries - * should be at lower priority compared to VF entries. Otherwise - * base rule is hit always and rules installed by VF will be of - * no use. Hence if the request is from PF and NOT a priority - * allocation request then allocate low priority entries. - */ - if (!(pcifunc & RVU_PFVF_FUNC_MASK)) - goto lprio_alloc; - /* Find out the search range for non-priority allocation request * * Get MCAM free entry count in middle zone. @@ -2555,6 +2554,18 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc, reverse = true; start = 0; end = mcam->bmap_entries; + /* Ensure PF requests are always at bottom and if PF requests + * for higher/lower priority entry wrt reference entry then + * honour that criteria and start search for entries from bottom + * and not in mid zone. + */ + if (!(pcifunc & RVU_PFVF_FUNC_MASK) && + req->priority == NPC_MCAM_HIGHER_PRIO) + end = req->ref_entry; + + if (!(pcifunc & RVU_PFVF_FUNC_MASK) && + req->priority == NPC_MCAM_LOWER_PRIO) + start = req->ref_entry; }
alloc:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wen Gu guwen@linux.alibaba.com
[ Upstream commit fb0aa0781a5f457e3864da68af52c3b1f4f7fd8f ]
When copying smc settings to clcsock, avoid setting clcsock's sk_sndbuf to sysctl_tcp_wmem[1], since this may overwrite the value set by tcp_sndbuf_expand() in TCP connection establishment.
And the other setting sk_{snd|rcv}buf to sysctl value in smc_adjust_sock_bufsizes() can also be omitted since the initialization of smc sock and clcsock has set sk_{snd|rcv}buf to smc.sysctl_{w|r}mem or ipv4_sysctl_tcp_{w|r}mem[1].
Fixes: 30c3c4a4497c ("net/smc: Use correct buffer sizes when switching between TCP and SMC") Link: https://lore.kernel.org/r/5eaf3858-e7fd-4db8-83e8-3d7a3e0e9ae2@linux.alibaba... Signed-off-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: Wenjia Zhang wenjia@linux.ibm.com Reviewed-by: Gerd Bayer gbayer@linux.ibm.com, too. Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 22 ++-------------------- 1 file changed, 2 insertions(+), 20 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index b6609527dff62..e86db21fef6e5 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -462,29 +462,11 @@ static int smc_bind(struct socket *sock, struct sockaddr *uaddr, static void smc_adjust_sock_bufsizes(struct sock *nsk, struct sock *osk, unsigned long mask) { - struct net *nnet = sock_net(nsk); - nsk->sk_userlocks = osk->sk_userlocks; - if (osk->sk_userlocks & SOCK_SNDBUF_LOCK) { + if (osk->sk_userlocks & SOCK_SNDBUF_LOCK) nsk->sk_sndbuf = osk->sk_sndbuf; - } else { - if (mask == SK_FLAGS_SMC_TO_CLC) - WRITE_ONCE(nsk->sk_sndbuf, - READ_ONCE(nnet->ipv4.sysctl_tcp_wmem[1])); - else - WRITE_ONCE(nsk->sk_sndbuf, - 2 * READ_ONCE(nnet->smc.sysctl_wmem)); - } - if (osk->sk_userlocks & SOCK_RCVBUF_LOCK) { + if (osk->sk_userlocks & SOCK_RCVBUF_LOCK) nsk->sk_rcvbuf = osk->sk_rcvbuf; - } else { - if (mask == SK_FLAGS_SMC_TO_CLC) - WRITE_ONCE(nsk->sk_rcvbuf, - READ_ONCE(nnet->ipv4.sysctl_tcp_rmem[1])); - else - WRITE_ONCE(nsk->sk_rcvbuf, - 2 * READ_ONCE(nnet->smc.sysctl_rmem)); - } }
static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangyu Hua hbh25y@gmail.com
[ Upstream commit affc18fdc694190ca7575b9a86632a73b9fe043d ]
q->bands will be assigned to qopt->bands to execute subsequent code logic after kmalloc. So the old q->bands should not be used in kmalloc. Otherwise, an out-of-bounds write will occur.
Fixes: c2999f7fb05b ("net: sched: multiq: don't call qdisc_put() while holding tree lock") Signed-off-by: Hangyu Hua hbh25y@gmail.com Acked-by: Cong Wang cong.wang@bytedance.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_multiq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c index 75c9c860182b4..0d6649d937c9f 100644 --- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -185,7 +185,7 @@ static int multiq_tune(struct Qdisc *sch, struct nlattr *opt,
qopt->bands = qdisc_dev(sch)->real_num_tx_queues;
- removed = kmalloc(sizeof(*removed) * (q->max_bands - q->bands), + removed = kmalloc(sizeof(*removed) * (q->max_bands - qopt->bands), GFP_KERNEL); if (!removed) return -ENOMEM;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 1cd4bc987abb2823836cbb8f887026011ccddc8a ]
Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint.
Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine.
Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors.
Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Cc: David Bauer mail@david-bauer.net Cc: Ido Schimmel idosch@nvidia.com Cc: Nikolay Aleksandrov razor@blackwall.org Cc: Martin KaFai Lau martin.lau@kernel.org Reviewed-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov razor@blackwall.org Reviewed-by: David Bauer mail@david-bauer.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vxlan/vxlan_core.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index a7ae68f490c4c..61224a5a877cb 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -1493,6 +1493,10 @@ static bool vxlan_snoop(struct net_device *dev, struct vxlan_fdb *f; u32 ifindex = 0;
+ /* Ignore packets from invalid src-address */ + if (!is_valid_ether_addr(src_mac)) + return true; + #if IS_ENABLED(CONFIG_IPV6) if (src_ip->sa.sa_family == AF_INET6 && (ipv6_addr_type(&src_ip->sin6.sin6_addr) & IPV6_ADDR_LINKLOCAL))
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Xing kernelxing@tencent.com
[ Upstream commit a46d0ea5c94205f40ecf912d1bb7806a8a64704f ]
According to RFC 1213, we should also take CLOSE-WAIT sockets into consideration:
"tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT."
After this, CurrEstab counter will display the total number of ESTABLISHED and CLOSE-WAIT sockets.
The logic of counting When we increment the counter? a) if we change the state to ESTABLISHED. b) if we change the state from SYN-RECEIVED to CLOSE-WAIT.
When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK.
Please note: there are two chances that old state of socket can be changed to CLOSE-WAIT in tcp_fin(). One is SYN-RECV, the other is ESTABLISHED. So we have to take care of the former case.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3447a09ee83a2..2d4f697d338f5 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2758,6 +2758,10 @@ void tcp_set_state(struct sock *sk, int state) if (oldstate != TCP_ESTABLISHED) TCP_INC_STATS(sock_net(sk), TCP_MIB_CURRESTAB); break; + case TCP_CLOSE_WAIT: + if (oldstate == TCP_SYN_RECV) + TCP_INC_STATS(sock_net(sk), TCP_MIB_CURRESTAB); + break;
case TCP_CLOSE: if (oldstate == TCP_CLOSE_WAIT || oldstate == TCP_ESTABLISHED) @@ -2769,7 +2773,7 @@ void tcp_set_state(struct sock *sk, int state) inet_put_port(sk); fallthrough; default: - if (oldstate == TCP_ESTABLISHED) + if (oldstate == TCP_ESTABLISHED || oldstate == TCP_CLOSE_WAIT) TCP_DEC_STATS(sock_net(sk), TCP_MIB_CURRESTAB); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit 8ff38e730c3f5ee717f25365ef8aa4739562d567 ]
If driver teardown is called while PCI is turned off, there is a race between health recovery and teardown. If health recovery already started it will wait 60 sec trying to see if PCI gets back and it can recover, but actually there is no need to wait anymore once teardown was called.
Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break waiting for PCI up.
Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20230314054234.267365-3-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 33afbfcc105a ("net/mlx5: Stop waiting for PCI if pci channel is offline") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/health.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c index e42e4ac231c64..e9462de771fd3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -325,6 +325,10 @@ int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev) while (sensor_pci_not_working(dev)) { if (time_after(jiffies, end)) return -ETIMEDOUT; + if (test_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state)) { + mlx5_core_warn(dev, "device is being removed, stop waiting for PCI\n"); + return -ENODEV; + } msleep(100); } return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh moshe@nvidia.com
[ Upstream commit 33afbfcc105a572159750f2ebee834a8a70fdd96 ]
In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout.
Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh moshe@nvidia.com Reviewed-by: Shay Drori shayd@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 4 ++++ drivers/net/ethernet/mellanox/mlx5/core/health.c | 8 ++++++++ drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 4 ++++ 3 files changed, 16 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index f34e758a2f1f6..9e26dda93f8ee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -379,6 +379,10 @@ int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev) do { if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED) break; + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n"); + return -EACCES; + }
cond_resched(); } while (!time_after(jiffies, end)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c index e9462de771fd3..65483dab90573 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -260,6 +260,10 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev) do { if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED) break; + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n"); + goto unlock; + }
msleep(20); } while (!time_after(jiffies, end)); @@ -329,6 +333,10 @@ int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev) mlx5_core_warn(dev, "device is being removed, stop waiting for PCI\n"); return -ENODEV; } + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for PCI\n"); + return -EACCES; + } msleep(100); } return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c index 6b774e0c27665..d0b595ba61101 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c @@ -74,6 +74,10 @@ int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev) ret = -EBUSY; goto pci_unlock; } + if (pci_channel_offline(dev->pdev)) { + ret = -EACCES; + goto pci_unlock; + }
/* Check if semaphore is already locked */ ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
[ Upstream commit 2059cf51f318681a4cdd3eb1a01a2d62b6a9c442 ]
mlx5_cmd_init_hca() is taking ~0.2 seconds. In case of a user who desire to disable some of the SF aux devices, and with large scale-1K SFs for example, this user will waste more than 3 minutes on mlx5_cmd_init_hca() which isn't needed at this stage.
Downstream patch will change SFs which are probe over the E-switch, local SFs, to be probed without any aux dev. In order to support this, split function_setup() to avoid executing mlx5_cmd_init_hca().
Signed-off-by: Shay Drory shayd@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: c8b3f38d2dae ("net/mlx5: Always stop health timer during driver removal") Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/main.c | 83 +++++++++++++------ 1 file changed, 58 insertions(+), 25 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 6ab0642e9de78..fe0a78c29438b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1093,7 +1093,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev) mlx5_devcom_unregister_device(dev->priv.devcom); }
-static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot, u64 timeout) +static int mlx5_function_enable(struct mlx5_core_dev *dev, bool boot, u64 timeout) { int err;
@@ -1158,28 +1158,56 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot, u64 timeout goto reclaim_boot_pages; }
+ return 0; + +reclaim_boot_pages: + mlx5_reclaim_startup_pages(dev); +err_disable_hca: + mlx5_core_disable_hca(dev, 0); +stop_health_poll: + mlx5_stop_health_poll(dev, boot); +err_cmd_cleanup: + mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN); + mlx5_cmd_cleanup(dev); + + return err; +} + +static void mlx5_function_disable(struct mlx5_core_dev *dev, bool boot) +{ + mlx5_reclaim_startup_pages(dev); + mlx5_core_disable_hca(dev, 0); + mlx5_stop_health_poll(dev, boot); + mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN); + mlx5_cmd_cleanup(dev); +} + +static int mlx5_function_open(struct mlx5_core_dev *dev) +{ + int err; + err = set_hca_ctrl(dev); if (err) { mlx5_core_err(dev, "set_hca_ctrl failed\n"); - goto reclaim_boot_pages; + return err; }
err = set_hca_cap(dev); if (err) { mlx5_core_err(dev, "set_hca_cap failed\n"); - goto reclaim_boot_pages; + return err; }
err = mlx5_satisfy_startup_pages(dev, 0); if (err) { mlx5_core_err(dev, "failed to allocate init pages\n"); - goto reclaim_boot_pages; + return err; }
err = mlx5_cmd_init_hca(dev, sw_owner_id); if (err) { mlx5_core_err(dev, "init hca failed\n"); - goto reclaim_boot_pages; + return err; }
mlx5_set_driver_version(dev); @@ -1187,26 +1215,13 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot, u64 timeout err = mlx5_query_hca_caps(dev); if (err) { mlx5_core_err(dev, "query hca failed\n"); - goto reclaim_boot_pages; + return err; } mlx5_start_health_fw_log_up(dev); - return 0; - -reclaim_boot_pages: - mlx5_reclaim_startup_pages(dev); -err_disable_hca: - mlx5_core_disable_hca(dev, 0); -stop_health_poll: - mlx5_stop_health_poll(dev, boot); -err_cmd_cleanup: - mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN); - mlx5_cmd_cleanup(dev); - - return err; }
-static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot) +static int mlx5_function_close(struct mlx5_core_dev *dev) { int err;
@@ -1215,15 +1230,33 @@ static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot) mlx5_core_err(dev, "tear_down_hca failed, skip cleanup\n"); return err; } - mlx5_reclaim_startup_pages(dev); - mlx5_core_disable_hca(dev, 0); - mlx5_stop_health_poll(dev, boot); - mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN); - mlx5_cmd_cleanup(dev);
return 0; }
+static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot, u64 timeout) +{ + int err; + + err = mlx5_function_enable(dev, boot, timeout); + if (err) + return err; + + err = mlx5_function_open(dev); + if (err) + mlx5_function_disable(dev, boot); + return err; +} + +static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot) +{ + int err = mlx5_function_close(dev); + + if (!err) + mlx5_function_disable(dev, boot); + return err; +} + static int mlx5_load(struct mlx5_core_dev *dev) { int err;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory shayd@nvidia.com
[ Upstream commit c8b3f38d2dae0397944814d691a419c451f9906f ]
Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed.
Hence, stop the health monitor even if teardown_hca fails.
[1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]---
Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory shayd@nvidia.com Reviewed-by: Moshe Shemesh moshe@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index fe0a78c29438b..67849b1c0bb71 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1254,6 +1254,9 @@ static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot)
if (!err) mlx5_function_disable(dev, boot); + else + mlx5_stop_health_poll(dev, boot); + return err; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
[ Upstream commit 229bedbf62b13af5aba6525ad10b62ad38d9ccb5 ]
In case of flow rule creation fail in mlx5_lag_create_port_sel_table(), instead of previously created rules, the tainted pointer is deleted deveral times. Fix this bug by using correct flow rules pointers.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode") Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Link: https://lore.kernel.org/r/20240604100552.25201-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c index 7d9bbb494d95b..005661248c7e9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c @@ -88,9 +88,13 @@ static int mlx5_lag_create_port_sel_table(struct mlx5_lag *ldev, &dest, 1); if (IS_ERR(lag_definer->rules[idx])) { err = PTR_ERR(lag_definer->rules[idx]); - while (i--) - while (j--) + do { + while (j--) { + idx = i * ldev->buckets + j; mlx5_del_flow_rules(lag_definer->rules[idx]); + } + j = ldev->buckets; + } while (i--); goto destroy_fg; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit f921a58ae20852d188f70842431ce6519c4fdc36 ]
If one TCA_TAPRIO_ATTR_PRIOMAP attribute has been provided, taprio_parse_mqprio_opt() must validate it, or userspace can inject arbitrary data to the kernel, the second time taprio_change() is called.
First call (with valid attributes) sets dev->num_tc to a non zero value.
Second call (with arbitrary mqprio attributes) returns early from taprio_parse_mqprio_opt() and bad things can happen.
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Reported-by: Noam Rathaus noamr@ssd-disclosure.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://lore.kernel.org/r/20240604181511.769870-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_taprio.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index 1d4638aa4254f..41187bbd25ee9 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -938,16 +938,13 @@ static int taprio_parse_mqprio_opt(struct net_device *dev, { int i, j;
- if (!qopt && !dev->num_tc) { - NL_SET_ERR_MSG(extack, "'mqprio' configuration is necessary"); - return -EINVAL; - } - - /* If num_tc is already set, it means that the user already - * configured the mqprio part - */ - if (dev->num_tc) + if (!qopt) { + if (!dev->num_tc) { + NL_SET_ERR_MSG(extack, "'mqprio' configuration is necessary"); + return -EINVAL; + } return 0; + }
/* Verify num_tc is not out of max range */ if (qopt->num_tc > TC_MAX_QUEUE) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karol Kolacinski karol.kolacinski@intel.com
[ Upstream commit 323a359f9b077f382f4483023d096a4d316fd135 ]
On failed verification of PTP clock pin, error message prints channel number instead of pin index after "pin", which is incorrect.
Fix error message by adding channel number to the message and printing pin number instead of channel number.
Fixes: 6092315dfdec ("ptp: introduce programmable pins.") Signed-off-by: Karol Kolacinski karol.kolacinski@intel.com Acked-by: Richard Cochran richardcochran@gmail.com Link: https://lore.kernel.org/r/20240604120555.16643-1-karol.kolacinski@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_chardev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c index 9311f3d09c8fc..8eb902fe73a98 100644 --- a/drivers/ptp/ptp_chardev.c +++ b/drivers/ptp/ptp_chardev.c @@ -84,7 +84,8 @@ int ptp_set_pinfunc(struct ptp_clock *ptp, unsigned int pin, }
if (info->verify(info, pin, func, chan)) { - pr_err("driver cannot use function %u on pin %u\n", func, chan); + pr_err("driver cannot use function %u and channel %u on pin %u\n", + func, chan, pin); return -EOPNOTSUPP; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit 03e4a092be8ce3de7c1baa7ae14e68b64e3ea644 ]
The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value structures in the Preserved Fields Area (PFA) of the NVM. This is used by the driver to access data such as the Part Board Assembly identifier.
The function uses simple logic to iterate over the PFA. First, the pointer to the PFA in the NVM is read. Then the total length of the PFA is read from the first word.
A pointer to the first TLV is initialized, and a simple loop iterates over each TLV. The pointer is moved forward through the NVM until it exceeds the PFA area.
The logic seems sound, but it is missing a key detail. The Preserved Fields Area length includes one additional final word. This is documented in the device data sheet as a dummy word which contains 0xFFFF. All NVMs have this extra word.
If the driver tries to scan for a TLV that is not in the PFA, it will read past the size of the PFA. It reads and interprets the last dummy word of the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA as a length.
The PFA resides within the Shadow RAM portion of the NVM, which is relatively small. All of its offsets are within a 16-bit size. The PFA pointer and TLV pointer are stored by the driver as 16-bit values.
In almost all cases, the word following the PFA will be such that interpreting it as a length will result in 16-bit arithmetic overflow. Once overflowed, the new next_tlv value is now below the maximum offset of the PFA. Thus, the driver will continue to iterate the data as TLVs. In the worst case, the driver hits on a sequence of reads which loop back to reading the same offsets in an endless loop.
To fix this, we need to correct the loop iteration check to account for this extra word at the end of the PFA. This alone is sufficient to resolve the known cases of this issue in the field. However, it is plausible that an NVM could be misconfigured or have corrupt data which results in the same kind of overflow. Protect against this by using check_add_overflow when calculating both the maximum offset of the TLVs, and when calculating the next_tlv offset at the end of each loop iteration. This ensures that the driver will not get stuck in an infinite loop when scanning the PFA.
Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get") Co-developed-by: Paul Greenwalt paul.greenwalt@intel.com Signed-off-by: Paul Greenwalt paul.greenwalt@intel.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_nvm.c | 28 ++++++++++++++++++------ 1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c index c262dc886e6a6..07ef6b1f00884 100644 --- a/drivers/net/ethernet/intel/ice/ice_nvm.c +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c @@ -441,8 +441,7 @@ int ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len, u16 module_type) { - u16 pfa_len, pfa_ptr; - u16 next_tlv; + u16 pfa_len, pfa_ptr, next_tlv, max_tlv; int status;
status = ice_read_sr_word(hw, ICE_SR_PFA_PTR, &pfa_ptr); @@ -455,11 +454,23 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len, ice_debug(hw, ICE_DBG_INIT, "Failed to read PFA length.\n"); return status; } + + /* The Preserved Fields Area contains a sequence of Type-Length-Value + * structures which define its contents. The PFA length includes all + * of the TLVs, plus the initial length word itself, *and* one final + * word at the end after all of the TLVs. + */ + if (check_add_overflow(pfa_ptr, pfa_len - 1, &max_tlv)) { + dev_warn(ice_hw_to_dev(hw), "PFA starts at offset %u. PFA length of %u caused 16-bit arithmetic overflow.\n", + pfa_ptr, pfa_len); + return -EINVAL; + } + /* Starting with first TLV after PFA length, iterate through the list * of TLVs to find the requested one. */ next_tlv = pfa_ptr + 1; - while (next_tlv < pfa_ptr + pfa_len) { + while (next_tlv < max_tlv) { u16 tlv_sub_module_type; u16 tlv_len;
@@ -483,10 +494,13 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len, } return -EINVAL; } - /* Check next TLV, i.e. current TLV pointer + length + 2 words - * (for current TLV's type and length) - */ - next_tlv = next_tlv + tlv_len + 2; + + if (check_add_overflow(next_tlv, 2, &next_tlv) || + check_add_overflow(next_tlv, tlv_len, &next_tlv)) { + dev_warn(ice_hw_to_dev(hw), "TLV of type %u and length 0x%04x caused 16-bit arithmetic overflow. The PFA starts at 0x%04x and has length of 0x%04x\n", + tlv_sub_module_type, tlv_len, pfa_ptr, pfa_len); + return -EINVAL; + } } /* Module does not exist */ return -ENOENT;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Wilczynski michal.wilczynski@intel.com
[ Upstream commit 16dfa49406bc5e1f4cbb115027cbd719d7e6c930 ]
To support new devlink-rate API ice_sched_node struct needs to store a number of additional parameters. This includes tx_max, tx_share, tx_weight, and tx_priority.
Add new fields to ice_sched_node struct. Add new functions to configure the hardware with new parameters. Introduce new xarray to identify nodes uniquely.
Signed-off-by: Michal Wilczynski michal.wilczynski@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/ice/ice_adminq_cmd.h | 4 +- drivers/net/ethernet/intel/ice/ice_common.c | 3 + drivers/net/ethernet/intel/ice/ice_sched.c | 81 +++++++++++++++++-- drivers/net/ethernet/intel/ice/ice_sched.h | 27 +++++++ drivers/net/ethernet/intel/ice/ice_type.h | 8 ++ 5 files changed, 116 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index fe48164dce1e1..4d53c40a9de27 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -848,9 +848,9 @@ struct ice_aqc_txsched_elem { u8 generic; #define ICE_AQC_ELEM_GENERIC_MODE_M 0x1 #define ICE_AQC_ELEM_GENERIC_PRIO_S 0x1 -#define ICE_AQC_ELEM_GENERIC_PRIO_M (0x7 << ICE_AQC_ELEM_GENERIC_PRIO_S) +#define ICE_AQC_ELEM_GENERIC_PRIO_M GENMASK(3, 1) #define ICE_AQC_ELEM_GENERIC_SP_S 0x4 -#define ICE_AQC_ELEM_GENERIC_SP_M (0x1 << ICE_AQC_ELEM_GENERIC_SP_S) +#define ICE_AQC_ELEM_GENERIC_SP_M GENMASK(4, 4) #define ICE_AQC_ELEM_GENERIC_ADJUST_VAL_S 0x5 #define ICE_AQC_ELEM_GENERIC_ADJUST_VAL_M \ (0x3 << ICE_AQC_ELEM_GENERIC_ADJUST_VAL_S) diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 039342a0ed15a..e2e661010176c 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1105,6 +1105,9 @@ int ice_init_hw(struct ice_hw *hw)
hw->evb_veb = true;
+ /* init xarray for identifying scheduling nodes uniquely */ + xa_init_flags(&hw->port_info->sched_node_ids, XA_FLAGS_ALLOC); + /* Query the allocated resources for Tx scheduler */ status = ice_sched_query_res_alloc(hw); if (status) { diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c index 2c62c1763ee0d..88e74835d0274 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.c +++ b/drivers/net/ethernet/intel/ice/ice_sched.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2018, Intel Corporation. */
+#include <net/devlink.h> #include "ice_sched.h"
/** @@ -355,6 +356,9 @@ void ice_free_sched_node(struct ice_port_info *pi, struct ice_sched_node *node) /* leaf nodes have no children */ if (node->children) devm_kfree(ice_hw_to_dev(hw), node->children); + + kfree(node->name); + xa_erase(&pi->sched_node_ids, node->id); devm_kfree(ice_hw_to_dev(hw), node); }
@@ -875,7 +879,7 @@ void ice_sched_cleanup_all(struct ice_hw *hw) * * This function add nodes to HW as well as to SW DB for a given layer */ -static int +int ice_sched_add_elems(struct ice_port_info *pi, struct ice_sched_node *tc_node, struct ice_sched_node *parent, u8 layer, u16 num_nodes, u16 *num_nodes_added, u32 *first_node_teid) @@ -940,6 +944,22 @@ ice_sched_add_elems(struct ice_port_info *pi, struct ice_sched_node *tc_node,
new_node->sibling = NULL; new_node->tc_num = tc_node->tc_num; + new_node->tx_weight = ICE_SCHED_DFLT_BW_WT; + new_node->tx_share = ICE_SCHED_DFLT_BW; + new_node->tx_max = ICE_SCHED_DFLT_BW; + new_node->name = kzalloc(SCHED_NODE_NAME_MAX_LEN, GFP_KERNEL); + if (!new_node->name) + return -ENOMEM; + + status = xa_alloc(&pi->sched_node_ids, &new_node->id, NULL, XA_LIMIT(0, UINT_MAX), + GFP_KERNEL); + if (status) { + ice_debug(hw, ICE_DBG_SCHED, "xa_alloc failed for sched node status =%d\n", + status); + break; + } + + snprintf(new_node->name, SCHED_NODE_NAME_MAX_LEN, "node_%u", new_node->id);
/* add it to previous node sibling pointer */ /* Note: siblings are not linked across branches */ @@ -2154,7 +2174,7 @@ ice_sched_get_free_vsi_parent(struct ice_hw *hw, struct ice_sched_node *node, * This function removes the child from the old parent and adds it to a new * parent */ -static void +void ice_sched_update_parent(struct ice_sched_node *new_parent, struct ice_sched_node *node) { @@ -2188,7 +2208,7 @@ ice_sched_update_parent(struct ice_sched_node *new_parent, * * This function move the child nodes to a given parent. */ -static int +int ice_sched_move_nodes(struct ice_port_info *pi, struct ice_sched_node *parent, u16 num_items, u32 *list) { @@ -3562,7 +3582,7 @@ ice_sched_set_eir_srl_excl(struct ice_port_info *pi, * node's RL profile ID of type CIR, EIR, or SRL, and removes old profile * ID from local database. The caller needs to hold scheduler lock. */ -static int +int ice_sched_set_node_bw(struct ice_port_info *pi, struct ice_sched_node *node, enum ice_rl_type rl_type, u32 bw, u8 layer_num) { @@ -3598,6 +3618,57 @@ ice_sched_set_node_bw(struct ice_port_info *pi, struct ice_sched_node *node, ICE_AQC_RL_PROFILE_TYPE_M, old_id); }
+/** + * ice_sched_set_node_priority - set node's priority + * @pi: port information structure + * @node: tree node + * @priority: number 0-7 representing priority among siblings + * + * This function sets priority of a node among it's siblings. + */ +int +ice_sched_set_node_priority(struct ice_port_info *pi, struct ice_sched_node *node, + u16 priority) +{ + struct ice_aqc_txsched_elem_data buf; + struct ice_aqc_txsched_elem *data; + + buf = node->info; + data = &buf.data; + + data->valid_sections |= ICE_AQC_ELEM_VALID_GENERIC; + data->generic |= FIELD_PREP(ICE_AQC_ELEM_GENERIC_PRIO_M, priority); + + return ice_sched_update_elem(pi->hw, node, &buf); +} + +/** + * ice_sched_set_node_weight - set node's weight + * @pi: port information structure + * @node: tree node + * @weight: number 1-200 representing weight for WFQ + * + * This function sets weight of the node for WFQ algorithm. + */ +int +ice_sched_set_node_weight(struct ice_port_info *pi, struct ice_sched_node *node, u16 weight) +{ + struct ice_aqc_txsched_elem_data buf; + struct ice_aqc_txsched_elem *data; + + buf = node->info; + data = &buf.data; + + data->valid_sections = ICE_AQC_ELEM_VALID_CIR | ICE_AQC_ELEM_VALID_EIR | + ICE_AQC_ELEM_VALID_GENERIC; + data->cir_bw.bw_alloc = cpu_to_le16(weight); + data->eir_bw.bw_alloc = cpu_to_le16(weight); + + data->generic |= FIELD_PREP(ICE_AQC_ELEM_GENERIC_SP_M, 0x0); + + return ice_sched_update_elem(pi->hw, node, &buf); +} + /** * ice_sched_set_node_bw_lmt - set node's BW limit * @pi: port information structure @@ -3608,7 +3679,7 @@ ice_sched_set_node_bw(struct ice_port_info *pi, struct ice_sched_node *node, * It updates node's BW limit parameters like BW RL profile ID of type CIR, * EIR, or SRL. The caller needs to hold scheduler lock. */ -static int +int ice_sched_set_node_bw_lmt(struct ice_port_info *pi, struct ice_sched_node *node, enum ice_rl_type rl_type, u32 bw) { diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h b/drivers/net/ethernet/intel/ice/ice_sched.h index 4f91577fed56b..920db43ed4fa6 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.h +++ b/drivers/net/ethernet/intel/ice/ice_sched.h @@ -6,6 +6,8 @@
#include "ice_common.h"
+#define SCHED_NODE_NAME_MAX_LEN 32 + #define ICE_QGRP_LAYER_OFFSET 2 #define ICE_VSI_LAYER_OFFSET 4 #define ICE_AGG_LAYER_OFFSET 6 @@ -69,6 +71,28 @@ int ice_aq_query_sched_elems(struct ice_hw *hw, u16 elems_req, struct ice_aqc_txsched_elem_data *buf, u16 buf_size, u16 *elems_ret, struct ice_sq_cd *cd); + +int +ice_sched_set_node_bw_lmt(struct ice_port_info *pi, struct ice_sched_node *node, + enum ice_rl_type rl_type, u32 bw); + +int +ice_sched_set_node_bw(struct ice_port_info *pi, struct ice_sched_node *node, + enum ice_rl_type rl_type, u32 bw, u8 layer_num); + +int +ice_sched_add_elems(struct ice_port_info *pi, struct ice_sched_node *tc_node, + struct ice_sched_node *parent, u8 layer, u16 num_nodes, + u16 *num_nodes_added, u32 *first_node_teid); + +int +ice_sched_move_nodes(struct ice_port_info *pi, struct ice_sched_node *parent, + u16 num_items, u32 *list); + +int ice_sched_set_node_priority(struct ice_port_info *pi, struct ice_sched_node *node, + u16 priority); +int ice_sched_set_node_weight(struct ice_port_info *pi, struct ice_sched_node *node, u16 weight); + int ice_sched_init_port(struct ice_port_info *pi); int ice_sched_query_res_alloc(struct ice_hw *hw); void ice_sched_get_psm_clk_freq(struct ice_hw *hw); @@ -82,6 +106,9 @@ ice_sched_find_node_by_teid(struct ice_sched_node *start_node, u32 teid); int ice_sched_add_node(struct ice_port_info *pi, u8 layer, struct ice_aqc_txsched_elem_data *info); +void +ice_sched_update_parent(struct ice_sched_node *new_parent, + struct ice_sched_node *node); void ice_free_sched_node(struct ice_port_info *pi, struct ice_sched_node *node); struct ice_sched_node *ice_sched_get_tc_node(struct ice_port_info *pi, u8 tc); struct ice_sched_node * diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index e1abfcee96dcd..daf86cf561bc7 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -524,7 +524,14 @@ struct ice_sched_node { struct ice_sched_node *sibling; /* next sibling in the same layer */ struct ice_sched_node **children; struct ice_aqc_txsched_elem_data info; + char *name; + struct devlink_rate *rate_node; + u64 tx_max; + u64 tx_share; u32 agg_id; /* aggregator group ID */ + u32 id; + u32 tx_priority; + u32 tx_weight; u16 vsi_handle; u8 in_use; /* suspended or in use */ u8 tx_sched_layer; /* Logical Layer (1-9) */ @@ -706,6 +713,7 @@ struct ice_port_info { /* List contain profile ID(s) and other params per layer */ struct list_head rl_prof_list[ICE_AQC_TOPO_MAX_LEVEL_NUM]; struct ice_qos_cfg qos_cfg; + struct xarray sched_node_ids; u8 is_vf:1; };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Przemek Kitszel przemyslaw.kitszel@intel.com
[ Upstream commit ad667d626825383b626ad6ed38d6205618abb115 ]
We all know they are redundant.
Reviewed-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Reviewed-by: Michal Wilczynski michal.wilczynski@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Przemek Kitszel przemyslaw.kitszel@intel.com Tested-by: Arpana Arland arpanax.arland@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_common.c | 6 +-- drivers/net/ethernet/intel/ice/ice_controlq.c | 3 +- drivers/net/ethernet/intel/ice/ice_flow.c | 23 ++-------- drivers/net/ethernet/intel/ice/ice_lib.c | 42 +++++++------------ drivers/net/ethernet/intel/ice/ice_sched.c | 11 ++--- drivers/net/ethernet/intel/ice/ice_switch.c | 19 +++------ 6 files changed, 29 insertions(+), 75 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index e2e661010176c..419052ebc3ae7 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -789,8 +789,7 @@ static void ice_cleanup_fltr_mgmt_struct(struct ice_hw *hw) devm_kfree(ice_hw_to_dev(hw), lst_itr); } } - if (recps[i].root_buf) - devm_kfree(ice_hw_to_dev(hw), recps[i].root_buf); + devm_kfree(ice_hw_to_dev(hw), recps[i].root_buf); } ice_rm_all_sw_replay_rule_info(hw); devm_kfree(ice_hw_to_dev(hw), sw->recp_list); @@ -986,8 +985,7 @@ static int ice_cfg_fw_log(struct ice_hw *hw, bool enable) }
out: - if (data) - devm_kfree(ice_hw_to_dev(hw), data); + devm_kfree(ice_hw_to_dev(hw), data);
return status; } diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c index 6bcfee2959915..f68df8e05b18e 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.c +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c @@ -339,8 +339,7 @@ do { \ } \ } \ /* free the buffer info list */ \ - if ((qi)->ring.cmd_buf) \ - devm_kfree(ice_hw_to_dev(hw), (qi)->ring.cmd_buf); \ + devm_kfree(ice_hw_to_dev(hw), (qi)->ring.cmd_buf); \ /* free DMA head */ \ devm_kfree(ice_hw_to_dev(hw), (qi)->ring.dma_head); \ } while (0) diff --git a/drivers/net/ethernet/intel/ice/ice_flow.c b/drivers/net/ethernet/intel/ice/ice_flow.c index ef103e47a8dc2..85cca572c22a5 100644 --- a/drivers/net/ethernet/intel/ice/ice_flow.c +++ b/drivers/net/ethernet/intel/ice/ice_flow.c @@ -1303,23 +1303,6 @@ ice_flow_find_prof_id(struct ice_hw *hw, enum ice_block blk, u64 prof_id) return NULL; }
-/** - * ice_dealloc_flow_entry - Deallocate flow entry memory - * @hw: pointer to the HW struct - * @entry: flow entry to be removed - */ -static void -ice_dealloc_flow_entry(struct ice_hw *hw, struct ice_flow_entry *entry) -{ - if (!entry) - return; - - if (entry->entry) - devm_kfree(ice_hw_to_dev(hw), entry->entry); - - devm_kfree(ice_hw_to_dev(hw), entry); -} - /** * ice_flow_rem_entry_sync - Remove a flow entry * @hw: pointer to the HW struct @@ -1335,7 +1318,8 @@ ice_flow_rem_entry_sync(struct ice_hw *hw, enum ice_block __always_unused blk,
list_del(&entry->l_entry);
- ice_dealloc_flow_entry(hw, entry); + devm_kfree(ice_hw_to_dev(hw), entry->entry); + devm_kfree(ice_hw_to_dev(hw), entry);
return 0; } @@ -1662,8 +1646,7 @@ ice_flow_add_entry(struct ice_hw *hw, enum ice_block blk, u64 prof_id,
out: if (status && e) { - if (e->entry) - devm_kfree(ice_hw_to_dev(hw), e->entry); + devm_kfree(ice_hw_to_dev(hw), e->entry); devm_kfree(ice_hw_to_dev(hw), e); }
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index cc6c04a69b285..cd161c03c5e39 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -320,31 +320,19 @@ static void ice_vsi_free_arrays(struct ice_vsi *vsi)
dev = ice_pf_to_dev(pf);
- if (vsi->af_xdp_zc_qps) { - bitmap_free(vsi->af_xdp_zc_qps); - vsi->af_xdp_zc_qps = NULL; - } + bitmap_free(vsi->af_xdp_zc_qps); + vsi->af_xdp_zc_qps = NULL; /* free the ring and vector containers */ - if (vsi->q_vectors) { - devm_kfree(dev, vsi->q_vectors); - vsi->q_vectors = NULL; - } - if (vsi->tx_rings) { - devm_kfree(dev, vsi->tx_rings); - vsi->tx_rings = NULL; - } - if (vsi->rx_rings) { - devm_kfree(dev, vsi->rx_rings); - vsi->rx_rings = NULL; - } - if (vsi->txq_map) { - devm_kfree(dev, vsi->txq_map); - vsi->txq_map = NULL; - } - if (vsi->rxq_map) { - devm_kfree(dev, vsi->rxq_map); - vsi->rxq_map = NULL; - } + devm_kfree(dev, vsi->q_vectors); + vsi->q_vectors = NULL; + devm_kfree(dev, vsi->tx_rings); + vsi->tx_rings = NULL; + devm_kfree(dev, vsi->rx_rings); + vsi->rx_rings = NULL; + devm_kfree(dev, vsi->txq_map); + vsi->txq_map = NULL; + devm_kfree(dev, vsi->rxq_map); + vsi->rxq_map = NULL; }
/** @@ -787,10 +775,8 @@ static void ice_rss_clean(struct ice_vsi *vsi)
dev = ice_pf_to_dev(pf);
- if (vsi->rss_hkey_user) - devm_kfree(dev, vsi->rss_hkey_user); - if (vsi->rss_lut_user) - devm_kfree(dev, vsi->rss_lut_user); + devm_kfree(dev, vsi->rss_hkey_user); + devm_kfree(dev, vsi->rss_lut_user);
ice_vsi_clean_rss_flow_fld(vsi); /* remove RSS replay list */ diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c index 88e74835d0274..849b6c7f0506b 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.c +++ b/drivers/net/ethernet/intel/ice/ice_sched.c @@ -353,10 +353,7 @@ void ice_free_sched_node(struct ice_port_info *pi, struct ice_sched_node *node) node->sibling; }
- /* leaf nodes have no children */ - if (node->children) - devm_kfree(ice_hw_to_dev(hw), node->children); - + devm_kfree(ice_hw_to_dev(hw), node->children); kfree(node->name); xa_erase(&pi->sched_node_ids, node->id); devm_kfree(ice_hw_to_dev(hw), node); @@ -854,10 +851,8 @@ void ice_sched_cleanup_all(struct ice_hw *hw) if (!hw) return;
- if (hw->layer_info) { - devm_kfree(ice_hw_to_dev(hw), hw->layer_info); - hw->layer_info = NULL; - } + devm_kfree(ice_hw_to_dev(hw), hw->layer_info); + hw->layer_info = NULL;
ice_sched_clear_port(hw->port_info);
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c index 46b36851af460..5ea6365872571 100644 --- a/drivers/net/ethernet/intel/ice/ice_switch.c +++ b/drivers/net/ethernet/intel/ice/ice_switch.c @@ -1636,21 +1636,16 @@ ice_save_vsi_ctx(struct ice_hw *hw, u16 vsi_handle, struct ice_vsi_ctx *vsi) */ static void ice_clear_vsi_q_ctx(struct ice_hw *hw, u16 vsi_handle) { - struct ice_vsi_ctx *vsi; + struct ice_vsi_ctx *vsi = ice_get_vsi_ctx(hw, vsi_handle); u8 i;
- vsi = ice_get_vsi_ctx(hw, vsi_handle); if (!vsi) return; ice_for_each_traffic_class(i) { - if (vsi->lan_q_ctx[i]) { - devm_kfree(ice_hw_to_dev(hw), vsi->lan_q_ctx[i]); - vsi->lan_q_ctx[i] = NULL; - } - if (vsi->rdma_q_ctx[i]) { - devm_kfree(ice_hw_to_dev(hw), vsi->rdma_q_ctx[i]); - vsi->rdma_q_ctx[i] = NULL; - } + devm_kfree(ice_hw_to_dev(hw), vsi->lan_q_ctx[i]); + vsi->lan_q_ctx[i] = NULL; + devm_kfree(ice_hw_to_dev(hw), vsi->rdma_q_ctx[i]); + vsi->rdma_q_ctx[i] = NULL; } }
@@ -5525,9 +5520,7 @@ ice_add_adv_recipe(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups, devm_kfree(ice_hw_to_dev(hw), fvit); }
- if (rm->root_buf) - devm_kfree(ice_hw_to_dev(hw), rm->root_buf); - + devm_kfree(ice_hw_to_dev(hw), rm->root_buf); kfree(rm);
err_free_lkup_exts:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larysa Zaremba larysa.zaremba@intel.com
[ Upstream commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd ]
Referenced commit has introduced a bitmap to distinguish between ZC and copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this for us.
The bitmap would be especially useful when restoring previous state after rebuild, if only it was not reallocated in the process. This leads to e.g. xdpsock dying after changing number of queues.
Instead of preserving the bitmap during the rebuild, remove it completely and distinguish between ZC and copy-mode queues based on the presence of a device associated with the pool.
Fixes: e102db780e1c ("ice: track AF_XDP ZC enabled queues in bitmap") Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Larysa Zaremba larysa.zaremba@intel.com Reviewed-by: Simon Horman horms@kernel.org Tested-by: Chandan Kumar Rout chandanx.rout@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice.h | 32 ++++++++++++++++-------- drivers/net/ethernet/intel/ice/ice_lib.c | 8 ------ drivers/net/ethernet/intel/ice/ice_xsk.c | 13 +++++----- 3 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index f2be383d97df5..6d75e5638f665 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -388,7 +388,6 @@ struct ice_vsi { struct ice_tc_cfg tc_cfg; struct bpf_prog *xdp_prog; struct ice_tx_ring **xdp_rings; /* XDP ring array */ - unsigned long *af_xdp_zc_qps; /* tracks AF_XDP ZC enabled qps */ u16 num_xdp_txq; /* Used XDP queues */ u8 xdp_mapping_mode; /* ICE_MAP_MODE_[CONTIG|SCATTER] */
@@ -688,6 +687,25 @@ static inline void ice_set_ring_xdp(struct ice_tx_ring *ring) ring->flags |= ICE_TX_FLAGS_RING_XDP; }
+/** + * ice_get_xp_from_qid - get ZC XSK buffer pool bound to a queue ID + * @vsi: pointer to VSI + * @qid: index of a queue to look at XSK buff pool presence + * + * Return: A pointer to xsk_buff_pool structure if there is a buffer pool + * attached and configured as zero-copy, NULL otherwise. + */ +static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi, + u16 qid) +{ + struct xsk_buff_pool *pool = xsk_get_pool_from_qid(vsi->netdev, qid); + + if (!ice_is_xdp_ena_vsi(vsi)) + return NULL; + + return (pool && pool->dev) ? pool : NULL; +} + /** * ice_xsk_pool - get XSK buffer pool bound to a ring * @ring: Rx ring to use @@ -700,10 +718,7 @@ static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring) struct ice_vsi *vsi = ring->vsi; u16 qid = ring->q_index;
- if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps)) - return NULL; - - return xsk_get_pool_from_qid(vsi->netdev, qid); + return ice_get_xp_from_qid(vsi, qid); }
/** @@ -728,12 +743,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid) if (!ring) return;
- if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps)) { - ring->xsk_pool = NULL; - return; - } - - ring->xsk_pool = xsk_get_pool_from_qid(vsi->netdev, qid); + ring->xsk_pool = ice_get_xp_from_qid(vsi, qid); }
/** diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index cd161c03c5e39..7661e735d0992 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -117,14 +117,8 @@ static int ice_vsi_alloc_arrays(struct ice_vsi *vsi) if (!vsi->q_vectors) goto err_vectors;
- vsi->af_xdp_zc_qps = bitmap_zalloc(max_t(int, vsi->alloc_txq, vsi->alloc_rxq), GFP_KERNEL); - if (!vsi->af_xdp_zc_qps) - goto err_zc_qps; - return 0;
-err_zc_qps: - devm_kfree(dev, vsi->q_vectors); err_vectors: devm_kfree(dev, vsi->rxq_map); err_rxq_map: @@ -320,8 +314,6 @@ static void ice_vsi_free_arrays(struct ice_vsi *vsi)
dev = ice_pf_to_dev(pf);
- bitmap_free(vsi->af_xdp_zc_qps); - vsi->af_xdp_zc_qps = NULL; /* free the ring and vector containers */ devm_kfree(dev, vsi->q_vectors); vsi->q_vectors = NULL; diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 48cf24709fe32..b917f271cdac1 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -281,7 +281,6 @@ static int ice_xsk_pool_disable(struct ice_vsi *vsi, u16 qid) if (!pool) return -EINVAL;
- clear_bit(qid, vsi->af_xdp_zc_qps); xsk_pool_dma_unmap(pool, ICE_RX_DMA_ATTR);
return 0; @@ -312,8 +311,6 @@ ice_xsk_pool_enable(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid) if (err) return err;
- set_bit(qid, vsi->af_xdp_zc_qps); - return 0; }
@@ -361,11 +358,13 @@ ice_realloc_rx_xdp_bufs(struct ice_rx_ring *rx_ring, bool pool_present) int ice_realloc_zc_buf(struct ice_vsi *vsi, bool zc) { struct ice_rx_ring *rx_ring; - unsigned long q; + uint i; + + ice_for_each_rxq(vsi, i) { + rx_ring = vsi->rx_rings[i]; + if (!rx_ring->xsk_pool) + continue;
- for_each_set_bit(q, vsi->af_xdp_zc_qps, - max_t(int, vsi->alloc_txq, vsi->alloc_rxq)) { - rx_ring = vsi->rx_rings[q]; if (ice_realloc_rx_xdp_bufs(rx_ring, zc)) return -ENOMEM; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
[ Upstream commit b0c9a26435413b81799047a7be53255640432547 ]
In case of region creation fail in ipc_devlink_create_region(), previously created regions delete process starts from tainted pointer which actually holds error code value. Fix this bug by decreasing region index before delete.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4dcd183fbd67 ("net: wwan: iosm: devlink registration") Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Acked-by: Sergey Ryazanov ryazanov.s.a@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20240604082500.20769-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_devlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_devlink.c b/drivers/net/wwan/iosm/iosm_ipc_devlink.c index 2fe724d623c06..33c5a46f1b922 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_devlink.c +++ b/drivers/net/wwan/iosm/iosm_ipc_devlink.c @@ -210,7 +210,7 @@ static int ipc_devlink_create_region(struct iosm_devlink *devlink) rc = PTR_ERR(devlink->cd_regions[i]); dev_err(devlink->dev, "Devlink region fail,err %d", rc); /* Delete previously created regions */ - for ( ; i >= 0; i--) + for (i--; i >= 0; i--) devlink_region_destroy(devlink->cd_regions[i]); goto region_create_fail; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 26bfb8b57063f52b867f9b6c8d1742fcb5bd656c ]
When a SOCK_DGRAM socket connect()s to another socket, the both sockets' sk->sk_state are changed to TCP_ESTABLISHED so that we can register them to BPF SOCKMAP.
When the socket disconnects from the peer by connect(AF_UNSPEC), the state is set back to TCP_CLOSE.
Then, the peer's state is also set to TCP_CLOSE, but the update is done locklessly and unconditionally.
Let's say socket A connect()ed to B, B connect()ed to C, and A disconnects from B.
After the first two connect()s, all three sockets' sk->sk_state are TCP_ESTABLISHED:
$ ss -xa Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess u_dgr ESTAB 0 0 @A 641 * 642 u_dgr ESTAB 0 0 @B 642 * 643 u_dgr ESTAB 0 0 @C 643 * 0
And after the disconnect, B's state is TCP_CLOSE even though it's still connected to C and C's state is TCP_ESTABLISHED.
$ ss -xa Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess u_dgr UNCONN 0 0 @A 641 * 0 u_dgr UNCONN 0 0 @B 642 * 643 u_dgr ESTAB 0 0 @C 643 * 0
In this case, we cannot register B to SOCKMAP.
So, when a socket disconnects from the peer, we should not set TCP_CLOSE to the peer if the peer is connected to yet another socket, and this must be done under unix_state_lock().
Note that we use WRITE_ONCE() for sk->sk_state as there are many lockless readers. These data-races will be fixed in the following patches.
Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 7d2a3b42b456a..5d6203b6e25c3 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -558,7 +558,6 @@ static void unix_dgram_disconnected(struct sock *sk, struct sock *other) sk_error_report(other); } } - other->sk_state = TCP_CLOSE; }
static void unix_sock_destructor(struct sock *sk) @@ -1412,8 +1411,15 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
unix_state_double_unlock(sk, other);
- if (other != old_peer) + if (other != old_peer) { unix_dgram_disconnected(sk, old_peer); + + unix_state_lock(old_peer); + if (!unix_peer(old_peer)) + WRITE_ONCE(old_peer->sk_state, TCP_CLOSE); + unix_state_unlock(old_peer); + } + sock_put(old_peer); } else { unix_peer(sk) = other;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 942238f9735a4a4ebf8274b218d9a910158941d1 ]
sk->sk_state is changed under unix_state_lock(), but it's read locklessly in many places.
This patch adds WRITE_ONCE() on the writer side.
We will add READ_ONCE() to the lockless readers in the following patches.
Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5d6203b6e25c3..358e80956cb7b 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -604,7 +604,7 @@ static void unix_release_sock(struct sock *sk, int embrion) u->path.dentry = NULL; u->path.mnt = NULL; state = sk->sk_state; - sk->sk_state = TCP_CLOSE; + WRITE_ONCE(sk->sk_state, TCP_CLOSE);
skpair = unix_peer(sk); unix_peer(sk) = NULL; @@ -726,7 +726,8 @@ static int unix_listen(struct socket *sock, int backlog) if (backlog > sk->sk_max_ack_backlog) wake_up_interruptible_all(&u->peer_wait); sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + WRITE_ONCE(sk->sk_state, TCP_LISTEN); + /* set credentials so connect can copy them */ init_peercred(sk); err = 0; @@ -1389,7 +1390,8 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, if (err) goto out_unlock;
- sk->sk_state = other->sk_state = TCP_ESTABLISHED; + WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED); + WRITE_ONCE(other->sk_state, TCP_ESTABLISHED); } else { /* * 1003.1g breaking connected state with AF_UNSPEC @@ -1406,7 +1408,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
unix_peer(sk) = other; if (!other) - sk->sk_state = TCP_CLOSE; + WRITE_ONCE(sk->sk_state, TCP_CLOSE); unix_dgram_peer_wake_disconnect_wakeup(sk, old_peer);
unix_state_double_unlock(sk, other); @@ -1620,7 +1622,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, copy_peercred(sk, other);
sock->state = SS_CONNECTED; - sk->sk_state = TCP_ESTABLISHED; + WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED); sock_hold(newsk);
smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */ @@ -2015,7 +2017,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, unix_peer(sk) = NULL; unix_dgram_peer_wake_disconnect_wakeup(sk, other);
- sk->sk_state = TCP_CLOSE; + WRITE_ONCE(sk->sk_state, TCP_CLOSE); unix_state_unlock(sk);
unix_dgram_disconnected(sk, other);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 3a0f38eb285c8c2eead4b3230c7ac2983707599d ]
ioctl(SIOCINQ) calls unix_inq_len() that checks sk->sk_state first and returns -EINVAL if it's TCP_LISTEN.
Then, for SOCK_STREAM sockets, unix_inq_len() returns the number of bytes in recvq.
However, unix_inq_len() does not hold unix_state_lock(), and the concurrent listen() might change the state after checking sk->sk_state.
If the race occurs, 0 is returned for the listener, instead of -EINVAL, because the length of skb with embryo is 0.
We could hold unix_state_lock() in unix_inq_len(), but it's overkill given the result is true for pre-listen() TCP_CLOSE state.
So, let's use READ_ONCE() for sk->sk_state in unix_inq_len().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 358e80956cb7b..6d7e1cd97c52e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -3068,7 +3068,7 @@ long unix_inq_len(struct sock *sk) struct sk_buff *skb; long amount = 0;
- if (sk->sk_state == TCP_LISTEN) + if (READ_ONCE(sk->sk_state) == TCP_LISTEN) return -EINVAL;
spin_lock(&sk->sk_receive_queue.lock);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit eb0718fb3e97ad0d6f4529b810103451c90adf94 ]
unix_poll() and unix_dgram_poll() read sk->sk_state locklessly and calls unix_writable() which also reads sk->sk_state without holding unix_state_lock().
Let's use READ_ONCE() in unix_poll() and unix_dgram_poll() and pass it to unix_writable().
While at it, we remove TCP_SYN_SENT check in unix_dgram_poll() as that state does not exist for AF_UNIX socket since the code was added.
Fixes: 1586a5877db9 ("af_unix: do not report POLLOUT on listeners") Fixes: 3c73419c09a5 ("af_unix: fix 'poll for write'/ connected DGRAM sockets") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 25 ++++++++++++------------- 1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 6d7e1cd97c52e..67a2a0e842e4f 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -518,9 +518,9 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other) return 0; }
-static int unix_writable(const struct sock *sk) +static int unix_writable(const struct sock *sk, unsigned char state) { - return sk->sk_state != TCP_LISTEN && + return state != TCP_LISTEN && (refcount_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf; }
@@ -529,7 +529,7 @@ static void unix_write_space(struct sock *sk) struct socket_wq *wq;
rcu_read_lock(); - if (unix_writable(sk)) { + if (unix_writable(sk, READ_ONCE(sk->sk_state))) { wq = rcu_dereference(sk->sk_wq); if (skwq_has_sleeper(wq)) wake_up_interruptible_sync_poll(&wq->wait, @@ -3180,12 +3180,14 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wait) { struct sock *sk = sock->sk; + unsigned char state; __poll_t mask; u8 shutdown;
sock_poll_wait(file, sock, wait); mask = 0; shutdown = READ_ONCE(sk->sk_shutdown); + state = READ_ONCE(sk->sk_state);
/* exceptional events? */ if (sk->sk_err) @@ -3207,14 +3209,14 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
/* Connection-based need to check for termination and startup */ if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) && - sk->sk_state == TCP_CLOSE) + state == TCP_CLOSE) mask |= EPOLLHUP;
/* * we set writable also when the other side has shut down the * connection. This prevents stuck sockets. */ - if (unix_writable(sk)) + if (unix_writable(sk, state)) mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
return mask; @@ -3225,12 +3227,14 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock, { struct sock *sk = sock->sk, *other; unsigned int writable; + unsigned char state; __poll_t mask; u8 shutdown;
sock_poll_wait(file, sock, wait); mask = 0; shutdown = READ_ONCE(sk->sk_shutdown); + state = READ_ONCE(sk->sk_state);
/* exceptional events? */ if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) @@ -3249,19 +3253,14 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock, mask |= EPOLLIN | EPOLLRDNORM;
/* Connection-based need to check for termination and startup */ - if (sk->sk_type == SOCK_SEQPACKET) { - if (sk->sk_state == TCP_CLOSE) - mask |= EPOLLHUP; - /* connection hasn't started yet? */ - if (sk->sk_state == TCP_SYN_SENT) - return mask; - } + if (sk->sk_type == SOCK_SEQPACKET && state == TCP_CLOSE) + mask |= EPOLLHUP;
/* No write status requested, avoid expensive OUT tests. */ if (!(poll_requested_events(wait) & (EPOLLWRBAND|EPOLLWRNORM|EPOLLOUT))) return mask;
- writable = unix_writable(sk); + writable = unix_writable(sk, state); if (writable) { unix_state_lock(sk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit a9bf9c7dc6a5899c01cb8f6e773a66315a5cd4b7 ]
As small optimisation, unix_stream_connect() prefetches the client's sk->sk_state without unix_state_lock() and checks if it's TCP_CLOSE.
Later, sk->sk_state is checked again under unix_state_lock().
Let's use READ_ONCE() for the first check and TCP_CLOSE directly for the second check.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 67a2a0e842e4f..6e03b364bb727 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1469,7 +1469,6 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, struct sk_buff *skb = NULL; long timeo; int err; - int st;
err = unix_validate_addr(sunaddr, addr_len); if (err) @@ -1553,9 +1552,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
Well, and we have to recheck the state after socket locked. */ - st = sk->sk_state; - - switch (st) { + switch (READ_ONCE(sk->sk_state)) { case TCP_CLOSE: /* This is ok... continue with connect */ break; @@ -1570,7 +1567,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
unix_state_lock_nested(sk, U_LOCK_SECOND);
- if (sk->sk_state != st) { + if (sk->sk_state != TCP_CLOSE) { unix_state_unlock(sk); unix_state_unlock(other); sock_put(other);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 8a34d4e8d9742a24f74998f45a6a98edd923319b ]
The following functions read sk->sk_state locklessly and proceed only if the state is TCP_ESTABLISHED.
* unix_stream_sendmsg * unix_stream_read_generic * unix_seqpacket_sendmsg * unix_seqpacket_recvmsg
Let's use READ_ONCE() there.
Fixes: a05d2ad1c1f3 ("af_unix: Only allow recv on connected seqpacket sockets.") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 6e03b364bb727..2e25d9eaa82ea 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2190,7 +2190,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, }
if (msg->msg_namelen) { - err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP; + err = READ_ONCE(sk->sk_state) == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP; goto out_err; } else { err = -ENOTCONN; @@ -2402,7 +2402,7 @@ static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg, if (err) return err;
- if (sk->sk_state != TCP_ESTABLISHED) + if (READ_ONCE(sk->sk_state) != TCP_ESTABLISHED) return -ENOTCONN;
if (msg->msg_namelen) @@ -2416,7 +2416,7 @@ static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg, { struct sock *sk = sock->sk;
- if (sk->sk_state != TCP_ESTABLISHED) + if (READ_ONCE(sk->sk_state) != TCP_ESTABLISHED) return -ENOTCONN;
return unix_dgram_recvmsg(sock, msg, size, flags); @@ -2740,7 +2740,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, size_t size = state->size; unsigned int last_len;
- if (unlikely(sk->sk_state != TCP_ESTABLISHED)) { + if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED)) { err = -EINVAL; goto out; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit af4c733b6b1aded4dc808fafece7dfe6e9d2ebb3 ]
unix_stream_read_skb() is called from sk->sk_data_ready() context where unix_state_lock() is not held.
Let's use READ_ONCE() there.
Fixes: 77462de14a43 ("af_unix: Add read_sock for stream socket types") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 2e25d9eaa82ea..f6ba015fffd2f 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2716,7 +2716,7 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor) { - if (unlikely(sk->sk_state != TCP_ESTABLISHED)) + if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED)) return -ENOTCONN;
return unix_read_skb(sk, recv_actor);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 0aa3be7b3e1f8f997312cc4705f8165e02806f8f ]
While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read locklessly.
Let's use READ_ONCE() there.
Note that the result could be inconsistent if the socket is dumped during the state change. This is common for other SOCK_DIAG and similar interfaces.
Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report") Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA") Fixes: 45a96b9be6ec ("unix_diag: Dumping all sockets core") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/diag.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c index 3438b7af09af5..9151c72e742fc 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -65,7 +65,7 @@ static int sk_diag_dump_icons(struct sock *sk, struct sk_buff *nlskb) u32 *buf; int i;
- if (sk->sk_state == TCP_LISTEN) { + if (READ_ONCE(sk->sk_state) == TCP_LISTEN) { spin_lock(&sk->sk_receive_queue.lock);
attr = nla_reserve(nlskb, UNIX_DIAG_ICONS, @@ -103,7 +103,7 @@ static int sk_diag_show_rqlen(struct sock *sk, struct sk_buff *nlskb) { struct unix_diag_rqlen rql;
- if (sk->sk_state == TCP_LISTEN) { + if (READ_ONCE(sk->sk_state) == TCP_LISTEN) { rql.udiag_rqueue = sk->sk_receive_queue.qlen; rql.udiag_wqueue = sk->sk_max_ack_backlog; } else { @@ -136,7 +136,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff *skb, struct unix_diag_r rep = nlmsg_data(nlh); rep->udiag_family = AF_UNIX; rep->udiag_type = sk->sk_type; - rep->udiag_state = sk->sk_state; + rep->udiag_state = READ_ONCE(sk->sk_state); rep->pad = 0; rep->udiag_ino = sk_ino; sock_diag_save_cookie(sk, rep->udiag_cookie); @@ -215,7 +215,7 @@ static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb) sk_for_each(sk, &net->unx.table.buckets[slot]) { if (num < s_num) goto next; - if (!(req->udiag_states & (1 << sk->sk_state))) + if (!(req->udiag_states & (1 << READ_ONCE(sk->sk_state)))) goto next; if (sk_diag_dump(sk, skb, req, sk_user_ns(skb->sk), NETLINK_CB(cb->skb).portid,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit bd9f2d05731f6a112d0c7391a0d537bfc588dbe6 ]
net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be changed concurrently.
Let's use READ_ONCE() in unix_create1().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index f6ba015fffd2f..5cffbd0661406 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -966,7 +966,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_hash = unix_unbound_hash(sk); sk->sk_allocation = GFP_KERNEL_ACCOUNT; sk->sk_write_space = unix_write_space; - sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; + sk->sk_max_ack_backlog = READ_ONCE(net->unx.sysctl_max_dgram_qlen); sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->inflight = 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 45d872f0e65593176d880ec148f41ad7c02e40a7 ]
Once sk->sk_state is changed to TCP_LISTEN, it never changes.
unix_accept() takes advantage of this characteristics; it does not hold the listener's unix_state_lock() and only acquires recvq lock to pop one skb.
It means unix_state_lock() does not prevent the queue length from changing in unix_stream_connect().
Thus, we need to use unix_recvq_full_lockless() to avoid data-race.
Now we remove unix_recvq_full() as no one uses it.
Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in unix_recvq_full_lockless() because of the following reasons:
(1) For SOCK_DGRAM, it is a written-once field in unix_create1()
(2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the listener's unix_state_lock() in unix_listen(), and we hold the lock in unix_stream_connect()
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5cffbd0661406..359d4f604ebda 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -221,15 +221,9 @@ static inline int unix_may_send(struct sock *sk, struct sock *osk) return unix_peer(osk) == NULL || unix_our_peer(sk, osk); }
-static inline int unix_recvq_full(const struct sock *sk) -{ - return skb_queue_len(&sk->sk_receive_queue) > sk->sk_max_ack_backlog; -} - static inline int unix_recvq_full_lockless(const struct sock *sk) { - return skb_queue_len_lockless(&sk->sk_receive_queue) > - READ_ONCE(sk->sk_max_ack_backlog); + return skb_queue_len_lockless(&sk->sk_receive_queue) > sk->sk_max_ack_backlog; }
struct sock *unix_peer_get(struct sock *s) @@ -1527,7 +1521,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, if (other->sk_shutdown & RCV_SHUTDOWN) goto out_unlock;
- if (unix_recvq_full(other)) { + if (unix_recvq_full_lockless(other)) { err = -EAGAIN; if (!timeo) goto out_unlock;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit cc04410af7de348234ac36a5f50c4ce416efdb4b ]
unix_poll() and unix_dgram_poll() read sk->sk_err without any lock held.
Add relevant READ_ONCE()/WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 83690b82d228 ("af_unix: Use skb_queue_empty_lockless() in unix_release_sock().") Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 359d4f604ebda..02d8612385bd9 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -548,7 +548,7 @@ static void unix_dgram_disconnected(struct sock *sk, struct sock *other) * when peer was not connected to us. */ if (!sock_flag(other, SOCK_DEAD) && unix_peer(other) == sk) { - other->sk_err = ECONNRESET; + WRITE_ONCE(other->sk_err, ECONNRESET); sk_error_report(other); } } @@ -620,7 +620,7 @@ static void unix_release_sock(struct sock *sk, int embrion) /* No more writes */ WRITE_ONCE(skpair->sk_shutdown, SHUTDOWN_MASK); if (!skb_queue_empty(&sk->sk_receive_queue) || embrion) - skpair->sk_err = ECONNRESET; + WRITE_ONCE(skpair->sk_err, ECONNRESET); unix_state_unlock(skpair); skpair->sk_state_change(skpair); sk_wake_async(skpair, SOCK_WAKE_WAITD, POLL_HUP); @@ -3181,7 +3181,7 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa state = READ_ONCE(sk->sk_state);
/* exceptional events? */ - if (sk->sk_err) + if (READ_ONCE(sk->sk_err)) mask |= EPOLLERR; if (shutdown == SHUTDOWN_MASK) mask |= EPOLLHUP; @@ -3228,7 +3228,8 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock, state = READ_ONCE(sk->sk_state);
/* exceptional events? */ - if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) + if (READ_ONCE(sk->sk_err) || + !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= EPOLLERR | (sock_flag(sk, SOCK_SELECT_ERR_QUEUE) ? EPOLLPRI : 0);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 83690b82d228b3570565ebd0b41873933238b97f ]
If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock() checks the length of the peer socket's recvq under unix_state_lock().
However, unix_stream_read_generic() calls skb_unlink() after releasing the lock. Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks skb without unix_state_lock().
Thues, unix_state_lock() does not protect qlen.
Let's use skb_queue_empty_lockless() in unix_release_sock().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 02d8612385bd9..bb94a67229aa3 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -619,7 +619,7 @@ static void unix_release_sock(struct sock *sk, int embrion) unix_state_lock(skpair); /* No more writes */ WRITE_ONCE(skpair->sk_shutdown, SHUTDOWN_MASK); - if (!skb_queue_empty(&sk->sk_receive_queue) || embrion) + if (!skb_queue_empty_lockless(&sk->sk_receive_queue) || embrion) WRITE_ONCE(skpair->sk_err, ECONNRESET); unix_state_unlock(skpair); skpair->sk_state_change(skpair);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 5d915e584d8408211d4567c22685aae8820bfc55 ]
We can dump the socket queue length via UNIX_DIAG by specifying UDIAG_SHOW_RQLEN.
If sk->sk_state is TCP_LISTEN, we return the recv queue length, but here we do not hold recvq lock.
Let's use skb_queue_len_lockless() in sk_diag_show_rqlen().
Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/diag.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c index 9151c72e742fc..fc56244214c30 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -104,7 +104,7 @@ static int sk_diag_show_rqlen(struct sock *sk, struct sk_buff *nlskb) struct unix_diag_rqlen rql;
if (READ_ONCE(sk->sk_state) == TCP_LISTEN) { - rql.udiag_rqueue = sk->sk_receive_queue.qlen; + rql.udiag_rqueue = skb_queue_len_lockless(&sk->sk_receive_queue); rql.udiag_wqueue = sk->sk_max_ack_backlog; } else { rql.udiag_rqueue = (u32) unix_inq_len(sk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit efaf24e30ec39ebbea9112227485805a48b0ceb1 ]
While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock().
Let's use READ_ONCE() to read sk->sk_shutdown.
Fixes: e4e541a84863 ("sock-diag: Report shutdown for inet and unix sockets (v2)") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/diag.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c index fc56244214c30..1de7500b41b61 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -165,7 +165,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff *skb, struct unix_diag_r sock_diag_put_meminfo(sk, skb, UNIX_DIAG_MEMINFO)) goto out_nlmsg_trim;
- if (nla_put_u8(skb, UNIX_DIAG_SHUTDOWN, sk->sk_shutdown)) + if (nla_put_u8(skb, UNIX_DIAG_SHUTDOWN, READ_ONCE(sk->sk_shutdown))) goto out_nlmsg_trim;
if ((req->udiag_show & UDIAG_SHOW_UID) &&
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit b01e1c030770ff3b4fe37fc7cc6bca03f594133f ]
syzbot found a race in __fib6_drop_pcpu_from() [1]
If compiler reads more than once (*ppcpu_rt), second read could read NULL, if another cpu clears the value in rt6_get_pcpu_route().
Add a READ_ONCE() to prevent this race.
Also add rcu_read_lock()/rcu_read_unlock() because we rely on RCU protection while dereferencing pcpu_rt.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Workqueue: netns cleanup_net RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984 Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48 RSP: 0018:ffffc900040df070 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16 RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091 RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8 R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline] fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline] fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038 fib6_del_route net/ipv6/ip6_fib.c:1998 [inline] fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043 fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205 fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175 fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255 __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271 rt6_sync_down_dev net/ipv6/route.c:4906 [inline] rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911 addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855 addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778 notifier_call_chain+0xb9/0x410 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline] call_netdevice_notifiers net/core/dev.c:2044 [inline] dev_close_many+0x333/0x6a0 net/core/dev.c:1585 unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193 unregister_netdevice_many net/core/dev.c:11276 [inline] default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759 ops_exit_list+0x128/0x180 net/core/net_namespace.c:178 cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 process_scheduled_works kernel/workqueue.c:3312 [inline] worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Martin KaFai Lau kafai@fb.com Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ip6_fib.c | 6 +++++- net/ipv6/route.c | 1 + 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 8213626434b91..1123594ad2be7 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -962,6 +962,7 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh, if (!fib6_nh->rt6i_pcpu) return;
+ rcu_read_lock(); /* release the reference to this fib entry from * all of its cached pcpu routes */ @@ -970,7 +971,9 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh, struct rt6_info *pcpu_rt;
ppcpu_rt = per_cpu_ptr(fib6_nh->rt6i_pcpu, cpu); - pcpu_rt = *ppcpu_rt; + + /* Paired with xchg() in rt6_get_pcpu_route() */ + pcpu_rt = READ_ONCE(*ppcpu_rt);
/* only dropping the 'from' reference if the cached route * is using 'match'. The cached pcpu_rt->from only changes @@ -984,6 +987,7 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh, fib6_info_release(from); } } + rcu_read_unlock(); }
struct fib6_nh_pcpu_arg { diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 258e87055836f..627431722f9d6 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1401,6 +1401,7 @@ static struct rt6_info *rt6_get_pcpu_route(const struct fib6_result *res) struct rt6_info *prev, **p;
p = this_cpu_ptr(res->nh->rt6i_pcpu); + /* Paired with READ_ONCE() in __fib6_drop_pcpu_from() */ prev = xchg(p, NULL); if (prev) { dst_dev_put(&prev->dst);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit 32868e126c78876a8a5ddfcb6ac8cb2fffcf4d27 ]
Qualcomm Bluetooth controllers may not have been provisioned with a valid device address and instead end up using the default address 00:00:00:00:5a:ad.
This was previously believed to be due to lack of persistent storage for the address but it may also be due to integrators opting to not use the on-chip OTP memory and instead store the address elsewhere (e.g. in storage managed by secure world firmware).
According to Qualcomm, at least WCN6750, WCN6855 and WCN7850 have on-chip OTP storage for the address.
As the device type alone cannot be used to determine when the address is valid, instead read back the address during setup() and only set the HCI_QUIRK_USE_BDADDR_PROPERTY flag when needed.
This specifically makes sure that controllers that have been provisioned with an address do not start as unconfigured.
Reported-by: Janaki Ramaiah Thota quic_janathot@quicinc.com Link: https://lore.kernel.org/r/124a7d54-5a18-4be7-9a76-a12017f6cce5@quicinc.com/ Fixes: 5971752de44c ("Bluetooth: hci_qca: Set HCI_QUIRK_USE_BDADDR_PROPERTY for wcn3990") Fixes: e668eb1e1578 ("Bluetooth: hci_core: Don't stop BT if the BD address missing in dts") Fixes: 6945795bc81a ("Bluetooth: fix use-bdaddr-property quirk") Cc: stable@vger.kernel.org # 6.5 Cc: Matthias Kaehlcke mka@chromium.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reported-by: Janaki Ramaiah Thota quic_janathot@quicinc.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btqca.c | 38 +++++++++++++++++++++++++++++++++++++ drivers/bluetooth/hci_qca.c | 2 -- 2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c index 2dda94a0875a6..8df2e53dcd63c 100644 --- a/drivers/bluetooth/btqca.c +++ b/drivers/bluetooth/btqca.c @@ -15,6 +15,8 @@
#define VERSION "0.1"
+#define QCA_BDADDR_DEFAULT (&(bdaddr_t) {{ 0xad, 0x5a, 0x00, 0x00, 0x00, 0x00 }}) + int qca_read_soc_version(struct hci_dev *hdev, struct qca_btsoc_version *ver, enum qca_btsoc_type soc_type) { @@ -682,6 +684,38 @@ int qca_set_bdaddr_rome(struct hci_dev *hdev, const bdaddr_t *bdaddr) } EXPORT_SYMBOL_GPL(qca_set_bdaddr_rome);
+static int qca_check_bdaddr(struct hci_dev *hdev) +{ + struct hci_rp_read_bd_addr *bda; + struct sk_buff *skb; + int err; + + if (bacmp(&hdev->public_addr, BDADDR_ANY)) + return 0; + + skb = __hci_cmd_sync(hdev, HCI_OP_READ_BD_ADDR, 0, NULL, + HCI_INIT_TIMEOUT); + if (IS_ERR(skb)) { + err = PTR_ERR(skb); + bt_dev_err(hdev, "Failed to read device address (%d)", err); + return err; + } + + if (skb->len != sizeof(*bda)) { + bt_dev_err(hdev, "Device address length mismatch"); + kfree_skb(skb); + return -EIO; + } + + bda = (struct hci_rp_read_bd_addr *)skb->data; + if (!bacmp(&bda->bdaddr, QCA_BDADDR_DEFAULT)) + set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); + + kfree_skb(skb); + + return 0; +} + static void qca_generate_hsp_nvm_name(char *fwname, size_t max_size, struct qca_btsoc_version ver, u8 rom_ver, u16 bid) { @@ -888,6 +922,10 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate, break; }
+ err = qca_check_bdaddr(hdev); + if (err) + return err; + bt_dev_info(hdev, "QCA setup on UART is completed");
return 0; diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c index a0e2b5d992695..070014d0fc994 100644 --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -1853,8 +1853,6 @@ static int qca_setup(struct hci_uart *hu) case QCA_WCN6750: case QCA_WCN6855: case QCA_WCN7850: - set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); - qcadev = serdev_device_get_drvdata(hu->serdev); if (qcadev->bdaddr_property_broken) set_bit(HCI_QUIRK_BDADDR_PROPERTY_BROKEN, &hdev->quirks);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit fe1c6c7acce10baf9521d6dccc17268d91ee2305 ]
[BUG] During my extent_map cleanup/refactor, with extra sanity checks, extent-map-tests::test_case_7() would not pass the checks.
The problem is, after btrfs_drop_extent_map_range(), the resulted extent_map has a @block_start way too large. Meanwhile my btrfs_file_extent_item based members are returning a correct @disk_bytenr/@offset combination.
The extent map layout looks like this:
0 16K 32K 48K | PINNED | | Regular |
The regular em at [32K, 48K) also has 32K @block_start.
Then drop range [0, 36K), which should shrink the regular one to be [36K, 48K). However the @block_start is incorrect, we expect 32K + 4K, but got 52K.
[CAUSE] Inside btrfs_drop_extent_map_range() function, if we hit an extent_map that covers the target range but is still beyond it, we need to split that extent map into half:
|<-- drop range -->| |<----- existing extent_map --->|
And if the extent map is not compressed, we need to forward extent_map::block_start by the difference between the end of drop range and the extent map start.
However in that particular case, the difference is calculated using (start + len - em->start).
The problem is @start can be modified if the drop range covers any pinned extent.
This leads to wrong calculation, and would be caught by my later extent_map sanity checks, which checks the em::block_start against btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset.
This is a regression caused by commit c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range"), which removed the @len update for pinned extents.
[FIX] Fix it by avoiding using @start completely, and use @end - em->start instead, which @end is exclusive bytenr number.
And update the test case to verify the @block_start to prevent such problem from happening.
Thankfully this is not going to lead to any data corruption, as IO path does not utilize btrfs_drop_extent_map_range() with @skip_pinned set.
So this fix is only here for the sake of consistency/correctness.
CC: stable@vger.kernel.org # 6.5+ Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range") Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_map.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 56d7580fdc3c4..3518e638374ea 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -867,7 +867,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->block_len = em->block_len; split->orig_start = em->orig_start; } else { - const u64 diff = start + len - em->start; + const u64 diff = end - em->start;
split->block_len = split->len; split->block_start += diff;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Keeping john@metanate.com
[ Upstream commit b566d38857fcb6777f25b674b90a831eec0817a2 ]
Commit fb1f16d74e26 ("usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()") added a new ffs_io_data::status field to fix lifetime issues in synchronous requests.
While there are no similar lifetime issues for asynchronous requests (the separate ep member in ffs_io_data avoids them) using the status field means the USB request can be freed earlier and that there is more consistency between the synchronous and asynchronous I/O paths.
Cc: Linyu Yuan quic_linyyuan@quicinc.com Signed-off-by: John Keeping john@metanate.com Reviewed-by: Linyu Yuan quic_linyyuan@quicinc.com Link: https://lore.kernel.org/r/20221124170430.3998755-1-john@metanate.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 24729b307eef ("usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_fs.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index b2da74bb107af..d32e1ece3e0a1 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -830,8 +830,7 @@ static void ffs_user_copy_worker(struct work_struct *work) { struct ffs_io_data *io_data = container_of(work, struct ffs_io_data, work); - int ret = io_data->req->status ? io_data->req->status : - io_data->req->actual; + int ret = io_data->status; bool kiocb_has_eventfd = io_data->kiocb->ki_flags & IOCB_EVENTFD;
if (io_data->read && ret > 0) { @@ -845,8 +844,6 @@ static void ffs_user_copy_worker(struct work_struct *work) if (io_data->ffs->ffs_eventfd && !kiocb_has_eventfd) eventfd_signal(io_data->ffs->ffs_eventfd, 1);
- usb_ep_free_request(io_data->ep, io_data->req); - if (io_data->read) kfree(io_data->to_free); ffs_free_buffer(io_data); @@ -861,6 +858,9 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep,
ENTER();
+ io_data->status = req->status ? req->status : req->actual; + usb_ep_free_request(_ep, req); + INIT_WORK(&io_data->work, ffs_user_copy_worker); queue_work(ffs->io_completion_wq, &io_data->work); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wesley Cheng quic_wcheng@quicinc.com
[ Upstream commit 24729b307eefcd7c476065cd7351c1a018082c19 ]
FFS based applications can utilize the aio_cancel() callback to dequeue pending USB requests submitted to the UDC. There is a scenario where the FFS application issues an AIO cancel call, while the UDC is handling a soft disconnect. For a DWC3 based implementation, the callstack looks like the following:
DWC3 Gadget FFS Application dwc3_gadget_soft_disconnect() ... --> dwc3_stop_active_transfers() --> dwc3_gadget_giveback(-ESHUTDOWN) --> ffs_epfile_async_io_complete() ffs_aio_cancel() --> usb_ep_free_request() --> usb_ep_dequeue()
There is currently no locking implemented between the AIO completion handler and AIO cancel, so the issue occurs if the completion routine is running in parallel to an AIO cancel call coming from the FFS application. As the completion call frees the USB request (io_data->req) the FFS application is also referencing it for the usb_ep_dequeue() call. This can lead to accessing a stale/hanging pointer.
commit b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently") relocated the usb_ep_free_request() into ffs_epfile_async_io_complete(). However, in order to properly implement locking to mitigate this issue, the spinlock can't be added to ffs_epfile_async_io_complete(), as usb_ep_dequeue() (if successfully dequeuing a USB request) will call the function driver's completion handler in the same context. Hence, leading into a deadlock.
Fix this issue by moving the usb_ep_free_request() back to ffs_user_copy_worker(), and ensuring that it explicitly sets io_data->req to NULL after freeing it within the ffs->eps_lock. This resolves the race condition above, as the ffs_aio_cancel() routine will not continue attempting to dequeue a request that has already been freed, or the ffs_user_copy_work() not freeing the USB request until the AIO cancel is done referencing it.
This fix depends on commit b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently")
Fixes: 2e4c7553cd6f ("usb: gadget: f_fs: add aio support") Cc: stable stable@kernel.org # b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently") Signed-off-by: Wesley Cheng quic_wcheng@quicinc.com Link: https://lore.kernel.org/r/20240409014059.6740-1-quic_wcheng@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_fs.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index d32e1ece3e0a1..698bf24ba44c7 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -832,6 +832,7 @@ static void ffs_user_copy_worker(struct work_struct *work) work); int ret = io_data->status; bool kiocb_has_eventfd = io_data->kiocb->ki_flags & IOCB_EVENTFD; + unsigned long flags;
if (io_data->read && ret > 0) { kthread_use_mm(io_data->mm); @@ -844,6 +845,11 @@ static void ffs_user_copy_worker(struct work_struct *work) if (io_data->ffs->ffs_eventfd && !kiocb_has_eventfd) eventfd_signal(io_data->ffs->ffs_eventfd, 1);
+ spin_lock_irqsave(&io_data->ffs->eps_lock, flags); + usb_ep_free_request(io_data->ep, io_data->req); + io_data->req = NULL; + spin_unlock_irqrestore(&io_data->ffs->eps_lock, flags); + if (io_data->read) kfree(io_data->to_free); ffs_free_buffer(io_data); @@ -859,7 +865,6 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep, ENTER();
io_data->status = req->status ? req->status : req->actual; - usb_ep_free_request(_ep, req);
INIT_WORK(&io_data->work, ffs_user_copy_worker); queue_work(ffs->io_completion_wq, &io_data->work);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luca Ceresoli luca.ceresoli@bootlin.com
[ Upstream commit 4d7c16d08d248952c116f2eb9b7b5abc43a19688 ]
Add OF device table with compatible strings to allow automatic module loading.
Signed-off-by: Luca Ceresoli luca.ceresoli@bootlin.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20231004-mxc4005-device-tree-support-v1-2-e7c0faea... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Stable-dep-of: 6b8cffdc4a31 ("iio: accel: mxc4005: Reset chip on probe() and resume()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/mxc4005.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/iio/accel/mxc4005.c b/drivers/iio/accel/mxc4005.c index ffae30e5eb5be..b8dfdb571bf1f 100644 --- a/drivers/iio/accel/mxc4005.c +++ b/drivers/iio/accel/mxc4005.c @@ -487,6 +487,13 @@ static const struct acpi_device_id mxc4005_acpi_match[] = { }; MODULE_DEVICE_TABLE(acpi, mxc4005_acpi_match);
+static const struct of_device_id mxc4005_of_match[] = { + { .compatible = "memsic,mxc4005", }, + { .compatible = "memsic,mxc6655", }, + { }, +}; +MODULE_DEVICE_TABLE(of, mxc4005_of_match); + static const struct i2c_device_id mxc4005_id[] = { {"mxc4005", 0}, {"mxc6655", 0}, @@ -498,6 +505,7 @@ static struct i2c_driver mxc4005_driver = { .driver = { .name = MXC4005_DRV_NAME, .acpi_match_table = ACPI_PTR(mxc4005_acpi_match), + .of_match_table = mxc4005_of_match, }, .probe = mxc4005_probe, .id_table = mxc4005_id,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 6b8cffdc4a31e4a72f75ecd1bc13fbf0dafee390 ]
On some designs the chip is not properly reset when powered up at boot or after a suspend/resume cycle.
Use the sw-reset feature to ensure that the chip is in a clean state after probe() / resume() and in the case of resume() restore the settings (scale, trigger-enabled).
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218578 Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20240326113700.56725-3-hdegoede@redhat.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/mxc4005.c | 68 +++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+)
diff --git a/drivers/iio/accel/mxc4005.c b/drivers/iio/accel/mxc4005.c index b8dfdb571bf1f..0ae544aaff0cc 100644 --- a/drivers/iio/accel/mxc4005.c +++ b/drivers/iio/accel/mxc4005.c @@ -5,6 +5,7 @@ * Copyright (c) 2014, Intel Corporation. */
+#include <linux/delay.h> #include <linux/module.h> #include <linux/i2c.h> #include <linux/iio/iio.h> @@ -36,6 +37,7 @@
#define MXC4005_REG_INT_CLR1 0x01 #define MXC4005_REG_INT_CLR1_BIT_DRDYC 0x01 +#define MXC4005_REG_INT_CLR1_SW_RST 0x10
#define MXC4005_REG_CONTROL 0x0D #define MXC4005_REG_CONTROL_MASK_FSR GENMASK(6, 5) @@ -43,6 +45,9 @@
#define MXC4005_REG_DEVICE_ID 0x0E
+/* Datasheet does not specify a reset time, this is a conservative guess */ +#define MXC4005_RESET_TIME_US 2000 + enum mxc4005_axis { AXIS_X, AXIS_Y, @@ -66,6 +71,8 @@ struct mxc4005_data { s64 timestamp __aligned(8); } scan; bool trigger_enabled; + unsigned int control; + unsigned int int_mask1; };
/* @@ -349,6 +356,7 @@ static int mxc4005_set_trigger_state(struct iio_trigger *trig, return ret; }
+ data->int_mask1 = val; data->trigger_enabled = state; mutex_unlock(&data->mutex);
@@ -384,6 +392,13 @@ static int mxc4005_chip_init(struct mxc4005_data *data)
dev_dbg(data->dev, "MXC4005 chip id %02x\n", reg);
+ ret = regmap_write(data->regmap, MXC4005_REG_INT_CLR1, + MXC4005_REG_INT_CLR1_SW_RST); + if (ret < 0) + return dev_err_probe(data->dev, ret, "resetting chip\n"); + + fsleep(MXC4005_RESET_TIME_US); + ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK0, 0); if (ret < 0) return dev_err_probe(data->dev, ret, "writing INT_MASK0\n"); @@ -480,6 +495,58 @@ static int mxc4005_probe(struct i2c_client *client, return devm_iio_device_register(&client->dev, indio_dev); }
+static int mxc4005_suspend(struct device *dev) +{ + struct iio_dev *indio_dev = dev_get_drvdata(dev); + struct mxc4005_data *data = iio_priv(indio_dev); + int ret; + + /* Save control to restore it on resume */ + ret = regmap_read(data->regmap, MXC4005_REG_CONTROL, &data->control); + if (ret < 0) + dev_err(data->dev, "failed to read reg_control\n"); + + return ret; +} + +static int mxc4005_resume(struct device *dev) +{ + struct iio_dev *indio_dev = dev_get_drvdata(dev); + struct mxc4005_data *data = iio_priv(indio_dev); + int ret; + + ret = regmap_write(data->regmap, MXC4005_REG_INT_CLR1, + MXC4005_REG_INT_CLR1_SW_RST); + if (ret) { + dev_err(data->dev, "failed to reset chip: %d\n", ret); + return ret; + } + + fsleep(MXC4005_RESET_TIME_US); + + ret = regmap_write(data->regmap, MXC4005_REG_CONTROL, data->control); + if (ret) { + dev_err(data->dev, "failed to restore control register\n"); + return ret; + } + + ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK0, 0); + if (ret) { + dev_err(data->dev, "failed to restore interrupt 0 mask\n"); + return ret; + } + + ret = regmap_write(data->regmap, MXC4005_REG_INT_MASK1, data->int_mask1); + if (ret) { + dev_err(data->dev, "failed to restore interrupt 1 mask\n"); + return ret; + } + + return 0; +} + +static DEFINE_SIMPLE_DEV_PM_OPS(mxc4005_pm_ops, mxc4005_suspend, mxc4005_resume); + static const struct acpi_device_id mxc4005_acpi_match[] = { {"MXC4005", 0}, {"MXC6655", 0}, @@ -506,6 +573,7 @@ static struct i2c_driver mxc4005_driver = { .name = MXC4005_DRV_NAME, .acpi_match_table = ACPI_PTR(mxc4005_acpi_match), .of_match_table = mxc4005_of_match, + .pm = pm_sleep_ptr(&mxc4005_pm_ops), }, .probe = mxc4005_probe, .id_table = mxc4005_id,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 1b6ceeb99ee05eb2c62a9e5512623e63cf8490ba ]
Use <asm/ftrace.h> to prevent a build warning:
arch/xtensa/kernel/stacktrace.c:263:15: warning: no previous prototype for 'return_address' [-Wmissing-prototypes] 263 | unsigned long return_address(unsigned level)
Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Chris Zankel chris@zankel.net Cc: Max Filippov jcmvbkbc@gmail.com Message-Id: 20230920052139.10570-8-rdunlap@infradead.org Signed-off-by: Max Filippov jcmvbkbc@gmail.com Stable-dep-of: 0e60f0b75884 ("xtensa: fix MAKE_PC_FROM_RA second argument") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/xtensa/kernel/stacktrace.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/xtensa/kernel/stacktrace.c b/arch/xtensa/kernel/stacktrace.c index 7f7755cd28f07..dcba743305efe 100644 --- a/arch/xtensa/kernel/stacktrace.c +++ b/arch/xtensa/kernel/stacktrace.c @@ -12,6 +12,7 @@ #include <linux/sched.h> #include <linux/stacktrace.h>
+#include <asm/ftrace.h> #include <asm/stacktrace.h> #include <asm/traps.h> #include <linux/uaccess.h>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Filippov jcmvbkbc@gmail.com
[ Upstream commit 0e60f0b75884677fb9f4f2ad40d52b43451564d5 ]
Xtensa has two-argument MAKE_PC_FROM_RA macro to convert a0 to an actual return address because when windowed ABI is used call{,x}{4,8,12} opcodes stuff encoded window size into the top 2 bits of the register that becomes a return address in the called function. Second argument of that macro is supposed to be an address having these 2 topmost bits set correctly, but the comment suggested that that could be the stack address. However the stack doesn't have to be in the same 1GByte region as the code, especially in noMMU XIP configurations.
Fix the comment and use either _text or regs->pc as the second argument for the MAKE_PC_FROM_RA macro.
Cc: stable@vger.kernel.org Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/xtensa/include/asm/processor.h | 8 ++++---- arch/xtensa/include/asm/ptrace.h | 2 +- arch/xtensa/kernel/process.c | 5 +++-- arch/xtensa/kernel/stacktrace.c | 3 ++- 4 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h index 228e4dff5fb2d..ab555a980147b 100644 --- a/arch/xtensa/include/asm/processor.h +++ b/arch/xtensa/include/asm/processor.h @@ -113,9 +113,9 @@ #define MAKE_RA_FOR_CALL(ra,ws) (((ra) & 0x3fffffff) | (ws) << 30)
/* Convert return address to a valid pc - * Note: We assume that the stack pointer is in the same 1GB ranges as the ra + * Note: 'text' is the address within the same 1GB range as the ra */ -#define MAKE_PC_FROM_RA(ra,sp) (((ra) & 0x3fffffff) | ((sp) & 0xc0000000)) +#define MAKE_PC_FROM_RA(ra, text) (((ra) & 0x3fffffff) | ((unsigned long)(text) & 0xc0000000))
#elif defined(__XTENSA_CALL0_ABI__)
@@ -125,9 +125,9 @@ #define MAKE_RA_FOR_CALL(ra, ws) (ra)
/* Convert return address to a valid pc - * Note: We assume that the stack pointer is in the same 1GB ranges as the ra + * Note: 'text' is not used as 'ra' is always the full address */ -#define MAKE_PC_FROM_RA(ra, sp) (ra) +#define MAKE_PC_FROM_RA(ra, text) (ra)
#else #error Unsupported Xtensa ABI diff --git a/arch/xtensa/include/asm/ptrace.h b/arch/xtensa/include/asm/ptrace.h index 308f209a47407..17c5cbd1832e7 100644 --- a/arch/xtensa/include/asm/ptrace.h +++ b/arch/xtensa/include/asm/ptrace.h @@ -87,7 +87,7 @@ struct pt_regs { # define user_mode(regs) (((regs)->ps & 0x00000020)!=0) # define instruction_pointer(regs) ((regs)->pc) # define return_pointer(regs) (MAKE_PC_FROM_RA((regs)->areg[0], \ - (regs)->areg[1])) + (regs)->pc))
# ifndef CONFIG_SMP # define profile_pc(regs) instruction_pointer(regs) diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c index 68e0e2f06d660..3138f72dcbe2e 100644 --- a/arch/xtensa/kernel/process.c +++ b/arch/xtensa/kernel/process.c @@ -47,6 +47,7 @@ #include <asm/asm-offsets.h> #include <asm/regs.h> #include <asm/hw_breakpoint.h> +#include <asm/sections.h> #include <asm/traps.h>
extern void ret_from_fork(void); @@ -379,7 +380,7 @@ unsigned long __get_wchan(struct task_struct *p) int count = 0;
sp = p->thread.sp; - pc = MAKE_PC_FROM_RA(p->thread.ra, p->thread.sp); + pc = MAKE_PC_FROM_RA(p->thread.ra, _text);
do { if (sp < stack_page + sizeof(struct task_struct) || @@ -391,7 +392,7 @@ unsigned long __get_wchan(struct task_struct *p)
/* Stack layout: sp-4: ra, sp-3: sp' */
- pc = MAKE_PC_FROM_RA(SPILL_SLOT(sp, 0), sp); + pc = MAKE_PC_FROM_RA(SPILL_SLOT(sp, 0), _text); sp = SPILL_SLOT(sp, 1); } while (count++ < 16); return 0; diff --git a/arch/xtensa/kernel/stacktrace.c b/arch/xtensa/kernel/stacktrace.c index dcba743305efe..b69044893287f 100644 --- a/arch/xtensa/kernel/stacktrace.c +++ b/arch/xtensa/kernel/stacktrace.c @@ -13,6 +13,7 @@ #include <linux/stacktrace.h>
#include <asm/ftrace.h> +#include <asm/sections.h> #include <asm/stacktrace.h> #include <asm/traps.h> #include <linux/uaccess.h> @@ -189,7 +190,7 @@ void walk_stackframe(unsigned long *sp, if (a1 <= (unsigned long)sp) break;
- frame.pc = MAKE_PC_FROM_RA(a0, a1); + frame.pc = MAKE_PC_FROM_RA(a0, _text); frame.sp = a1;
if (fn(&frame, data))
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kodanev aleksei.kodanev@bell-sw.com
[ Upstream commit f8e12e770e8049917f82387033b3cf44bc43b915 ]
pipe_ctx pointer cannot be NULL when getting the address of an element of the pipe_ctx array. Moreover, the MAX_PIPES is defined as 6, so pipe_ctx is not NULL after the loop either.
Detected using the static analysis tool - Svace.
Signed-off-by: Alexey Kodanev aleksei.kodanev@bell-sw.com Signed-off-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 892b41b16f61 ("drm/amd/display: Fix incorrect DSC instance for MST") Signed-off-by: Sasha Levin sashal@kernel.org --- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 72 +++++-------------- 1 file changed, 16 insertions(+), 56 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index ff7dd17ad0763..35ea58fbc1d9d 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -1369,16 +1369,11 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -1475,12 +1470,12 @@ static ssize_t dp_dsc_clock_en_write(struct file *f, const char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx || !pipe_ctx->stream) + if (!pipe_ctx->stream) goto done;
// Get CRTC state @@ -1560,16 +1555,11 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -1664,12 +1654,12 @@ static ssize_t dp_dsc_slice_width_write(struct file *f, const char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx || !pipe_ctx->stream) + if (!pipe_ctx->stream) goto done;
// Safely get CRTC state @@ -1749,16 +1739,11 @@ static ssize_t dp_dsc_slice_height_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -1853,12 +1838,12 @@ static ssize_t dp_dsc_slice_height_write(struct file *f, const char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx || !pipe_ctx->stream) + if (!pipe_ctx->stream) goto done;
// Get CRTC state @@ -1934,16 +1919,11 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -2035,12 +2015,12 @@ static ssize_t dp_dsc_bits_per_pixel_write(struct file *f, const char __user *bu
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx || !pipe_ctx->stream) + if (!pipe_ctx->stream) goto done;
// Get CRTC state @@ -2114,16 +2094,11 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -2175,16 +2150,11 @@ static ssize_t dp_dsc_pic_height_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -2251,16 +2221,11 @@ static ssize_t dp_dsc_chunk_size_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state); @@ -2327,16 +2292,11 @@ static ssize_t dp_dsc_slice_bpg_offset_read(struct file *f, char __user *buf,
for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; - if (pipe_ctx && pipe_ctx->stream && + if (pipe_ctx->stream && pipe_ctx->stream->link == aconnector->dc_link) break; }
- if (!pipe_ctx) { - kfree(rd_buf); - return -ENXIO; - } - dsc = pipe_ctx->stream_res.dsc; if (dsc) dsc->funcs->dsc_read_state(dsc, &dsc_state);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hersen Wu hersenxs.wu@amd.com
[ Upstream commit 892b41b16f6163e6556545835abba668fcab4eea ]
[Why] DSC debugfs, such as dp_dsc_clock_en_read, use aconnector->dc_link to find pipe_ctx for display. Displays connected to MST hub share the same dc_link. DSC instance is from pipe_ctx. This causes incorrect DSC instance for display connected to MST hub.
[How] Add aconnector->sink check to find pipe_ctx.
CC: stable@vger.kernel.org Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Hersen Wu hersenxs.wu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 48 ++++++++++++++----- 1 file changed, 36 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 35ea58fbc1d9d..dd34dfcd5af76 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -1370,7 +1370,9 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1471,7 +1473,9 @@ static ssize_t dp_dsc_clock_en_write(struct file *f, const char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1556,7 +1560,9 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1655,7 +1661,9 @@ static ssize_t dp_dsc_slice_width_write(struct file *f, const char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1740,7 +1748,9 @@ static ssize_t dp_dsc_slice_height_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1839,7 +1849,9 @@ static ssize_t dp_dsc_slice_height_write(struct file *f, const char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -1920,7 +1932,9 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -2016,7 +2030,9 @@ static ssize_t dp_dsc_bits_per_pixel_write(struct file *f, const char __user *bu for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -2095,7 +2111,9 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -2151,7 +2169,9 @@ static ssize_t dp_dsc_pic_height_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -2222,7 +2242,9 @@ static ssize_t dp_dsc_chunk_size_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
@@ -2293,7 +2315,9 @@ static ssize_t dp_dsc_slice_bpg_offset_read(struct file *f, char __user *buf, for (i = 0; i < MAX_PIPES; i++) { pipe_ctx = &aconnector->dc_link->dc->current_state->res_ctx.pipe_ctx[i]; if (pipe_ctx->stream && - pipe_ctx->stream->link == aconnector->dc_link) + pipe_ctx->stream->link == aconnector->dc_link && + pipe_ctx->stream->sink && + pipe_ctx->stream->sink == aconnector->dc_sink) break; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 028fe09cda0a0d568e6a7d65b0336d32600b480c ]
DT schema expects TLMM pin configuration nodes to be named with '-state' suffix and their optional children with '-pins' suffix.
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20221006144518.256956-1-krzysztof.kozlowski@linaro... Stable-dep-of: 819fe8c96a51 ("arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sa8155p-adp.dts | 60 ++- .../dts/qcom/sm8150-microsoft-surface-duo.dts | 2 +- arch/arm64/boot/dts/qcom/sm8150.dtsi | 376 ++++++------------ 3 files changed, 157 insertions(+), 281 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts index 4dee790f1049d..01ac460d910ec 100644 --- a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts +++ b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts @@ -488,26 +488,26 @@ &pcie1_phy { &tlmm { gpio-reserved-ranges = <0 4>;
- sdc2_on: sdc2_on { - clk { + sdc2_on: sdc2-on-state { + clk-pins { pins = "sdc2_clk"; bias-disable; /* No pull */ drive-strength = <16>; /* 16 MA */ };
- cmd { + cmd-pins { pins = "sdc2_cmd"; bias-pull-up; /* pull up */ drive-strength = <16>; /* 16 MA */ };
- data { + data-pins { pins = "sdc2_data"; bias-pull-up; /* pull up */ drive-strength = <16>; /* 16 MA */ };
- sd-cd { + sd-cd-pins { pins = "gpio96"; function = "gpio"; bias-pull-up; /* pull up */ @@ -515,26 +515,26 @@ sd-cd { }; };
- sdc2_off: sdc2_off { - clk { + sdc2_off: sdc2-off-state { + clk-pins { pins = "sdc2_clk"; bias-disable; /* No pull */ drive-strength = <2>; /* 2 MA */ };
- cmd { + cmd-pins { pins = "sdc2_cmd"; bias-pull-up; /* pull up */ drive-strength = <2>; /* 2 MA */ };
- data { + data-pins { pins = "sdc2_data"; bias-pull-up; /* pull up */ drive-strength = <2>; /* 2 MA */ };
- sd-cd { + sd-cd-pins { pins = "gpio96"; function = "gpio"; bias-pull-up; /* pull up */ @@ -542,66 +542,62 @@ sd-cd { }; };
- usb2phy_ac_en1_default: usb2phy_ac_en1_default { - mux { - pins = "gpio113"; - function = "usb2phy_ac"; - bias-disable; - drive-strength = <2>; - }; + usb2phy_ac_en1_default: usb2phy-ac-en1-default-state { + pins = "gpio113"; + function = "usb2phy_ac"; + bias-disable; + drive-strength = <2>; };
- usb2phy_ac_en2_default: usb2phy_ac_en2_default { - mux { - pins = "gpio123"; - function = "usb2phy_ac"; - bias-disable; - drive-strength = <2>; - }; + usb2phy_ac_en2_default: usb2phy-ac-en2-default-state { + pins = "gpio123"; + function = "usb2phy_ac"; + bias-disable; + drive-strength = <2>; };
- ethernet_defaults: ethernet-defaults { - mdc { + ethernet_defaults: ethernet-defaults-state { + mdc-pins { pins = "gpio7"; function = "rgmii"; bias-pull-up; };
- mdio { + mdio-pins { pins = "gpio59"; function = "rgmii"; bias-pull-up; };
- rgmii-rx { + rgmii-rx-pins { pins = "gpio117", "gpio118", "gpio119", "gpio120", "gpio115", "gpio116"; function = "rgmii"; bias-disable; drive-strength = <2>; };
- rgmii-tx { + rgmii-tx-pins { pins = "gpio122", "gpio4", "gpio5", "gpio6", "gpio114", "gpio121"; function = "rgmii"; bias-pull-up; drive-strength = <16>; };
- phy-intr { + phy-intr-pins { pins = "gpio124"; function = "emac_phy"; bias-disable; drive-strength = <8>; };
- pps { + pps-pins { pins = "gpio81"; function = "emac_pps"; bias-disable; drive-strength = <8>; };
- phy-reset { + phy-reset-pins { pins = "gpio79"; function = "gpio"; bias-pull-up; diff --git a/arch/arm64/boot/dts/qcom/sm8150-microsoft-surface-duo.dts b/arch/arm64/boot/dts/qcom/sm8150-microsoft-surface-duo.dts index bb278ecac3faf..5397fba9417bb 100644 --- a/arch/arm64/boot/dts/qcom/sm8150-microsoft-surface-duo.dts +++ b/arch/arm64/boot/dts/qcom/sm8150-microsoft-surface-duo.dts @@ -475,7 +475,7 @@ &pon_resin { &tlmm { gpio-reserved-ranges = <126 4>;
- da7280_intr_default: da7280-intr-default { + da7280_intr_default: da7280-intr-default-state { pins = "gpio42"; function = "gpio"; bias-pull-up; diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi index 9dccecd9fcaef..bbd322fc56460 100644 --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi @@ -2284,422 +2284,302 @@ tlmm: pinctrl@3100000 { #interrupt-cells = <2>; wakeup-parent = <&pdc>;
- qup_i2c0_default: qup-i2c0-default { - mux { - pins = "gpio0", "gpio1"; - function = "qup0"; - }; - - config { - pins = "gpio0", "gpio1"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c0_default: qup-i2c0-default-state { + pins = "gpio0", "gpio1"; + function = "qup0"; + drive-strength = <0x02>; + bias-disable; };
- qup_spi0_default: qup-spi0-default { + qup_spi0_default: qup-spi0-default-state { pins = "gpio0", "gpio1", "gpio2", "gpio3"; function = "qup0"; drive-strength = <6>; bias-disable; };
- qup_i2c1_default: qup-i2c1-default { - mux { - pins = "gpio114", "gpio115"; - function = "qup1"; - }; - - config { - pins = "gpio114", "gpio115"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c1_default: qup-i2c1-default-state { + pins = "gpio114", "gpio115"; + function = "qup1"; + drive-strength = <2>; + bias-disable; };
- qup_spi1_default: qup-spi1-default { + qup_spi1_default: qup-spi1-default-state { pins = "gpio114", "gpio115", "gpio116", "gpio117"; function = "qup1"; drive-strength = <6>; bias-disable; };
- qup_i2c2_default: qup-i2c2-default { - mux { - pins = "gpio126", "gpio127"; - function = "qup2"; - }; - - config { - pins = "gpio126", "gpio127"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c2_default: qup-i2c2-default-state { + pins = "gpio126", "gpio127"; + function = "qup2"; + drive-strength = <2>; + bias-disable; };
- qup_spi2_default: qup-spi2-default { + qup_spi2_default: qup-spi2-default-state { pins = "gpio126", "gpio127", "gpio128", "gpio129"; function = "qup2"; drive-strength = <6>; bias-disable; };
- qup_i2c3_default: qup-i2c3-default { - mux { - pins = "gpio144", "gpio145"; - function = "qup3"; - }; - - config { - pins = "gpio144", "gpio145"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c3_default: qup-i2c3-default-state { + pins = "gpio144", "gpio145"; + function = "qup3"; + drive-strength = <2>; + bias-disable; };
- qup_spi3_default: qup-spi3-default { + qup_spi3_default: qup-spi3-default-state { pins = "gpio144", "gpio145", "gpio146", "gpio147"; function = "qup3"; drive-strength = <6>; bias-disable; };
- qup_i2c4_default: qup-i2c4-default { - mux { - pins = "gpio51", "gpio52"; - function = "qup4"; - }; - - config { - pins = "gpio51", "gpio52"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c4_default: qup-i2c4-default-state { + pins = "gpio51", "gpio52"; + function = "qup4"; + drive-strength = <2>; + bias-disable; };
- qup_spi4_default: qup-spi4-default { + qup_spi4_default: qup-spi4-default-state { pins = "gpio51", "gpio52", "gpio53", "gpio54"; function = "qup4"; drive-strength = <6>; bias-disable; };
- qup_i2c5_default: qup-i2c5-default { - mux { - pins = "gpio121", "gpio122"; - function = "qup5"; - }; - - config { - pins = "gpio121", "gpio122"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c5_default: qup-i2c5-default-state { + pins = "gpio121", "gpio122"; + function = "qup5"; + drive-strength = <2>; + bias-disable; };
- qup_spi5_default: qup-spi5-default { + qup_spi5_default: qup-spi5-default-state { pins = "gpio119", "gpio120", "gpio121", "gpio122"; function = "qup5"; drive-strength = <6>; bias-disable; };
- qup_i2c6_default: qup-i2c6-default { - mux { - pins = "gpio6", "gpio7"; - function = "qup6"; - }; - - config { - pins = "gpio6", "gpio7"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c6_default: qup-i2c6-default-state { + pins = "gpio6", "gpio7"; + function = "qup6"; + drive-strength = <2>; + bias-disable; };
- qup_spi6_default: qup-spi6_default { + qup_spi6_default: qup-spi6_default-state { pins = "gpio4", "gpio5", "gpio6", "gpio7"; function = "qup6"; drive-strength = <6>; bias-disable; };
- qup_i2c7_default: qup-i2c7-default { - mux { - pins = "gpio98", "gpio99"; - function = "qup7"; - }; - - config { - pins = "gpio98", "gpio99"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c7_default: qup-i2c7-default-state { + pins = "gpio98", "gpio99"; + function = "qup7"; + drive-strength = <2>; + bias-disable; };
- qup_spi7_default: qup-spi7_default { + qup_spi7_default: qup-spi7_default-state { pins = "gpio98", "gpio99", "gpio100", "gpio101"; function = "qup7"; drive-strength = <6>; bias-disable; };
- qup_i2c8_default: qup-i2c8-default { - mux { - pins = "gpio88", "gpio89"; - function = "qup8"; - }; - - config { - pins = "gpio88", "gpio89"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c8_default: qup-i2c8-default-state { + pins = "gpio88", "gpio89"; + function = "qup8"; + drive-strength = <2>; + bias-disable; };
- qup_spi8_default: qup-spi8-default { + qup_spi8_default: qup-spi8-default-state { pins = "gpio88", "gpio89", "gpio90", "gpio91"; function = "qup8"; drive-strength = <6>; bias-disable; };
- qup_i2c9_default: qup-i2c9-default { - mux { - pins = "gpio39", "gpio40"; - function = "qup9"; - }; - - config { - pins = "gpio39", "gpio40"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c9_default: qup-i2c9-default-state { + pins = "gpio39", "gpio40"; + function = "qup9"; + drive-strength = <2>; + bias-disable; };
- qup_spi9_default: qup-spi9-default { + qup_spi9_default: qup-spi9-default-state { pins = "gpio39", "gpio40", "gpio41", "gpio42"; function = "qup9"; drive-strength = <6>; bias-disable; };
- qup_i2c10_default: qup-i2c10-default { - mux { - pins = "gpio9", "gpio10"; - function = "qup10"; - }; - - config { - pins = "gpio9", "gpio10"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c10_default: qup-i2c10-default-state { + pins = "gpio9", "gpio10"; + function = "qup10"; + drive-strength = <2>; + bias-disable; };
- qup_spi10_default: qup-spi10-default { + qup_spi10_default: qup-spi10-default-state { pins = "gpio9", "gpio10", "gpio11", "gpio12"; function = "qup10"; drive-strength = <6>; bias-disable; };
- qup_i2c11_default: qup-i2c11-default { - mux { - pins = "gpio94", "gpio95"; - function = "qup11"; - }; - - config { - pins = "gpio94", "gpio95"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c11_default: qup-i2c11-default-state { + pins = "gpio94", "gpio95"; + function = "qup11"; + drive-strength = <2>; + bias-disable; };
- qup_spi11_default: qup-spi11-default { + qup_spi11_default: qup-spi11-default-state { pins = "gpio92", "gpio93", "gpio94", "gpio95"; function = "qup11"; drive-strength = <6>; bias-disable; };
- qup_i2c12_default: qup-i2c12-default { - mux { - pins = "gpio83", "gpio84"; - function = "qup12"; - }; - - config { - pins = "gpio83", "gpio84"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c12_default: qup-i2c12-default-state { + pins = "gpio83", "gpio84"; + function = "qup12"; + drive-strength = <2>; + bias-disable; };
- qup_spi12_default: qup-spi12-default { + qup_spi12_default: qup-spi12-default-state { pins = "gpio83", "gpio84", "gpio85", "gpio86"; function = "qup12"; drive-strength = <6>; bias-disable; };
- qup_i2c13_default: qup-i2c13-default { - mux { - pins = "gpio43", "gpio44"; - function = "qup13"; - }; - - config { - pins = "gpio43", "gpio44"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c13_default: qup-i2c13-default-state { + pins = "gpio43", "gpio44"; + function = "qup13"; + drive-strength = <2>; + bias-disable; };
- qup_spi13_default: qup-spi13-default { + qup_spi13_default: qup-spi13-default-state { pins = "gpio43", "gpio44", "gpio45", "gpio46"; function = "qup13"; drive-strength = <6>; bias-disable; };
- qup_i2c14_default: qup-i2c14-default { - mux { - pins = "gpio47", "gpio48"; - function = "qup14"; - }; - - config { - pins = "gpio47", "gpio48"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c14_default: qup-i2c14-default-state { + pins = "gpio47", "gpio48"; + function = "qup14"; + drive-strength = <2>; + bias-disable; };
- qup_spi14_default: qup-spi14-default { + qup_spi14_default: qup-spi14-default-state { pins = "gpio47", "gpio48", "gpio49", "gpio50"; function = "qup14"; drive-strength = <6>; bias-disable; };
- qup_i2c15_default: qup-i2c15-default { - mux { - pins = "gpio27", "gpio28"; - function = "qup15"; - }; - - config { - pins = "gpio27", "gpio28"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c15_default: qup-i2c15-default-state { + pins = "gpio27", "gpio28"; + function = "qup15"; + drive-strength = <2>; + bias-disable; };
- qup_spi15_default: qup-spi15-default { + qup_spi15_default: qup-spi15-default-state { pins = "gpio27", "gpio28", "gpio29", "gpio30"; function = "qup15"; drive-strength = <6>; bias-disable; };
- qup_i2c16_default: qup-i2c16-default { - mux { - pins = "gpio86", "gpio85"; - function = "qup16"; - }; - - config { - pins = "gpio86", "gpio85"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c16_default: qup-i2c16-default-state { + pins = "gpio86", "gpio85"; + function = "qup16"; + drive-strength = <2>; + bias-disable; };
- qup_spi16_default: qup-spi16-default { + qup_spi16_default: qup-spi16-default-state { pins = "gpio83", "gpio84", "gpio85", "gpio86"; function = "qup16"; drive-strength = <6>; bias-disable; };
- qup_i2c17_default: qup-i2c17-default { - mux { - pins = "gpio55", "gpio56"; - function = "qup17"; - }; - - config { - pins = "gpio55", "gpio56"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c17_default: qup-i2c17-default-state { + pins = "gpio55", "gpio56"; + function = "qup17"; + drive-strength = <2>; + bias-disable; };
- qup_spi17_default: qup-spi17-default { + qup_spi17_default: qup-spi17-default-state { pins = "gpio55", "gpio56", "gpio57", "gpio58"; function = "qup17"; drive-strength = <6>; bias-disable; };
- qup_i2c18_default: qup-i2c18-default { - mux { - pins = "gpio23", "gpio24"; - function = "qup18"; - }; - - config { - pins = "gpio23", "gpio24"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c18_default: qup-i2c18-default-state { + pins = "gpio23", "gpio24"; + function = "qup18"; + drive-strength = <2>; + bias-disable; };
- qup_spi18_default: qup-spi18-default { + qup_spi18_default: qup-spi18-default-state { pins = "gpio23", "gpio24", "gpio25", "gpio26"; function = "qup18"; drive-strength = <6>; bias-disable; };
- qup_i2c19_default: qup-i2c19-default { - mux { - pins = "gpio57", "gpio58"; - function = "qup19"; - }; - - config { - pins = "gpio57", "gpio58"; - drive-strength = <0x02>; - bias-disable; - }; + qup_i2c19_default: qup-i2c19-default-state { + pins = "gpio57", "gpio58"; + function = "qup19"; + drive-strength = <2>; + bias-disable; };
- qup_spi19_default: qup-spi19-default { + qup_spi19_default: qup-spi19-default-state { pins = "gpio55", "gpio56", "gpio57", "gpio58"; function = "qup19"; drive-strength = <6>; bias-disable; };
- pcie0_default_state: pcie0-default { - perst { + pcie0_default_state: pcie0-default-state { + perst-pins { pins = "gpio35"; function = "gpio"; drive-strength = <2>; bias-pull-down; };
- clkreq { + clkreq-pins { pins = "gpio36"; function = "pci_e0"; drive-strength = <2>; bias-pull-up; };
- wake { + wake-pins { pins = "gpio37"; function = "gpio"; drive-strength = <2>; @@ -2707,22 +2587,22 @@ wake { }; };
- pcie1_default_state: pcie1-default { - perst { + pcie1_default_state: pcie1-default-state { + perst-pins { pins = "gpio102"; function = "gpio"; drive-strength = <2>; bias-pull-down; };
- clkreq { + clkreq-pins { pins = "gpio103"; function = "pci_e1"; drive-strength = <2>; bias-pull-up; };
- wake { + wake-pins { pins = "gpio104"; function = "gpio"; drive-strength = <2>;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Volodymyr Babchuk Volodymyr_Babchuk@epam.com
[ Upstream commit 819fe8c96a5172dfd960e5945e8f00f8fed32953 ]
There are two issues with SDHC2 configuration for SA8155P-ADP, which prevent use of SDHC2 and causes issues with ethernet:
- Card Detect pin for SHDC2 on SA8155P-ADP is connected to gpio4 of PMM8155AU_1, not to SoC itself. SoC's gpio4 is used for DWMAC TX. If sdhc driver probes after dwmac driver, it reconfigures gpio4 and this breaks Ethernet MAC.
- pinctrl configuration mentions gpio96 as CD pin. It seems it was copied from some SM8150 example, because as mentioned above, correct CD pin is gpio4 on PMM8155AU_1.
This patch fixes both mentioned issues by providing correct pin handle and pinctrl configuration.
Fixes: 0deb2624e2d0 ("arm64: dts: qcom: sa8155p-adp: Add support for uSD card") Cc: stable@vger.kernel.org Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com Reviewed-by: Stephan Gerhold stephan@gerhold.net Link: https://lore.kernel.org/r/20240412190310.1647893-1-volodymyr_babchuk@epam.co... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sa8155p-adp.dts | 30 ++++++++++-------------- 1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts index 01ac460d910ec..cbec4c9f31025 100644 --- a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts +++ b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts @@ -372,6 +372,16 @@ rgmii_phy: phy@7 { }; };
+&pmm8155au_1_gpios { + pmm8155au_1_sdc2_cd: sdc2-cd-default-state { + pins = "gpio4"; + function = "normal"; + input-enable; + bias-pull-up; + power-source = <0>; + }; +}; + &qupv3_id_1 { status = "okay"; }; @@ -389,10 +399,10 @@ &remoteproc_cdsp { &sdhc_2 { status = "okay";
- cd-gpios = <&tlmm 4 GPIO_ACTIVE_LOW>; + cd-gpios = <&pmm8155au_1_gpios 4 GPIO_ACTIVE_LOW>; pinctrl-names = "default", "sleep"; - pinctrl-0 = <&sdc2_on>; - pinctrl-1 = <&sdc2_off>; + pinctrl-0 = <&sdc2_on &pmm8155au_1_sdc2_cd>; + pinctrl-1 = <&sdc2_off &pmm8155au_1_sdc2_cd>; vqmmc-supply = <&vreg_l13c_2p96>; /* IO line power */ vmmc-supply = <&vreg_l17a_2p96>; /* Card power line */ bus-width = <4>; @@ -506,13 +516,6 @@ data-pins { bias-pull-up; /* pull up */ drive-strength = <16>; /* 16 MA */ }; - - sd-cd-pins { - pins = "gpio96"; - function = "gpio"; - bias-pull-up; /* pull up */ - drive-strength = <2>; /* 2 MA */ - }; };
sdc2_off: sdc2-off-state { @@ -533,13 +536,6 @@ data-pins { bias-pull-up; /* pull up */ drive-strength = <2>; /* 2 MA */ }; - - sd-cd-pins { - pins = "gpio96"; - function = "gpio"; - bias-pull-up; /* pull up */ - drive-strength = <2>; /* 2 MA */ - }; };
usb2phy_ac_en1_default: usb2phy-ac-en1-default-state {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit c1426d392aebc51da4944d950d89e483e43f6f14 ]
pvpanic-mmio.c and pvpanic-pci.c share a lot of code. Refactor it into pvpanic.c where it doesn't have to be kept in sync manually and where the core logic can be understood more easily.
No functional change.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Link: https://lore.kernel.org/r/20231011-pvpanic-cleanup-v2-1-4b21d56f779f@weisssc... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: ee59be35d7a8 ("misc/pvpanic-pci: register attributes via pci_driver") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/pvpanic/pvpanic-mmio.c | 58 +--------------------- drivers/misc/pvpanic/pvpanic-pci.c | 58 +--------------------- drivers/misc/pvpanic/pvpanic.c | 76 ++++++++++++++++++++++++++++- drivers/misc/pvpanic/pvpanic.h | 10 +--- 4 files changed, 80 insertions(+), 122 deletions(-)
diff --git a/drivers/misc/pvpanic/pvpanic-mmio.c b/drivers/misc/pvpanic/pvpanic-mmio.c index eb97167c03fb4..9715798acce3d 100644 --- a/drivers/misc/pvpanic/pvpanic-mmio.c +++ b/drivers/misc/pvpanic/pvpanic-mmio.c @@ -24,52 +24,9 @@ MODULE_AUTHOR("Hu Tao hutao@cn.fujitsu.com"); MODULE_DESCRIPTION("pvpanic-mmio device driver"); MODULE_LICENSE("GPL");
-static ssize_t capability_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - - return sysfs_emit(buf, "%x\n", pi->capability); -} -static DEVICE_ATTR_RO(capability); - -static ssize_t events_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - - return sysfs_emit(buf, "%x\n", pi->events); -} - -static ssize_t events_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - unsigned int tmp; - int err; - - err = kstrtouint(buf, 16, &tmp); - if (err) - return err; - - if ((tmp & pi->capability) != tmp) - return -EINVAL; - - pi->events = tmp; - - return count; -} -static DEVICE_ATTR_RW(events); - -static struct attribute *pvpanic_mmio_dev_attrs[] = { - &dev_attr_capability.attr, - &dev_attr_events.attr, - NULL -}; -ATTRIBUTE_GROUPS(pvpanic_mmio_dev); - static int pvpanic_mmio_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; - struct pvpanic_instance *pi; struct resource *res; void __iomem *base;
@@ -92,18 +49,7 @@ static int pvpanic_mmio_probe(struct platform_device *pdev) return -EINVAL; }
- pi = devm_kmalloc(dev, sizeof(*pi), GFP_KERNEL); - if (!pi) - return -ENOMEM; - - pi->base = base; - pi->capability = PVPANIC_PANICKED | PVPANIC_CRASH_LOADED; - - /* initialize capability by RDPT */ - pi->capability &= ioread8(base); - pi->events = pi->capability; - - return devm_pvpanic_probe(dev, pi); + return devm_pvpanic_probe(dev, base); }
static const struct of_device_id pvpanic_mmio_match[] = { @@ -123,7 +69,7 @@ static struct platform_driver pvpanic_mmio_driver = { .name = "pvpanic-mmio", .of_match_table = pvpanic_mmio_match, .acpi_match_table = pvpanic_device_ids, - .dev_groups = pvpanic_mmio_dev_groups, + .dev_groups = pvpanic_dev_groups, }, .probe = pvpanic_mmio_probe, }; diff --git a/drivers/misc/pvpanic/pvpanic-pci.c b/drivers/misc/pvpanic/pvpanic-pci.c index 07eddb5ea30fa..689af4c28c2a9 100644 --- a/drivers/misc/pvpanic/pvpanic-pci.c +++ b/drivers/misc/pvpanic/pvpanic-pci.c @@ -22,51 +22,8 @@ MODULE_AUTHOR("Mihai Carabas mihai.carabas@oracle.com"); MODULE_DESCRIPTION("pvpanic device driver"); MODULE_LICENSE("GPL");
-static ssize_t capability_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - - return sysfs_emit(buf, "%x\n", pi->capability); -} -static DEVICE_ATTR_RO(capability); - -static ssize_t events_show(struct device *dev, struct device_attribute *attr, char *buf) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - - return sysfs_emit(buf, "%x\n", pi->events); -} - -static ssize_t events_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t count) -{ - struct pvpanic_instance *pi = dev_get_drvdata(dev); - unsigned int tmp; - int err; - - err = kstrtouint(buf, 16, &tmp); - if (err) - return err; - - if ((tmp & pi->capability) != tmp) - return -EINVAL; - - pi->events = tmp; - - return count; -} -static DEVICE_ATTR_RW(events); - -static struct attribute *pvpanic_pci_dev_attrs[] = { - &dev_attr_capability.attr, - &dev_attr_events.attr, - NULL -}; -ATTRIBUTE_GROUPS(pvpanic_pci_dev); - static int pvpanic_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { - struct pvpanic_instance *pi; void __iomem *base; int ret;
@@ -78,18 +35,7 @@ static int pvpanic_pci_probe(struct pci_dev *pdev, const struct pci_device_id *e if (!base) return -ENOMEM;
- pi = devm_kmalloc(&pdev->dev, sizeof(*pi), GFP_KERNEL); - if (!pi) - return -ENOMEM; - - pi->base = base; - pi->capability = PVPANIC_PANICKED | PVPANIC_CRASH_LOADED; - - /* initlize capability by RDPT */ - pi->capability &= ioread8(base); - pi->events = pi->capability; - - return devm_pvpanic_probe(&pdev->dev, pi); + return devm_pvpanic_probe(&pdev->dev, base); }
static const struct pci_device_id pvpanic_pci_id_tbl[] = { @@ -103,7 +49,7 @@ static struct pci_driver pvpanic_pci_driver = { .id_table = pvpanic_pci_id_tbl, .probe = pvpanic_pci_probe, .driver = { - .dev_groups = pvpanic_pci_dev_groups, + .dev_groups = pvpanic_dev_groups, }, }; module_pci_driver(pvpanic_pci_driver); diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c index 049a120063489..305b367e0ce34 100644 --- a/drivers/misc/pvpanic/pvpanic.c +++ b/drivers/misc/pvpanic/pvpanic.c @@ -7,6 +7,7 @@ * Copyright (C) 2021 Oracle. */
+#include <linux/device.h> #include <linux/io.h> #include <linux/kernel.h> #include <linux/kexec.h> @@ -26,6 +27,13 @@ MODULE_AUTHOR("Mihai Carabas mihai.carabas@oracle.com"); MODULE_DESCRIPTION("pvpanic device driver"); MODULE_LICENSE("GPL");
+struct pvpanic_instance { + void __iomem *base; + unsigned int capability; + unsigned int events; + struct list_head list; +}; + static struct list_head pvpanic_list; static spinlock_t pvpanic_lock;
@@ -81,11 +89,75 @@ static void pvpanic_remove(void *param) spin_unlock(&pvpanic_lock); }
-int devm_pvpanic_probe(struct device *dev, struct pvpanic_instance *pi) +static ssize_t capability_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct pvpanic_instance *pi = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%x\n", pi->capability); +} +static DEVICE_ATTR_RO(capability); + +static ssize_t events_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct pvpanic_instance *pi = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%x\n", pi->events); +} + +static ssize_t events_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct pvpanic_instance *pi = dev_get_drvdata(dev); + unsigned int tmp; + int err; + + err = kstrtouint(buf, 16, &tmp); + if (err) + return err; + + if ((tmp & pi->capability) != tmp) + return -EINVAL; + + pi->events = tmp; + + return count; +} +static DEVICE_ATTR_RW(events); + +static struct attribute *pvpanic_dev_attrs[] = { + &dev_attr_capability.attr, + &dev_attr_events.attr, + NULL +}; + +static const struct attribute_group pvpanic_dev_group = { + .attrs = pvpanic_dev_attrs, +}; + +const struct attribute_group *pvpanic_dev_groups[] = { + &pvpanic_dev_group, + NULL +}; +EXPORT_SYMBOL_GPL(pvpanic_dev_groups); + +int devm_pvpanic_probe(struct device *dev, void __iomem *base) { - if (!pi || !pi->base) + struct pvpanic_instance *pi; + + if (!base) return -EINVAL;
+ pi = devm_kmalloc(dev, sizeof(*pi), GFP_KERNEL); + if (!pi) + return -ENOMEM; + + pi->base = base; + pi->capability = PVPANIC_PANICKED | PVPANIC_CRASH_LOADED; + + /* initlize capability by RDPT */ + pi->capability &= ioread8(base); + pi->events = pi->capability; + spin_lock(&pvpanic_lock); list_add(&pi->list, &pvpanic_list); spin_unlock(&pvpanic_lock); diff --git a/drivers/misc/pvpanic/pvpanic.h b/drivers/misc/pvpanic/pvpanic.h index 4935459517548..46ffb10438adf 100644 --- a/drivers/misc/pvpanic/pvpanic.h +++ b/drivers/misc/pvpanic/pvpanic.h @@ -8,13 +8,7 @@ #ifndef PVPANIC_H_ #define PVPANIC_H_
-struct pvpanic_instance { - void __iomem *base; - unsigned int capability; - unsigned int events; - struct list_head list; -}; - -int devm_pvpanic_probe(struct device *dev, struct pvpanic_instance *pi); +int devm_pvpanic_probe(struct device *dev, void __iomem *base); +extern const struct attribute_group *pvpanic_dev_groups[];
#endif /* PVPANIC_H_ */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit ee59be35d7a8be7fcaa2d61fb89734ab5c25e4ee ]
In __pci_register_driver(), the pci core overwrites the dev_groups field of the embedded struct device_driver with the dev_groups from the outer struct pci_driver unconditionally.
Set dev_groups in the pci_driver to make sure it is used.
This was broken since the introduction of pvpanic-pci.
Fixes: db3a4f0abefd ("misc/pvpanic: add PCI driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh linux@weissschuh.net Fixes: ded13b9cfd59 ("PCI: Add support for dev_groups to struct pci_driver") Link: https://lore.kernel.org/r/20240411-pvpanic-pci-dev-groups-v1-1-db8cb69f1b09@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/pvpanic/pvpanic-pci.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/misc/pvpanic/pvpanic-pci.c b/drivers/misc/pvpanic/pvpanic-pci.c index 689af4c28c2a9..2494725dfacfa 100644 --- a/drivers/misc/pvpanic/pvpanic-pci.c +++ b/drivers/misc/pvpanic/pvpanic-pci.c @@ -48,8 +48,6 @@ static struct pci_driver pvpanic_pci_driver = { .name = "pvpanic-pci", .id_table = pvpanic_pci_id_tbl, .probe = pvpanic_pci_probe, - .driver = { - .dev_groups = pvpanic_dev_groups, - }, + .dev_groups = pvpanic_dev_groups, }; module_pci_driver(pvpanic_pci_driver);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
[ Upstream commit 2e57cefc4477659527f7adab1f87cdbf60ef1ae6 ]
To better show why the limit is what it is, since we have only 16 bits for the divisor.
Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Suggested-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Link: https://lore.kernel.org/r/20231221231823.2327894-13-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 8492bd91aa05 ("serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/sc16is7xx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index e6eedebf67765..08da7cc221d0e 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -489,7 +489,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud) u8 prescaler = 0; unsigned long clk = port->uartclk, div = clk / 16 / baud;
- if (div > 0xffff) { + if (div >= BIT(16)) { prescaler = SC16IS7XX_MCR_CLKSEL_BIT; div /= 4; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
[ Upstream commit 8492bd91aa055907c67ef04f2b56f6dadd1f44bf ]
When using a high speed clock with a low baud rate, the 4x prescaler is automatically selected if required. In that case, sc16is7xx_set_baud() properly configures the chip registers, but returns an incorrect baud rate by not taking into account the prescaler value. This incorrect baud rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50, sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected prescaler value instead of the MCR bitfield. Then properly take into account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Reviewed-by: Jiri Slaby jirislaby@kernel.org Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/sc16is7xx.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index 08da7cc221d0e..a723df9b37dd9 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -482,16 +482,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg) return reg == SC16IS7XX_RHR_REG; }
+/* + * Configure programmable baud rate generator (divisor) according to the + * desired baud rate. + * + * From the datasheet, the divisor is computed according to: + * + * XTAL1 input frequency + * ----------------------- + * prescaler + * divisor = --------------------------- + * baud-rate x sampling-rate + */ static int sc16is7xx_set_baud(struct uart_port *port, int baud) { struct sc16is7xx_one *one = to_sc16is7xx_one(port, port); u8 lcr; - u8 prescaler = 0; + unsigned int prescaler = 1; unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) { - prescaler = SC16IS7XX_MCR_CLKSEL_BIT; - div /= 4; + prescaler = 4; + div /= prescaler; }
/* In an amazing feat of design, the Enhanced Features Register shares @@ -528,9 +540,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
mutex_unlock(&one->efr_lock);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */ sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_CLKSEL_BIT, - prescaler); + prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Open the LCR divisors for configuration */ sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, @@ -545,7 +558,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud) /* Put LCR back to the normal mode */ sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div); + return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div); }
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 55c421b364482b61c4c45313a535e61ed5ae4ea3 ]
Using __exit for the remove function results in the remove callback being discarded with CONFIG_MMC_DAVINCI=y. When such a device gets unbound (e.g. using sysfs or hotplug), the driver is just removed without the cleanup being performed. This results in resource leaks. Fix it by compiling in the remove callback unconditionally.
This also fixes a W=1 modpost warning:
WARNING: modpost: drivers/mmc/host/davinci_mmc: section mismatch in reference: davinci_mmcsd_driver+0x10 (section: .data) -> davinci_mmcsd_remove (section: .exit.text)
Fixes: b4cff4549b7a ("DaVinci: MMC: MMC/SD controller driver for DaVinci family") Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240324114017.231936-2-u.kleine-koenig@pengutroni... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/davinci_mmc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c index 7138dfa065bfa..e89a97b415154 100644 --- a/drivers/mmc/host/davinci_mmc.c +++ b/drivers/mmc/host/davinci_mmc.c @@ -1345,7 +1345,7 @@ static int davinci_mmcsd_probe(struct platform_device *pdev) return ret; }
-static int __exit davinci_mmcsd_remove(struct platform_device *pdev) +static int davinci_mmcsd_remove(struct platform_device *pdev) { struct mmc_davinci_host *host = platform_get_drvdata(pdev);
@@ -1402,7 +1402,7 @@ static struct platform_driver davinci_mmcsd_driver = { .of_match_table = davinci_mmc_dt_ids, }, .probe = davinci_mmcsd_probe, - .remove = __exit_p(davinci_mmcsd_remove), + .remove = davinci_mmcsd_remove, .id_table = davinci_mmc_devtype, };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos j4g8y7@gmail.com
[ Upstream commit 0c50b7fcf2773b4853e83fc15aba1a196ba95966 ]
There are several functions which are calling qcom_scm_bw_enable() then returns immediately if the call fails and leaves the clocks enabled.
Change the code of these functions to disable clocks when the qcom_scm_bw_enable() call fails. This also fixes a possible dma buffer leak in the qcom_scm_pas_init_image() function.
Compile tested only due to lack of hardware with interconnect support.
Cc: stable@vger.kernel.org Fixes: 65b7ebda5028 ("firmware: qcom_scm: Add bw voting support to the SCM interface") Signed-off-by: Gabor Juhos j4g8y7@gmail.com Reviewed-by: Mukesh Ojha quic_mojha@quicinc.com Link: https://lore.kernel.org/r/20240304-qcom-scm-disable-clk-v1-1-b36e51577ca1@gm... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/qcom_scm.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c index 58f1a86065dc9..619cd6548cf64 100644 --- a/drivers/firmware/qcom_scm.c +++ b/drivers/firmware/qcom_scm.c @@ -495,13 +495,14 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
ret = qcom_scm_bw_enable(); if (ret) - return ret; + goto disable_clk;
desc.args[1] = mdata_phys;
ret = qcom_scm_call(__scm->dev, &desc, &res); - qcom_scm_bw_disable(); + +disable_clk: qcom_scm_clk_disable();
out: @@ -563,10 +564,12 @@ int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr, phys_addr_t size)
ret = qcom_scm_bw_enable(); if (ret) - return ret; + goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res); qcom_scm_bw_disable(); + +disable_clk: qcom_scm_clk_disable();
return ret ? : res.result[0]; @@ -598,10 +601,12 @@ int qcom_scm_pas_auth_and_reset(u32 peripheral)
ret = qcom_scm_bw_enable(); if (ret) - return ret; + goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res); qcom_scm_bw_disable(); + +disable_clk: qcom_scm_clk_disable();
return ret ? : res.result[0]; @@ -632,11 +637,12 @@ int qcom_scm_pas_shutdown(u32 peripheral)
ret = qcom_scm_bw_enable(); if (ret) - return ret; + goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res); - qcom_scm_bw_disable(); + +disable_clk: qcom_scm_clk_disable();
return ret ? : res.result[0];
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Yang yangcong5@huaqin.corp-partner.google.com
[ Upstream commit f2f43bf15d7aa3286eced18d5199ee579e2c2614 ]
The ili9882t is a TDDI IC (Touch with Display Driver). The datasheet specifies there should be 60ms between touch SDA sleep and panel RESX. Doug's series[1] allows panels and touchscreens to power on/off together, so we can add the 65 ms delay in i2c_hid_core_suspend before panel_unprepare.
Because ili9882t touchscrgeen is a panel follower, and needs to use vccio-supply instead of vcc33-supply, so set it NULL to ili9882t_chip_data, then not use vcc33 regulator.
[1]: https://lore.kernel.org/all/20230727171750.633410-1-dianders@chromium.org
Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Cong Yang yangcong5@huaqin.corp-partner.google.com Acked-by: Benjamin Tissoires bentiss@kernel.org Link: https://lore.kernel.org/r/20230802071947.1683318-3-yangcong5@huaqin.corp-par... Signed-off-by: Benjamin Tissoires bentiss@kernel.org Stable-dep-of: 0eafc58f2194 ("HID: i2c-hid: elan: fix reset suspend current leakage") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/i2c-hid/i2c-hid-of-elan.c | 50 ++++++++++++++++++++------- 1 file changed, 38 insertions(+), 12 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-of-elan.c b/drivers/hid/i2c-hid/i2c-hid-of-elan.c index 2d991325e734c..35986e8297095 100644 --- a/drivers/hid/i2c-hid/i2c-hid-of-elan.c +++ b/drivers/hid/i2c-hid/i2c-hid-of-elan.c @@ -18,9 +18,11 @@ #include "i2c-hid.h"
struct elan_i2c_hid_chip_data { - unsigned int post_gpio_reset_delay_ms; + unsigned int post_gpio_reset_on_delay_ms; + unsigned int post_gpio_reset_off_delay_ms; unsigned int post_power_delay_ms; u16 hid_descriptor_address; + const char *main_supply_name; };
struct i2c_hid_of_elan { @@ -38,9 +40,11 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops) container_of(ops, struct i2c_hid_of_elan, ops); int ret;
- ret = regulator_enable(ihid_elan->vcc33); - if (ret) - return ret; + if (ihid_elan->vcc33) { + ret = regulator_enable(ihid_elan->vcc33); + if (ret) + return ret; + }
ret = regulator_enable(ihid_elan->vccio); if (ret) { @@ -52,8 +56,8 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops) msleep(ihid_elan->chip_data->post_power_delay_ms);
gpiod_set_value_cansleep(ihid_elan->reset_gpio, 0); - if (ihid_elan->chip_data->post_gpio_reset_delay_ms) - msleep(ihid_elan->chip_data->post_gpio_reset_delay_ms); + if (ihid_elan->chip_data->post_gpio_reset_on_delay_ms) + msleep(ihid_elan->chip_data->post_gpio_reset_on_delay_ms);
return 0; } @@ -64,8 +68,12 @@ static void elan_i2c_hid_power_down(struct i2chid_ops *ops) container_of(ops, struct i2c_hid_of_elan, ops);
gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1); + if (ihid_elan->chip_data->post_gpio_reset_off_delay_ms) + msleep(ihid_elan->chip_data->post_gpio_reset_off_delay_ms); + regulator_disable(ihid_elan->vccio); - regulator_disable(ihid_elan->vcc33); + if (ihid_elan->vcc33) + regulator_disable(ihid_elan->vcc33); }
static int i2c_hid_of_elan_probe(struct i2c_client *client, @@ -90,24 +98,42 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client, if (IS_ERR(ihid_elan->vccio)) return PTR_ERR(ihid_elan->vccio);
- ihid_elan->vcc33 = devm_regulator_get(&client->dev, "vcc33"); - if (IS_ERR(ihid_elan->vcc33)) - return PTR_ERR(ihid_elan->vcc33); - ihid_elan->chip_data = device_get_match_data(&client->dev);
+ if (ihid_elan->chip_data->main_supply_name) { + ihid_elan->vcc33 = devm_regulator_get(&client->dev, + ihid_elan->chip_data->main_supply_name); + if (IS_ERR(ihid_elan->vcc33)) + return PTR_ERR(ihid_elan->vcc33); + } + return i2c_hid_core_probe(client, &ihid_elan->ops, ihid_elan->chip_data->hid_descriptor_address, 0); }
static const struct elan_i2c_hid_chip_data elan_ekth6915_chip_data = { .post_power_delay_ms = 1, - .post_gpio_reset_delay_ms = 300, + .post_gpio_reset_on_delay_ms = 300, + .hid_descriptor_address = 0x0001, + .main_supply_name = "vcc33", +}; + +static const struct elan_i2c_hid_chip_data ilitek_ili9882t_chip_data = { + .post_power_delay_ms = 1, + .post_gpio_reset_on_delay_ms = 200, + .post_gpio_reset_off_delay_ms = 65, .hid_descriptor_address = 0x0001, + /* + * this touchscreen is tightly integrated with the panel and assumes + * that the relevant power rails (other than the IO rail) have already + * been turned on by the panel driver because we're a panel follower. + */ + .main_supply_name = NULL, };
static const struct of_device_id elan_i2c_hid_of_match[] = { { .compatible = "elan,ekth6915", .data = &elan_ekth6915_chip_data }, + { .compatible = "ilitek,ili9882t", .data = &ilitek_ili9882t_chip_data }, { } }; MODULE_DEVICE_TABLE(of, elan_i2c_hid_of_match);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit 0eafc58f2194dbd01d4be40f99a697681171995b ]
The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s shares the VCC33 supply with other peripherals that may remain powered during suspend (e.g. when enabled as wakeup sources).
The reset line is also wired so that it can be left deasserted when the supply is off.
This is important as it avoids holding the controller in reset for extended periods of time when it remains powered, which can lead to increased power consumption, and also avoids leaking current through the X13s reset circuitry during suspend (and after driver unbind).
Use the new 'no-reset-on-power-off' devicetree property to determine when reset needs to be asserted on power down.
Notably this also avoids wasting power on machine variants without a touchscreen for which the driver would otherwise exit probe with reset asserted.
Fixes: bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens") Cc: stable@vger.kernel.org # 6.0 Cc: Douglas Anderson dianders@chromium.org Tested-by: Steev Klimaszewski steev@kali.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Douglas Anderson dianders@chromium.org Link: https://lore.kernel.org/r/20240507144821.12275-5-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/i2c-hid/i2c-hid-of-elan.c | 59 +++++++++++++++++++++------ 1 file changed, 47 insertions(+), 12 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-of-elan.c b/drivers/hid/i2c-hid/i2c-hid-of-elan.c index 35986e8297095..8d4deb2def97b 100644 --- a/drivers/hid/i2c-hid/i2c-hid-of-elan.c +++ b/drivers/hid/i2c-hid/i2c-hid-of-elan.c @@ -31,6 +31,7 @@ struct i2c_hid_of_elan { struct regulator *vcc33; struct regulator *vccio; struct gpio_desc *reset_gpio; + bool no_reset_on_power_off; const struct elan_i2c_hid_chip_data *chip_data; };
@@ -40,17 +41,17 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops) container_of(ops, struct i2c_hid_of_elan, ops); int ret;
+ gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1); + if (ihid_elan->vcc33) { ret = regulator_enable(ihid_elan->vcc33); if (ret) - return ret; + goto err_deassert_reset; }
ret = regulator_enable(ihid_elan->vccio); - if (ret) { - regulator_disable(ihid_elan->vcc33); - return ret; - } + if (ret) + goto err_disable_vcc33;
if (ihid_elan->chip_data->post_power_delay_ms) msleep(ihid_elan->chip_data->post_power_delay_ms); @@ -60,6 +61,15 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops) msleep(ihid_elan->chip_data->post_gpio_reset_on_delay_ms);
return 0; + +err_disable_vcc33: + if (ihid_elan->vcc33) + regulator_disable(ihid_elan->vcc33); +err_deassert_reset: + if (ihid_elan->no_reset_on_power_off) + gpiod_set_value_cansleep(ihid_elan->reset_gpio, 0); + + return ret; }
static void elan_i2c_hid_power_down(struct i2chid_ops *ops) @@ -67,7 +77,14 @@ static void elan_i2c_hid_power_down(struct i2chid_ops *ops) struct i2c_hid_of_elan *ihid_elan = container_of(ops, struct i2c_hid_of_elan, ops);
- gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1); + /* + * Do not assert reset when the hardware allows for it to remain + * deasserted regardless of the state of the (shared) power supply to + * avoid wasting power when the supply is left on. + */ + if (!ihid_elan->no_reset_on_power_off) + gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1); + if (ihid_elan->chip_data->post_gpio_reset_off_delay_ms) msleep(ihid_elan->chip_data->post_gpio_reset_off_delay_ms);
@@ -80,6 +97,7 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client, const struct i2c_device_id *id) { struct i2c_hid_of_elan *ihid_elan; + int ret;
ihid_elan = devm_kzalloc(&client->dev, sizeof(*ihid_elan), GFP_KERNEL); if (!ihid_elan) @@ -94,21 +112,38 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client, if (IS_ERR(ihid_elan->reset_gpio)) return PTR_ERR(ihid_elan->reset_gpio);
+ ihid_elan->no_reset_on_power_off = of_property_read_bool(client->dev.of_node, + "no-reset-on-power-off"); + ihid_elan->vccio = devm_regulator_get(&client->dev, "vccio"); - if (IS_ERR(ihid_elan->vccio)) - return PTR_ERR(ihid_elan->vccio); + if (IS_ERR(ihid_elan->vccio)) { + ret = PTR_ERR(ihid_elan->vccio); + goto err_deassert_reset; + }
ihid_elan->chip_data = device_get_match_data(&client->dev);
if (ihid_elan->chip_data->main_supply_name) { ihid_elan->vcc33 = devm_regulator_get(&client->dev, ihid_elan->chip_data->main_supply_name); - if (IS_ERR(ihid_elan->vcc33)) - return PTR_ERR(ihid_elan->vcc33); + if (IS_ERR(ihid_elan->vcc33)) { + ret = PTR_ERR(ihid_elan->vcc33); + goto err_deassert_reset; + } }
- return i2c_hid_core_probe(client, &ihid_elan->ops, - ihid_elan->chip_data->hid_descriptor_address, 0); + ret = i2c_hid_core_probe(client, &ihid_elan->ops, + ihid_elan->chip_data->hid_descriptor_address, 0); + if (ret) + goto err_deassert_reset; + + return 0; + +err_deassert_reset: + if (ihid_elan->no_reset_on_power_off) + gpiod_set_value_cansleep(ihid_elan->reset_gpio, 0); + + return ret; }
static const struct elan_i2c_hid_chip_data elan_ekth6915_chip_data = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit 373c612d72461ddaea223592df31e62c934aae61 ]
Add fwnode APIs for finding and getting I2C adapters, which will be used by the SFP code. These are passed the fwnode corresponding to the adapter, and return the I2C adapter. It is the responsibility of the caller to find the appropriate fwnode.
We keep the DT and ACPI interfaces, but where appropriate, recode them to use the fwnode interfaces internally.
Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 3f858bbf04db ("i2c: acpi: Unbind mux adapters before delete") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-core-acpi.c | 13 +---- drivers/i2c/i2c-core-base.c | 98 +++++++++++++++++++++++++++++++++++++ drivers/i2c/i2c-core-of.c | 66 ------------------------- include/linux/i2c.h | 24 +++++++-- 4 files changed, 120 insertions(+), 81 deletions(-)
diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c index 4dd777cc0c89f..d6037a3286690 100644 --- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -442,18 +442,7 @@ EXPORT_SYMBOL_GPL(i2c_acpi_find_adapter_by_handle);
static struct i2c_client *i2c_acpi_find_client_by_adev(struct acpi_device *adev) { - struct device *dev; - struct i2c_client *client; - - dev = bus_find_device_by_acpi_dev(&i2c_bus_type, adev); - if (!dev) - return NULL; - - client = i2c_verify_client(dev); - if (!client) - put_device(dev); - - return client; + return i2c_find_device_by_fwnode(acpi_fwnode_handle(adev)); }
static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value, diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c index 1ebc953799149..8af82f42af30b 100644 --- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -1017,6 +1017,35 @@ void i2c_unregister_device(struct i2c_client *client) } EXPORT_SYMBOL_GPL(i2c_unregister_device);
+/** + * i2c_find_device_by_fwnode() - find an i2c_client for the fwnode + * @fwnode: &struct fwnode_handle corresponding to the &struct i2c_client + * + * Look up and return the &struct i2c_client corresponding to the @fwnode. + * If no client can be found, or @fwnode is NULL, this returns NULL. + * + * The user must call put_device(&client->dev) once done with the i2c client. + */ +struct i2c_client *i2c_find_device_by_fwnode(struct fwnode_handle *fwnode) +{ + struct i2c_client *client; + struct device *dev; + + if (!fwnode) + return NULL; + + dev = bus_find_device_by_fwnode(&i2c_bus_type, fwnode); + if (!dev) + return NULL; + + client = i2c_verify_client(dev); + if (!client) + put_device(dev); + + return client; +} +EXPORT_SYMBOL(i2c_find_device_by_fwnode); +
static const struct i2c_device_id dummy_id[] = { { "dummy", 0 }, @@ -1767,6 +1796,75 @@ int devm_i2c_add_adapter(struct device *dev, struct i2c_adapter *adapter) } EXPORT_SYMBOL_GPL(devm_i2c_add_adapter);
+static int i2c_dev_or_parent_fwnode_match(struct device *dev, const void *data) +{ + if (dev_fwnode(dev) == data) + return 1; + + if (dev->parent && dev_fwnode(dev->parent) == data) + return 1; + + return 0; +} + +/** + * i2c_find_adapter_by_fwnode() - find an i2c_adapter for the fwnode + * @fwnode: &struct fwnode_handle corresponding to the &struct i2c_adapter + * + * Look up and return the &struct i2c_adapter corresponding to the @fwnode. + * If no adapter can be found, or @fwnode is NULL, this returns NULL. + * + * The user must call put_device(&adapter->dev) once done with the i2c adapter. + */ +struct i2c_adapter *i2c_find_adapter_by_fwnode(struct fwnode_handle *fwnode) +{ + struct i2c_adapter *adapter; + struct device *dev; + + if (!fwnode) + return NULL; + + dev = bus_find_device(&i2c_bus_type, NULL, fwnode, + i2c_dev_or_parent_fwnode_match); + if (!dev) + return NULL; + + adapter = i2c_verify_adapter(dev); + if (!adapter) + put_device(dev); + + return adapter; +} +EXPORT_SYMBOL(i2c_find_adapter_by_fwnode); + +/** + * i2c_get_adapter_by_fwnode() - find an i2c_adapter for the fwnode + * @fwnode: &struct fwnode_handle corresponding to the &struct i2c_adapter + * + * Look up and return the &struct i2c_adapter corresponding to the @fwnode, + * and increment the adapter module's use count. If no adapter can be found, + * or @fwnode is NULL, this returns NULL. + * + * The user must call i2c_put_adapter(adapter) once done with the i2c adapter. + * Note that this is different from i2c_find_adapter_by_node(). + */ +struct i2c_adapter *i2c_get_adapter_by_fwnode(struct fwnode_handle *fwnode) +{ + struct i2c_adapter *adapter; + + adapter = i2c_find_adapter_by_fwnode(fwnode); + if (!adapter) + return NULL; + + if (!try_module_get(adapter->owner)) { + put_device(&adapter->dev); + adapter = NULL; + } + + return adapter; +} +EXPORT_SYMBOL(i2c_get_adapter_by_fwnode); + static void i2c_parse_timing(struct device *dev, char *prop_name, u32 *cur_val_p, u32 def_val, bool use_def) { diff --git a/drivers/i2c/i2c-core-of.c b/drivers/i2c/i2c-core-of.c index 1073f82d5dd47..545436b7dd535 100644 --- a/drivers/i2c/i2c-core-of.c +++ b/drivers/i2c/i2c-core-of.c @@ -113,72 +113,6 @@ void of_i2c_register_devices(struct i2c_adapter *adap) of_node_put(bus); }
-static int of_dev_or_parent_node_match(struct device *dev, const void *data) -{ - if (dev->of_node == data) - return 1; - - if (dev->parent) - return dev->parent->of_node == data; - - return 0; -} - -/* must call put_device() when done with returned i2c_client device */ -struct i2c_client *of_find_i2c_device_by_node(struct device_node *node) -{ - struct device *dev; - struct i2c_client *client; - - dev = bus_find_device_by_of_node(&i2c_bus_type, node); - if (!dev) - return NULL; - - client = i2c_verify_client(dev); - if (!client) - put_device(dev); - - return client; -} -EXPORT_SYMBOL(of_find_i2c_device_by_node); - -/* must call put_device() when done with returned i2c_adapter device */ -struct i2c_adapter *of_find_i2c_adapter_by_node(struct device_node *node) -{ - struct device *dev; - struct i2c_adapter *adapter; - - dev = bus_find_device(&i2c_bus_type, NULL, node, - of_dev_or_parent_node_match); - if (!dev) - return NULL; - - adapter = i2c_verify_adapter(dev); - if (!adapter) - put_device(dev); - - return adapter; -} -EXPORT_SYMBOL(of_find_i2c_adapter_by_node); - -/* must call i2c_put_adapter() when done with returned i2c_adapter device */ -struct i2c_adapter *of_get_i2c_adapter_by_node(struct device_node *node) -{ - struct i2c_adapter *adapter; - - adapter = of_find_i2c_adapter_by_node(node); - if (!adapter) - return NULL; - - if (!try_module_get(adapter->owner)) { - put_device(&adapter->dev); - adapter = NULL; - } - - return adapter; -} -EXPORT_SYMBOL(of_get_i2c_adapter_by_node); - static const struct of_device_id* i2c_of_match_device_sysfs(const struct of_device_id *matches, struct i2c_client *client) diff --git a/include/linux/i2c.h b/include/linux/i2c.h index f7c49bbdb8a18..cfc59c3371cb2 100644 --- a/include/linux/i2c.h +++ b/include/linux/i2c.h @@ -964,15 +964,33 @@ int i2c_handle_smbus_host_notify(struct i2c_adapter *adap, unsigned short addr);
#endif /* I2C */
+/* must call put_device() when done with returned i2c_client device */ +struct i2c_client *i2c_find_device_by_fwnode(struct fwnode_handle *fwnode); + +/* must call put_device() when done with returned i2c_adapter device */ +struct i2c_adapter *i2c_find_adapter_by_fwnode(struct fwnode_handle *fwnode); + +/* must call i2c_put_adapter() when done with returned i2c_adapter device */ +struct i2c_adapter *i2c_get_adapter_by_fwnode(struct fwnode_handle *fwnode); + #if IS_ENABLED(CONFIG_OF) /* must call put_device() when done with returned i2c_client device */ -struct i2c_client *of_find_i2c_device_by_node(struct device_node *node); +static inline struct i2c_client *of_find_i2c_device_by_node(struct device_node *node) +{ + return i2c_find_device_by_fwnode(of_fwnode_handle(node)); +}
/* must call put_device() when done with returned i2c_adapter device */ -struct i2c_adapter *of_find_i2c_adapter_by_node(struct device_node *node); +static inline struct i2c_adapter *of_find_i2c_adapter_by_node(struct device_node *node) +{ + return i2c_find_adapter_by_fwnode(of_fwnode_handle(node)); +}
/* must call i2c_put_adapter() when done with returned i2c_adapter device */ -struct i2c_adapter *of_get_i2c_adapter_by_node(struct device_node *node); +static inline struct i2c_adapter *of_get_i2c_adapter_by_node(struct device_node *node) +{ + return i2c_get_adapter_by_fwnode(of_fwnode_handle(node)); +}
const struct of_device_id *i2c_of_match_device(const struct of_device_id *matches,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hamish Martin hamish.martin@alliedtelesis.co.nz
[ Upstream commit 3f858bbf04dbac934ac279aaee05d49eb9910051 ]
There is an issue with ACPI overlay table removal specifically related to I2C multiplexers.
Consider an ACPI SSDT Overlay that defines a PCA9548 I2C mux on an existing I2C bus. When this table is loaded we see the creation of a device for the overall PCA9548 chip and 8 further devices - one i2c_adapter each for the mux channels. These are all bound to their ACPI equivalents via an eventual invocation of acpi_bind_one().
When we unload the SSDT overlay we run into the problem. The ACPI devices are deleted as normal via acpi_device_del_work_fn() and the acpi_device_del_list.
However, the following warning and stack trace is output as the deletion does not go smoothly: ------------[ cut here ]------------ kernfs: can not remove 'physical_node', no directory WARNING: CPU: 1 PID: 11 at fs/kernfs/dir.c:1674 kernfs_remove_by_name_ns+0xb9/0xc0 Modules linked in: CPU: 1 PID: 11 Comm: kworker/u128:0 Not tainted 6.8.0-rc6+ #1 Hardware name: congatec AG conga-B7E3/conga-B7E3, BIOS 5.13 05/16/2023 Workqueue: kacpi_hotplug acpi_device_del_work_fn RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0 Code: e4 00 48 89 ef e8 07 71 db ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 a7 55 e4 00 0f 0b eb a6 48 c7 c7 f0 38 0d 9d e8 97 0a d5 ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffff9f864008fb28 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8ef90a8d4940 RCX: 0000000000000000 RDX: ffff8f000e267d10 RSI: ffff8f000e25c780 RDI: ffff8f000e25c780 RBP: ffff8ef9186f9870 R08: 0000000000013ffb R09: 00000000ffffbfff R10: 00000000ffffbfff R11: ffff8f000e0a0000 R12: ffff9f864008fb50 R13: ffff8ef90c93dd60 R14: ffff8ef9010d0958 R15: ffff8ef9186f98c8 FS: 0000000000000000(0000) GS:ffff8f000e240000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f48f5253a08 CR3: 00000003cb82e000 CR4: 00000000003506f0 Call Trace: <TASK> ? kernfs_remove_by_name_ns+0xb9/0xc0 ? __warn+0x7c/0x130 ? kernfs_remove_by_name_ns+0xb9/0xc0 ? report_bug+0x171/0x1a0 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? kernfs_remove_by_name_ns+0xb9/0xc0 ? kernfs_remove_by_name_ns+0xb9/0xc0 acpi_unbind_one+0x108/0x180 device_del+0x18b/0x490 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f device_unregister+0xd/0x30 i2c_del_adapter.part.0+0x1bf/0x250 i2c_mux_del_adapters+0xa1/0xe0 i2c_device_remove+0x1e/0x80 device_release_driver_internal+0x19a/0x200 bus_remove_device+0xbf/0x100 device_del+0x157/0x490 ? __pfx_device_match_fwnode+0x10/0x10 ? srso_return_thunk+0x5/0x5f device_unregister+0xd/0x30 i2c_acpi_notify+0x10f/0x140 notifier_call_chain+0x58/0xd0 blocking_notifier_call_chain+0x3a/0x60 acpi_device_del_work_fn+0x85/0x1d0 process_one_work+0x134/0x2f0 worker_thread+0x2f0/0x410 ? __pfx_worker_thread+0x10/0x10 kthread+0xe3/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> ---[ end trace 0000000000000000 ]--- ... repeated 7 more times, 1 for each channel of the mux ...
The issue is that the binding of the ACPI devices to their peer I2C adapters is not correctly cleaned up. Digging deeper into the issue we see that the deletion order is such that the ACPI devices matching the mux channel i2c adapters are deleted first during the SSDT overlay removal. For each of the channels we see a call to i2c_acpi_notify() with ACPI_RECONFIG_DEVICE_REMOVE but, because these devices are not actually i2c_clients, nothing is done for them.
Later on, after each of the mux channels has been dealt with, we come to delete the i2c_client representing the PCA9548 device. This is the call stack we see above, whereby the kernel cleans up the i2c_client including destruction of the mux and its channel adapters. At this point we do attempt to unbind from the ACPI peers but those peers no longer exist and so we hit the kernfs errors.
The fix is to augment i2c_acpi_notify() to handle i2c_adapters. But, given that the life cycle of the adapters is linked to the i2c_client, instead of deleting the i2c_adapters during the i2c_acpi_notify(), we just trigger unbinding of the ACPI device from the adapter device, and allow the clean up of the adapter to continue in the way it always has.
Signed-off-by: Hamish Martin hamish.martin@alliedtelesis.co.nz Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Andi Shyti andi.shyti@kernel.org Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-core-acpi.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c index d6037a3286690..14ae0cfc325ef 100644 --- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -445,6 +445,11 @@ static struct i2c_client *i2c_acpi_find_client_by_adev(struct acpi_device *adev) return i2c_find_device_by_fwnode(acpi_fwnode_handle(adev)); }
+static struct i2c_adapter *i2c_acpi_find_adapter_by_adev(struct acpi_device *adev) +{ + return i2c_find_adapter_by_fwnode(acpi_fwnode_handle(adev)); +} + static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value, void *arg) { @@ -471,11 +476,17 @@ static int i2c_acpi_notify(struct notifier_block *nb, unsigned long value, break;
client = i2c_acpi_find_client_by_adev(adev); - if (!client) - break; + if (client) { + i2c_unregister_device(client); + put_device(&client->dev); + } + + adapter = i2c_acpi_find_adapter_by_adev(adev); + if (adapter) { + acpi_unbind_one(&adapter->dev); + put_device(&adapter->dev); + }
- i2c_unregister_device(client); - put_device(&client->dev); break; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hocko mhocko@suse.com
[ Upstream commit e9c3cda4d86e56bf7fe403729f38c4f0f65d3860 ]
Gao Xiang has reported that the page allocator complains about high order __GFP_NOFAIL request coming from the vmalloc core:
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549 alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286 vm_area_alloc_pages mm/vmalloc.c:2989 [inline] __vmalloc_area_node mm/vmalloc.c:3057 [inline] __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227 kvmalloc_node+0x156/0x1a0 mm/util.c:606 kvmalloc include/linux/slab.h:737 [inline] kvmalloc_array include/linux/slab.h:755 [inline] kvcalloc include/linux/slab.h:760 [inline]
it seems that I have completely missed high order allocation backing vmalloc areas case when implementing __GFP_NOFAIL support. This means that [k]vmalloc at al. can allocate higher order allocations with __GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or cause a lot of reclaim/compaction activity if those requests cannot be satisfied.
Fix the issue by falling back to zero order allocations for __GFP_NOFAIL requests if the high order request fails.
Link: https://lkml.kernel.org/r/ZAXynvdNqcI0f6Us@dhcp22.suse.cz Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL") Reported-by: Gao Xiang hsiangkao@linux.alibaba.com Link: https://lkml.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com Signed-off-by: Michal Hocko mhocko@suse.com Reviewed-by: Uladzislau Rezki (Sony) urezki@gmail.com Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Baoquan He bhe@redhat.com Cc: Christoph Hellwig hch@lst.de Cc: Mel Gorman mgorman@techsingularity.net Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 8e0545c83d67 ("mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL") Signed-off-by: Sasha Levin sashal@kernel.org --- mm/vmalloc.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 67a10a04df041..cab30d9497e6b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2923,6 +2923,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, unsigned int order, unsigned int nr_pages, struct page **pages) { unsigned int nr_allocated = 0; + gfp_t alloc_gfp = gfp; + bool nofail = false; struct page *page; int i;
@@ -2933,6 +2935,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * more permissive. */ if (!order) { + /* bulk allocator doesn't support nofail req. officially */ gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL;
while (nr_allocated < nr_pages) { @@ -2971,20 +2974,35 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } + } else if (gfp & __GFP_NOFAIL) { + /* + * Higher order nofail allocations are really expensive and + * potentially dangerous (pre-mature OOM, disruptive reclaim + * and compaction etc. + */ + alloc_gfp &= ~__GFP_NOFAIL; + nofail = true; }
/* High-order pages or fallback path if "bulk" fails. */ - while (nr_allocated < nr_pages) { if (fatal_signal_pending(current)) break;
if (nid == NUMA_NO_NODE) - page = alloc_pages(gfp, order); + page = alloc_pages(alloc_gfp, order); else - page = alloc_pages_node(nid, gfp, order); - if (unlikely(!page)) - break; + page = alloc_pages_node(nid, alloc_gfp, order); + if (unlikely(!page)) { + if (!nofail) + break; + + /* fall back to the zero order allocations */ + alloc_gfp |= __GFP_NOFAIL; + order = 0; + continue; + } + /* * Higher order allocations must be able to be treated as * indepdenent small pages by callers (as they can with
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hailong.Liu hailong.liu@oppo.com
[ Upstream commit 8e0545c83d672750632f46e3f9ad95c48c91a0fc ]
commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc") includes support for __GFP_NOFAIL, but it presents a conflict with commit dd544141b9eb ("vmalloc: back off when the current task is OOM-killed"). A possible scenario is as follows:
process-a __vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL) __vmalloc_area_node() vm_area_alloc_pages() --> oom-killer send SIGKILL to process-a if (fatal_signal_pending(current)) break; --> return NULL;
To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages() if __GFP_NOFAIL set.
This issue occurred during OPLUS KASAN TEST. Below is part of the log -> oom-killer sends signal to process [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198
[65731.259685] [T32454] Call trace: [65731.259698] [T32454] dump_backtrace+0xf4/0x118 [65731.259734] [T32454] show_stack+0x18/0x24 [65731.259756] [T32454] dump_stack_lvl+0x60/0x7c [65731.259781] [T32454] dump_stack+0x18/0x38 [65731.259800] [T32454] mrdump_common_die+0x250/0x39c [mrdump] [65731.259936] [T32454] ipanic_die+0x20/0x34 [mrdump] [65731.260019] [T32454] atomic_notifier_call_chain+0xb4/0xfc [65731.260047] [T32454] notify_die+0x114/0x198 [65731.260073] [T32454] die+0xf4/0x5b4 [65731.260098] [T32454] die_kernel_fault+0x80/0x98 [65731.260124] [T32454] __do_kernel_fault+0x160/0x2a8 [65731.260146] [T32454] do_bad_area+0x68/0x148 [65731.260174] [T32454] do_mem_abort+0x151c/0x1b34 [65731.260204] [T32454] el1_abort+0x3c/0x5c [65731.260227] [T32454] el1h_64_sync_handler+0x54/0x90 [65731.260248] [T32454] el1h_64_sync+0x68/0x6c
[65731.260269] [T32454] z_erofs_decompress_queue+0x7f0/0x2258 --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL); kernel panic by NULL pointer dereference. erofs assume kvmalloc with __GFP_NOFAIL never return NULL. [65731.260293] [T32454] z_erofs_runqueue+0xf30/0x104c [65731.260314] [T32454] z_erofs_readahead+0x4f0/0x968 [65731.260339] [T32454] read_pages+0x170/0xadc [65731.260364] [T32454] page_cache_ra_unbounded+0x874/0xf30 [65731.260388] [T32454] page_cache_ra_order+0x24c/0x714 [65731.260411] [T32454] filemap_fault+0xbf0/0x1a74 [65731.260437] [T32454] __do_fault+0xd0/0x33c [65731.260462] [T32454] handle_mm_fault+0xf74/0x3fe0 [65731.260486] [T32454] do_mem_abort+0x54c/0x1b34 [65731.260509] [T32454] el0_da+0x44/0x94 [65731.260531] [T32454] el0t_64_sync_handler+0x98/0xb4 [65731.260553] [T32454] el0t_64_sync+0x198/0x19c
Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL") Signed-off-by: Hailong.Liu hailong.liu@oppo.com Acked-by: Michal Hocko mhocko@suse.com Suggested-by: Barry Song 21cnbao@gmail.com Reported-by: Oven liyangouwen1@oppo.com Reviewed-by: Barry Song baohua@kernel.org Reviewed-by: Uladzislau Rezki (Sony) urezki@gmail.com Cc: Chao Yu chao@kernel.org Cc: Christoph Hellwig hch@infradead.org Cc: Gao Xiang xiang@kernel.org Cc: Lorenzo Stoakes lstoakes@gmail.com Cc: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/vmalloc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index cab30d9497e6b..c5e30b52844c8 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2924,7 +2924,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid, { unsigned int nr_allocated = 0; gfp_t alloc_gfp = gfp; - bool nofail = false; + bool nofail = gfp & __GFP_NOFAIL; struct page *page; int i;
@@ -2981,12 +2981,11 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * and compaction etc. */ alloc_gfp &= ~__GFP_NOFAIL; - nofail = true; }
/* High-order pages or fallback path if "bulk" fails. */ while (nr_allocated < nr_pages) { - if (fatal_signal_pending(current)) + if (!nofail && fatal_signal_pending(current)) break;
if (nid == NUMA_NO_NODE)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dev Jain dev.jain@arm.com
[ Upstream commit 9ad665ef55eaad1ead1406a58a34f615a7c18b5e ]
Currently, the test tries to set nr_hugepages to zero, but that is not actually done because the file offset is not reset after read(). Fix that using lseek().
Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain dev.jain@arm.com Cc: stable@vger.kernel.org Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Shuah Khan shuah@kernel.org Cc: Sri Jayaramappa sjayaram@akamai.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/vm/compaction_test.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/vm/compaction_test.c b/tools/testing/selftests/vm/compaction_test.c index 9b420140ba2ba..55dec92e1e58c 100644 --- a/tools/testing/selftests/vm/compaction_test.c +++ b/tools/testing/selftests/vm/compaction_test.c @@ -103,6 +103,8 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size) goto close_fd; }
+ lseek(fd, 0, SEEK_SET); + /* Start with the initial condition of 0 huge pages*/ if (write(fd, "0", sizeof(char)) != sizeof(char)) { perror("Failed to write 0 to /proc/sys/vm/nr_hugepages\n");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum usama.anjum@collabora.com
[ Upstream commit 9a21701edc41465de56f97914741bfb7bfc2517d ]
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages.
Link: https://lkml.kernel.org/r/20240101083614.1076768-1-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Cc: Shuah Khan shuah@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/vm/compaction_test.c | 91 ++++++++++---------- 1 file changed, 44 insertions(+), 47 deletions(-)
diff --git a/tools/testing/selftests/vm/compaction_test.c b/tools/testing/selftests/vm/compaction_test.c index 55dec92e1e58c..f81931c1f8386 100644 --- a/tools/testing/selftests/vm/compaction_test.c +++ b/tools/testing/selftests/vm/compaction_test.c @@ -33,7 +33,7 @@ int read_memory_info(unsigned long *memfree, unsigned long *hugepagesize) FILE *cmdfile = popen(cmd, "r");
if (!(fgets(buffer, sizeof(buffer), cmdfile))) { - perror("Failed to read meminfo\n"); + ksft_print_msg("Failed to read meminfo: %s\n", strerror(errno)); return -1; }
@@ -44,7 +44,7 @@ int read_memory_info(unsigned long *memfree, unsigned long *hugepagesize) cmdfile = popen(cmd, "r");
if (!(fgets(buffer, sizeof(buffer), cmdfile))) { - perror("Failed to read meminfo\n"); + ksft_print_msg("Failed to read meminfo: %s\n", strerror(errno)); return -1; }
@@ -62,14 +62,14 @@ int prereq(void) fd = open("/proc/sys/vm/compact_unevictable_allowed", O_RDONLY | O_NONBLOCK); if (fd < 0) { - perror("Failed to open\n" - "/proc/sys/vm/compact_unevictable_allowed\n"); + ksft_print_msg("Failed to open /proc/sys/vm/compact_unevictable_allowed: %s\n", + strerror(errno)); return -1; }
if (read(fd, &allowed, sizeof(char)) != sizeof(char)) { - perror("Failed to read from\n" - "/proc/sys/vm/compact_unevictable_allowed\n"); + ksft_print_msg("Failed to read from /proc/sys/vm/compact_unevictable_allowed: %s\n", + strerror(errno)); close(fd); return -1; } @@ -78,12 +78,13 @@ int prereq(void) if (allowed == '1') return 0;
+ ksft_print_msg("Compaction isn't allowed\n"); return -1; }
int check_compaction(unsigned long mem_free, unsigned int hugepage_size) { - int fd; + int fd, ret = -1; int compaction_index = 0; char initial_nr_hugepages[10] = {0}; char nr_hugepages[10] = {0}; @@ -94,12 +95,14 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK); if (fd < 0) { - perror("Failed to open /proc/sys/vm/nr_hugepages"); + ksft_test_result_fail("Failed to open /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); return -1; }
if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) { - perror("Failed to read from /proc/sys/vm/nr_hugepages"); + ksft_test_result_fail("Failed to read from /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -107,7 +110,8 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Start with the initial condition of 0 huge pages*/ if (write(fd, "0", sizeof(char)) != sizeof(char)) { - perror("Failed to write 0 to /proc/sys/vm/nr_hugepages\n"); + ksft_test_result_fail("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -116,14 +120,16 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size) /* Request a large number of huge pages. The Kernel will allocate as much as it can */ if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) { - perror("Failed to write 100000 to /proc/sys/vm/nr_hugepages\n"); + ksft_test_result_fail("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
lseek(fd, 0, SEEK_SET);
if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) { - perror("Failed to re-read from /proc/sys/vm/nr_hugepages\n"); + ksft_test_result_fail("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -131,67 +137,58 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size) huge pages */ compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size);
- if (compaction_index > 3) { - printf("No of huge pages allocated = %d\n", - (atoi(nr_hugepages))); - fprintf(stderr, "ERROR: Less that 1/%d of memory is available\n" - "as huge pages\n", compaction_index); - goto close_fd; - } - - printf("No of huge pages allocated = %d\n", - (atoi(nr_hugepages))); - lseek(fd, 0, SEEK_SET);
if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages)) != strlen(initial_nr_hugepages)) { - perror("Failed to write value to /proc/sys/vm/nr_hugepages\n"); + ksft_test_result_fail("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
- close(fd); - return 0; + if (compaction_index > 3) { + ksft_print_msg("ERROR: Less that 1/%d of memory is available\n" + "as huge pages\n", compaction_index); + ksft_test_result_fail("No of huge pages allocated = %d\n", (atoi(nr_hugepages))); + goto close_fd; + } + + ksft_test_result_pass("Memory compaction succeeded. No of huge pages allocated = %d\n", + (atoi(nr_hugepages))); + ret = 0;
close_fd: close(fd); - printf("Not OK. Compaction test failed."); - return -1; + return ret; }
int main(int argc, char **argv) { struct rlimit lim; - struct map_list *list, *entry; + struct map_list *list = NULL, *entry; size_t page_size, i; void *map = NULL; unsigned long mem_free = 0; unsigned long hugepage_size = 0; long mem_fragmentable_MB = 0;
- if (prereq() != 0) { - printf("Either the sysctl compact_unevictable_allowed is not\n" - "set to 1 or couldn't read the proc file.\n" - "Skipping the test\n"); - return KSFT_SKIP; - } + ksft_print_header(); + + if (prereq() != 0) + return ksft_exit_pass(); + + ksft_set_plan(1);
lim.rlim_cur = RLIM_INFINITY; lim.rlim_max = RLIM_INFINITY; - if (setrlimit(RLIMIT_MEMLOCK, &lim)) { - perror("Failed to set rlimit:\n"); - return -1; - } + if (setrlimit(RLIMIT_MEMLOCK, &lim)) + ksft_exit_fail_msg("Failed to set rlimit: %s\n", strerror(errno));
page_size = getpagesize();
- list = NULL; - - if (read_memory_info(&mem_free, &hugepage_size) != 0) { - printf("ERROR: Cannot read meminfo\n"); - return -1; - } + if (read_memory_info(&mem_free, &hugepage_size) != 0) + ksft_exit_fail_msg("Failed to get meminfo\n");
mem_fragmentable_MB = mem_free * 0.8 / 1024;
@@ -227,7 +224,7 @@ int main(int argc, char **argv) }
if (check_compaction(mem_free, hugepage_size) == 0) - return 0; + return ksft_exit_pass();
- return -1; + return ksft_exit_fail(); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Brown broonie@kernel.org
[ Upstream commit f3b7568c49420d2dcd251032c9ca1e069ec8a6c9 ]
Every test result report in the compaction test prints a distinct log messae, and some of the reports print a name that varies at runtime. This causes problems for automation since a lot of automation software uses the printed string as the name of the test, if the name varies from run to run and from pass to fail then the automation software can't identify that a test changed result or that the same tests are being run.
Refactor the logging to use a consistent name when printing the result of the test, printing the existing messages as diagnostic information instead so they are still available for people trying to interpret the results.
Link: https://lkml.kernel.org/r/20240209-kselftest-mm-cleanup-v1-2-a3c0386496b5@ke... Signed-off-by: Mark Brown broonie@kernel.org Cc: Muhammad Usama Anjum usama.anjum@collabora.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Shuah Khan shuah@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/vm/compaction_test.c | 35 +++++++++++--------- 1 file changed, 19 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/vm/compaction_test.c b/tools/testing/selftests/vm/compaction_test.c index f81931c1f8386..6aa6460b854ea 100644 --- a/tools/testing/selftests/vm/compaction_test.c +++ b/tools/testing/selftests/vm/compaction_test.c @@ -95,14 +95,15 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK); if (fd < 0) { - ksft_test_result_fail("Failed to open /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); - return -1; + ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); + ret = -1; + goto out; }
if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) { - ksft_test_result_fail("Failed to read from /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -110,8 +111,8 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Start with the initial condition of 0 huge pages*/ if (write(fd, "0", sizeof(char)) != sizeof(char)) { - ksft_test_result_fail("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -120,16 +121,16 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size) /* Request a large number of huge pages. The Kernel will allocate as much as it can */ if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) { - ksft_test_result_fail("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + ksft_print_msg("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
lseek(fd, 0, SEEK_SET);
if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) { - ksft_test_result_fail("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + ksft_print_msg("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
@@ -141,24 +142,26 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages)) != strlen(initial_nr_hugepages)) { - ksft_test_result_fail("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + ksft_print_msg("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); goto close_fd; }
+ ksft_print_msg("Number of huge pages allocated = %d\n", + atoi(nr_hugepages)); + if (compaction_index > 3) { ksft_print_msg("ERROR: Less that 1/%d of memory is available\n" "as huge pages\n", compaction_index); - ksft_test_result_fail("No of huge pages allocated = %d\n", (atoi(nr_hugepages))); goto close_fd; }
- ksft_test_result_pass("Memory compaction succeeded. No of huge pages allocated = %d\n", - (atoi(nr_hugepages))); ret = 0;
close_fd: close(fd); + out: + ksft_test_result(ret == 0, "check_compaction\n"); return ret; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dev Jain dev.jain@arm.com
[ Upstream commit d4202e66a4b1fe6968f17f9f09bbc30d08f028a1 ]
Patch series "Fixes for compaction_test", v2.
The compaction_test memory selftest introduces fragmentation in memory and then tries to allocate as many hugepages as possible. This series addresses some problems.
On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since compaction_index becomes 0, which is less than 3, due to no division by zero exception being raised. We fix that by checking for division by zero.
Secondly, correctly set the number of hugepages to zero before trying to set a large number of them.
Now, consider a situation in which, at the start of the test, a non-zero number of hugepages have been already set (while running the entire selftests/mm suite, or manually by the admin). The test operates on 80% of memory to avoid OOM-killer invocation, and because some memory is already blocked by hugepages, it would increase the chance of OOM-killing. Also, since mem_free used in check_compaction() is the value before we set nr_hugepages to zero, the chance that the compaction_index will be small is very high if the preset nr_hugepages was high, leading to a bogus test success.
This patch (of 3):
Currently, if at runtime we are not able to allocate a huge page, the test will trivially pass on Aarch64 due to no exception being raised on division by zero while computing compaction_index. Fix that by checking for nr_hugepages == 0. Anyways, in general, avoid a division by zero by exiting the program beforehand. While at it, fix a typo, and handle the case where the number of hugepages may overflow an integer.
Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain dev.jain@arm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Shuah Khan shuah@kernel.org Cc: Sri Jayaramappa sjayaram@akamai.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/vm/compaction_test.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/vm/compaction_test.c b/tools/testing/selftests/vm/compaction_test.c index 6aa6460b854ea..309b3750e57e1 100644 --- a/tools/testing/selftests/vm/compaction_test.c +++ b/tools/testing/selftests/vm/compaction_test.c @@ -82,12 +82,13 @@ int prereq(void) return -1; }
-int check_compaction(unsigned long mem_free, unsigned int hugepage_size) +int check_compaction(unsigned long mem_free, unsigned long hugepage_size) { + unsigned long nr_hugepages_ul; int fd, ret = -1; int compaction_index = 0; - char initial_nr_hugepages[10] = {0}; - char nr_hugepages[10] = {0}; + char initial_nr_hugepages[20] = {0}; + char nr_hugepages[20] = {0};
/* We want to test with 80% of available memory. Else, OOM killer comes in to play */ @@ -136,7 +137,12 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* We should have been able to request at least 1/3 rd of the memory in huge pages */ - compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size); + nr_hugepages_ul = strtoul(nr_hugepages, NULL, 10); + if (!nr_hugepages_ul) { + ksft_print_msg("ERROR: No memory is available as huge pages\n"); + goto close_fd; + } + compaction_index = mem_free/(nr_hugepages_ul * hugepage_size);
lseek(fd, 0, SEEK_SET);
@@ -147,11 +153,11 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size) goto close_fd; }
- ksft_print_msg("Number of huge pages allocated = %d\n", - atoi(nr_hugepages)); + ksft_print_msg("Number of huge pages allocated = %lu\n", + nr_hugepages_ul);
if (compaction_index > 3) { - ksft_print_msg("ERROR: Less that 1/%d of memory is available\n" + ksft_print_msg("ERROR: Less than 1/%d of memory is available\n" "as huge pages\n", compaction_index); goto close_fd; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 21ae74e1bf18331ae5e279bd96304b3630828009 ]
If ath10k_snoc is built-in, while Qualcomm remoteprocs are built as modules, compilation fails with:
/usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_init': drivers/net/wireless/ath/ath10k/snoc.c:1534: undefined reference to `qcom_register_ssr_notifier' /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_deinit': drivers/net/wireless/ath/ath10k/snoc.c:1551: undefined reference to `qcom_unregister_ssr_notifier'
Add corresponding dependency to ATH10K_SNOC Kconfig entry so that it's built as module if QCOM_RPROC_COMMON is built as module too.
Fixes: 747ff7d3d742 ("ath10k: Don't always treat modem stop events as crashes") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://msgid.link/20240511-ath10k-snoc-dep-v1-1-9666e3af5c27@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/wireless/ath/ath10k/Kconfig +++ b/drivers/net/wireless/ath/ath10k/Kconfig @@ -44,6 +44,7 @@ config ATH10K_SNOC tristate "Qualcomm ath10k SNOC support" depends on ATH10K depends on ARCH_QCOM || COMPILE_TEST + depends on QCOM_RPROC_COMMON || QCOM_RPROC_COMMON=n select QCOM_SCM select QCOM_QMI_HELPERS help
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 184533e3618f4d0b382c1ef3de0ce34e849005d7 ]
We have a few static functions at disk-io.c for which we have a forward declaration of their prototype, but it's not needed because all those functions are defined before they are called, so remove them.
Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: fb33eb2ef0d8 ("btrfs: fix leak of qgroup extent records after transaction abort") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 5756edb37c61e..0111eda33aa9c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -51,15 +51,6 @@ BTRFS_SUPER_FLAG_METADUMP |\ BTRFS_SUPER_FLAG_METADUMP_V2)
-static void btrfs_destroy_ordered_extents(struct btrfs_root *root); -static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, - struct btrfs_fs_info *fs_info); -static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root); -static int btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info, - struct extent_io_tree *dirty_pages, - int mark); -static int btrfs_destroy_pinned_extent(struct btrfs_fs_info *fs_info, - struct extent_io_tree *pinned_extents); static int btrfs_cleanup_transaction(struct btrfs_fs_info *fs_info); static void btrfs_error_commit_super(struct btrfs_fs_info *fs_info);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 99f09ce309b8307ce8dca209f936e99a7c332214 ]
btrfs_destroy_delayed_refs() always returns 0 and its single caller does not check its return value, as it also returns void, and so does the callers' caller and so on. This is because we are in the transaction abort path, where we have no way to deal with errors (we are in a critical situation) and all cleanup of resources works in a best effort fashion. So make btrfs_destroy_delayed_refs() return void.
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: fb33eb2ef0d8 ("btrfs: fix leak of qgroup extent records after transaction abort") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 0111eda33aa9c..5eac900f5d168 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4939,13 +4939,12 @@ static void btrfs_destroy_all_ordered_extents(struct btrfs_fs_info *fs_info) btrfs_wait_ordered_roots(fs_info, U64_MAX, 0, (u64)-1); }
-static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, - struct btrfs_fs_info *fs_info) +static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, + struct btrfs_fs_info *fs_info) { struct rb_node *node; struct btrfs_delayed_ref_root *delayed_refs; struct btrfs_delayed_ref_node *ref; - int ret = 0;
delayed_refs = &trans->delayed_refs;
@@ -4953,7 +4952,7 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, if (atomic_read(&delayed_refs->num_entries) == 0) { spin_unlock(&delayed_refs->lock); btrfs_debug(fs_info, "delayed_refs has NO entry"); - return ret; + return; }
while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { @@ -5015,8 +5014,6 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, btrfs_qgroup_destroy_extent_records(trans);
spin_unlock(&delayed_refs->lock); - - return ret; }
static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit fb33eb2ef0d88e75564983ef057b44c5b7e4fded ]
Qgroup extent records are created when delayed ref heads are created and then released after accounting extents at btrfs_qgroup_account_extents(), called during the transaction commit path.
If a transaction is aborted we free the qgroup records by calling btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(), unless we don't have delayed references. We are incorrectly assuming that no delayed references means we don't have qgroup extents records.
We can currently have no delayed references because we ran them all during a transaction commit and the transaction was aborted after that due to some error in the commit path.
So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs() even if we don't have any delayed references.
Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ Fixes: 81f7eb00ff5b ("btrfs: destroy qgroup extent records on transaction abort") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 5eac900f5d168..c17232659942d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4943,18 +4943,10 @@ static void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans, struct btrfs_fs_info *fs_info) { struct rb_node *node; - struct btrfs_delayed_ref_root *delayed_refs; + struct btrfs_delayed_ref_root *delayed_refs = &trans->delayed_refs; struct btrfs_delayed_ref_node *ref;
- delayed_refs = &trans->delayed_refs; - spin_lock(&delayed_refs->lock); - if (atomic_read(&delayed_refs->num_entries) == 0) { - spin_unlock(&delayed_refs->lock); - btrfs_debug(fs_info, "delayed_refs has NO entry"); - return; - } - while ((node = rb_first_cached(&delayed_refs->href_root)) != NULL) { struct btrfs_delayed_ref_head *head; struct rb_node *n;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) willy@infradead.org
[ Upstream commit 09a46acb3697e50548bb265afa1d79163659dd85 ]
In prepartion for switching from kmap() to kmap_local(), return the kmap address from nilfs_get_page() instead of having the caller look up page_address().
[konishi.ryusuke: fixed a missing blank line after declaration] Link: https://lkml.kernel.org/r/20231127143036.2425-7-konishi.ryusuke@gmail.com Signed-off-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 7373a51e7998 ("nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/dir.c | 57 +++++++++++++++++++++++-------------------------- 1 file changed, 27 insertions(+), 30 deletions(-)
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c index 760405da852f6..4911f09eb68b0 100644 --- a/fs/nilfs2/dir.c +++ b/fs/nilfs2/dir.c @@ -186,19 +186,24 @@ static bool nilfs_check_page(struct page *page) return false; }
-static struct page *nilfs_get_page(struct inode *dir, unsigned long n) +static void *nilfs_get_page(struct inode *dir, unsigned long n, + struct page **pagep) { struct address_space *mapping = dir->i_mapping; struct page *page = read_mapping_page(mapping, n, NULL); + void *kaddr;
- if (!IS_ERR(page)) { - kmap(page); - if (unlikely(!PageChecked(page))) { - if (!nilfs_check_page(page)) - goto fail; - } + if (IS_ERR(page)) + return page; + + kaddr = kmap(page); + if (unlikely(!PageChecked(page))) { + if (!nilfs_check_page(page)) + goto fail; } - return page; + + *pagep = page; + return kaddr;
fail: nilfs_put_page(page); @@ -275,14 +280,14 @@ static int nilfs_readdir(struct file *file, struct dir_context *ctx) for ( ; n < npages; n++, offset = 0) { char *kaddr, *limit; struct nilfs_dir_entry *de; - struct page *page = nilfs_get_page(inode, n); + struct page *page;
- if (IS_ERR(page)) { + kaddr = nilfs_get_page(inode, n, &page); + if (IS_ERR(kaddr)) { nilfs_error(sb, "bad page in #%lu", inode->i_ino); ctx->pos += PAGE_SIZE - offset; return -EIO; } - kaddr = page_address(page); de = (struct nilfs_dir_entry *)(kaddr + offset); limit = kaddr + nilfs_last_byte(inode, n) - NILFS_DIR_REC_LEN(1); @@ -345,11 +350,9 @@ nilfs_find_entry(struct inode *dir, const struct qstr *qstr, start = 0; n = start; do { - char *kaddr; + char *kaddr = nilfs_get_page(dir, n, &page);
- page = nilfs_get_page(dir, n); - if (!IS_ERR(page)) { - kaddr = page_address(page); + if (!IS_ERR(kaddr)) { de = (struct nilfs_dir_entry *)kaddr; kaddr += nilfs_last_byte(dir, n) - reclen; while ((char *) de <= kaddr) { @@ -387,15 +390,11 @@ nilfs_find_entry(struct inode *dir, const struct qstr *qstr,
struct nilfs_dir_entry *nilfs_dotdot(struct inode *dir, struct page **p) { - struct page *page = nilfs_get_page(dir, 0); - struct nilfs_dir_entry *de = NULL; + struct nilfs_dir_entry *de = nilfs_get_page(dir, 0, p);
- if (!IS_ERR(page)) { - de = nilfs_next_entry( - (struct nilfs_dir_entry *)page_address(page)); - *p = page; - } - return de; + if (IS_ERR(de)) + return NULL; + return nilfs_next_entry(de); }
ino_t nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr) @@ -459,12 +458,11 @@ int nilfs_add_link(struct dentry *dentry, struct inode *inode) for (n = 0; n <= npages; n++) { char *dir_end;
- page = nilfs_get_page(dir, n); - err = PTR_ERR(page); - if (IS_ERR(page)) + kaddr = nilfs_get_page(dir, n, &page); + err = PTR_ERR(kaddr); + if (IS_ERR(kaddr)) goto out; lock_page(page); - kaddr = page_address(page); dir_end = kaddr + nilfs_last_byte(dir, n); de = (struct nilfs_dir_entry *)kaddr; kaddr += PAGE_SIZE - reclen; @@ -627,11 +625,10 @@ int nilfs_empty_dir(struct inode *inode) char *kaddr; struct nilfs_dir_entry *de;
- page = nilfs_get_page(inode, i); - if (IS_ERR(page)) + kaddr = nilfs_get_page(inode, i, &page); + if (IS_ERR(kaddr)) continue;
- kaddr = page_address(page); de = (struct nilfs_dir_entry *)kaddr; kaddr += nilfs_last_byte(inode, i) - NILFS_DIR_REC_LEN(1);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
[ Upstream commit 7373a51e7998b508af7136530f3a997b286ce81c ]
The error handling in nilfs_empty_dir() when a directory folio/page read fails is incorrect, as in the old ext2 implementation, and if the folio/page cannot be read or nilfs_check_folio() fails, it will falsely determine the directory as empty and corrupt the file system.
In addition, since nilfs_empty_dir() does not immediately return on a failed folio/page read, but continues to loop, this can cause a long loop with I/O if i_size of the directory's inode is also corrupted, causing the log writer thread to wait and hang, as reported by syzbot.
Fix these issues by making nilfs_empty_dir() immediately return a false value (0) if it fails to get a directory folio/page.
Link: https://lkml.kernel.org/r/20240604134255.7165-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+c8166c541d3971bf6c87@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c8166c541d3971bf6c87 Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c index 4911f09eb68b0..e9668e455a35e 100644 --- a/fs/nilfs2/dir.c +++ b/fs/nilfs2/dir.c @@ -627,7 +627,7 @@ int nilfs_empty_dir(struct inode *inode)
kaddr = nilfs_get_page(inode, i, &page); if (IS_ERR(kaddr)) - continue; + return 0;
de = (struct nilfs_dir_entry *)kaddr; kaddr += nilfs_last_byte(inode, i) - NILFS_DIR_REC_LEN(1);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows:
BUG: kernel NULL pointer dereference, address: 00000000000000b0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022 RIP: 0010:io_buffer_select+0xc3/0x210 Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246 RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700 RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020 R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8 R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000 FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x14d/0x420 ? do_user_addr_fault+0x61/0x6a0 ? exc_page_fault+0x6c/0x150 ? asm_exc_page_fault+0x22/0x30 ? io_buffer_select+0xc3/0x210 __io_import_iovec+0xb5/0x120 io_readv_prep_async+0x36/0x70 io_queue_sqe_fallback+0x20/0x260 io_submit_sqes+0x314/0x630 __do_sys_io_uring_enter+0x339/0xbc0 ? __do_sys_io_uring_register+0x11b/0xc50 ? vm_mmap_pgoff+0xce/0x160 do_syscall_64+0x5f/0x180 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0x55e0a110a67e Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6
because the request is marked forced ASYNC and has a bad file fd, and hence takes the forced async prep path.
Current kernels with the request async prep cleaned up can no longer hit this issue, but for ease of backporting, let's add this safety check in here too as it really doesn't hurt. For both cases, this will inevitably end with a CQE posted with -EBADF.
Cc: stable@vger.kernel.org Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/kbuf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -154,7 +154,8 @@ static void __user *io_ring_buffer_selec req->buf_list = bl; req->buf_index = buf->bid;
- if (issue_flags & IO_URING_F_UNLOCKED || !file_can_poll(req->file)) { + if (issue_flags & IO_URING_F_UNLOCKED || + (req->file && !file_can_poll(req->file))) { /* * If we came in unlocked, we have no choice but to consume the * buffer here, otherwise nothing ensures that the buffer won't
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern stern@rowland.harvard.edu
commit 22f00812862564b314784167a89f27b444f82a46 upstream.
The syzbot fuzzer found that the interrupt-URB completion callback in the cdc-wdm driver was taking too long, and the driver's immediate resubmission of interrupt URBs with -EPROTO status combined with the dummy-hcd emulation to cause a CPU lockup:
cdc_wdm 1-1:1.0: nonzero urb status received: -71 cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625] CPU#0 Utilization every 4s during lockup: #1: 98% system, 0% softirq, 3% hardirq, 0% idle #2: 98% system, 0% softirq, 3% hardirq, 0% idle #3: 98% system, 0% softirq, 3% hardirq, 0% idle #4: 98% system, 0% softirq, 3% hardirq, 0% idle #5: 98% system, 1% softirq, 3% hardirq, 0% idle Modules linked in: irq event stamp: 73096 hardirqs last enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline] hardirqs last enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994 hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline] hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551 softirqs last enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline] softirqs last enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582 softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588 CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G W 6.10.0-rc2-syzkaller-g8867bbd4a056 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Testing showed that the problem did not occur if the two error messages -- the first two lines above -- were removed; apparently adding material to the kernel log takes a surprisingly large amount of time.
In any case, the best approach for preventing these lockups and to avoid spamming the log with thousands of error messages per second is to ratelimit the two dev_err() calls. Therefore we replace them with dev_err_ratelimited().
Signed-off-by: Alan Stern stern@rowland.harvard.edu Suggested-by: Greg KH gregkh@linuxfoundation.org Reported-and-tested-by: syzbot+5f996b83575ef4058638@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/ Reported-and-tested-by: syzbot+1b2abad17596ad03dcff@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/ Fixes: 9908a32e94de ("USB: remove err() macro from usb class drivers") Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowla... Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/29855215-52f5-4385-b058-91f42c2bee18@rowland.harva... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-wdm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/class/cdc-wdm.c +++ b/drivers/usb/class/cdc-wdm.c @@ -266,14 +266,14 @@ static void wdm_int_callback(struct urb dev_err(&desc->intf->dev, "Stall on int endpoint\n"); goto sw; /* halt is cleared in work */ default: - dev_err(&desc->intf->dev, + dev_err_ratelimited(&desc->intf->dev, "nonzero urb status received: %d\n", status); break; } }
if (urb->actual_length < sizeof(struct usb_cdc_notification)) { - dev_err(&desc->intf->dev, "wdm_int_callback - %d bytes\n", + dev_err_ratelimited(&desc->intf->dev, "wdm_int_callback - %d bytes\n", urb->actual_length); goto exit; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ernberg john.ernberg@actia.se
commit 8475ffcfb381a77075562207ce08552414a80326 upstream.
If no other USB HCDs are selected when compiling a small pure virutal machine, the Xen HCD driver cannot be built.
Fix it by traversing down host/ if CONFIG_USB_XEN_HCD is selected.
Fixes: 494ed3997d75 ("usb: Introduce Xen pvUSB frontend (xen hcd)") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: John Ernberg john.ernberg@actia.se Link: https://lore.kernel.org/r/20240517114345.1190755-1-john.ernberg@actia.se Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/Makefile +++ b/drivers/usb/Makefile @@ -34,6 +34,7 @@ obj-$(CONFIG_USB_R8A66597_HCD) += host/ obj-$(CONFIG_USB_FSL_USB2) += host/ obj-$(CONFIG_USB_FOTG210_HCD) += host/ obj-$(CONFIG_USB_MAX3421_HCD) += host/ +obj-$(CONFIG_USB_XEN_HCD) += host/
obj-$(CONFIG_USB_C67X00_HCD) += c67x00/
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amit Sunil Dhamne amitsd@google.com
commit e7e921918d905544500ca7a95889f898121ba886 upstream.
There could be a potential use-after-free case in tcpm_register_source_caps(). This could happen when: * new (say invalid) source caps are advertised * the existing source caps are unregistered * tcpm_register_source_caps() returns with an error as usb_power_delivery_register_capabilities() fails
This causes port->partner_source_caps to hold on to the now freed source caps.
Reset port->partner_source_caps value to NULL after unregistering existing source caps.
Fixes: 230ecdf71a64 ("usb: typec: tcpm: unregister existing source caps before re-registration") Cc: stable@vger.kernel.org Signed-off-by: Amit Sunil Dhamne amitsd@google.com Reviewed-by: Ondrej Jirman megi@xff.cz Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20240514220134.2143181-1-amitsd@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/tcpm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -2430,8 +2430,10 @@ static int tcpm_register_sink_caps(struc memcpy(caps.pdo, port->sink_caps, sizeof(u32) * port->nr_sink_caps); caps.role = TYPEC_SINK;
- if (cap) + if (cap) { usb_power_delivery_unregister_capabilities(cap); + port->partner_source_caps = NULL; + }
cap = usb_power_delivery_register_capabilities(port->partner_pd, &caps); if (IS_ERR(cap))
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kyle Tso kyletso@google.com
commit fc8fb9eea94d8f476e15f3a4a7addeb16b3b99d6 upstream.
Similar to what fixed in Commit a6fe37f428c1 ("usb: typec: tcpm: Skip hard reset when in error recovery"), the handling of the received Hard Reset has to be skipped during TOGGLING state.
[ 4086.021288] VBUS off [ 4086.021295] pending state change SNK_READY -> SNK_UNATTACHED @ 650 ms [rev2 NONE_AMS] [ 4086.022113] VBUS VSAFE0V [ 4086.022117] state change SNK_READY -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.022447] VBUS off [ 4086.022450] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.023060] VBUS VSAFE0V [ 4086.023064] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.023070] disable BIST MODE TESTDATA [ 4086.023766] disable vbus discharge ret:0 [ 4086.023911] Setting usb_comm capable false [ 4086.028874] Setting voltage/current limit 0 mV 0 mA [ 4086.028888] polarity 0 [ 4086.030305] Requesting mux state 0, usb-role 0, orientation 0 [ 4086.033539] Start toggling [ 4086.038496] state change SNK_UNATTACHED -> TOGGLING [rev2 NONE_AMS]
// This Hard Reset is unexpected [ 4086.038499] Received hard reset [ 4086.038501] state change TOGGLING -> HARD_RESET_START [rev2 HARD_RESET]
Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)") Cc: stable@vger.kernel.org Signed-off-by: Kyle Tso kyletso@google.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20240520154858.1072347-1-kyletso@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/tcpm.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/typec/tcpm/tcpm.c +++ b/drivers/usb/typec/tcpm/tcpm.c @@ -5448,6 +5448,7 @@ static void _tcpm_pd_hard_reset(struct t port->tcpc->set_bist_data(port->tcpc, false);
switch (port->state) { + case TOGGLING: case ERROR_RECOVERY: case PORT_RESET: case PORT_RESET_WAIT_OFF:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomas Winkler tomas.winkler@intel.com
commit 283cb234ef95d94c61f59e1cd070cd9499b51292 upstream.
The mei_me_pci_resume doesn't release irq on the error path, in case mei_start() fails.
Cc: stable@kernel.org Fixes: 33ec08263147 ("mei: revamp mei reset state machine") Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20240604090728.1027307-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/pci-me.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -394,8 +394,10 @@ static int mei_me_pci_resume(struct devi }
err = mei_restart(dev); - if (err) + if (err) { + free_irq(pdev->irq, dev); return err; + }
/* Start timer if stopped in suspend */ schedule_delayed_work(&dev->timer_work, HZ);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit b19ab7ee2c4c1ec5f27c18413c3ab63907f7d55c upstream.
When lookahead has "consumed" some characters (la_count > 0), n_tty_receive_buf_standard() and n_tty_receive_buf_closing() for characters beyond the la_count are given wrong cp/fp offsets which leads to duplicating and losing some characters.
If la_count > 0, correct buffer pointers and make count consistent too (the latter is not strictly necessary to fix the issue but seems more logical to adjust all variables immediately to keep state consistent).
Reported-by: Vadym Krevs vkrevs@yahoo.com Fixes: 6bb6fa6908eb ("tty: Implement lookahead to process XON/XOFF timely") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218834 Tested-by: Vadym Krevs vkrevs@yahoo.com Cc: stable@vger.kernel.org Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20240514140429.12087-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_tty.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -1602,15 +1602,25 @@ static void __receive_buf(struct tty_str else if (ldata->raw || (L_EXTPROC(tty) && !preops)) n_tty_receive_buf_raw(tty, cp, fp, count); else if (tty->closing && !L_EXTPROC(tty)) { - if (la_count > 0) + if (la_count > 0) { n_tty_receive_buf_closing(tty, cp, fp, la_count, true); - if (count > la_count) - n_tty_receive_buf_closing(tty, cp, fp, count - la_count, false); + cp += la_count; + if (fp) + fp += la_count; + count -= la_count; + } + if (count > 0) + n_tty_receive_buf_closing(tty, cp, fp, count, false); } else { - if (la_count > 0) + if (la_count > 0) { n_tty_receive_buf_standard(tty, cp, fp, la_count, true); - if (count > la_count) - n_tty_receive_buf_standard(tty, cp, fp, count - la_count, false); + cp += la_count; + if (fp) + fp += la_count; + count -= la_count; + } + if (count > 0) + n_tty_receive_buf_standard(tty, cp, fp, count, false);
flush_echoes(tty); if (tty->ops->flush_chars)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mickaël Salaün mic@digikod.net
commit 88da52ccd66e65f2e63a6c35c9dff55d448ef4dc upstream.
The WARN_ON_ONCE() in collect_domain_accesses() can be triggered when trying to link a root mount point. This cannot work in practice because this directory is mounted, but the VFS check is done after the call to security_path_link().
Do not use source directory's d_parent when the source directory is the mount point.
Cc: Günther Noack gnoack@google.com Cc: Paul Moore paul@paul-moore.com Cc: stable@vger.kernel.org Reported-by: syzbot+bf4903dc7e12b18ebc87@syzkaller.appspotmail.com Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Closes: https://lore.kernel.org/r/000000000000553d3f0618198200@google.com Link: https://lore.kernel.org/r/20240516181935.1645983-2-mic@digikod.net [mic: Fix commit message] Signed-off-by: Mickaël Salaün mic@digikod.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/landlock/fs.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -824,6 +824,7 @@ static int current_check_refer_path(stru bool allow_parent1, allow_parent2; access_mask_t access_request_parent1, access_request_parent2; struct path mnt_dir; + struct dentry *old_parent; layer_mask_t layer_masks_parent1[LANDLOCK_NUM_ACCESS_FS] = {}, layer_masks_parent2[LANDLOCK_NUM_ACCESS_FS] = {};
@@ -870,9 +871,17 @@ static int current_check_refer_path(stru mnt_dir.mnt = new_dir->mnt; mnt_dir.dentry = new_dir->mnt->mnt_root;
+ /* + * old_dentry may be the root of the common mount point and + * !IS_ROOT(old_dentry) at the same time (e.g. with open_tree() and + * OPEN_TREE_CLONE). We do not need to call dget(old_parent) because + * we keep a reference to old_dentry. + */ + old_parent = (old_dentry == mnt_dir.dentry) ? old_dentry : + old_dentry->d_parent; + /* new_dir->dentry is equal to new_dentry->d_parent */ - allow_parent1 = collect_domain_accesses(dom, mnt_dir.dentry, - old_dentry->d_parent, + allow_parent1 = collect_domain_accesses(dom, mnt_dir.dentry, old_parent, &layer_masks_parent1); allow_parent2 = collect_domain_accesses( dom, mnt_dir.dentry, new_dir->dentry, &layer_masks_parent2);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 7c55b78818cfb732680c4a72ab270cc2d2ee3d0f upstream.
When an xattr size is not what is expected, it is printed out to the kernel log in hex format as a form of debugging. But when that xattr size is bigger than the expected size, printing it out can cause an access off the end of the buffer.
Fix this all up by properly restricting the size of the debug hex dump in the kernel log.
Reported-by: syzbot+9dfe490c8176301c1d06@syzkaller.appspotmail.com Cc: Dave Kleikamp shaggy@kernel.org Link: https://lore.kernel.org/r/2024051433-slider-cloning-98f9@gregkh Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jfs/xattr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/jfs/xattr.c +++ b/fs/jfs/xattr.c @@ -557,9 +557,11 @@ static int ea_get(struct inode *inode, s
size_check: if (EALIST_SIZE(ea_buf->xattr) != ea_size) { + int size = min_t(int, EALIST_SIZE(ea_buf->xattr), ea_size); + printk(KERN_ERR "ea_get: invalid extended attribute\n"); print_hex_dump(KERN_ERR, "", DUMP_PREFIX_ADDRESS, 16, 1, - ea_buf->xattr, ea_size, 1); + ea_buf->xattr, size, 1); ea_release(inode, ea_buf); rc = -EIO; goto clean_up;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit f0260589b439e2637ad54a2b25f00a516ef28a57 upstream.
The transferred length is set incorrectly for cancelled bulk transfer TDs in case the bulk transfer ring stops on the last transfer block with a 'Stop - Length Invalid' completion code.
length essentially ends up being set to the requested length: urb->actual_length = urb->transfer_buffer_length
Length for 'Stop - Length Invalid' cases should be the sum of all TRB transfer block lengths up to the one the ring stopped on, _excluding_ the one stopped on.
Fix this by always summing up TRB lengths for 'Stop - Length Invalid' bulk cases.
This issue was discovered by Alan Stern while debugging https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not solve that bug. Issue is older than 4.10 kernel but fix won't apply to those due to major reworks in that area.
Tested-by: Pierre Tomon pierretom+12@ik.me Cc: stable@vger.kernel.org # v4.10+ Cc: Alan Stern stern@rowland.harvard.edu Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20240611120610.3264502-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2506,9 +2506,8 @@ static int process_bulk_intr_td(struct x goto finish_td; case COMP_STOPPED_LENGTH_INVALID: /* stopped on ep trb with invalid length, exclude it */ - ep_trb_len = 0; - remaining = 0; - break; + td->urb->actual_length = sum_trb_lengths(xhci, ep_ring, ep_trb); + goto finish_td; case COMP_USB_TRANSACTION_ERROR: if (xhci->quirks & XHCI_NO_SOFT_RETRY || (ep->err_count++ > MAX_SOFT_RETRY) ||
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuangyi Chiang ki.chiang65@gmail.com
commit 17bd54555c2aaecfdb38e2734149f684a73fa584 upstream.
As described in commit c877b3b2ad5c ("xhci: Add reset on resume quirk for asrock p67 host"), EJ188 have the same issue as EJ168, where completely dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well.
Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang ki.chiang65@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20240611120610.3264502-3-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -36,6 +36,7 @@
#define PCI_VENDOR_ID_ETRON 0x1b6f #define PCI_DEVICE_ID_EJ168 0x7023 +#define PCI_DEVICE_ID_EJ188 0x7052
#define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 @@ -275,6 +276,10 @@ static void xhci_pci_quirks(struct devic xhci->quirks |= XHCI_TRUST_TX_LENGTH; xhci->quirks |= XHCI_BROKEN_STREAMS; } + if (pdev->vendor == PCI_VENDOR_ID_ETRON && + pdev->device == PCI_DEVICE_ID_EJ188) + xhci->quirks |= XHCI_RESET_ON_RESUME; + if (pdev->vendor == PCI_VENDOR_ID_RENESAS && pdev->device == 0x0014) { xhci->quirks |= XHCI_TRUST_TX_LENGTH;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hector Martin marcan@marcan.st
commit 5ceac4402f5d975e5a01c806438eb4e554771577 upstream.
When multiple streams are in use, multiple TDs might be in flight when an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for each, to ensure everything is reset properly and the caches cleared. Change the logic so that any N>1 TDs found active for different streams are deferred until after the first one is processed, calling xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to queue another command until we are done with all of them. Also change the error/"should never happen" paths to ensure we at least clear any affected TDs, even if we can't issue a command to clear the hardware cache, and complain loudly with an xhci_warn() if this ever happens.
This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") early on in the XHCI driver's life, when stream support was first added. It was then identified but not fixed nor made into a warning in commit 674f8438c121 ("xhci: split handling halted endpoints into two steps"), which added a FIXME comment for the problem case (without materially changing the behavior as far as I can tell, though the new logic made the problem more obvious).
Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs."), it was acknowledged again.
[Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") was a targeted regression fix to the previously mentioned patch. Users reported issues with usb stuck after unmounting/disconnecting UAS devices. This rolled back the TD clearing of multiple streams to its original state.]
Apparently the commit author was aware of the problem (yet still chose to submit it): It was still mentioned as a FIXME, an xhci_dbg() was added to log the problem condition, and the remaining issue was mentioned in the commit description. The choice of making the log type xhci_dbg() for what is, at this point, a completely unhandled and known broken condition is puzzling and unfortunate, as it guarantees that no actual users would see the log in production, thereby making it nigh undebuggable (indeed, even if you turn on DEBUG, the message doesn't really hint at there being a problem at all).
It took me *months* of random xHC crashes to finally find a reliable repro and be able to do a deep dive debug session, which could all have been avoided had this unhandled, broken condition been actually reported with a warning, as it should have been as a bug intentionally left in unfixed (never mind that it shouldn't have been left in at all).
Another fix to solve clearing the caches of all stream rings with cancelled TDs is needed, but not as urgent.
3 years after that statement and 14 years after the original bug was introduced, I think it's finally time to fix it. And maybe next time let's not leave bugs unfixed (that are actually worse than the original bug), and let's actually get people to review kernel commits please.
Fixes xHC crashes and IOMMU faults with UAS devices when handling errors/faults. Easiest repro is to use `hdparm` to mark an early sector (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. At least in the case of JMicron controllers, the read errors end up having to cancel two TDs (for two queued requests to different streams) and the one that didn't get cleared properly ends up faulting the xHC entirely when it tries to access DMA pages that have since been unmapped, referred to by the stale TDs. This normally happens quickly (after two or three loops). After this fix, I left the `cat` in a loop running overnight and experienced no xHC failures, with all read errors recovered properly. Repro'd and tested on an Apple M1 Mac Mini (dwc3 host).
On systems without an IOMMU, this bug would instead silently corrupt freed memory, making this a security bug (even on systems with IOMMUs this could silently corrupt memory belonging to other USB devices on the same controller, so it's still a security bug). Given that the kernel autoprobes partition tables, I'm pretty sure a malicious USB device pretending to be a UAS device and reporting an error with the right timing could deliberately trigger a UAF and write to freed memory, with no user action.
[Mathias: Commit message and code comment edit, original at:] https://lore.kernel.org/linux-usb/20240524-xhci-streams-v1-1-6b1f13819bea@ma...
Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps") Cc: stable@vger.kernel.org Cc: security@kernel.org Reviewed-by: Neal Gompa neal@gompa.dev Signed-off-by: Hector Martin marcan@marcan.st Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20240611120610.3264502-5-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 54 ++++++++++++++++++++++++++++++++++--------- drivers/usb/host/xhci.h | 1 2 files changed, 44 insertions(+), 11 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -987,13 +987,27 @@ static int xhci_invalidate_cancelled_tds break; case TD_DIRTY: /* TD is cached, clear it */ case TD_HALTED: + case TD_CLEARING_CACHE_DEFERRED: + if (cached_td) { + if (cached_td->urb->stream_id != td->urb->stream_id) { + /* Multiple streams case, defer move dq */ + xhci_dbg(xhci, + "Move dq deferred: stream %u URB %p\n", + td->urb->stream_id, td->urb); + td->cancel_status = TD_CLEARING_CACHE_DEFERRED; + break; + } + + /* Should never happen, but clear the TD if it does */ + xhci_warn(xhci, + "Found multiple active URBs %p and %p in stream %u?\n", + td->urb, cached_td->urb, + td->urb->stream_id); + td_to_noop(xhci, ring, cached_td, false); + cached_td->cancel_status = TD_CLEARED; + } + td->cancel_status = TD_CLEARING_CACHE; - if (cached_td) - /* FIXME stream case, several stopped rings */ - xhci_dbg(xhci, - "Move dq past stream %u URB %p instead of stream %u URB %p\n", - td->urb->stream_id, td->urb, - cached_td->urb->stream_id, cached_td->urb); cached_td = td; break; } @@ -1013,10 +1027,16 @@ static int xhci_invalidate_cancelled_tds if (err) { /* Failed to move past cached td, just set cached TDs to no-op */ list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list, cancelled_td_list) { - if (td->cancel_status != TD_CLEARING_CACHE) + /* + * Deferred TDs need to have the deq pointer set after the above command + * completes, so if that failed we just give up on all of them (and + * complain loudly since this could cause issues due to caching). + */ + if (td->cancel_status != TD_CLEARING_CACHE && + td->cancel_status != TD_CLEARING_CACHE_DEFERRED) continue; - xhci_dbg(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n", - td->urb); + xhci_warn(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n", + td->urb); td_to_noop(xhci, ring, td, false); td->cancel_status = TD_CLEARED; } @@ -1304,6 +1324,7 @@ static void xhci_handle_cmd_set_deq(stru struct xhci_ep_ctx *ep_ctx; struct xhci_slot_ctx *slot_ctx; struct xhci_td *td, *tmp_td; + bool deferred = false;
ep_index = TRB_TO_EP_INDEX(le32_to_cpu(trb->generic.field[3])); stream_id = TRB_TO_STREAM_ID(le32_to_cpu(trb->generic.field[2])); @@ -1390,6 +1411,8 @@ static void xhci_handle_cmd_set_deq(stru xhci_dbg(ep->xhci, "%s: Giveback cancelled URB %p TD\n", __func__, td->urb); xhci_td_cleanup(ep->xhci, td, ep_ring, td->status); + } else if (td->cancel_status == TD_CLEARING_CACHE_DEFERRED) { + deferred = true; } else { xhci_dbg(ep->xhci, "%s: Keep cancelled URB %p TD as cancel_status is %d\n", __func__, td->urb, td->cancel_status); @@ -1399,8 +1422,17 @@ cleanup: ep->ep_state &= ~SET_DEQ_PENDING; ep->queued_deq_seg = NULL; ep->queued_deq_ptr = NULL; - /* Restart any rings with pending URBs */ - ring_doorbell_for_active_rings(xhci, slot_id, ep_index); + + if (deferred) { + /* We have more streams to clear */ + xhci_dbg(ep->xhci, "%s: Pending TDs to clear, continuing with invalidation\n", + __func__); + xhci_invalidate_cancelled_tds(ep); + } else { + /* Restart any rings with pending URBs */ + xhci_dbg(ep->xhci, "%s: All TDs cleared, ring doorbell\n", __func__); + ring_doorbell_for_active_rings(xhci, slot_id, ep_index); + } }
static void xhci_handle_cmd_reset_ep(struct xhci_hcd *xhci, int slot_id, --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1556,6 +1556,7 @@ enum xhci_cancelled_td_status { TD_DIRTY = 0, TD_HALTED, TD_CLEARING_CACHE, + TD_CLEARING_CACHE_DEFERRED, TD_CLEARED, };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuangyi Chiang ki.chiang65@gmail.com
commit 91f7a1524a92c70ffe264db8bdfa075f15bbbeb9 upstream.
As described in commit 8f873c1ff4ca ("xhci: Blacklist using streams on the Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188 as well.
Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang ki.chiang65@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20240611120610.3264502-4-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -277,8 +277,10 @@ static void xhci_pci_quirks(struct devic xhci->quirks |= XHCI_BROKEN_STREAMS; } if (pdev->vendor == PCI_VENDOR_ID_ETRON && - pdev->device == PCI_DEVICE_ID_EJ188) + pdev->device == PCI_DEVICE_ID_EJ188) { xhci->quirks |= XHCI_RESET_ON_RESUME; + xhci->quirks |= XHCI_BROKEN_STREAMS; + }
if (pdev->vendor == PCI_VENDOR_ID_RENESAS && pdev->device == 0x0014) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aapo Vienamo aapo.vienamo@linux.intel.com
commit 985cfe501b74f214905ab4817acee0df24627268 upstream.
The margin debugfs node controls the "Enable Margin Test" field of the lane margining operations. This field selects between either low or high voltage margin values for voltage margin test or left or right timing margin values for timing margin test.
According to the USB4 specification, whether or not the "Enable Margin Test" control applies, depends on the values of the "Independent High/Low Voltage Margin" or "Independent Left/Right Timing Margin" capability fields for voltage and timing margin tests respectively. The pre-existing condition enabled the debugfs node also in the case where both low/high or left/right margins are returned, which is incorrect. This change only enables the debugfs node in question, if the specific required capability values are met.
Signed-off-by: Aapo Vienamo aapo.vienamo@linux.intel.com Fixes: d0f1e0c2a699 ("thunderbolt: Add support for receiver lane margining") Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/debugfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/thunderbolt/debugfs.c +++ b/drivers/thunderbolt/debugfs.c @@ -927,8 +927,9 @@ static void margining_port_init(struct t debugfs_create_file("run", 0600, dir, port, &margining_run_fops); debugfs_create_file("results", 0600, dir, port, &margining_results_fops); debugfs_create_file("test", 0600, dir, port, &margining_test_fops); - if (independent_voltage_margins(usb4) || - (supports_time(usb4) && independent_time_margins(usb4))) + if (independent_voltage_margins(usb4) == USB4_MARGIN_CAP_0_VOLTAGE_HL || + (supports_time(usb4) && + independent_time_margins(usb4) == USB4_MARGIN_CAP_1_TIME_LR)) debugfs_create_file("margin", 0600, dir, port, &margining_margin_fops); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit 90e6f08915ec6efe46570420412a65050ec826b2 upstream.
The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to the HBA if a read or write command directed at an ATA device should be translated to an NCQ read/write command with the high prioiryt bit set when the request uses the RT priority class and the user has enabled NCQ priority through sysfs.
However, unlike the mpt3sas driver, the mpi3mr driver does not define the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never actually set and NCQ Priority cannot ever be used.
Fix this by defining these missing atributes to allow a user to check if an ATA device supports NCQ priority and to enable/disable the use of NCQ priority. To do this, lift the function scsih_ncq_prio_supp() out of the mpt3sas driver and make it the generic SCSI SAS transport function sas_ata_ncq_prio_supported(). Nothing in that function is hardware specific, so this function can be used in both the mpt3sas driver and the mpi3mr driver.
Reported-by: Scott McCoy scott.mccoy@wdc.com Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20240611083435.92961-1-dlemoal@kernel.org Reviewed-by: Niklas Cassel cassel@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpi3mr/mpi3mr_app.c | 62 +++++++++++++++++++++++++++++++++++ drivers/scsi/mpt3sas/mpt3sas_base.h | 3 - drivers/scsi/mpt3sas/mpt3sas_ctl.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 23 ------------ drivers/scsi/scsi_transport_sas.c | 23 ++++++++++++ include/scsi/scsi_transport_sas.h | 2 + 6 files changed, 89 insertions(+), 28 deletions(-)
--- a/drivers/scsi/mpi3mr/mpi3mr_app.c +++ b/drivers/scsi/mpi3mr/mpi3mr_app.c @@ -1851,10 +1851,72 @@ persistent_id_show(struct device *dev, s } static DEVICE_ATTR_RO(persistent_id);
+/** + * sas_ncq_prio_supported_show - Indicate if device supports NCQ priority + * @dev: pointer to embedded device + * @attr: sas_ncq_prio_supported attribute descriptor + * @buf: the buffer returned + * + * A sysfs 'read-only' sdev attribute, only works with SATA devices + */ +static ssize_t +sas_ncq_prio_supported_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct scsi_device *sdev = to_scsi_device(dev); + + return sysfs_emit(buf, "%d\n", sas_ata_ncq_prio_supported(sdev)); +} +static DEVICE_ATTR_RO(sas_ncq_prio_supported); + +/** + * sas_ncq_prio_enable_show - send prioritized io commands to device + * @dev: pointer to embedded device + * @attr: sas_ncq_prio_enable attribute descriptor + * @buf: the buffer returned + * + * A sysfs 'read/write' sdev attribute, only works with SATA devices + */ +static ssize_t +sas_ncq_prio_enable_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct scsi_device *sdev = to_scsi_device(dev); + struct mpi3mr_sdev_priv_data *sdev_priv_data = sdev->hostdata; + + if (!sdev_priv_data) + return 0; + + return sysfs_emit(buf, "%d\n", sdev_priv_data->ncq_prio_enable); +} + +static ssize_t +sas_ncq_prio_enable_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct scsi_device *sdev = to_scsi_device(dev); + struct mpi3mr_sdev_priv_data *sdev_priv_data = sdev->hostdata; + bool ncq_prio_enable = 0; + + if (kstrtobool(buf, &ncq_prio_enable)) + return -EINVAL; + + if (!sas_ata_ncq_prio_supported(sdev)) + return -EINVAL; + + sdev_priv_data->ncq_prio_enable = ncq_prio_enable; + + return strlen(buf); +} +static DEVICE_ATTR_RW(sas_ncq_prio_enable); + static struct attribute *mpi3mr_dev_attrs[] = { &dev_attr_sas_address.attr, &dev_attr_device_handle.attr, &dev_attr_persistent_id.attr, + &dev_attr_sas_ncq_prio_supported.attr, + &dev_attr_sas_ncq_prio_enable.attr, NULL, };
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -2045,9 +2045,6 @@ void mpt3sas_setup_direct_io(struct MPT3SAS_ADAPTER *ioc, struct scsi_cmnd *scmd, struct _raid_device *raid_device, Mpi25SCSIIORequest_t *mpi_request);
-/* NCQ Prio Handling Check */ -bool scsih_ncq_prio_supp(struct scsi_device *sdev); - void mpt3sas_setup_debugfs(struct MPT3SAS_ADAPTER *ioc); void mpt3sas_destroy_debugfs(struct MPT3SAS_ADAPTER *ioc); void mpt3sas_init_debugfs(void); --- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c +++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c @@ -4034,7 +4034,7 @@ sas_ncq_prio_supported_show(struct devic { struct scsi_device *sdev = to_scsi_device(dev);
- return sysfs_emit(buf, "%d\n", scsih_ncq_prio_supp(sdev)); + return sysfs_emit(buf, "%d\n", sas_ata_ncq_prio_supported(sdev)); } static DEVICE_ATTR_RO(sas_ncq_prio_supported);
@@ -4069,7 +4069,7 @@ sas_ncq_prio_enable_store(struct device if (kstrtobool(buf, &ncq_prio_enable)) return -EINVAL;
- if (!scsih_ncq_prio_supp(sdev)) + if (!sas_ata_ncq_prio_supported(sdev)) return -EINVAL;
sas_device_priv_data->ncq_prio_enable = ncq_prio_enable; --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -12591,29 +12591,6 @@ scsih_pci_mmio_enabled(struct pci_dev *p return PCI_ERS_RESULT_RECOVERED; }
-/** - * scsih_ncq_prio_supp - Check for NCQ command priority support - * @sdev: scsi device struct - * - * This is called when a user indicates they would like to enable - * ncq command priorities. This works only on SATA devices. - */ -bool scsih_ncq_prio_supp(struct scsi_device *sdev) -{ - struct scsi_vpd *vpd; - bool ncq_prio_supp = false; - - rcu_read_lock(); - vpd = rcu_dereference(sdev->vpd_pg89); - if (!vpd || vpd->len < 214) - goto out; - - ncq_prio_supp = (vpd->data[213] >> 4) & 1; -out: - rcu_read_unlock(); - - return ncq_prio_supp; -} /* * The pci device ids are defined in mpi/mpi2_cnfg.h. */ --- a/drivers/scsi/scsi_transport_sas.c +++ b/drivers/scsi/scsi_transport_sas.c @@ -416,6 +416,29 @@ unsigned int sas_is_tlr_enabled(struct s } EXPORT_SYMBOL_GPL(sas_is_tlr_enabled);
+/** + * sas_ata_ncq_prio_supported - Check for ATA NCQ command priority support + * @sdev: SCSI device + * + * Check if an ATA device supports NCQ priority using VPD page 89h (ATA + * Information). Since this VPD page is implemented only for ATA devices, + * this function always returns false for SCSI devices. + */ +bool sas_ata_ncq_prio_supported(struct scsi_device *sdev) +{ + struct scsi_vpd *vpd; + bool ncq_prio_supported = false; + + rcu_read_lock(); + vpd = rcu_dereference(sdev->vpd_pg89); + if (vpd && vpd->len >= 214) + ncq_prio_supported = (vpd->data[213] >> 4) & 1; + rcu_read_unlock(); + + return ncq_prio_supported; +} +EXPORT_SYMBOL_GPL(sas_ata_ncq_prio_supported); + /* * SAS Phy attributes */ --- a/include/scsi/scsi_transport_sas.h +++ b/include/scsi/scsi_transport_sas.h @@ -200,6 +200,8 @@ unsigned int sas_is_tlr_enabled(struct s void sas_disable_tlr(struct scsi_device *); void sas_enable_tlr(struct scsi_device *);
+bool sas_ata_ncq_prio_supported(struct scsi_device *sdev); + extern struct sas_rphy *sas_end_device_alloc(struct sas_port *); extern struct sas_rphy *sas_expander_alloc(struct sas_port *, enum sas_device_type); void sas_rphy_free(struct sas_rphy *);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Breno Leitao leitao@debian.org
commit 4254dfeda82f20844299dca6c38cbffcfd499f41 upstream.
There is a potential out-of-bounds access when using test_bit() on a single word. The test_bit() and set_bit() functions operate on long values, and when testing or setting a single word, they can exceed the word boundary. KASAN detects this issue and produces a dump:
BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas
Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965
For full log, please look at [1].
Make the allocation at least the size of sizeof(unsigned long) so that set_bit() and test_bit() have sufficient room for read/write operations without overwriting unallocated memory.
[1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/
Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Cc: stable@vger.kernel.org Suggested-by: Keith Busch kbusch@kernel.org Signed-off-by: Breno Leitao leitao@debian.org Link: https://lore.kernel.org/r/20240605085530.499432-1-leitao@debian.org Reviewed-by: Keith Busch kbusch@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -8497,6 +8497,12 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPT ioc->pd_handles_sz = (ioc->facts.MaxDevHandle / 8); if (ioc->facts.MaxDevHandle % 8) ioc->pd_handles_sz++; + /* + * pd_handles_sz should have, at least, the minimal room for + * set_bit()/test_bit(), otherwise out-of-memory touch may occur. + */ + ioc->pd_handles_sz = ALIGN(ioc->pd_handles_sz, sizeof(unsigned long)); + ioc->pd_handles = kzalloc(ioc->pd_handles_sz, GFP_KERNEL); if (!ioc->pd_handles) { @@ -8514,6 +8520,13 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPT ioc->pend_os_device_add_sz = (ioc->facts.MaxDevHandle / 8); if (ioc->facts.MaxDevHandle % 8) ioc->pend_os_device_add_sz++; + + /* + * pend_os_device_add_sz should have, at least, the minimal room for + * set_bit()/test_bit(), otherwise out-of-memory may occur. + */ + ioc->pend_os_device_add_sz = ALIGN(ioc->pend_os_device_add_sz, + sizeof(unsigned long)); ioc->pend_os_device_add = kzalloc(ioc->pend_os_device_add_sz, GFP_KERNEL); if (!ioc->pend_os_device_add) { @@ -8805,6 +8818,12 @@ _base_check_ioc_facts_changes(struct MPT if (ioc->facts.MaxDevHandle % 8) pd_handles_sz++;
+ /* + * pd_handles should have, at least, the minimal room for + * set_bit()/test_bit(), otherwise out-of-memory touch may + * occur. + */ + pd_handles_sz = ALIGN(pd_handles_sz, sizeof(unsigned long)); pd_handles = krealloc(ioc->pd_handles, pd_handles_sz, GFP_KERNEL); if (!pd_handles) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin K. Petersen martin.petersen@oracle.com
commit 7926d51f73e0434a6250c2fd1a0555f98d9a62da upstream.
Commit 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") triggered a read to LBA 0 before attempting to inquire about device characteristics. This was done because some protocol bridge devices will return generic values until an attached storage device's media has been accessed.
Pierre Tomon reported that this change caused problems on a large capacity external drive connected via a bridge device. The bridge in question does not appear to implement the READ(10) command.
Issue a READ(16) instead of READ(10) when a device has been identified as preferring 16-byte commands (use_16_for_rw heuristic).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218890 Link: https://lore.kernel.org/r/70dd7ae0-b6b1-48e1-bb59-53b7c7f18274@rowland.harva... Link: https://lore.kernel.org/r/20240605022521.3960956-1-martin.petersen@oracle.co... Fixes: 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") Cc: stable@vger.kernel.org Reported-by: Pierre Tomon pierretom+12@ik.me Suggested-by: Alan Stern stern@rowland.harvard.edu Tested-by: Pierre Tomon pierretom+12@ik.me Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/sd.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3288,16 +3288,23 @@ static bool sd_validate_opt_xfer_size(st
static void sd_read_block_zero(struct scsi_disk *sdkp) { - unsigned int buf_len = sdkp->device->sector_size; - char *buffer, cmd[10] = { }; + struct scsi_device *sdev = sdkp->device; + unsigned int buf_len = sdev->sector_size; + u8 *buffer, cmd[16] = { };
buffer = kmalloc(buf_len, GFP_KERNEL); if (!buffer) return;
- cmd[0] = READ_10; - put_unaligned_be32(0, &cmd[2]); /* Logical block address 0 */ - put_unaligned_be16(1, &cmd[7]); /* Transfer 1 logical block */ + if (sdev->use_16_for_rw) { + cmd[0] = READ_16; + put_unaligned_be64(0, &cmd[2]); /* Logical block address 0 */ + put_unaligned_be32(1, &cmd[10]);/* Transfer 1 logical block */ + } else { + cmd[0] = READ_10; + put_unaligned_be32(0, &cmd[2]); /* Logical block address 0 */ + put_unaligned_be16(1, &cmd[7]); /* Transfer 1 logical block */ + }
scsi_execute_req(sdkp->device, cmd, DMA_FROM_DEVICE, buffer, buf_len, NULL, SD_TIMEOUT, sdkp->max_retries, NULL);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ziwei Xiao ziweixiao@google.com
commit 6f4d93b78ade0a4c2cafd587f7b429ce95abb02e upstream.
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer.
Fix this by clearing napi->skb before the skb is freed.
Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Cc: stable@vger.kernel.org Reported-by: Shailend Chand shailend@google.com Signed-off-by: Ziwei Xiao ziweixiao@google.com Reviewed-by: Harshitha Ramamurthy hramamurthy@google.com Reviewed-by: Shailend Chand shailend@google.com Reviewed-by: Praveen Kaligineedi pkaligineedi@google.com Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/google/gve/gve_rx_dqo.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c +++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c @@ -465,11 +465,13 @@ static void gve_rx_skb_hash(struct sk_bu skb_set_hash(skb, le32_to_cpu(compl_desc->hash), hash_type); }
-static void gve_rx_free_skb(struct gve_rx_ring *rx) +static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx) { if (!rx->ctx.skb_head) return;
+ if (rx->ctx.skb_head == napi->skb) + napi->skb = NULL; dev_kfree_skb_any(rx->ctx.skb_head); rx->ctx.skb_head = NULL; rx->ctx.skb_tail = NULL; @@ -693,7 +695,7 @@ int gve_rx_poll_dqo(struct gve_notify_bl
err = gve_rx_dqo(napi, rx, compl_desc, rx->q_num); if (err < 0) { - gve_rx_free_skb(rx); + gve_rx_free_skb(napi, rx); u64_stats_update_begin(&rx->statss); if (err == -ENOMEM) rx->rx_skb_alloc_fail++; @@ -736,7 +738,7 @@ int gve_rx_poll_dqo(struct gve_notify_bl
/* gve_rx_complete_skb() will consume skb if successful */ if (gve_rx_complete_skb(rx, napi, compl_desc, feat) != 0) { - gve_rx_free_skb(rx); + gve_rx_free_skb(napi, rx); u64_stats_update_begin(&rx->statss); rx->rx_desc_err_dropped_pkt++; u64_stats_update_end(&rx->statss);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit 2d43cc701b96f910f50915ac4c2a0cae5deb734c upstream.
Building ppc64le_defconfig with GCC 14 fails with assembler errors:
CC fs/readdir.o /tmp/ccdQn0mD.s: Assembler messages: /tmp/ccdQn0mD.s:212: Error: operand out of domain (18 is not a multiple of 4) /tmp/ccdQn0mD.s:226: Error: operand out of domain (18 is not a multiple of 4) ... [6 lines] /tmp/ccdQn0mD.s:1699: Error: operand out of domain (18 is not a multiple of 4)
A snippet of the asm shows:
# ../fs/readdir.c:210: unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); ld 9,0(29) # MEM[(u64 *)name_38(D) + _88 * 1], MEM[(u64 *)name_38(D) + _88 * 1] # 210 "../fs/readdir.c" 1 1: std 9,18(8) # put_user # *__pus_addr_52, MEM[(u64 *)name_38(D) + _88 * 1]
The 'std' instruction requires a 4-byte aligned displacement because it is a DS-form instruction, and as the assembler says, 18 is not a multiple of 4.
A similar error is seen with GCC 13 and CONFIG_UBSAN_SIGNED_WRAP=y.
The fix is to change the constraint on the memory operand to put_user(), from "m" which is a general memory reference to "YZ".
The "Z" constraint is documented in the GCC manual PowerPC machine constraints, and specifies a "memory operand accessed with indexed or indirect addressing". "Y" is not documented in the manual but specifies a "memory operand for a DS-form instruction". Using both allows the compiler to generate a DS-form "std" or X-form "stdx" as appropriate.
The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because the "Y" constraint does not guarantee 4-byte alignment when prefixed instructions are enabled.
Unfortunately clang doesn't support the "Y" constraint so that has to be behind an ifdef.
Although the build error is only seen with GCC 13/14, that appears to just be luck. The constraint has been incorrect since it was first added.
Fixes: c20beffeec3c ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()") Cc: stable@vger.kernel.org # v5.10+ Suggested-by: Kewen Lin linkw@gcc.gnu.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20240529123029.146953-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/uaccess.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -80,9 +80,20 @@ __pu_failed: \ : \ : label)
+#ifdef CONFIG_CC_IS_CLANG +#define DS_FORM_CONSTRAINT "Z<>" +#else +#define DS_FORM_CONSTRAINT "YZ<>" +#endif + #ifdef __powerpc64__ -#define __put_user_asm2_goto(x, ptr, label) \ - __put_user_asm_goto(x, ptr, label, "std") +#define __put_user_asm2_goto(x, addr, label) \ + asm goto ("1: std%U1%X1 %0,%1 # put_user\n" \ + EX_TABLE(1b, %l2) \ + : \ + : "r" (x), DS_FORM_CONSTRAINT (*addr) \ + : \ + : label) #else /* __powerpc64__ */ #define __put_user_asm2_goto(x, addr, label) \ asm goto( \
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit 0774d19038c496f0c3602fb505c43e1b2d8eed85 upstream.
If an input device declares too many capability bits then modalias string for such device may become too long and not fit into uevent buffer, resulting in failure of sending said uevent. This, in turn, may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have limited number of keys, but happen with synthetic devices such as ones created by xen-kbdfront driver, which creates devices as being capable of delivering all possible keys, since it doesn't know what keys the backend may produce.
To deal with such devices input core will attempt to trim key data, in the hope that the rest of modalias string will fit in the given buffer. When trimming key data it will indicate that it is not complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174, new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing devices, and will also allow writing more complex rules that would recognize trimmed modalias and check input device characteristics by other means (for example by parsing KEY= data in uevent or parsing input device sysfs attributes).
Note that the driver core may try adding more uevent environment variables once input core is done adding its own, so when forming modalias we can not use the entire available buffer, so we reduce it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk jandryuk@gmail.com Reviewed-by: Peter Hutterer peter.hutterer@who-t.net Tested-by: Jason Andryuk jandryuk@gmail.com Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Jason Andryuk jason.andryuk@amd.com --- drivers/input/input.c | 104 ++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 89 insertions(+), 15 deletions(-)
--- a/drivers/input/input.c +++ b/drivers/input/input.c @@ -1374,19 +1374,19 @@ static int input_print_modalias_bits(cha char name, unsigned long *bm, unsigned int min_bit, unsigned int max_bit) { - int len = 0, i; + int bit = min_bit; + int len = 0;
len += snprintf(buf, max(size, 0), "%c", name); - for (i = min_bit; i < max_bit; i++) - if (bm[BIT_WORD(i)] & BIT_MASK(i)) - len += snprintf(buf + len, max(size - len, 0), "%X,", i); + for_each_set_bit_from(bit, bm, max_bit) + len += snprintf(buf + len, max(size - len, 0), "%X,", bit); return len; }
-static int input_print_modalias(char *buf, int size, struct input_dev *id, - int add_cr) +static int input_print_modalias_parts(char *buf, int size, int full_len, + struct input_dev *id) { - int len; + int len, klen, remainder, space;
len = snprintf(buf, max(size, 0), "input:b%04Xv%04Xp%04Xe%04X-", @@ -1395,8 +1395,48 @@ static int input_print_modalias(char *bu
len += input_print_modalias_bits(buf + len, size - len, 'e', id->evbit, 0, EV_MAX); - len += input_print_modalias_bits(buf + len, size - len, + + /* + * Calculate the remaining space in the buffer making sure we + * have place for the terminating 0. + */ + space = max(size - (len + 1), 0); + + klen = input_print_modalias_bits(buf + len, size - len, 'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX); + len += klen; + + /* + * If we have more data than we can fit in the buffer, check + * if we can trim key data to fit in the rest. We will indicate + * that key data is incomplete by adding "+" sign at the end, like + * this: * "k1,2,3,45,+,". + * + * Note that we shortest key info (if present) is "k+," so we + * can only try to trim if key data is longer than that. + */ + if (full_len && size < full_len + 1 && klen > 3) { + remainder = full_len - len; + /* + * We can only trim if we have space for the remainder + * and also for at least "k+," which is 3 more characters. + */ + if (remainder <= space - 3) { + /* + * We are guaranteed to have 'k' in the buffer, so + * we need at least 3 additional bytes for storing + * "+," in addition to the remainder. + */ + for (int i = size - 1 - remainder - 3; i >= 0; i--) { + if (buf[i] == 'k' || buf[i] == ',') { + strcpy(buf + i + 1, "+,"); + len = i + 3; /* Not counting '\0' */ + break; + } + } + } + } + len += input_print_modalias_bits(buf + len, size - len, 'r', id->relbit, 0, REL_MAX); len += input_print_modalias_bits(buf + len, size - len, @@ -1412,12 +1452,25 @@ static int input_print_modalias(char *bu len += input_print_modalias_bits(buf + len, size - len, 'w', id->swbit, 0, SW_MAX);
- if (add_cr) - len += snprintf(buf + len, max(size - len, 0), "\n"); - return len; }
+static int input_print_modalias(char *buf, int size, struct input_dev *id) +{ + int full_len; + + /* + * Printing is done in 2 passes: first one figures out total length + * needed for the modalias string, second one will try to trim key + * data in case when buffer is too small for the entire modalias. + * If the buffer is too small regardless, it will fill as much as it + * can (without trimming key data) into the buffer and leave it to + * the caller to figure out what to do with the result. + */ + full_len = input_print_modalias_parts(NULL, 0, 0, id); + return input_print_modalias_parts(buf, size, full_len, id); +} + static ssize_t input_dev_show_modalias(struct device *dev, struct device_attribute *attr, char *buf) @@ -1425,7 +1478,9 @@ static ssize_t input_dev_show_modalias(s struct input_dev *id = to_input_dev(dev); ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1); + len = input_print_modalias(buf, PAGE_SIZE, id); + if (len < PAGE_SIZE - 2) + len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE); } @@ -1637,6 +1692,23 @@ static int input_add_uevent_bm_var(struc return 0; }
+/* + * This is a pretty gross hack. When building uevent data the driver core + * may try adding more environment variables to kobj_uevent_env without + * telling us, so we have no idea how much of the buffer we can use to + * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially + * reduce amount of memory we will use for the modalias environment variable. + * + * The potential additions are: + * + * SEQNUM=18446744073709551615 - (%llu - 28 bytes) + * HOME=/ (6 bytes) + * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes) + * + * 68 bytes total. Allow extra buffer - 96 bytes + */ +#define UEVENT_ENV_EXTRA_LEN 96 + static int input_add_uevent_modalias_var(struct kobj_uevent_env *env, struct input_dev *dev) { @@ -1646,9 +1718,11 @@ static int input_add_uevent_modalias_var return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1], - sizeof(env->buf) - env->buflen, - dev, 0); - if (len >= (sizeof(env->buf) - env->buflen)) + (int)sizeof(env->buf) - env->buflen - + UEVENT_ENV_EXTRA_LEN, + dev); + if (len >= ((int)sizeof(env->buf) - env->buflen - + UEVENT_ENV_EXTRA_LEN)) return -ENOMEM;
env->buflen += len;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit d55510527153d17a3af8cc2df69c04f95ae1350d ]
tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not include linux/vmalloc.h. Kernel v6.10 made changes that causes the currently included headers not depend on vmalloc.h and therefore mem.c can no longer compile. Add linux/vmalloc.h to fix compile issue.
CC [M] tools/testing/cxl/test/mem.o tools/testing/cxl/test/mem.c: In function ‘label_area_release’: tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration] 1428 | vfree(lsa); | ^~~~~ | kvfree tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’: tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration] 1466 | mdata->lsa = vmalloc(LSA_SIZE); | ^~~~~~~ | kmalloc
Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver") Reviewed-by: Dan Williams dan.j.williams@intel.com Reviewed-by: Alison Schofield alison.schofield@intel.com Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com Signed-off-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/cxl/test/mem.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index aa2df3a150518..6e9a89f54d94f 100644 --- a/tools/testing/cxl/test/mem.c +++ b/tools/testing/cxl/test/mem.c @@ -3,6 +3,7 @@
#include <linux/platform_device.h> #include <linux/mod_devicetable.h> +#include <linux/vmalloc.h> #include <linux/module.h> #include <linux/delay.h> #include <linux/sizes.h>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit cc5ac966f26193ab185cc43d64d9f1ae998ccb6e ]
This lets us see the correct trace output.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-2-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/cachefiles.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index d8d4d73fe7b6a..b4939097f5116 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -127,7 +127,9 @@ enum cachefiles_error_trace { EM(cachefiles_obj_see_lookup_cookie, "SEE lookup_cookie") \ EM(cachefiles_obj_see_lookup_failed, "SEE lookup_failed") \ EM(cachefiles_obj_see_withdraw_cookie, "SEE withdraw_cookie") \ - E_(cachefiles_obj_see_withdrawal, "SEE withdrawal") + EM(cachefiles_obj_see_withdrawal, "SEE withdrawal") \ + EM(cachefiles_obj_get_ondemand_fd, "GET ondemand_fd") \ + E_(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd")
#define cachefiles_coherency_traces \ EM(cachefiles_coherency_check_aux, "BAD aux ") \
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 0fc75c5940fa634d84e64c93bfc388e1274ed013 ]
Even with CACHEFILES_DEAD set, we can still read the requests, so in the following concurrency the request may be used after it has been freed:
mount | daemon_thread1 | daemon_thread2 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read // close dev fd cachefiles_flush_reqs complete(&REQ_A->done) kfree(REQ_A) xa_lock(&cache->reqs); cachefiles_ondemand_select_req req->msg.opcode != CACHEFILES_OP_READ // req use-after-free !!! xa_unlock(&cache->reqs); xa_destroy(&cache->reqs)
Hence remove requests from cache->reqs when flushing them to avoid accessing freed requests.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-3-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Reviewed-by: Jia Zhu zhujia.zj@bytedance.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/daemon.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 5f4df9588620f..7d1f456e376dd 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -158,6 +158,7 @@ static void cachefiles_flush_reqs(struct cachefiles_cache *cache) xa_for_each(xa, index, req) { req->error = -EIO; complete(&req->done); + __xa_erase(xa, index); } xa_unlock(xa);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu zhujia.zj@bytedance.com
[ Upstream commit 357a18d033143617e9c7d420c8f0dd4cbab5f34d ]
Previously, @ondemand_id field was used not only to identify ondemand state of the object, but also to represent the index of the xarray. This commit introduces @state field to decouple the role of @ondemand_id and adds helpers to access it.
Signed-off-by: Jia Zhu zhujia.zj@bytedance.com Link: https://lore.kernel.org/r/20231120041422.75170-2-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/internal.h | 21 +++++++++++++++++++++ fs/cachefiles/ondemand.c | 21 +++++++++------------ 2 files changed, 30 insertions(+), 12 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 2ad58c4652084..00beedeaec183 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -44,6 +44,11 @@ struct cachefiles_volume { struct dentry *fanout[256]; /* Fanout subdirs */ };
+enum cachefiles_object_state { + CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */ + CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */ +}; + /* * Backing file state. */ @@ -62,6 +67,7 @@ struct cachefiles_object { #define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */ #ifdef CONFIG_CACHEFILES_ONDEMAND int ondemand_id; + enum cachefiles_object_state state; #endif };
@@ -296,6 +302,21 @@ extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object); extern int cachefiles_ondemand_read(struct cachefiles_object *object, loff_t pos, size_t len);
+#define CACHEFILES_OBJECT_STATE_FUNCS(_state, _STATE) \ +static inline bool \ +cachefiles_ondemand_object_is_##_state(const struct cachefiles_object *object) \ +{ \ + return object->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ +} \ + \ +static inline void \ +cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \ +{ \ + object->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ +} + +CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN); +CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE); #else static inline ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, char __user *_buffer, size_t buflen) diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 0254ed39f68ce..90456b8a4b3e0 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -15,6 +15,7 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
xa_lock(&cache->reqs); object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; + cachefiles_ondemand_set_object_close(object);
/* * Flush all pending READ requests since their completion depends on @@ -176,6 +177,8 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags); trace_cachefiles_ondemand_copen(req->object, id, size);
+ cachefiles_ondemand_set_object_open(req->object); + out: complete(&req->done); return ret; @@ -363,7 +366,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, /* coupled with the barrier in cachefiles_flush_reqs() */ smp_mb();
- if (opcode != CACHEFILES_OP_OPEN && object->ondemand_id <= 0) { + if (opcode != CACHEFILES_OP_OPEN && + !cachefiles_ondemand_object_is_open(object)) { WARN_ON_ONCE(object->ondemand_id == 0); xas_unlock(&xas); ret = -EIO; @@ -430,18 +434,11 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req, void *private) { struct cachefiles_object *object = req->object; - int object_id = object->ondemand_id;
- /* - * It's possible that object id is still 0 if the cookie looking up - * phase failed before OPEN request has ever been sent. Also avoid - * sending CLOSE request for CACHEFILES_ONDEMAND_ID_CLOSED, which means - * anon_fd has already been closed. - */ - if (object_id <= 0) + if (!cachefiles_ondemand_object_is_open(object)) return -ENOENT;
- req->msg.object_id = object_id; + req->msg.object_id = object->ondemand_id; trace_cachefiles_ondemand_close(object, &req->msg); return 0; } @@ -460,7 +457,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req, int object_id = object->ondemand_id;
/* Stop enqueuing requests when daemon has closed anon_fd. */ - if (object_id <= 0) { + if (!cachefiles_ondemand_object_is_open(object)) { WARN_ON_ONCE(object_id == 0); pr_info_once("READ: anonymous fd closed prematurely.\n"); return -EIO; @@ -485,7 +482,7 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object) * creating a new tmpfile as the cache file. Reuse the previously * allocated object ID if any. */ - if (object->ondemand_id > 0) + if (cachefiles_ondemand_object_is_open(object)) return 0;
volume_key_size = volume->key[0] + 1;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu zhujia.zj@bytedance.com
[ Upstream commit 3c5ecfe16e7699011c12c2d44e55437415331fa3 ]
We'll introduce a @work_struct field for @object in subsequent patches, it will enlarge the size of @object. As the result of that, this commit extracts ondemand info field from @object.
Signed-off-by: Jia Zhu zhujia.zj@bytedance.com Link: https://lore.kernel.org/r/20231120041422.75170-3-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/interface.c | 7 ++++++- fs/cachefiles/internal.h | 26 ++++++++++++++++++++++---- fs/cachefiles/ondemand.c | 34 ++++++++++++++++++++++++++++------ 3 files changed, 56 insertions(+), 11 deletions(-)
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index a69073a1d3f06..bde23e156a63c 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -31,6 +31,11 @@ struct cachefiles_object *cachefiles_alloc_object(struct fscache_cookie *cookie) if (!object) return NULL;
+ if (cachefiles_ondemand_init_obj_info(object, volume)) { + kmem_cache_free(cachefiles_object_jar, object); + return NULL; + } + refcount_set(&object->ref, 1);
spin_lock_init(&object->lock); @@ -88,7 +93,7 @@ void cachefiles_put_object(struct cachefiles_object *object, ASSERTCMP(object->file, ==, NULL);
kfree(object->d_name); - + cachefiles_ondemand_deinit_obj_info(object); cache = object->volume->cache->cache; fscache_put_cookie(object->cookie, fscache_cookie_put_object); object->cookie = NULL; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 00beedeaec183..b0fe76964bc0d 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -49,6 +49,12 @@ enum cachefiles_object_state { CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */ };
+struct cachefiles_ondemand_info { + int ondemand_id; + enum cachefiles_object_state state; + struct cachefiles_object *object; +}; + /* * Backing file state. */ @@ -66,8 +72,7 @@ struct cachefiles_object { unsigned long flags; #define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */ #ifdef CONFIG_CACHEFILES_ONDEMAND - int ondemand_id; - enum cachefiles_object_state state; + struct cachefiles_ondemand_info *ondemand; #endif };
@@ -302,17 +307,21 @@ extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object); extern int cachefiles_ondemand_read(struct cachefiles_object *object, loff_t pos, size_t len);
+extern int cachefiles_ondemand_init_obj_info(struct cachefiles_object *obj, + struct cachefiles_volume *volume); +extern void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *obj); + #define CACHEFILES_OBJECT_STATE_FUNCS(_state, _STATE) \ static inline bool \ cachefiles_ondemand_object_is_##_state(const struct cachefiles_object *object) \ { \ - return object->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ + return object->ondemand->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ } \ \ static inline void \ cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \ { \ - object->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ + object->ondemand->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \ }
CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN); @@ -338,6 +347,15 @@ static inline int cachefiles_ondemand_read(struct cachefiles_object *object, { return -EOPNOTSUPP; } + +static inline int cachefiles_ondemand_init_obj_info(struct cachefiles_object *obj, + struct cachefiles_volume *volume) +{ + return 0; +} +static inline void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *obj) +{ +} #endif
/* diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 90456b8a4b3e0..deb7e3007aa1d 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -9,12 +9,13 @@ static int cachefiles_ondemand_fd_release(struct inode *inode, { struct cachefiles_object *object = file->private_data; struct cachefiles_cache *cache = object->volume->cache; - int object_id = object->ondemand_id; + struct cachefiles_ondemand_info *info = object->ondemand; + int object_id = info->ondemand_id; struct cachefiles_req *req; XA_STATE(xas, &cache->reqs, 0);
xa_lock(&cache->reqs); - object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; + info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; cachefiles_ondemand_set_object_close(object);
/* @@ -222,7 +223,7 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) load = (void *)req->msg.data; load->fd = fd; req->msg.object_id = object_id; - object->ondemand_id = object_id; + object->ondemand->ondemand_id = object_id;
cachefiles_get_unbind_pincount(cache); trace_cachefiles_ondemand_open(object, &req->msg, load); @@ -368,7 +369,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
if (opcode != CACHEFILES_OP_OPEN && !cachefiles_ondemand_object_is_open(object)) { - WARN_ON_ONCE(object->ondemand_id == 0); + WARN_ON_ONCE(object->ondemand->ondemand_id == 0); xas_unlock(&xas); ret = -EIO; goto out; @@ -438,7 +439,7 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req, if (!cachefiles_ondemand_object_is_open(object)) return -ENOENT;
- req->msg.object_id = object->ondemand_id; + req->msg.object_id = object->ondemand->ondemand_id; trace_cachefiles_ondemand_close(object, &req->msg); return 0; } @@ -454,7 +455,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req, struct cachefiles_object *object = req->object; struct cachefiles_read *load = (void *)req->msg.data; struct cachefiles_read_ctx *read_ctx = private; - int object_id = object->ondemand_id; + int object_id = object->ondemand->ondemand_id;
/* Stop enqueuing requests when daemon has closed anon_fd. */ if (!cachefiles_ondemand_object_is_open(object)) { @@ -500,6 +501,27 @@ void cachefiles_ondemand_clean_object(struct cachefiles_object *object) cachefiles_ondemand_init_close_req, NULL); }
+int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object, + struct cachefiles_volume *volume) +{ + if (!cachefiles_in_ondemand_mode(volume->cache)) + return 0; + + object->ondemand = kzalloc(sizeof(struct cachefiles_ondemand_info), + GFP_KERNEL); + if (!object->ondemand) + return -ENOMEM; + + object->ondemand->object = object; + return 0; +} + +void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *object) +{ + kfree(object->ondemand); + object->ondemand = NULL; +} + int cachefiles_ondemand_read(struct cachefiles_object *object, loff_t pos, size_t len) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu zhujia.zj@bytedance.com
[ Upstream commit 0a7e54c1959c0feb2de23397ec09c7692364313e ]
When an anonymous fd is closed by user daemon, if there is a new read request for this file comes up, the anonymous fd should be re-opened to handle that read request rather than fail it directly.
1. Introduce reopening state for objects that are closed but have inflight/subsequent read requests. 2. No longer flush READ requests but only CLOSE requests when anonymous fd is closed. 3. Enqueue the reopen work to workqueue, thus user daemon could get rid of daemon_read context and handle that request smoothly. Otherwise, the user daemon will send a reopen request and wait for itself to process the request.
Signed-off-by: Jia Zhu zhujia.zj@bytedance.com Link: https://lore.kernel.org/r/20231120041422.75170-4-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/internal.h | 3 ++ fs/cachefiles/ondemand.c | 98 ++++++++++++++++++++++++++++------------ 2 files changed, 72 insertions(+), 29 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index b0fe76964bc0d..b9a90f1a0c015 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -47,9 +47,11 @@ struct cachefiles_volume { enum cachefiles_object_state { CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */ CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */ + CACHEFILES_ONDEMAND_OBJSTATE_REOPENING, /* Object that was closed and is being reopened. */ };
struct cachefiles_ondemand_info { + struct work_struct ondemand_work; int ondemand_id; enum cachefiles_object_state state; struct cachefiles_object *object; @@ -326,6 +328,7 @@ cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN); CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE); +CACHEFILES_OBJECT_STATE_FUNCS(reopening, REOPENING); #else static inline ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, char __user *_buffer, size_t buflen) diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index deb7e3007aa1d..8e130de952f7d 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -18,14 +18,10 @@ static int cachefiles_ondemand_fd_release(struct inode *inode, info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; cachefiles_ondemand_set_object_close(object);
- /* - * Flush all pending READ requests since their completion depends on - * anon_fd. - */ - xas_for_each(&xas, req, ULONG_MAX) { + /* Only flush CACHEFILES_REQ_NEW marked req to avoid race with daemon_read */ + xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) { if (req->msg.object_id == object_id && - req->msg.opcode == CACHEFILES_OP_READ) { - req->error = -EIO; + req->msg.opcode == CACHEFILES_OP_CLOSE) { complete(&req->done); xas_store(&xas, NULL); } @@ -179,6 +175,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) trace_cachefiles_ondemand_copen(req->object, id, size);
cachefiles_ondemand_set_object_open(req->object); + wake_up_all(&cache->daemon_pollwq);
out: complete(&req->done); @@ -222,7 +219,6 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
load = (void *)req->msg.data; load->fd = fd; - req->msg.object_id = object_id; object->ondemand->ondemand_id = object_id;
cachefiles_get_unbind_pincount(cache); @@ -238,6 +234,43 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) return ret; }
+static void ondemand_object_worker(struct work_struct *work) +{ + struct cachefiles_ondemand_info *info = + container_of(work, struct cachefiles_ondemand_info, ondemand_work); + + cachefiles_ondemand_init_object(info->object); +} + +/* + * If there are any inflight or subsequent READ requests on the + * closed object, reopen it. + * Skip read requests whose related object is reopening. + */ +static struct cachefiles_req *cachefiles_ondemand_select_req(struct xa_state *xas, + unsigned long xa_max) +{ + struct cachefiles_req *req; + struct cachefiles_object *object; + struct cachefiles_ondemand_info *info; + + xas_for_each_marked(xas, req, xa_max, CACHEFILES_REQ_NEW) { + if (req->msg.opcode != CACHEFILES_OP_READ) + return req; + object = req->object; + info = object->ondemand; + if (cachefiles_ondemand_object_is_close(object)) { + cachefiles_ondemand_set_object_reopening(object); + queue_work(fscache_wq, &info->ondemand_work); + continue; + } + if (cachefiles_ondemand_object_is_reopening(object)) + continue; + return req; + } + return NULL; +} + ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, char __user *_buffer, size_t buflen) { @@ -248,16 +281,16 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, int ret = 0; XA_STATE(xas, &cache->reqs, cache->req_id_next);
+ xa_lock(&cache->reqs); /* * Cyclically search for a request that has not ever been processed, * to prevent requests from being processed repeatedly, and make * request distribution fair. */ - xa_lock(&cache->reqs); - req = xas_find_marked(&xas, UINT_MAX, CACHEFILES_REQ_NEW); + req = cachefiles_ondemand_select_req(&xas, ULONG_MAX); if (!req && cache->req_id_next > 0) { xas_set(&xas, 0); - req = xas_find_marked(&xas, cache->req_id_next - 1, CACHEFILES_REQ_NEW); + req = cachefiles_ondemand_select_req(&xas, cache->req_id_next - 1); } if (!req) { xa_unlock(&cache->reqs); @@ -277,14 +310,18 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, xa_unlock(&cache->reqs);
id = xas.xa_index; - msg->msg_id = id;
if (msg->opcode == CACHEFILES_OP_OPEN) { ret = cachefiles_ondemand_get_fd(req); - if (ret) + if (ret) { + cachefiles_ondemand_set_object_close(req->object); goto error; + } }
+ msg->msg_id = id; + msg->object_id = req->object->ondemand->ondemand_id; + if (copy_to_user(_buffer, msg, n) != 0) { ret = -EFAULT; goto err_put_fd; @@ -317,19 +354,23 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, void *private) { struct cachefiles_cache *cache = object->volume->cache; - struct cachefiles_req *req; + struct cachefiles_req *req = NULL; XA_STATE(xas, &cache->reqs, 0); int ret;
if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags)) return 0;
- if (test_bit(CACHEFILES_DEAD, &cache->flags)) - return -EIO; + if (test_bit(CACHEFILES_DEAD, &cache->flags)) { + ret = -EIO; + goto out; + }
req = kzalloc(sizeof(*req) + data_len, GFP_KERNEL); - if (!req) - return -ENOMEM; + if (!req) { + ret = -ENOMEM; + goto out; + }
req->object = object; init_completion(&req->done); @@ -367,7 +408,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, /* coupled with the barrier in cachefiles_flush_reqs() */ smp_mb();
- if (opcode != CACHEFILES_OP_OPEN && + if (opcode == CACHEFILES_OP_CLOSE && !cachefiles_ondemand_object_is_open(object)) { WARN_ON_ONCE(object->ondemand->ondemand_id == 0); xas_unlock(&xas); @@ -392,7 +433,15 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, wake_up_all(&cache->daemon_pollwq); wait_for_completion(&req->done); ret = req->error; + kfree(req); + return ret; out: + /* Reset the object to close state in error handling path. + * If error occurs after creating the anonymous fd, + * cachefiles_ondemand_fd_release() will set object to close. + */ + if (opcode == CACHEFILES_OP_OPEN) + cachefiles_ondemand_set_object_close(object); kfree(req); return ret; } @@ -439,7 +488,6 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req, if (!cachefiles_ondemand_object_is_open(object)) return -ENOENT;
- req->msg.object_id = object->ondemand->ondemand_id; trace_cachefiles_ondemand_close(object, &req->msg); return 0; } @@ -455,16 +503,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req, struct cachefiles_object *object = req->object; struct cachefiles_read *load = (void *)req->msg.data; struct cachefiles_read_ctx *read_ctx = private; - int object_id = object->ondemand->ondemand_id; - - /* Stop enqueuing requests when daemon has closed anon_fd. */ - if (!cachefiles_ondemand_object_is_open(object)) { - WARN_ON_ONCE(object_id == 0); - pr_info_once("READ: anonymous fd closed prematurely.\n"); - return -EIO; - }
- req->msg.object_id = object_id; load->off = read_ctx->off; load->len = read_ctx->len; trace_cachefiles_ondemand_read(object, &req->msg, load); @@ -513,6 +552,7 @@ int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object, return -ENOMEM;
object->ondemand->object = object; + INIT_WORK(&object->ondemand->ondemand_work, ondemand_object_worker); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 0a790040838c736495d5afd6b2d636f159f817f1 ]
The following concurrency may cause a read request to fail to be completed and result in a hung:
t1 | t2 --------------------------------------------------------- cachefiles_ondemand_copen req = xa_erase(&cache->reqs, id) // Anon fd is maliciously closed. cachefiles_ondemand_fd_release xa_lock(&cache->reqs) cachefiles_ondemand_set_object_close(object) xa_unlock(&cache->reqs) cachefiles_ondemand_set_object_open // No one will ever close it again. cachefiles_ondemand_daemon_read cachefiles_ondemand_select_req // Get a read req but its fd is already closed. // The daemon can't issue a cread ioctl with an closed fd, then hung.
So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and state, thus we can avoid the above problem in cachefiles_ondemand_copen() by using ondemand_id to determine if fd has been closed.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-8-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/internal.h | 1 + fs/cachefiles/ondemand.c | 35 ++++++++++++++++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index b9a90f1a0c015..33fe418aca770 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -55,6 +55,7 @@ struct cachefiles_ondemand_info { int ondemand_id; enum cachefiles_object_state state; struct cachefiles_object *object; + spinlock_t lock; };
/* diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 8e130de952f7d..8118649d30727 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -10,13 +10,16 @@ static int cachefiles_ondemand_fd_release(struct inode *inode, struct cachefiles_object *object = file->private_data; struct cachefiles_cache *cache = object->volume->cache; struct cachefiles_ondemand_info *info = object->ondemand; - int object_id = info->ondemand_id; + int object_id; struct cachefiles_req *req; XA_STATE(xas, &cache->reqs, 0);
xa_lock(&cache->reqs); + spin_lock(&info->lock); + object_id = info->ondemand_id; info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; cachefiles_ondemand_set_object_close(object); + spin_unlock(&info->lock);
/* Only flush CACHEFILES_REQ_NEW marked req to avoid race with daemon_read */ xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) { @@ -116,6 +119,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) { struct cachefiles_req *req; struct fscache_cookie *cookie; + struct cachefiles_ondemand_info *info; char *pid, *psize; unsigned long id; long size; @@ -166,6 +170,33 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) goto out; }
+ info = req->object->ondemand; + spin_lock(&info->lock); + /* + * The anonymous fd was closed before copen ? Fail the request. + * + * t1 | t2 + * --------------------------------------------------------- + * cachefiles_ondemand_copen + * req = xa_erase(&cache->reqs, id) + * // Anon fd is maliciously closed. + * cachefiles_ondemand_fd_release + * xa_lock(&cache->reqs) + * cachefiles_ondemand_set_object_close(object) + * xa_unlock(&cache->reqs) + * cachefiles_ondemand_set_object_open + * // No one will ever close it again. + * cachefiles_ondemand_daemon_read + * cachefiles_ondemand_select_req + * + * Get a read req but its fd is already closed. The daemon can't + * issue a cread ioctl with an closed fd, then hung. + */ + if (info->ondemand_id == CACHEFILES_ONDEMAND_ID_CLOSED) { + spin_unlock(&info->lock); + req->error = -EBADFD; + goto out; + } cookie = req->object->cookie; cookie->object_size = size; if (size) @@ -175,6 +206,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) trace_cachefiles_ondemand_copen(req->object, id, size);
cachefiles_ondemand_set_object_open(req->object); + spin_unlock(&info->lock); wake_up_all(&cache->daemon_pollwq);
out: @@ -552,6 +584,7 @@ int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object, return -ENOMEM;
object->ondemand->object = object; + spin_lock_init(&object->ondemand->lock); INIT_WORK(&object->ondemand->ondemand_work, ondemand_object_worker); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu zhujia.zj@bytedance.com
[ Upstream commit e73fa11a356ca0905c3cc648eaacc6f0f2d2c8b3 ]
Previously, in ondemand read scenario, if the anonymous fd was closed by user daemon, inflight and subsequent read requests would return EIO. As long as the device connection is not released, user daemon can hold and restore inflight requests by setting the request flag to CACHEFILES_REQ_NEW.
Suggested-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Jia Zhu zhujia.zj@bytedance.com Signed-off-by: Xin Yin yinxin.x@bytedance.com Link: https://lore.kernel.org/r/20231120041422.75170-6-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/daemon.c | 1 + fs/cachefiles/internal.h | 3 +++ fs/cachefiles/ondemand.c | 23 +++++++++++++++++++++++ 3 files changed, 27 insertions(+)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 7d1f456e376dd..26b487e112596 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -77,6 +77,7 @@ static const struct cachefiles_daemon_cmd cachefiles_daemon_cmds[] = { { "tag", cachefiles_daemon_tag }, #ifdef CONFIG_CACHEFILES_ONDEMAND { "copen", cachefiles_ondemand_copen }, + { "restore", cachefiles_ondemand_restore }, #endif { "", NULL } }; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 33fe418aca770..361356d0e866a 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -304,6 +304,9 @@ extern ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, extern int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args);
+extern int cachefiles_ondemand_restore(struct cachefiles_cache *cache, + char *args); + extern int cachefiles_ondemand_init_object(struct cachefiles_object *object); extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object);
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 8118649d30727..6d8f7f01a73ac 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -214,6 +214,29 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) return ret; }
+int cachefiles_ondemand_restore(struct cachefiles_cache *cache, char *args) +{ + struct cachefiles_req *req; + + XA_STATE(xas, &cache->reqs, 0); + + if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags)) + return -EOPNOTSUPP; + + /* + * Reset the requests to CACHEFILES_REQ_NEW state, so that the + * requests have been processed halfway before the crash of the + * user daemon could be reprocessed after the recovery. + */ + xas_lock(&xas); + xas_for_each(&xas, req, ULONG_MAX) + xas_set_mark(&xas, CACHEFILES_REQ_NEW); + xas_unlock(&xas); + + wake_up_all(&cache->daemon_pollwq); + return 0; +} + static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) { struct cachefiles_object *object;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit de3e26f9e5b76fc628077578c001c4a51bf54d06 ]
We got the following issue in a fuzz test of randomly issuing the restore command:
================================================================== BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0 Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962
CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542 Call Trace: kasan_report+0x94/0xc0 cachefiles_ondemand_daemon_read+0x609/0xab0 vfs_read+0x169/0xb50 ksys_read+0xf5/0x1e0
Allocated by task 626: __kmalloc+0x1df/0x4b0 cachefiles_ondemand_send_req+0x24d/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...]
Freed by task 626: kfree+0xf1/0x2c0 cachefiles_ondemand_send_req+0x568/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...] ==================================================================
Following is the process that triggers the issue:
mount | daemon_thread1 | daemon_thread2 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done)
cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd copy_to_user(_buffer, msg, n) process_open_req(REQ_A) ------ restore ------ cachefiles_ondemand_restore xas_for_each(&xas, req, ULONG_MAX) xas_set_mark(&xas, CACHEFILES_REQ_NEW);
cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req
write(devfd, ("copen %u,%llu", msg->msg_id, size)); cachefiles_ondemand_copen xa_erase(&cache->reqs, id) complete(&REQ_A->done) kfree(REQ_A) cachefiles_ondemand_get_fd(REQ_A) fd = get_unused_fd_flags file = anon_inode_getfile fd_install(fd, file) load = (void *)REQ_A->msg.data; load->fd = fd; // load UAF !!!
This issue is caused by issuing a restore command when the daemon is still alive, which results in a request being processed multiple times thus triggering a UAF. So to avoid this problem, add an additional reference count to cachefiles_req, which is held while waiting and reading, and then released when the waiting and reading is over.
Note that since there is only one reference count for waiting, we need to avoid the same request being completed multiple times, so we can only complete the request if it is successfully removed from the xarray.
Fixes: e73fa11a356c ("cachefiles: add restore command to recover inflight ondemand read requests") Suggested-by: Hou Tao houtao1@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-4-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Reviewed-by: Jia Zhu zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/internal.h | 1 + fs/cachefiles/ondemand.c | 23 +++++++++++++++++++---- 2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 361356d0e866a..28799c8e2c6f6 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -139,6 +139,7 @@ static inline bool cachefiles_in_ondemand_mode(struct cachefiles_cache *cache) struct cachefiles_req { struct cachefiles_object *object; struct completion done; + refcount_t ref; int error; struct cachefiles_msg msg; }; diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 6d8f7f01a73ac..f8d0a01795702 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -4,6 +4,12 @@ #include <linux/uio.h> #include "internal.h"
+static inline void cachefiles_req_put(struct cachefiles_req *req) +{ + if (refcount_dec_and_test(&req->ref)) + kfree(req); +} + static int cachefiles_ondemand_fd_release(struct inode *inode, struct file *file) { @@ -362,6 +368,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
xas_clear_mark(&xas, CACHEFILES_REQ_NEW); cache->req_id_next = xas.xa_index + 1; + refcount_inc(&req->ref); xa_unlock(&cache->reqs);
id = xas.xa_index; @@ -388,15 +395,22 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, complete(&req->done); }
+ cachefiles_req_put(req); return n;
err_put_fd: if (msg->opcode == CACHEFILES_OP_OPEN) close_fd(((struct cachefiles_open *)msg->data)->fd); error: - xa_erase(&cache->reqs, id); - req->error = ret; - complete(&req->done); + xas_reset(&xas); + xas_lock(&xas); + if (xas_load(&xas) == req) { + req->error = ret; + complete(&req->done); + xas_store(&xas, NULL); + } + xas_unlock(&xas); + cachefiles_req_put(req); return ret; }
@@ -427,6 +441,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, goto out; }
+ refcount_set(&req->ref, 1); req->object = object; init_completion(&req->done); req->msg.opcode = opcode; @@ -488,7 +503,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object, wake_up_all(&cache->daemon_pollwq); wait_for_completion(&req->done); ret = req->error; - kfree(req); + cachefiles_req_put(req); return ret; out: /* Reset the object to close state in error handling path.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit da4a827416066191aafeeccee50a8836a826ba10 ]
We got the following issue in a fuzz test of randomly issuing the restore command:
================================================================== BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0xb41/0xb60 Read of size 8 at addr ffff888122e84088 by task ondemand-04-dae/963
CPU: 13 PID: 963 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #564 Call Trace: kasan_report+0x93/0xc0 cachefiles_ondemand_daemon_read+0xb41/0xb60 vfs_read+0x169/0xb50 ksys_read+0xf5/0x1e0
Allocated by task 116: kmem_cache_alloc+0x140/0x3a0 cachefiles_lookup_cookie+0x140/0xcd0 fscache_cookie_state_machine+0x43c/0x1230 [...]
Freed by task 792: kmem_cache_free+0xfe/0x390 cachefiles_put_object+0x241/0x480 fscache_cookie_state_machine+0x5c8/0x1230 [...] ==================================================================
Following is the process that triggers the issue:
mount | daemon_thread1 | daemon_thread2 ------------------------------------------------------------ cachefiles_withdraw_cookie cachefiles_ondemand_clean_object(object) cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done)
cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req msg->object_id = req->object->ondemand->ondemand_id ------ restore ------ cachefiles_ondemand_restore xas_for_each(&xas, req, ULONG_MAX) xas_set_mark(&xas, CACHEFILES_REQ_NEW)
cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req copy_to_user(_buffer, msg, n) xa_erase(&cache->reqs, id) complete(&REQ_A->done) ------ close(fd) ------ cachefiles_ondemand_fd_release cachefiles_put_object cachefiles_put_object kmem_cache_free(cachefiles_object_jar, object) REQ_A->object->ondemand->ondemand_id // object UAF !!!
When we see the request within xa_lock, req->object must not have been freed yet, so grab the reference count of object before xa_unlock to avoid the above issue.
Fixes: 0a7e54c1959c ("cachefiles: resend an open request if the read request's object is closed") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-5-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Reviewed-by: Jia Zhu zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/ondemand.c | 3 +++ include/trace/events/cachefiles.h | 6 +++++- 2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index f8d0a01795702..fd73811c7ce4f 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -369,6 +369,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, xas_clear_mark(&xas, CACHEFILES_REQ_NEW); cache->req_id_next = xas.xa_index + 1; refcount_inc(&req->ref); + cachefiles_grab_object(req->object, cachefiles_obj_get_read_req); xa_unlock(&cache->reqs);
id = xas.xa_index; @@ -389,6 +390,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, goto err_put_fd; }
+ cachefiles_put_object(req->object, cachefiles_obj_put_read_req); /* CLOSE request has no reply */ if (msg->opcode == CACHEFILES_OP_CLOSE) { xa_erase(&cache->reqs, id); @@ -402,6 +404,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, if (msg->opcode == CACHEFILES_OP_OPEN) close_fd(((struct cachefiles_open *)msg->data)->fd); error: + cachefiles_put_object(req->object, cachefiles_obj_put_read_req); xas_reset(&xas); xas_lock(&xas); if (xas_load(&xas) == req) { diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index b4939097f5116..dff9a48502247 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -33,6 +33,8 @@ enum cachefiles_obj_ref_trace { cachefiles_obj_see_withdrawal, cachefiles_obj_get_ondemand_fd, cachefiles_obj_put_ondemand_fd, + cachefiles_obj_get_read_req, + cachefiles_obj_put_read_req, };
enum fscache_why_object_killed { @@ -129,7 +131,9 @@ enum cachefiles_error_trace { EM(cachefiles_obj_see_withdraw_cookie, "SEE withdraw_cookie") \ EM(cachefiles_obj_see_withdrawal, "SEE withdrawal") \ EM(cachefiles_obj_get_ondemand_fd, "GET ondemand_fd") \ - E_(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd") + EM(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd") \ + EM(cachefiles_obj_get_read_req, "GET read_req") \ + E_(cachefiles_obj_put_read_req, "PUT read_req")
#define cachefiles_coherency_traces \ EM(cachefiles_coherency_check_aux, "BAD aux ") \
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 3e6d704f02aa4c50c7bc5fe91a4401df249a137b ]
The err_put_fd label is only used once, so remove it to make the code more readable. In addition, the logic for deleting error request and CLOSE request is merged to simplify the code.
Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-6-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Reviewed-by: Jia Zhu zhujia.zj@bytedance.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/ondemand.c | 45 ++++++++++++++-------------------------- 1 file changed, 16 insertions(+), 29 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index fd73811c7ce4f..99b4bffad4a4f 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -337,7 +337,6 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, { struct cachefiles_req *req; struct cachefiles_msg *msg; - unsigned long id = 0; size_t n; int ret = 0; XA_STATE(xas, &cache->reqs, cache->req_id_next); @@ -372,49 +371,37 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, cachefiles_grab_object(req->object, cachefiles_obj_get_read_req); xa_unlock(&cache->reqs);
- id = xas.xa_index; - if (msg->opcode == CACHEFILES_OP_OPEN) { ret = cachefiles_ondemand_get_fd(req); if (ret) { cachefiles_ondemand_set_object_close(req->object); - goto error; + goto out; } }
- msg->msg_id = id; + msg->msg_id = xas.xa_index; msg->object_id = req->object->ondemand->ondemand_id;
if (copy_to_user(_buffer, msg, n) != 0) { ret = -EFAULT; - goto err_put_fd; - } - - cachefiles_put_object(req->object, cachefiles_obj_put_read_req); - /* CLOSE request has no reply */ - if (msg->opcode == CACHEFILES_OP_CLOSE) { - xa_erase(&cache->reqs, id); - complete(&req->done); + if (msg->opcode == CACHEFILES_OP_OPEN) + close_fd(((struct cachefiles_open *)msg->data)->fd); } - - cachefiles_req_put(req); - return n; - -err_put_fd: - if (msg->opcode == CACHEFILES_OP_OPEN) - close_fd(((struct cachefiles_open *)msg->data)->fd); -error: +out: cachefiles_put_object(req->object, cachefiles_obj_put_read_req); - xas_reset(&xas); - xas_lock(&xas); - if (xas_load(&xas) == req) { - req->error = ret; - complete(&req->done); - xas_store(&xas, NULL); + /* Remove error request and CLOSE request has no reply */ + if (ret || msg->opcode == CACHEFILES_OP_CLOSE) { + xas_reset(&xas); + xas_lock(&xas); + if (xas_load(&xas) == req) { + req->error = ret; + complete(&req->done); + xas_store(&xas, NULL); + } + xas_unlock(&xas); } - xas_unlock(&xas); cachefiles_req_put(req); - return ret; + return ret ? ret : n; }
typedef int (*init_req_fn)(struct cachefiles_req *req, void *private);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 4988e35e95fc938bdde0e15880fe72042fc86acf ]
Now every time the daemon reads an open request, it gets a new anonymous fd and ondemand_id. With the introduction of "restore", it is possible to read the same open request more than once, and therefore an object can have more than one anonymous fd.
If the anonymous fd is not unique, the following concurrencies will result in an fd leak:
t1 | t2 | t3 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd load->fd = fd0 ondemand_id = object_id0 ------ restore ------ cachefiles_ondemand_restore // restore REQ_A cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd load->fd = fd1 ondemand_id = object_id1 process_open_req(REQ_A) write(devfd, ("copen %u,%llu", msg->msg_id, size)) cachefiles_ondemand_copen xa_erase(&cache->reqs, id) complete(&REQ_A->done) kfree(REQ_A) process_open_req(REQ_A) // copen fails due to no req // daemon close(fd1) cachefiles_ondemand_fd_release // set object closed -- umount -- cachefiles_withdraw_cookie cachefiles_ondemand_clean_object cachefiles_ondemand_init_close_req if (!cachefiles_ondemand_object_is_open(object)) return -ENOENT; // The fd0 is not closed until the daemon exits.
However, the anonymous fd holds the reference count of the object and the object holds the reference count of the cookie. So even though the cookie has been relinquished, it will not be unhashed and freed until the daemon exits.
In fscache_hash_cookie(), when the same cookie is found in the hash list, if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new cookie waits for the old cookie to be unhashed, while the old cookie is waiting for the leaked fd to be closed, if the daemon does not exit in time it will trigger a hung task.
To avoid this, allocate a new anonymous fd only if no anonymous fd has been allocated (ondemand_id == 0) or if the previously allocated anonymous fd has been closed (ondemand_id == -1). Moreover, returns an error if ondemand_id is valid, letting the daemon know that the current userland restore logic is abnormal and needs to be checked.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-9-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/ondemand.c | 34 ++++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 99b4bffad4a4f..773c3b407a33b 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -14,11 +14,18 @@ static int cachefiles_ondemand_fd_release(struct inode *inode, struct file *file) { struct cachefiles_object *object = file->private_data; - struct cachefiles_cache *cache = object->volume->cache; - struct cachefiles_ondemand_info *info = object->ondemand; + struct cachefiles_cache *cache; + struct cachefiles_ondemand_info *info; int object_id; struct cachefiles_req *req; - XA_STATE(xas, &cache->reqs, 0); + XA_STATE(xas, NULL, 0); + + if (!object) + return 0; + + info = object->ondemand; + cache = object->volume->cache; + xas.xa = &cache->reqs;
xa_lock(&cache->reqs); spin_lock(&info->lock); @@ -275,22 +282,39 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) goto err_put_fd; }
+ spin_lock(&object->ondemand->lock); + if (object->ondemand->ondemand_id > 0) { + spin_unlock(&object->ondemand->lock); + /* Pair with check in cachefiles_ondemand_fd_release(). */ + file->private_data = NULL; + ret = -EEXIST; + goto err_put_file; + } + file->f_mode |= FMODE_PWRITE | FMODE_LSEEK; fd_install(fd, file);
load = (void *)req->msg.data; load->fd = fd; object->ondemand->ondemand_id = object_id; + spin_unlock(&object->ondemand->lock);
cachefiles_get_unbind_pincount(cache); trace_cachefiles_ondemand_open(object, &req->msg, load); return 0;
+err_put_file: + fput(file); err_put_fd: put_unused_fd(fd); err_free_id: xa_erase(&cache->ondemand_ids, object_id); err: + spin_lock(&object->ondemand->lock); + /* Avoid marking an opened object as closed. */ + if (object->ondemand->ondemand_id <= 0) + cachefiles_ondemand_set_object_close(object); + spin_unlock(&object->ondemand->lock); cachefiles_put_object(object, cachefiles_obj_put_ondemand_fd); return ret; } @@ -373,10 +397,8 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
if (msg->opcode == CACHEFILES_OP_OPEN) { ret = cachefiles_ondemand_get_fd(req); - if (ret) { - cachefiles_ondemand_set_object_close(req->object); + if (ret) goto out; - } }
msg->msg_id = xas.xa_index;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 4b4391e77a6bf24cba2ef1590e113d9b73b11039 ]
After installing the anonymous fd, we can now see it in userland and close it. However, at this point we may not have gotten the reference count of the cache, but we will put it during colse fd, so this may cause a cache UAF.
So grab the cache reference count before fd_install(). In addition, by kernel convention, fd is taken over by the user land after fd_install(), and the kernel should not call close_fd() after that, i.e., it should call fd_install() after everything is ready, thus fd_install() is called after copy_to_user() succeeds.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Suggested-by: Hou Tao houtao1@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-10-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/ondemand.c | 53 +++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 20 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c index 773c3b407a33b..a8cfa5047aaf8 100644 --- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -4,6 +4,11 @@ #include <linux/uio.h> #include "internal.h"
+struct ondemand_anon_file { + struct file *file; + int fd; +}; + static inline void cachefiles_req_put(struct cachefiles_req *req) { if (refcount_dec_and_test(&req->ref)) @@ -250,14 +255,14 @@ int cachefiles_ondemand_restore(struct cachefiles_cache *cache, char *args) return 0; }
-static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) +static int cachefiles_ondemand_get_fd(struct cachefiles_req *req, + struct ondemand_anon_file *anon_file) { struct cachefiles_object *object; struct cachefiles_cache *cache; struct cachefiles_open *load; - struct file *file; u32 object_id; - int ret, fd; + int ret;
object = cachefiles_grab_object(req->object, cachefiles_obj_get_ondemand_fd); @@ -269,16 +274,16 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) if (ret < 0) goto err;
- fd = get_unused_fd_flags(O_WRONLY); - if (fd < 0) { - ret = fd; + anon_file->fd = get_unused_fd_flags(O_WRONLY); + if (anon_file->fd < 0) { + ret = anon_file->fd; goto err_free_id; }
- file = anon_inode_getfile("[cachefiles]", &cachefiles_ondemand_fd_fops, - object, O_WRONLY); - if (IS_ERR(file)) { - ret = PTR_ERR(file); + anon_file->file = anon_inode_getfile("[cachefiles]", + &cachefiles_ondemand_fd_fops, object, O_WRONLY); + if (IS_ERR(anon_file->file)) { + ret = PTR_ERR(anon_file->file); goto err_put_fd; }
@@ -286,16 +291,15 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) if (object->ondemand->ondemand_id > 0) { spin_unlock(&object->ondemand->lock); /* Pair with check in cachefiles_ondemand_fd_release(). */ - file->private_data = NULL; + anon_file->file->private_data = NULL; ret = -EEXIST; goto err_put_file; }
- file->f_mode |= FMODE_PWRITE | FMODE_LSEEK; - fd_install(fd, file); + anon_file->file->f_mode |= FMODE_PWRITE | FMODE_LSEEK;
load = (void *)req->msg.data; - load->fd = fd; + load->fd = anon_file->fd; object->ondemand->ondemand_id = object_id; spin_unlock(&object->ondemand->lock);
@@ -304,9 +308,11 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req) return 0;
err_put_file: - fput(file); + fput(anon_file->file); + anon_file->file = NULL; err_put_fd: - put_unused_fd(fd); + put_unused_fd(anon_file->fd); + anon_file->fd = ret; err_free_id: xa_erase(&cache->ondemand_ids, object_id); err: @@ -363,6 +369,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, struct cachefiles_msg *msg; size_t n; int ret = 0; + struct ondemand_anon_file anon_file; XA_STATE(xas, &cache->reqs, cache->req_id_next);
xa_lock(&cache->reqs); @@ -396,7 +403,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, xa_unlock(&cache->reqs);
if (msg->opcode == CACHEFILES_OP_OPEN) { - ret = cachefiles_ondemand_get_fd(req); + ret = cachefiles_ondemand_get_fd(req, &anon_file); if (ret) goto out; } @@ -404,10 +411,16 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache, msg->msg_id = xas.xa_index; msg->object_id = req->object->ondemand->ondemand_id;
- if (copy_to_user(_buffer, msg, n) != 0) { + if (copy_to_user(_buffer, msg, n) != 0) ret = -EFAULT; - if (msg->opcode == CACHEFILES_OP_OPEN) - close_fd(((struct cachefiles_open *)msg->data)->fd); + + if (msg->opcode == CACHEFILES_OP_OPEN) { + if (ret < 0) { + fput(anon_file.file); + put_unused_fd(anon_file.fd); + goto out; + } + fd_install(anon_file.fd, anon_file.file); } out: cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 85e833cd7243bda7285492b0653c3abb1e2e757b ]
In ondemand mode, when the daemon is processing an open request, if the kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write() will always return -EIO, so the daemon can't pass the copen to the kernel. Then the kernel process that is waiting for the copen triggers a hung_task.
Since the DEAD state is irreversible, it can only be exited by closing /dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to avoid the above hungtask. We may still be able to read some of the cached data before closing the fd of /dev/cachefiles.
Note that this relies on the patch that adds reference counting to the req, otherwise it may UAF.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li libaokun1@huawei.com Link: https://lore.kernel.org/r/20240522114308.2402121-12-libaokun@huaweicloud.com Acked-by: Jeff Layton jlayton@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/daemon.c | 2 +- fs/cachefiles/internal.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 26b487e112596..b9945e4f697be 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -133,7 +133,7 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file) return 0; }
-static void cachefiles_flush_reqs(struct cachefiles_cache *cache) +void cachefiles_flush_reqs(struct cachefiles_cache *cache) { struct xarray *xa = &cache->reqs; struct cachefiles_req *req; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 28799c8e2c6f6..3eea52462fc87 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -188,6 +188,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache, * daemon.c */ extern const struct file_operations cachefiles_daemon_fops; +extern void cachefiles_flush_reqs(struct cachefiles_cache *cache); extern void cachefiles_get_unbind_pincount(struct cachefiles_cache *cache); extern void cachefiles_put_unbind_pincount(struct cachefiles_cache *cache);
@@ -414,6 +415,8 @@ do { \ pr_err("I/O Error: " FMT"\n", ##__VA_ARGS__); \ fscache_io_error((___cache)->cache); \ set_bit(CACHEFILES_DEAD, &(___cache)->flags); \ + if (cachefiles_in_ondemand_mode(___cache)) \ + cachefiles_flush_reqs(___cache); \ } while (0)
#define cachefiles_io_error_obj(object, FMT, ...) \
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit f6c3c83db1d939ebdb8c8922748ae647d8126d91 ]
The dynevent/test_duplicates.tc test case uses `syscalls/sys_enter_openat` event for defining eprobe on it. Since this `syscalls` events depend on CONFIG_FTRACE_SYSCALLS=y, if it is not set, the test will fail.
Add the event file to `required` line so that the test will return `unsupported` result.
Fixes: 297e1dcdca3d ("selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes") Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc index d3a79da215c8b..5f72abe6fa79b 100644 --- a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc +++ b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc @@ -1,7 +1,7 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 # description: Generic dynamic event - check if duplicate events are caught -# requires: dynamic_events "e[:[<group>/][<event>]] <attached-group>.<attached-event> [<args>]":README +# requires: dynamic_events "e[:[<group>/][<event>]] <attached-group>.<attached-event> [<args>]":README events/syscalls/sys_enter_openat
echo 0 > events/enable
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland samuel.holland@sifive.com
[ Upstream commit 2607133196c35f31892ee199ce7ffa717bea4ad1 ]
These clkdevs were unnecessary, because systems using this driver always look up clocks using the devicetree. And as Russell King points out[1], since the provided device name was truncated, lookups via clkdev would never match.
Recently, commit 8d532528ff6a ("clkdev: report over-sized strings when creating clkdev entries") caused clkdev registration to fail due to the truncation, and this now prevents the driver from probing. Fix the driver by removing the clkdev registration.
Link: https://lore.kernel.org/linux-clk/ZkfYqj+OcAxd9O2t@shell.armlinux.org.uk/ [1] Fixes: 30b8e27e3b58 ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block") Fixes: 8d532528ff6a ("clkdev: report over-sized strings when creating clkdev entries") Reported-by: Guenter Roeck linux@roeck-us.net Closes: https://lore.kernel.org/linux-clk/7eda7621-0dde-4153-89e4-172e4c095d01@roeck... Suggested-by: Russell King linux@armlinux.org.uk Signed-off-by: Samuel Holland samuel.holland@sifive.com Link: https://lore.kernel.org/r/20240528001432.1200403-1-samuel.holland@sifive.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/sifive/sifive-prci.c | 8 -------- 1 file changed, 8 deletions(-)
diff --git a/drivers/clk/sifive/sifive-prci.c b/drivers/clk/sifive/sifive-prci.c index 916d2fc28b9c1..39bfbd120e0bc 100644 --- a/drivers/clk/sifive/sifive-prci.c +++ b/drivers/clk/sifive/sifive-prci.c @@ -4,7 +4,6 @@ * Copyright (C) 2020 Zong Li */
-#include <linux/clkdev.h> #include <linux/delay.h> #include <linux/io.h> #include <linux/of_device.h> @@ -536,13 +535,6 @@ static int __prci_register_clocks(struct device *dev, struct __prci_data *pd, return r; }
- r = clk_hw_register_clkdev(&pic->hw, pic->name, dev_name(dev)); - if (r) { - dev_warn(dev, "Failed to register clkdev for %s: %d\n", - init.name, r); - return r; - } - pd->hw_clks.hws[i] = &pic->hw; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia kolga@netapp.com
[ Upstream commit 28568c906c1bb5f7560e18082ed7d6295860f1c2 ]
In commit 4ca9f31a2be66 ("NFSv4.1 test and add 4.1 trunking transport"), we introduce the ability to query the NFS server for possible trunking locations of the existing filesystem. However, we never checked the returned file system path for these alternative locations. According to the RFC, the server can say that the filesystem currently known under "fs_root" of fs_location also resides under these server locations under the following "rootpath" pathname. The client cannot handle trunking a filesystem that reside under different location under different paths other than what the main path is. This patch enforces the check that fs_root path and rootpath path in fs_location reply is the same.
Fixes: 4ca9f31a2be6 ("NFSv4.1 test and add 4.1 trunking transport") Signed-off-by: Olga Kornievskaia kolga@netapp.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index bda3050817c90..ec641a8f6604b 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -4009,6 +4009,23 @@ static void test_fs_location_for_trunking(struct nfs4_fs_location *location, } }
+static bool _is_same_nfs4_pathname(struct nfs4_pathname *path1, + struct nfs4_pathname *path2) +{ + int i; + + if (path1->ncomponents != path2->ncomponents) + return false; + for (i = 0; i < path1->ncomponents; i++) { + if (path1->components[i].len != path2->components[i].len) + return false; + if (memcmp(path1->components[i].data, path2->components[i].data, + path1->components[i].len)) + return false; + } + return true; +} + static int _nfs4_discover_trunking(struct nfs_server *server, struct nfs_fh *fhandle) { @@ -4042,9 +4059,13 @@ static int _nfs4_discover_trunking(struct nfs_server *server, if (status) goto out_free_3;
- for (i = 0; i < locations->nlocations; i++) + for (i = 0; i < locations->nlocations; i++) { + if (!_is_same_nfs4_pathname(&locations->fs_path, + &locations->locations[i].rootpath)) + continue; test_fs_location_for_trunking(&locations->locations[i], clp, server); + } out_free_3: kfree(locations->fattr); out_free_2:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Hanxiao chenhx.fnst@fujitsu.com
[ Upstream commit 33c94d7e3cb84f6d130678d6d59ba475a6c489cf ]
don't return 0 if snd_buf->len really greater than snd_buf->buflen
Signed-off-by: Chen Hanxiao chenhx.fnst@fujitsu.com Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko") Reviewed-by: Benjamin Coddington bcodding@redhat.com Reviewed-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/auth_gss/auth_gss.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c index 2d7b1e03110ae..3ef511d7af190 100644 --- a/net/sunrpc/auth_gss/auth_gss.c +++ b/net/sunrpc/auth_gss/auth_gss.c @@ -1858,8 +1858,10 @@ gss_wrap_req_priv(struct rpc_cred *cred, struct gss_cl_ctx *ctx, offset = (u8 *)p - (u8 *)snd_buf->head[0].iov_base; maj_stat = gss_wrap(ctx->gc_gss_ctx, offset, snd_buf, inpages); /* slack space should prevent this ever happening: */ - if (unlikely(snd_buf->len > snd_buf->buflen)) + if (unlikely(snd_buf->len > snd_buf->buflen)) { + status = -EIO; goto wrap_failed; + } /* We're assuming that when GSS_S_CONTEXT_EXPIRED, the encryption was * done anyway, so it's safe to put the request on the wire: */ if (maj_stat == GSS_S_CONTEXT_EXPIRED)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown neilb@suse.de
[ Upstream commit 99bc9f2eb3f79a2b4296d9bf43153e1d10ca50d3 ]
dentry->d_fsdata is set to NFS_FSDATA_BLOCKED while unlinking or renaming-over a file to ensure that no open succeeds while the NFS operation progressed on the server.
Setting dentry->d_fsdata to NFS_FSDATA_BLOCKED is done under ->d_lock after checking the refcount is not elevated. Any attempt to open the file (through that name) will go through lookp_open() which will take ->d_lock while incrementing the refcount, we can be sure that once the new value is set, __nfs_lookup_revalidate() *will* see the new value and will block.
We don't have any locking guarantee that when we set ->d_fsdata to NULL, the wait_var_event() in __nfs_lookup_revalidate() will notice. wait/wake primitives do NOT provide barriers to guarantee order. We must use smp_load_acquire() in wait_var_event() to ensure we look at an up-to-date value, and must use smp_store_release() before wake_up_var().
This patch adds those barrier functions and factors out block_revalidate() and unblock_revalidate() far clarity.
There is also a hypothetical bug in that if memory allocation fails (which never happens in practice) we might leave ->d_fsdata locked. This patch adds the missing call to unblock_revalidate().
Reported-and-tested-by: Richard Kojedzinszky richard+debian+bugreport@kojedz.in Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071501 Fixes: 3c59366c207e ("NFS: don't unhash dentry during unlink/rename") Signed-off-by: NeilBrown neilb@suse.de Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/dir.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-)
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index f594dac436a7e..a5a4d9422d6ed 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1792,9 +1792,10 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags, if (parent != READ_ONCE(dentry->d_parent)) return -ECHILD; } else { - /* Wait for unlink to complete */ + /* Wait for unlink to complete - see unblock_revalidate() */ wait_var_event(&dentry->d_fsdata, - dentry->d_fsdata != NFS_FSDATA_BLOCKED); + smp_load_acquire(&dentry->d_fsdata) + != NFS_FSDATA_BLOCKED); parent = dget_parent(dentry); ret = reval(d_inode(parent), dentry, flags); dput(parent); @@ -1807,6 +1808,29 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags) return __nfs_lookup_revalidate(dentry, flags, nfs_do_lookup_revalidate); }
+static void block_revalidate(struct dentry *dentry) +{ + /* old devname - just in case */ + kfree(dentry->d_fsdata); + + /* Any new reference that could lead to an open + * will take ->d_lock in lookup_open() -> d_lookup(). + * Holding this lock ensures we cannot race with + * __nfs_lookup_revalidate() and removes and need + * for further barriers. + */ + lockdep_assert_held(&dentry->d_lock); + + dentry->d_fsdata = NFS_FSDATA_BLOCKED; +} + +static void unblock_revalidate(struct dentry *dentry) +{ + /* store_release ensures wait_var_event() sees the update */ + smp_store_release(&dentry->d_fsdata, NULL); + wake_up_var(&dentry->d_fsdata); +} + /* * A weaker form of d_revalidate for revalidating just the d_inode(dentry) * when we don't really care about the dentry name. This is called when a @@ -2489,15 +2513,12 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry) spin_unlock(&dentry->d_lock); goto out; } - /* old devname */ - kfree(dentry->d_fsdata); - dentry->d_fsdata = NFS_FSDATA_BLOCKED; + block_revalidate(dentry);
spin_unlock(&dentry->d_lock); error = nfs_safe_remove(dentry); nfs_dentry_remove_handle_error(dir, dentry, error); - dentry->d_fsdata = NULL; - wake_up_var(&dentry->d_fsdata); + unblock_revalidate(dentry); out: trace_nfs_unlink_exit(dir, dentry, error); return error; @@ -2609,8 +2630,7 @@ nfs_unblock_rename(struct rpc_task *task, struct nfs_renamedata *data) { struct dentry *new_dentry = data->new_dentry;
- new_dentry->d_fsdata = NULL; - wake_up_var(&new_dentry->d_fsdata); + unblock_revalidate(new_dentry); }
/* @@ -2672,11 +2692,6 @@ int nfs_rename(struct user_namespace *mnt_userns, struct inode *old_dir, if (WARN_ON(new_dentry->d_flags & DCACHE_NFSFS_RENAMED) || WARN_ON(new_dentry->d_fsdata == NFS_FSDATA_BLOCKED)) goto out; - if (new_dentry->d_fsdata) { - /* old devname */ - kfree(new_dentry->d_fsdata); - new_dentry->d_fsdata = NULL; - }
spin_lock(&new_dentry->d_lock); if (d_count(new_dentry) > 2) { @@ -2698,7 +2713,7 @@ int nfs_rename(struct user_namespace *mnt_userns, struct inode *old_dir, new_dentry = dentry; new_inode = NULL; } else { - new_dentry->d_fsdata = NFS_FSDATA_BLOCKED; + block_revalidate(new_dentry); must_unblock = true; spin_unlock(&new_dentry->d_lock); } @@ -2710,6 +2725,8 @@ int nfs_rename(struct user_namespace *mnt_userns, struct inode *old_dir, task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, must_unblock ? nfs_unblock_rename : NULL); if (IS_ERR(task)) { + if (must_unblock) + unblock_revalidate(new_dentry); error = PTR_ERR(task); goto out; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Armin Wolf W_Armin@gmx.de
[ Upstream commit 1981b296f858010eae409548fd297659b2cc570e ]
When reading token data from sysfs on my Inspiron 3505, the token locations and values are wrong. This happens because match_attribute() blindly assumes that all entries in da_tokens have an associated entry in token_attrs.
This however is not true as soon as da_tokens[] contains zeroed token entries. Those entries are being skipped when initialising token_attrs, breaking the core assumption of match_attribute().
Fix this by defining an extra struct for each pair of token attributes and use container_of() to retrieve token information.
Tested on a Dell Inspiron 3050.
Fixes: 33b9ca1e53b4 ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens") Signed-off-by: Armin Wolf W_Armin@gmx.de Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20240528204903.445546-1-W_Armin@gmx.de Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell/dell-smbios-base.c | 92 ++++++++------------ 1 file changed, 36 insertions(+), 56 deletions(-)
diff --git a/drivers/platform/x86/dell/dell-smbios-base.c b/drivers/platform/x86/dell/dell-smbios-base.c index e61bfaf8b5c48..86b95206cb1bd 100644 --- a/drivers/platform/x86/dell/dell-smbios-base.c +++ b/drivers/platform/x86/dell/dell-smbios-base.c @@ -11,6 +11,7 @@ */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/container_of.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/capability.h> @@ -25,11 +26,16 @@ static u32 da_supported_commands; static int da_num_tokens; static struct platform_device *platform_device; static struct calling_interface_token *da_tokens; -static struct device_attribute *token_location_attrs; -static struct device_attribute *token_value_attrs; +static struct token_sysfs_data *token_entries; static struct attribute **token_attrs; static DEFINE_MUTEX(smbios_mutex);
+struct token_sysfs_data { + struct device_attribute location_attr; + struct device_attribute value_attr; + struct calling_interface_token *token; +}; + struct smbios_device { struct list_head list; struct device *device; @@ -416,47 +422,26 @@ static void __init find_tokens(const struct dmi_header *dm, void *dummy) } }
-static int match_attribute(struct device *dev, - struct device_attribute *attr) -{ - int i; - - for (i = 0; i < da_num_tokens * 2; i++) { - if (!token_attrs[i]) - continue; - if (strcmp(token_attrs[i]->name, attr->attr.name) == 0) - return i/2; - } - dev_dbg(dev, "couldn't match: %s\n", attr->attr.name); - return -EINVAL; -} - static ssize_t location_show(struct device *dev, struct device_attribute *attr, char *buf) { - int i; + struct token_sysfs_data *data = container_of(attr, struct token_sysfs_data, location_attr);
if (!capable(CAP_SYS_ADMIN)) return -EPERM;
- i = match_attribute(dev, attr); - if (i > 0) - return sysfs_emit(buf, "%08x", da_tokens[i].location); - return 0; + return sysfs_emit(buf, "%08x", data->token->location); }
static ssize_t value_show(struct device *dev, struct device_attribute *attr, char *buf) { - int i; + struct token_sysfs_data *data = container_of(attr, struct token_sysfs_data, value_attr);
if (!capable(CAP_SYS_ADMIN)) return -EPERM;
- i = match_attribute(dev, attr); - if (i > 0) - return sysfs_emit(buf, "%08x", da_tokens[i].value); - return 0; + return sysfs_emit(buf, "%08x", data->token->value); }
static struct attribute_group smbios_attribute_group = { @@ -473,22 +458,15 @@ static int build_tokens_sysfs(struct platform_device *dev) { char *location_name; char *value_name; - size_t size; int ret; int i, j;
- /* (number of tokens + 1 for null terminated */ - size = sizeof(struct device_attribute) * (da_num_tokens + 1); - token_location_attrs = kzalloc(size, GFP_KERNEL); - if (!token_location_attrs) + token_entries = kcalloc(da_num_tokens, sizeof(*token_entries), GFP_KERNEL); + if (!token_entries) return -ENOMEM; - token_value_attrs = kzalloc(size, GFP_KERNEL); - if (!token_value_attrs) - goto out_allocate_value;
/* need to store both location and value + terminator*/ - size = sizeof(struct attribute *) * ((2 * da_num_tokens) + 1); - token_attrs = kzalloc(size, GFP_KERNEL); + token_attrs = kcalloc((2 * da_num_tokens) + 1, sizeof(*token_attrs), GFP_KERNEL); if (!token_attrs) goto out_allocate_attrs;
@@ -496,27 +474,32 @@ static int build_tokens_sysfs(struct platform_device *dev) /* skip empty */ if (da_tokens[i].tokenID == 0) continue; + + token_entries[i].token = &da_tokens[i]; + /* add location */ location_name = kasprintf(GFP_KERNEL, "%04x_location", da_tokens[i].tokenID); if (location_name == NULL) goto out_unwind_strings; - sysfs_attr_init(&token_location_attrs[i].attr); - token_location_attrs[i].attr.name = location_name; - token_location_attrs[i].attr.mode = 0444; - token_location_attrs[i].show = location_show; - token_attrs[j++] = &token_location_attrs[i].attr; + + sysfs_attr_init(&token_entries[i].location_attr.attr); + token_entries[i].location_attr.attr.name = location_name; + token_entries[i].location_attr.attr.mode = 0444; + token_entries[i].location_attr.show = location_show; + token_attrs[j++] = &token_entries[i].location_attr.attr;
/* add value */ value_name = kasprintf(GFP_KERNEL, "%04x_value", da_tokens[i].tokenID); if (value_name == NULL) goto loop_fail_create_value; - sysfs_attr_init(&token_value_attrs[i].attr); - token_value_attrs[i].attr.name = value_name; - token_value_attrs[i].attr.mode = 0444; - token_value_attrs[i].show = value_show; - token_attrs[j++] = &token_value_attrs[i].attr; + + sysfs_attr_init(&token_entries[i].value_attr.attr); + token_entries[i].value_attr.attr.name = value_name; + token_entries[i].value_attr.attr.mode = 0444; + token_entries[i].value_attr.show = value_show; + token_attrs[j++] = &token_entries[i].value_attr.attr; continue;
loop_fail_create_value: @@ -532,14 +515,12 @@ static int build_tokens_sysfs(struct platform_device *dev)
out_unwind_strings: while (i--) { - kfree(token_location_attrs[i].attr.name); - kfree(token_value_attrs[i].attr.name); + kfree(token_entries[i].location_attr.attr.name); + kfree(token_entries[i].value_attr.attr.name); } kfree(token_attrs); out_allocate_attrs: - kfree(token_value_attrs); -out_allocate_value: - kfree(token_location_attrs); + kfree(token_entries);
return -ENOMEM; } @@ -551,12 +532,11 @@ static void free_group(struct platform_device *pdev) sysfs_remove_group(&pdev->dev.kobj, &smbios_attribute_group); for (i = 0; i < da_num_tokens; i++) { - kfree(token_location_attrs[i].attr.name); - kfree(token_value_attrs[i].attr.name); + kfree(token_entries[i].location_attr.attr.name); + kfree(token_entries[i].value_attr.attr.name); } kfree(token_attrs); - kfree(token_value_attrs); - kfree(token_location_attrs); + kfree(token_entries); }
static int __init dell_smbios_init(void)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregor Herburger gregor.herburger@tq-group.com
[ Upstream commit 8c219e52ca4d9a67cd6a7074e91bf29b55edc075 ]
Fix description for GPIO_TQMX86 from QTMX86 to TQMx86.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Gregor Herburger gregor.herburger@tq-group.com Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/e0e38c9944ad6d281d9a662a45d289b88edc808e.171706399... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 700f71c954956..b23ef29f56020 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -1416,7 +1416,7 @@ config GPIO_TPS68470 are "output only" GPIOs.
config GPIO_TQMX86 - tristate "TQ-Systems QTMX86 GPIO" + tristate "TQ-Systems TQMx86 GPIO" depends on MFD_TQMX86 || COMPILE_TEST depends on HAS_IOPORT_MAP select GPIOLIB_IRQCHIP
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrei Coardos aboutphysycs@gmail.com
[ Upstream commit 0a5e9306b812fe3517548fab92b3d3d6ce7576e5 ]
This function call was found to be unnecessary as there is no equivalent platform_get_drvdata() call to access the private data of the driver. Also, the private data is defined in this driver, so there is no risk of it being accessed outside of this driver file.
Reviewed-by: Alexandru Ardelean alex@shruggie.ro Signed-off-by: Andrei Coardos aboutphysycs@gmail.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Stable-dep-of: 9d6a811b522b ("gpio: tqmx86: introduce shadow register for GPIO output value") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tqmx86.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c index e739dcea61b23..f0a2cf4b06796 100644 --- a/drivers/gpio/gpio-tqmx86.c +++ b/drivers/gpio/gpio-tqmx86.c @@ -259,8 +259,6 @@ static int tqmx86_gpio_probe(struct platform_device *pdev)
tqmx86_gpio_write(gpio, (u8)~TQMX86_DIR_INPUT_MASK, TQMX86_GPIODD);
- platform_set_drvdata(pdev, gpio); - chip = &gpio->chip; chip->label = "gpio-tqmx86"; chip->owner = THIS_MODULE;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit 9d6a811b522ba558bcb4ec01d12e72a0af8e9f6e ]
The TQMx86 GPIO controller uses the same register address for input and output data. Reading the register will always return current inputs rather than the previously set outputs (regardless of the current direction setting). Therefore, using a RMW pattern does not make sense when setting output values. Instead, the previously set output register value needs to be stored as a shadow register.
As there is no reliable way to get the current output values from the hardware, also initialize all channels to 0, to ensure that stored and actual output values match. This should usually not have any effect in practise, as the TQMx86 UEFI sets all outputs to 0 during boot.
Also prepare for extension of the driver to more than 8 GPIOs by using DECLARE_BITMAP.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/d0555933becd45fa92a85675d26e4d59343ddc01.171706399... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tqmx86.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c index f0a2cf4b06796..da689b5b3fad2 100644 --- a/drivers/gpio/gpio-tqmx86.c +++ b/drivers/gpio/gpio-tqmx86.c @@ -6,6 +6,7 @@ * Vadim V.Vlasov vvlasov@dev.rtsoft.ru */
+#include <linux/bitmap.h> #include <linux/bitops.h> #include <linux/errno.h> #include <linux/gpio/driver.h> @@ -38,6 +39,7 @@ struct tqmx86_gpio_data { void __iomem *io_base; int irq; raw_spinlock_t spinlock; + DECLARE_BITMAP(output, TQMX86_NGPIO); u8 irq_type[TQMX86_NGPI]; };
@@ -64,15 +66,10 @@ static void tqmx86_gpio_set(struct gpio_chip *chip, unsigned int offset, { struct tqmx86_gpio_data *gpio = gpiochip_get_data(chip); unsigned long flags; - u8 val;
raw_spin_lock_irqsave(&gpio->spinlock, flags); - val = tqmx86_gpio_read(gpio, TQMX86_GPIOD); - if (value) - val |= BIT(offset); - else - val &= ~BIT(offset); - tqmx86_gpio_write(gpio, val, TQMX86_GPIOD); + __assign_bit(offset, gpio->output, value); + tqmx86_gpio_write(gpio, bitmap_get_value8(gpio->output, 0), TQMX86_GPIOD); raw_spin_unlock_irqrestore(&gpio->spinlock, flags); }
@@ -259,6 +256,13 @@ static int tqmx86_gpio_probe(struct platform_device *pdev)
tqmx86_gpio_write(gpio, (u8)~TQMX86_DIR_INPUT_MASK, TQMX86_GPIODD);
+ /* + * Reading the previous output state is not possible with TQMx86 hardware. + * Initialize all outputs to 0 to have a defined state that matches the + * shadow register. + */ + tqmx86_gpio_write(gpio, 0, TQMX86_GPIOD); + chip = &gpio->chip; chip->label = "gpio-tqmx86"; chip->owner = THIS_MODULE;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 8e43827b6ae727a745ce7a8cc19184b28905a965 ]
Convert the driver to immutable irq-chip with a bit of intuition.
Cc: Marc Zyngier maz@kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Marc Zyngier maz@kernel.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Stable-dep-of: 08af509efdf8 ("gpio: tqmx86: store IRQ trigger type and unmask status separately") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tqmx86.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c index da689b5b3fad2..b7e2dbbdc4ebe 100644 --- a/drivers/gpio/gpio-tqmx86.c +++ b/drivers/gpio/gpio-tqmx86.c @@ -16,6 +16,7 @@ #include <linux/module.h> #include <linux/platform_device.h> #include <linux/pm_runtime.h> +#include <linux/seq_file.h> #include <linux/slab.h>
#define TQMX86_NGPIO 8 @@ -35,7 +36,6 @@
struct tqmx86_gpio_data { struct gpio_chip chip; - struct irq_chip irq_chip; void __iomem *io_base; int irq; raw_spinlock_t spinlock; @@ -119,6 +119,7 @@ static void tqmx86_gpio_irq_mask(struct irq_data *data) gpiic &= ~mask; tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC); raw_spin_unlock_irqrestore(&gpio->spinlock, flags); + gpiochip_disable_irq(&gpio->chip, irqd_to_hwirq(data)); }
static void tqmx86_gpio_irq_unmask(struct irq_data *data) @@ -131,6 +132,7 @@ static void tqmx86_gpio_irq_unmask(struct irq_data *data)
mask = TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS);
+ gpiochip_enable_irq(&gpio->chip, irqd_to_hwirq(data)); raw_spin_lock_irqsave(&gpio->spinlock, flags); gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); gpiic &= ~mask; @@ -223,6 +225,22 @@ static void tqmx86_init_irq_valid_mask(struct gpio_chip *chip, clear_bit(3, valid_mask); }
+static void tqmx86_gpio_irq_print_chip(struct irq_data *d, struct seq_file *p) +{ + struct gpio_chip *gc = irq_data_get_irq_chip_data(d); + + seq_printf(p, gc->label); +} + +static const struct irq_chip tqmx86_gpio_irq_chip = { + .irq_mask = tqmx86_gpio_irq_mask, + .irq_unmask = tqmx86_gpio_irq_unmask, + .irq_set_type = tqmx86_gpio_irq_set_type, + .irq_print_chip = tqmx86_gpio_irq_print_chip, + .flags = IRQCHIP_IMMUTABLE, + GPIOCHIP_IRQ_RESOURCE_HELPERS, +}; + static int tqmx86_gpio_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -279,14 +297,8 @@ static int tqmx86_gpio_probe(struct platform_device *pdev) pm_runtime_enable(&pdev->dev);
if (irq > 0) { - struct irq_chip *irq_chip = &gpio->irq_chip; u8 irq_status;
- irq_chip->name = chip->label; - irq_chip->irq_mask = tqmx86_gpio_irq_mask; - irq_chip->irq_unmask = tqmx86_gpio_irq_unmask; - irq_chip->irq_set_type = tqmx86_gpio_irq_set_type; - /* Mask all interrupts */ tqmx86_gpio_write(gpio, 0, TQMX86_GPIIC);
@@ -295,7 +307,7 @@ static int tqmx86_gpio_probe(struct platform_device *pdev) tqmx86_gpio_write(gpio, irq_status, TQMX86_GPIIS);
girq = &chip->irq; - girq->chip = irq_chip; + gpio_irq_chip_set_chip(girq, &tqmx86_gpio_irq_chip); girq->parent_handler = tqmx86_gpio_irq_handler; girq->num_parents = 1; girq->parents = devm_kcalloc(&pdev->dev, 1,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit 08af509efdf8dad08e972b48de0e2c2a7919ea8b ]
irq_set_type() should not implicitly unmask the IRQ.
All accesses to the interrupt configuration register are moved to a new helper tqmx86_gpio_irq_config(). We also introduce the new rule that accessing irq_type must happen while locked, which will become significant for fixing EDGE_BOTH handling.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Link: https://lore.kernel.org/r/6aa4f207f77cb58ef64ffb947e91949b0f753ccd.171706399... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tqmx86.c | 48 ++++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c index b7e2dbbdc4ebe..7e428c872a257 100644 --- a/drivers/gpio/gpio-tqmx86.c +++ b/drivers/gpio/gpio-tqmx86.c @@ -29,15 +29,19 @@ #define TQMX86_GPIIC 3 /* GPI Interrupt Configuration Register */ #define TQMX86_GPIIS 4 /* GPI Interrupt Status Register */
+#define TQMX86_GPII_NONE 0 #define TQMX86_GPII_FALLING BIT(0) #define TQMX86_GPII_RISING BIT(1) #define TQMX86_GPII_MASK (BIT(0) | BIT(1)) #define TQMX86_GPII_BITS 2 +/* Stored in irq_type with GPII bits */ +#define TQMX86_INT_UNMASKED BIT(2)
struct tqmx86_gpio_data { struct gpio_chip chip; void __iomem *io_base; int irq; + /* Lock must be held for accessing output and irq_type fields */ raw_spinlock_t spinlock; DECLARE_BITMAP(output, TQMX86_NGPIO); u8 irq_type[TQMX86_NGPI]; @@ -104,21 +108,32 @@ static int tqmx86_gpio_get_direction(struct gpio_chip *chip, return GPIO_LINE_DIRECTION_OUT; }
+static void tqmx86_gpio_irq_config(struct tqmx86_gpio_data *gpio, int offset) + __must_hold(&gpio->spinlock) +{ + u8 type = TQMX86_GPII_NONE, gpiic; + + if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED) + type = gpio->irq_type[offset] & TQMX86_GPII_MASK; + + gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); + gpiic &= ~(TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS)); + gpiic |= type << (offset * TQMX86_GPII_BITS); + tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC); +} + static void tqmx86_gpio_irq_mask(struct irq_data *data) { unsigned int offset = (data->hwirq - TQMX86_NGPO); struct tqmx86_gpio_data *gpio = gpiochip_get_data( irq_data_get_irq_chip_data(data)); unsigned long flags; - u8 gpiic, mask; - - mask = TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS);
raw_spin_lock_irqsave(&gpio->spinlock, flags); - gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); - gpiic &= ~mask; - tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC); + gpio->irq_type[offset] &= ~TQMX86_INT_UNMASKED; + tqmx86_gpio_irq_config(gpio, offset); raw_spin_unlock_irqrestore(&gpio->spinlock, flags); + gpiochip_disable_irq(&gpio->chip, irqd_to_hwirq(data)); }
@@ -128,16 +143,12 @@ static void tqmx86_gpio_irq_unmask(struct irq_data *data) struct tqmx86_gpio_data *gpio = gpiochip_get_data( irq_data_get_irq_chip_data(data)); unsigned long flags; - u8 gpiic, mask; - - mask = TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS);
gpiochip_enable_irq(&gpio->chip, irqd_to_hwirq(data)); + raw_spin_lock_irqsave(&gpio->spinlock, flags); - gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); - gpiic &= ~mask; - gpiic |= gpio->irq_type[offset] << (offset * TQMX86_GPII_BITS); - tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC); + gpio->irq_type[offset] |= TQMX86_INT_UNMASKED; + tqmx86_gpio_irq_config(gpio, offset); raw_spin_unlock_irqrestore(&gpio->spinlock, flags); }
@@ -148,7 +159,7 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type) unsigned int offset = (data->hwirq - TQMX86_NGPO); unsigned int edge_type = type & IRQF_TRIGGER_MASK; unsigned long flags; - u8 new_type, gpiic; + u8 new_type;
switch (edge_type) { case IRQ_TYPE_EDGE_RISING: @@ -164,13 +175,10 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type) return -EINVAL; /* not supported */ }
- gpio->irq_type[offset] = new_type; - raw_spin_lock_irqsave(&gpio->spinlock, flags); - gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); - gpiic &= ~((TQMX86_GPII_MASK) << (offset * TQMX86_GPII_BITS)); - gpiic |= new_type << (offset * TQMX86_GPII_BITS); - tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC); + gpio->irq_type[offset] &= ~TQMX86_GPII_MASK; + gpio->irq_type[offset] |= new_type; + tqmx86_gpio_irq_config(gpio, offset); raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit 90dd7de4ef7ba584823dfbeba834c2919a4bb55b ]
The TQMx86 GPIO controller only supports falling and rising edge triggers, but not both. Fix this by implementing a software both-edge mode that toggles the edge type after every interrupt.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller") Co-developed-by: Gregor Herburger gregor.herburger@tq-group.com Signed-off-by: Gregor Herburger gregor.herburger@tq-group.com Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Link: https://lore.kernel.org/r/515324f0491c4d44f4ef49f170354aca002d81ef.171706399... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tqmx86.c | 46 ++++++++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c index 7e428c872a257..f2e7e8754d95d 100644 --- a/drivers/gpio/gpio-tqmx86.c +++ b/drivers/gpio/gpio-tqmx86.c @@ -32,6 +32,10 @@ #define TQMX86_GPII_NONE 0 #define TQMX86_GPII_FALLING BIT(0) #define TQMX86_GPII_RISING BIT(1) +/* Stored in irq_type as a trigger type, but not actually valid as a register + * value, so the name doesn't use "GPII" + */ +#define TQMX86_INT_BOTH (BIT(0) | BIT(1)) #define TQMX86_GPII_MASK (BIT(0) | BIT(1)) #define TQMX86_GPII_BITS 2 /* Stored in irq_type with GPII bits */ @@ -113,9 +117,15 @@ static void tqmx86_gpio_irq_config(struct tqmx86_gpio_data *gpio, int offset) { u8 type = TQMX86_GPII_NONE, gpiic;
- if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED) + if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED) { type = gpio->irq_type[offset] & TQMX86_GPII_MASK;
+ if (type == TQMX86_INT_BOTH) + type = tqmx86_gpio_get(&gpio->chip, offset + TQMX86_NGPO) + ? TQMX86_GPII_FALLING + : TQMX86_GPII_RISING; + } + gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC); gpiic &= ~(TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS)); gpiic |= type << (offset * TQMX86_GPII_BITS); @@ -169,7 +179,7 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type) new_type = TQMX86_GPII_FALLING; break; case IRQ_TYPE_EDGE_BOTH: - new_type = TQMX86_GPII_FALLING | TQMX86_GPII_RISING; + new_type = TQMX86_INT_BOTH; break; default: return -EINVAL; /* not supported */ @@ -189,8 +199,8 @@ static void tqmx86_gpio_irq_handler(struct irq_desc *desc) struct gpio_chip *chip = irq_desc_get_handler_data(desc); struct tqmx86_gpio_data *gpio = gpiochip_get_data(chip); struct irq_chip *irq_chip = irq_desc_get_chip(desc); - unsigned long irq_bits; - int i = 0; + unsigned long irq_bits, flags; + int i; u8 irq_status;
chained_irq_enter(irq_chip, desc); @@ -199,6 +209,34 @@ static void tqmx86_gpio_irq_handler(struct irq_desc *desc) tqmx86_gpio_write(gpio, irq_status, TQMX86_GPIIS);
irq_bits = irq_status; + + raw_spin_lock_irqsave(&gpio->spinlock, flags); + for_each_set_bit(i, &irq_bits, TQMX86_NGPI) { + /* + * Edge-both triggers are implemented by flipping the edge + * trigger after each interrupt, as the controller only supports + * either rising or falling edge triggers, but not both. + * + * Internally, the TQMx86 GPIO controller has separate status + * registers for rising and falling edge interrupts. GPIIC + * configures which bits from which register are visible in the + * interrupt status register GPIIS and defines what triggers the + * parent IRQ line. Writing to GPIIS always clears both rising + * and falling interrupt flags internally, regardless of the + * currently configured trigger. + * + * In consequence, we can cleanly implement the edge-both + * trigger in software by first clearing the interrupt and then + * setting the new trigger based on the current GPIO input in + * tqmx86_gpio_irq_config() - even if an edge arrives between + * reading the input and setting the trigger, we will have a new + * interrupt pending. + */ + if ((gpio->irq_type[i] & TQMX86_GPII_MASK) == TQMX86_INT_BOTH) + tqmx86_gpio_irq_config(gpio, i); + } + raw_spin_unlock_irqrestore(&gpio->spinlock, flags); + for_each_set_bit(i, &irq_bits, TQMX86_NGPI) generic_handle_domain_irq(gpio->chip.irq.domain, i + TQMX86_NGPO);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Zhandarovich n.zhandarovich@fintech.ru
[ Upstream commit 4aa2dcfbad538adf7becd0034a3754e1bd01b2b5 ]
Syzkaller hit a warning [1] in a call to implement() when trying to write a value into a field of smaller size in an output report.
Since implement() already has a warn message printed out with the help of hid_warn() and value in question gets trimmed with: ... value &= m; ... WARN_ON may be considered superfluous. Remove it to suppress future syzkaller triggers.
[1] WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 implement drivers/hid/hid-core.c:1451 [inline] WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863 Modules linked in: CPU: 0 PID: 5084 Comm: syz-executor424 Not tainted 6.9.0-rc7-syzkaller-00183-gcf87f46fd34d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 RIP: 0010:implement drivers/hid/hid-core.c:1451 [inline] RIP: 0010:hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863 ... Call Trace: <TASK> __usbhid_submit_report drivers/hid/usbhid/hid-core.c:591 [inline] usbhid_submit_report+0x43d/0x9e0 drivers/hid/usbhid/hid-core.c:636 hiddev_ioctl+0x138b/0x1f00 drivers/hid/usbhid/hiddev.c:726 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:904 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f ...
Fixes: 95d1c8951e5b ("HID: simplify implement() a bit") Reported-by: syzbot+5186630949e3c55f0799@syzkaller.appspotmail.com Suggested-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: Nikita Zhandarovich n.zhandarovich@fintech.ru Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-core.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index cdad3a0662876..e2e52aa0eeba9 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1451,7 +1451,6 @@ static void implement(const struct hid_device *hid, u8 *report, hid_warn(hid, "%s() called with too large value %d (n: %d)! (%s)\n", __func__, value, n, current->comm); - WARN_ON(1); value &= m; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kun(llfl) llfl@linux.alibaba.com
[ Upstream commit a295ec52c8624883885396fde7b4df1a179627c3 ]
During the iommu initialization, iommu_init_pci() adds sysfs nodes. However, these nodes aren't remove in free_iommu_resources() subsequently.
Fixes: 39ab9555c241 ("iommu: Add sysfs bindings for struct iommu_device") Signed-off-by: Kun(llfl) llfl@linux.alibaba.com Reviewed-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/c8e0d11c6ab1ee48299c288009cf9c5dae07b42d.171521500... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/init.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index cc94ac6662339..c9598c506ff94 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1655,8 +1655,17 @@ static void __init free_pci_segments(void) } }
+static void __init free_sysfs(struct amd_iommu *iommu) +{ + if (iommu->iommu.dev) { + iommu_device_unregister(&iommu->iommu); + iommu_device_sysfs_remove(&iommu->iommu); + } +} + static void __init free_iommu_one(struct amd_iommu *iommu) { + free_sysfs(iommu); free_cwwb_sem(iommu); free_command_buffer(iommu); free_event_buffer(iommu);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: José Expósito jose.exposito89@gmail.com
[ Upstream commit ce3af2ee95170b7d9e15fff6e500d67deab1e7b3 ]
Fix a memory leak on logi_dj_recv_send_report() error path.
Fixes: 6f20d3261265 ("HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()") Signed-off-by: José Expósito jose.exposito89@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-logitech-dj.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c index 57697605b2e24..dc7b0fe83478e 100644 --- a/drivers/hid/hid-logitech-dj.c +++ b/drivers/hid/hid-logitech-dj.c @@ -1284,8 +1284,10 @@ static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev, */ msleep(50);
- if (retval) + if (retval) { + kfree(dj_report); return retval; + } }
/*
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zack Rusin zackr@vmware.com
[ Upstream commit df42523c12f8d58a41f547f471b46deffd18c203 ]
Instead of using vmwgfx specific framebuffer implementation use the drm fb helpers. There's no change in functionality, the only difference is a reduction in the amount of code inside the vmwgfx module.
drm fb helpers do not deal correctly with changes in crtc preferred mode at runtime, but the old fb code wasn't dealing with it either. Same situation applies to high-res fb consoles - the old code was limited to 1176x885 because it was checking for legacy/deprecated memory limites, the drm fb helpers are limited to the initial resolution set on fb due to first problem (drm fb helpers being unable to handle hotplug crtc preferred mode changes).
This also removes the kernel config for disabling fb support which hasn't been used or supported in a very long time.
Signed-off-by: Zack Rusin zackr@vmware.com Reviewed-by: Maaz Mombasawala mombasawalam@vmware.com Reviewed-by: Martin Krastev krastevm@vmware.com Reviewed-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20221022040236.616490-14-zack@... Stable-dep-of: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/Kconfig | 7 - drivers/gpu/drm/vmwgfx/Makefile | 2 - drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 58 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 35 +- drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 831 ---------------------------- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 79 +-- drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 7 - 7 files changed, 28 insertions(+), 991 deletions(-) delete mode 100644 drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
diff --git a/drivers/gpu/drm/vmwgfx/Kconfig b/drivers/gpu/drm/vmwgfx/Kconfig index a4fabe208d9f0..faddae3d6ac2e 100644 --- a/drivers/gpu/drm/vmwgfx/Kconfig +++ b/drivers/gpu/drm/vmwgfx/Kconfig @@ -16,13 +16,6 @@ config DRM_VMWGFX virtual hardware. The compiled module will be called "vmwgfx.ko".
-config DRM_VMWGFX_FBCON - depends on DRM_VMWGFX && DRM_FBDEV_EMULATION - bool "Enable framebuffer console under vmwgfx by default" - help - Choose this option if you are shipping a new vmwgfx - userspace driver that supports using the kernel driver. - config DRM_VMWGFX_MKSSTATS bool "Enable mksGuestStats instrumentation of vmwgfx by default" depends on DRM_VMWGFX diff --git a/drivers/gpu/drm/vmwgfx/Makefile b/drivers/gpu/drm/vmwgfx/Makefile index 68e350f410ad3..2a644f035597f 100644 --- a/drivers/gpu/drm/vmwgfx/Makefile +++ b/drivers/gpu/drm/vmwgfx/Makefile @@ -12,6 +12,4 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o vmwgfx_drv.o \ vmwgfx_devcaps.o ttm_object.o vmwgfx_system_manager.o \ vmwgfx_gem.o
-vmwgfx-$(CONFIG_DRM_FBDEV_EMULATION) += vmwgfx_fb.o - obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 53f63ad656a41..0a75084cd32a2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -35,6 +35,7 @@
#include <drm/drm_aperture.h> #include <drm/drm_drv.h> +#include <drm/drm_fb_helper.h> #include <drm/drm_gem_ttm_helper.h> #include <drm/drm_ioctl.h> #include <drm/drm_module.h> @@ -52,9 +53,6 @@
#define VMWGFX_DRIVER_DESC "Linux drm driver for VMware graphics devices"
-#define VMW_MIN_INITIAL_WIDTH 800 -#define VMW_MIN_INITIAL_HEIGHT 600 - /* * Fully encoded drm commands. Might move to vmw_drm.h */ @@ -265,7 +263,6 @@ static const struct pci_device_id vmw_pci_id_list[] = { }; MODULE_DEVICE_TABLE(pci, vmw_pci_id_list);
-static int enable_fbdev = IS_ENABLED(CONFIG_DRM_VMWGFX_FBCON); static int vmw_restrict_iommu; static int vmw_force_coherent; static int vmw_restrict_dma_mask; @@ -275,8 +272,6 @@ static int vmw_probe(struct pci_dev *, const struct pci_device_id *); static int vmwgfx_pm_notifier(struct notifier_block *nb, unsigned long val, void *ptr);
-MODULE_PARM_DESC(enable_fbdev, "Enable vmwgfx fbdev"); -module_param_named(enable_fbdev, enable_fbdev, int, 0600); MODULE_PARM_DESC(restrict_iommu, "Try to limit IOMMU usage for TTM pages"); module_param_named(restrict_iommu, vmw_restrict_iommu, int, 0600); MODULE_PARM_DESC(force_coherent, "Force coherent TTM pages"); @@ -626,8 +621,8 @@ static void vmw_get_initial_size(struct vmw_private *dev_priv) width = vmw_read(dev_priv, SVGA_REG_WIDTH); height = vmw_read(dev_priv, SVGA_REG_HEIGHT);
- width = max_t(uint32_t, width, VMW_MIN_INITIAL_WIDTH); - height = max_t(uint32_t, height, VMW_MIN_INITIAL_HEIGHT); + width = max_t(uint32_t, width, VMWGFX_MIN_INITIAL_WIDTH); + height = max_t(uint32_t, height, VMWGFX_MIN_INITIAL_HEIGHT);
if (width > dev_priv->fb_max_width || height > dev_priv->fb_max_height) { @@ -636,8 +631,8 @@ static void vmw_get_initial_size(struct vmw_private *dev_priv) * This is a host error and shouldn't occur. */
- width = VMW_MIN_INITIAL_WIDTH; - height = VMW_MIN_INITIAL_HEIGHT; + width = VMWGFX_MIN_INITIAL_WIDTH; + height = VMWGFX_MIN_INITIAL_HEIGHT; }
dev_priv->initial_width = width; @@ -887,9 +882,6 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id)
dev_priv->assume_16bpp = !!vmw_assume_16bpp;
- dev_priv->enable_fb = enable_fbdev; - - dev_priv->capabilities = vmw_read(dev_priv, SVGA_REG_CAPABILITIES); vmw_print_bitmap(&dev_priv->drm, "Capabilities", dev_priv->capabilities, @@ -1136,12 +1128,6 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id) VMWGFX_DRIVER_PATCHLEVEL, UTS_RELEASE); vmw_write_driver_id(dev_priv);
- if (dev_priv->enable_fb) { - vmw_fifo_resource_inc(dev_priv); - vmw_svga_enable(dev_priv); - vmw_fb_init(dev_priv); - } - dev_priv->pm_nb.notifier_call = vmwgfx_pm_notifier; register_pm_notifier(&dev_priv->pm_nb);
@@ -1188,12 +1174,9 @@ static void vmw_driver_unload(struct drm_device *dev) unregister_pm_notifier(&dev_priv->pm_nb);
vmw_sw_context_fini(dev_priv); - if (dev_priv->enable_fb) { - vmw_fb_off(dev_priv); - vmw_fb_close(dev_priv); - vmw_fifo_resource_dec(dev_priv); - vmw_svga_disable(dev_priv); - } + vmw_fifo_resource_dec(dev_priv); + + vmw_svga_disable(dev_priv);
vmw_kms_close(dev_priv); vmw_overlay_close(dev_priv); @@ -1331,8 +1314,6 @@ static void vmw_master_drop(struct drm_device *dev, struct vmw_private *dev_priv = vmw_priv(dev);
vmw_kms_legacy_hotspot_clear(dev_priv); - if (!dev_priv->enable_fb) - vmw_svga_disable(dev_priv); }
/** @@ -1528,25 +1509,19 @@ static int vmw_pm_freeze(struct device *kdev) DRM_ERROR("Failed to freeze modesetting.\n"); return ret; } - if (dev_priv->enable_fb) - vmw_fb_off(dev_priv);
vmw_execbuf_release_pinned_bo(dev_priv); vmw_resource_evict_all(dev_priv); vmw_release_device_early(dev_priv); while (ttm_device_swapout(&dev_priv->bdev, &ctx, GFP_KERNEL) > 0); - if (dev_priv->enable_fb) - vmw_fifo_resource_dec(dev_priv); + vmw_fifo_resource_dec(dev_priv); if (atomic_read(&dev_priv->num_fifo_resources) != 0) { DRM_ERROR("Can't hibernate while 3D resources are active.\n"); - if (dev_priv->enable_fb) - vmw_fifo_resource_inc(dev_priv); + vmw_fifo_resource_inc(dev_priv); WARN_ON(vmw_request_device_late(dev_priv)); dev_priv->suspend_locked = false; if (dev_priv->suspend_state) vmw_kms_resume(dev); - if (dev_priv->enable_fb) - vmw_fb_on(dev_priv); return -EBUSY; }
@@ -1566,24 +1541,19 @@ static int vmw_pm_restore(struct device *kdev)
vmw_detect_version(dev_priv);
- if (dev_priv->enable_fb) - vmw_fifo_resource_inc(dev_priv); + vmw_fifo_resource_inc(dev_priv);
ret = vmw_request_device(dev_priv); if (ret) return ret;
- if (dev_priv->enable_fb) - __vmw_svga_enable(dev_priv); + __vmw_svga_enable(dev_priv);
vmw_fence_fifo_up(dev_priv->fman); dev_priv->suspend_locked = false; if (dev_priv->suspend_state) vmw_kms_resume(&dev_priv->drm);
- if (dev_priv->enable_fb) - vmw_fb_on(dev_priv); - return 0; }
@@ -1674,6 +1644,10 @@ static int vmw_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if (ret) goto out_unload;
+ vmw_fifo_resource_inc(vmw); + vmw_svga_enable(vmw); + drm_fbdev_generic_setup(&vmw->drm, 0); + vmw_debugfs_gem_init(vmw); vmw_debugfs_resource_managers_init(vmw);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 136f1cdcf8cdf..00d9e58b7e149 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -62,6 +62,9 @@ #define VMWGFX_MAX_DISPLAYS 16 #define VMWGFX_CMD_BOUNCE_INIT_SIZE 32768
+#define VMWGFX_MIN_INITIAL_WIDTH 1280 +#define VMWGFX_MIN_INITIAL_HEIGHT 800 + #define VMWGFX_PCI_ID_SVGA2 0x0405 #define VMWGFX_PCI_ID_SVGA3 0x0406
@@ -551,7 +554,6 @@ struct vmw_private { * Framebuffer info. */
- void *fb_info; enum vmw_display_unit_type active_display_unit; struct vmw_legacy_display *ldu_priv; struct vmw_overlay *overlay_priv; @@ -610,8 +612,6 @@ struct vmw_private { struct mutex cmdbuf_mutex; struct mutex binding_mutex;
- bool enable_fb; - /** * PM management. */ @@ -1178,35 +1178,6 @@ extern void vmw_generic_waiter_add(struct vmw_private *dev_priv, u32 flag, extern void vmw_generic_waiter_remove(struct vmw_private *dev_priv, u32 flag, int *waiter_count);
- -/** - * Kernel framebuffer - vmwgfx_fb.c - */ - -#ifdef CONFIG_DRM_FBDEV_EMULATION -int vmw_fb_init(struct vmw_private *vmw_priv); -int vmw_fb_close(struct vmw_private *dev_priv); -int vmw_fb_off(struct vmw_private *vmw_priv); -int vmw_fb_on(struct vmw_private *vmw_priv); -#else -static inline int vmw_fb_init(struct vmw_private *vmw_priv) -{ - return 0; -} -static inline int vmw_fb_close(struct vmw_private *dev_priv) -{ - return 0; -} -static inline int vmw_fb_off(struct vmw_private *vmw_priv) -{ - return 0; -} -static inline int vmw_fb_on(struct vmw_private *vmw_priv) -{ - return 0; -} -#endif - /** * Kernel modesetting - vmwgfx_kms.c */ diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c deleted file mode 100644 index 5b85b477e4c69..0000000000000 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c +++ /dev/null @@ -1,831 +0,0 @@ -/************************************************************************** - * - * Copyright © 2007 David Airlie - * Copyright © 2009-2015 VMware, Inc., Palo Alto, CA., USA - * All Rights Reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the - * "Software"), to deal in the Software without restriction, including - * without limitation the rights to use, copy, modify, merge, publish, - * distribute, sub license, and/or sell copies of the Software, and to - * permit persons to whom the Software is furnished to do so, subject to - * the following conditions: - * - * The above copyright notice and this permission notice (including the - * next paragraph) shall be included in all copies or substantial portions - * of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, - * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR - * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE - * USE OR OTHER DEALINGS IN THE SOFTWARE. - * - **************************************************************************/ - -#include <linux/fb.h> -#include <linux/pci.h> - -#include <drm/drm_fourcc.h> -#include <drm/ttm/ttm_placement.h> - -#include "vmwgfx_drv.h" -#include "vmwgfx_kms.h" - -#define VMW_DIRTY_DELAY (HZ / 30) - -struct vmw_fb_par { - struct vmw_private *vmw_priv; - - void *vmalloc; - - struct mutex bo_mutex; - struct vmw_buffer_object *vmw_bo; - unsigned bo_size; - struct drm_framebuffer *set_fb; - struct drm_display_mode *set_mode; - u32 fb_x; - u32 fb_y; - bool bo_iowrite; - - u32 pseudo_palette[17]; - - unsigned max_width; - unsigned max_height; - - struct { - spinlock_t lock; - bool active; - unsigned x1; - unsigned y1; - unsigned x2; - unsigned y2; - } dirty; - - struct drm_crtc *crtc; - struct drm_connector *con; - struct delayed_work local_work; -}; - -static int vmw_fb_setcolreg(unsigned regno, unsigned red, unsigned green, - unsigned blue, unsigned transp, - struct fb_info *info) -{ - struct vmw_fb_par *par = info->par; - u32 *pal = par->pseudo_palette; - - if (regno > 15) { - DRM_ERROR("Bad regno %u.\n", regno); - return 1; - } - - switch (par->set_fb->format->depth) { - case 24: - case 32: - pal[regno] = ((red & 0xff00) << 8) | - (green & 0xff00) | - ((blue & 0xff00) >> 8); - break; - default: - DRM_ERROR("Bad depth %u, bpp %u.\n", - par->set_fb->format->depth, - par->set_fb->format->cpp[0] * 8); - return 1; - } - - return 0; -} - -static int vmw_fb_check_var(struct fb_var_screeninfo *var, - struct fb_info *info) -{ - int depth = var->bits_per_pixel; - struct vmw_fb_par *par = info->par; - struct vmw_private *vmw_priv = par->vmw_priv; - - switch (var->bits_per_pixel) { - case 32: - depth = (var->transp.length > 0) ? 32 : 24; - break; - default: - DRM_ERROR("Bad bpp %u.\n", var->bits_per_pixel); - return -EINVAL; - } - - switch (depth) { - case 24: - var->red.offset = 16; - var->green.offset = 8; - var->blue.offset = 0; - var->red.length = 8; - var->green.length = 8; - var->blue.length = 8; - var->transp.length = 0; - var->transp.offset = 0; - break; - case 32: - var->red.offset = 16; - var->green.offset = 8; - var->blue.offset = 0; - var->red.length = 8; - var->green.length = 8; - var->blue.length = 8; - var->transp.length = 8; - var->transp.offset = 24; - break; - default: - DRM_ERROR("Bad depth %u.\n", depth); - return -EINVAL; - } - - if ((var->xoffset + var->xres) > par->max_width || - (var->yoffset + var->yres) > par->max_height) { - DRM_ERROR("Requested geom can not fit in framebuffer\n"); - return -EINVAL; - } - - if (!vmw_kms_validate_mode_vram(vmw_priv, - var->xres * var->bits_per_pixel/8, - var->yoffset + var->yres)) { - DRM_ERROR("Requested geom can not fit in framebuffer\n"); - return -EINVAL; - } - - return 0; -} - -static int vmw_fb_blank(int blank, struct fb_info *info) -{ - return 0; -} - -/** - * vmw_fb_dirty_flush - flush dirty regions to the kms framebuffer - * - * @work: The struct work_struct associated with this task. - * - * This function flushes the dirty regions of the vmalloc framebuffer to the - * kms framebuffer, and if the kms framebuffer is visible, also updated the - * corresponding displays. Note that this function runs even if the kms - * framebuffer is not bound to a crtc and thus not visible, but it's turned - * off during hibernation using the par->dirty.active bool. - */ -static void vmw_fb_dirty_flush(struct work_struct *work) -{ - struct vmw_fb_par *par = container_of(work, struct vmw_fb_par, - local_work.work); - struct vmw_private *vmw_priv = par->vmw_priv; - struct fb_info *info = vmw_priv->fb_info; - unsigned long irq_flags; - s32 dst_x1, dst_x2, dst_y1, dst_y2, w = 0, h = 0; - u32 cpp, max_x, max_y; - struct drm_clip_rect clip; - struct drm_framebuffer *cur_fb; - u8 *src_ptr, *dst_ptr; - struct vmw_buffer_object *vbo = par->vmw_bo; - void *virtual; - - if (!READ_ONCE(par->dirty.active)) - return; - - mutex_lock(&par->bo_mutex); - cur_fb = par->set_fb; - if (!cur_fb) - goto out_unlock; - - (void) ttm_bo_reserve(&vbo->base, false, false, NULL); - virtual = vmw_bo_map_and_cache(vbo); - if (!virtual) - goto out_unreserve; - - spin_lock_irqsave(&par->dirty.lock, irq_flags); - if (!par->dirty.active) { - spin_unlock_irqrestore(&par->dirty.lock, irq_flags); - goto out_unreserve; - } - - /* - * Handle panning when copying from vmalloc to framebuffer. - * Clip dirty area to framebuffer. - */ - cpp = cur_fb->format->cpp[0]; - max_x = par->fb_x + cur_fb->width; - max_y = par->fb_y + cur_fb->height; - - dst_x1 = par->dirty.x1 - par->fb_x; - dst_y1 = par->dirty.y1 - par->fb_y; - dst_x1 = max_t(s32, dst_x1, 0); - dst_y1 = max_t(s32, dst_y1, 0); - - dst_x2 = par->dirty.x2 - par->fb_x; - dst_y2 = par->dirty.y2 - par->fb_y; - dst_x2 = min_t(s32, dst_x2, max_x); - dst_y2 = min_t(s32, dst_y2, max_y); - w = dst_x2 - dst_x1; - h = dst_y2 - dst_y1; - w = max_t(s32, 0, w); - h = max_t(s32, 0, h); - - par->dirty.x1 = par->dirty.x2 = 0; - par->dirty.y1 = par->dirty.y2 = 0; - spin_unlock_irqrestore(&par->dirty.lock, irq_flags); - - if (w && h) { - dst_ptr = (u8 *)virtual + - (dst_y1 * par->set_fb->pitches[0] + dst_x1 * cpp); - src_ptr = (u8 *)par->vmalloc + - ((dst_y1 + par->fb_y) * info->fix.line_length + - (dst_x1 + par->fb_x) * cpp); - - while (h-- > 0) { - memcpy(dst_ptr, src_ptr, w*cpp); - dst_ptr += par->set_fb->pitches[0]; - src_ptr += info->fix.line_length; - } - - clip.x1 = dst_x1; - clip.x2 = dst_x2; - clip.y1 = dst_y1; - clip.y2 = dst_y2; - } - -out_unreserve: - ttm_bo_unreserve(&vbo->base); - if (w && h) { - WARN_ON_ONCE(par->set_fb->funcs->dirty(cur_fb, NULL, 0, 0, - &clip, 1)); - vmw_cmd_flush(vmw_priv, false); - } -out_unlock: - mutex_unlock(&par->bo_mutex); -} - -static void vmw_fb_dirty_mark(struct vmw_fb_par *par, - unsigned x1, unsigned y1, - unsigned width, unsigned height) -{ - unsigned long flags; - unsigned x2 = x1 + width; - unsigned y2 = y1 + height; - - spin_lock_irqsave(&par->dirty.lock, flags); - if (par->dirty.x1 == par->dirty.x2) { - par->dirty.x1 = x1; - par->dirty.y1 = y1; - par->dirty.x2 = x2; - par->dirty.y2 = y2; - /* if we are active start the dirty work - * we share the work with the defio system */ - if (par->dirty.active) - schedule_delayed_work(&par->local_work, - VMW_DIRTY_DELAY); - } else { - if (x1 < par->dirty.x1) - par->dirty.x1 = x1; - if (y1 < par->dirty.y1) - par->dirty.y1 = y1; - if (x2 > par->dirty.x2) - par->dirty.x2 = x2; - if (y2 > par->dirty.y2) - par->dirty.y2 = y2; - } - spin_unlock_irqrestore(&par->dirty.lock, flags); -} - -static int vmw_fb_pan_display(struct fb_var_screeninfo *var, - struct fb_info *info) -{ - struct vmw_fb_par *par = info->par; - - if ((var->xoffset + var->xres) > var->xres_virtual || - (var->yoffset + var->yres) > var->yres_virtual) { - DRM_ERROR("Requested panning can not fit in framebuffer\n"); - return -EINVAL; - } - - mutex_lock(&par->bo_mutex); - par->fb_x = var->xoffset; - par->fb_y = var->yoffset; - if (par->set_fb) - vmw_fb_dirty_mark(par, par->fb_x, par->fb_y, par->set_fb->width, - par->set_fb->height); - mutex_unlock(&par->bo_mutex); - - return 0; -} - -static void vmw_deferred_io(struct fb_info *info, struct list_head *pagereflist) -{ - struct vmw_fb_par *par = info->par; - unsigned long start, end, min, max; - unsigned long flags; - struct fb_deferred_io_pageref *pageref; - int y1, y2; - - min = ULONG_MAX; - max = 0; - list_for_each_entry(pageref, pagereflist, list) { - start = pageref->offset; - end = start + PAGE_SIZE - 1; - min = min(min, start); - max = max(max, end); - } - - if (min < max) { - y1 = min / info->fix.line_length; - y2 = (max / info->fix.line_length) + 1; - - spin_lock_irqsave(&par->dirty.lock, flags); - par->dirty.x1 = 0; - par->dirty.y1 = y1; - par->dirty.x2 = info->var.xres; - par->dirty.y2 = y2; - spin_unlock_irqrestore(&par->dirty.lock, flags); - - /* - * Since we've already waited on this work once, try to - * execute asap. - */ - cancel_delayed_work(&par->local_work); - schedule_delayed_work(&par->local_work, 0); - } -}; - -static struct fb_deferred_io vmw_defio = { - .delay = VMW_DIRTY_DELAY, - .deferred_io = vmw_deferred_io, -}; - -/* - * Draw code - */ - -static void vmw_fb_fillrect(struct fb_info *info, const struct fb_fillrect *rect) -{ - cfb_fillrect(info, rect); - vmw_fb_dirty_mark(info->par, rect->dx, rect->dy, - rect->width, rect->height); -} - -static void vmw_fb_copyarea(struct fb_info *info, const struct fb_copyarea *region) -{ - cfb_copyarea(info, region); - vmw_fb_dirty_mark(info->par, region->dx, region->dy, - region->width, region->height); -} - -static void vmw_fb_imageblit(struct fb_info *info, const struct fb_image *image) -{ - cfb_imageblit(info, image); - vmw_fb_dirty_mark(info->par, image->dx, image->dy, - image->width, image->height); -} - -/* - * Bring up code - */ - -static int vmw_fb_create_bo(struct vmw_private *vmw_priv, - size_t size, struct vmw_buffer_object **out) -{ - struct vmw_buffer_object *vmw_bo; - int ret; - - ret = vmw_bo_create(vmw_priv, size, - &vmw_sys_placement, - false, false, - &vmw_bo_bo_free, &vmw_bo); - if (unlikely(ret != 0)) - return ret; - - *out = vmw_bo; - - return ret; -} - -static int vmw_fb_compute_depth(struct fb_var_screeninfo *var, - int *depth) -{ - switch (var->bits_per_pixel) { - case 32: - *depth = (var->transp.length > 0) ? 32 : 24; - break; - default: - DRM_ERROR("Bad bpp %u.\n", var->bits_per_pixel); - return -EINVAL; - } - - return 0; -} - -static int vmwgfx_set_config_internal(struct drm_mode_set *set) -{ - struct drm_crtc *crtc = set->crtc; - struct drm_modeset_acquire_ctx ctx; - int ret; - - drm_modeset_acquire_init(&ctx, 0); - -restart: - ret = crtc->funcs->set_config(set, &ctx); - - if (ret == -EDEADLK) { - drm_modeset_backoff(&ctx); - goto restart; - } - - drm_modeset_drop_locks(&ctx); - drm_modeset_acquire_fini(&ctx); - - return ret; -} - -static int vmw_fb_kms_detach(struct vmw_fb_par *par, - bool detach_bo, - bool unref_bo) -{ - struct drm_framebuffer *cur_fb = par->set_fb; - int ret; - - /* Detach the KMS framebuffer from crtcs */ - if (par->set_mode) { - struct drm_mode_set set; - - set.crtc = par->crtc; - set.x = 0; - set.y = 0; - set.mode = NULL; - set.fb = NULL; - set.num_connectors = 0; - set.connectors = &par->con; - ret = vmwgfx_set_config_internal(&set); - if (ret) { - DRM_ERROR("Could not unset a mode.\n"); - return ret; - } - drm_mode_destroy(&par->vmw_priv->drm, par->set_mode); - par->set_mode = NULL; - } - - if (cur_fb) { - drm_framebuffer_put(cur_fb); - par->set_fb = NULL; - } - - if (par->vmw_bo && detach_bo && unref_bo) - vmw_bo_unreference(&par->vmw_bo); - - return 0; -} - -static int vmw_fb_kms_framebuffer(struct fb_info *info) -{ - struct drm_mode_fb_cmd2 mode_cmd = {0}; - struct vmw_fb_par *par = info->par; - struct fb_var_screeninfo *var = &info->var; - struct drm_framebuffer *cur_fb; - struct vmw_framebuffer *vfb; - int ret = 0, depth; - size_t new_bo_size; - - ret = vmw_fb_compute_depth(var, &depth); - if (ret) - return ret; - - mode_cmd.width = var->xres; - mode_cmd.height = var->yres; - mode_cmd.pitches[0] = ((var->bits_per_pixel + 7) / 8) * mode_cmd.width; - mode_cmd.pixel_format = - drm_mode_legacy_fb_format(var->bits_per_pixel, depth); - - cur_fb = par->set_fb; - if (cur_fb && cur_fb->width == mode_cmd.width && - cur_fb->height == mode_cmd.height && - cur_fb->format->format == mode_cmd.pixel_format && - cur_fb->pitches[0] == mode_cmd.pitches[0]) - return 0; - - /* Need new buffer object ? */ - new_bo_size = (size_t) mode_cmd.pitches[0] * (size_t) mode_cmd.height; - ret = vmw_fb_kms_detach(par, - par->bo_size < new_bo_size || - par->bo_size > 2*new_bo_size, - true); - if (ret) - return ret; - - if (!par->vmw_bo) { - ret = vmw_fb_create_bo(par->vmw_priv, new_bo_size, - &par->vmw_bo); - if (ret) { - DRM_ERROR("Failed creating a buffer object for " - "fbdev.\n"); - return ret; - } - par->bo_size = new_bo_size; - } - - vfb = vmw_kms_new_framebuffer(par->vmw_priv, par->vmw_bo, NULL, - true, &mode_cmd); - if (IS_ERR(vfb)) - return PTR_ERR(vfb); - - par->set_fb = &vfb->base; - - return 0; -} - -static int vmw_fb_set_par(struct fb_info *info) -{ - struct vmw_fb_par *par = info->par; - struct vmw_private *vmw_priv = par->vmw_priv; - struct drm_mode_set set; - struct fb_var_screeninfo *var = &info->var; - struct drm_display_mode new_mode = { DRM_MODE("fb_mode", - DRM_MODE_TYPE_DRIVER, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) - }; - struct drm_display_mode *mode; - int ret; - - mode = drm_mode_duplicate(&vmw_priv->drm, &new_mode); - if (!mode) { - DRM_ERROR("Could not create new fb mode.\n"); - return -ENOMEM; - } - - mode->hdisplay = var->xres; - mode->vdisplay = var->yres; - vmw_guess_mode_timing(mode); - - if (!vmw_kms_validate_mode_vram(vmw_priv, - mode->hdisplay * - DIV_ROUND_UP(var->bits_per_pixel, 8), - mode->vdisplay)) { - drm_mode_destroy(&vmw_priv->drm, mode); - return -EINVAL; - } - - mutex_lock(&par->bo_mutex); - ret = vmw_fb_kms_framebuffer(info); - if (ret) - goto out_unlock; - - par->fb_x = var->xoffset; - par->fb_y = var->yoffset; - - set.crtc = par->crtc; - set.x = 0; - set.y = 0; - set.mode = mode; - set.fb = par->set_fb; - set.num_connectors = 1; - set.connectors = &par->con; - - ret = vmwgfx_set_config_internal(&set); - if (ret) - goto out_unlock; - - vmw_fb_dirty_mark(par, par->fb_x, par->fb_y, - par->set_fb->width, par->set_fb->height); - - /* If there already was stuff dirty we wont - * schedule a new work, so lets do it now */ - - schedule_delayed_work(&par->local_work, 0); - -out_unlock: - if (par->set_mode) - drm_mode_destroy(&vmw_priv->drm, par->set_mode); - par->set_mode = mode; - - mutex_unlock(&par->bo_mutex); - - return ret; -} - - -static const struct fb_ops vmw_fb_ops = { - .owner = THIS_MODULE, - .fb_check_var = vmw_fb_check_var, - .fb_set_par = vmw_fb_set_par, - .fb_setcolreg = vmw_fb_setcolreg, - .fb_fillrect = vmw_fb_fillrect, - .fb_copyarea = vmw_fb_copyarea, - .fb_imageblit = vmw_fb_imageblit, - .fb_pan_display = vmw_fb_pan_display, - .fb_blank = vmw_fb_blank, - .fb_mmap = fb_deferred_io_mmap, -}; - -int vmw_fb_init(struct vmw_private *vmw_priv) -{ - struct device *device = vmw_priv->drm.dev; - struct vmw_fb_par *par; - struct fb_info *info; - unsigned fb_width, fb_height; - unsigned int fb_bpp, fb_pitch, fb_size; - struct drm_display_mode *init_mode; - int ret; - - fb_bpp = 32; - - /* XXX As shouldn't these be as well. */ - fb_width = min(vmw_priv->fb_max_width, (unsigned)2048); - fb_height = min(vmw_priv->fb_max_height, (unsigned)2048); - - fb_pitch = fb_width * fb_bpp / 8; - fb_size = fb_pitch * fb_height; - - info = framebuffer_alloc(sizeof(*par), device); - if (!info) - return -ENOMEM; - - /* - * Par - */ - vmw_priv->fb_info = info; - par = info->par; - memset(par, 0, sizeof(*par)); - INIT_DELAYED_WORK(&par->local_work, &vmw_fb_dirty_flush); - par->vmw_priv = vmw_priv; - par->vmalloc = NULL; - par->max_width = fb_width; - par->max_height = fb_height; - - ret = vmw_kms_fbdev_init_data(vmw_priv, 0, par->max_width, - par->max_height, &par->con, - &par->crtc, &init_mode); - if (ret) - goto err_kms; - - info->var.xres = init_mode->hdisplay; - info->var.yres = init_mode->vdisplay; - - /* - * Create buffers and alloc memory - */ - par->vmalloc = vzalloc(fb_size); - if (unlikely(par->vmalloc == NULL)) { - ret = -ENOMEM; - goto err_free; - } - - /* - * Fixed and var - */ - strcpy(info->fix.id, "svgadrmfb"); - info->fix.type = FB_TYPE_PACKED_PIXELS; - info->fix.visual = FB_VISUAL_TRUECOLOR; - info->fix.type_aux = 0; - info->fix.xpanstep = 1; /* doing it in hw */ - info->fix.ypanstep = 1; /* doing it in hw */ - info->fix.ywrapstep = 0; - info->fix.accel = FB_ACCEL_NONE; - info->fix.line_length = fb_pitch; - - info->fix.smem_start = 0; - info->fix.smem_len = fb_size; - - info->pseudo_palette = par->pseudo_palette; - info->screen_base = (char __iomem *)par->vmalloc; - info->screen_size = fb_size; - - info->fbops = &vmw_fb_ops; - - /* 24 depth per default */ - info->var.red.offset = 16; - info->var.green.offset = 8; - info->var.blue.offset = 0; - info->var.red.length = 8; - info->var.green.length = 8; - info->var.blue.length = 8; - info->var.transp.offset = 0; - info->var.transp.length = 0; - - info->var.xres_virtual = fb_width; - info->var.yres_virtual = fb_height; - info->var.bits_per_pixel = fb_bpp; - info->var.xoffset = 0; - info->var.yoffset = 0; - info->var.activate = FB_ACTIVATE_NOW; - info->var.height = -1; - info->var.width = -1; - - /* Use default scratch pixmap (info->pixmap.flags = FB_PIXMAP_SYSTEM) */ - info->apertures = alloc_apertures(1); - if (!info->apertures) { - ret = -ENOMEM; - goto err_aper; - } - info->apertures->ranges[0].base = vmw_priv->vram_start; - info->apertures->ranges[0].size = vmw_priv->vram_size; - - /* - * Dirty & Deferred IO - */ - par->dirty.x1 = par->dirty.x2 = 0; - par->dirty.y1 = par->dirty.y2 = 0; - par->dirty.active = true; - spin_lock_init(&par->dirty.lock); - mutex_init(&par->bo_mutex); - info->fbdefio = &vmw_defio; - fb_deferred_io_init(info); - - ret = register_framebuffer(info); - if (unlikely(ret != 0)) - goto err_defio; - - vmw_fb_set_par(info); - - return 0; - -err_defio: - fb_deferred_io_cleanup(info); -err_aper: -err_free: - vfree(par->vmalloc); -err_kms: - framebuffer_release(info); - vmw_priv->fb_info = NULL; - - return ret; -} - -int vmw_fb_close(struct vmw_private *vmw_priv) -{ - struct fb_info *info; - struct vmw_fb_par *par; - - if (!vmw_priv->fb_info) - return 0; - - info = vmw_priv->fb_info; - par = info->par; - - /* ??? order */ - fb_deferred_io_cleanup(info); - cancel_delayed_work_sync(&par->local_work); - unregister_framebuffer(info); - - mutex_lock(&par->bo_mutex); - (void) vmw_fb_kms_detach(par, true, true); - mutex_unlock(&par->bo_mutex); - - vfree(par->vmalloc); - framebuffer_release(info); - - return 0; -} - -int vmw_fb_off(struct vmw_private *vmw_priv) -{ - struct fb_info *info; - struct vmw_fb_par *par; - unsigned long flags; - - if (!vmw_priv->fb_info) - return -EINVAL; - - info = vmw_priv->fb_info; - par = info->par; - - spin_lock_irqsave(&par->dirty.lock, flags); - par->dirty.active = false; - spin_unlock_irqrestore(&par->dirty.lock, flags); - - flush_delayed_work(&info->deferred_work); - flush_delayed_work(&par->local_work); - - return 0; -} - -int vmw_fb_on(struct vmw_private *vmw_priv) -{ - struct fb_info *info; - struct vmw_fb_par *par; - unsigned long flags; - - if (!vmw_priv->fb_info) - return -EINVAL; - - info = vmw_priv->fb_info; - par = info->par; - - spin_lock_irqsave(&par->dirty.lock, flags); - par->dirty.active = true; - spin_unlock_irqrestore(&par->dirty.lock, flags); - - /* - * Need to reschedule a dirty update, because otherwise that's - * only done in dirty_mark() if the previous coalesced - * dirty region was empty. - */ - schedule_delayed_work(&par->local_work, 0); - - return 0; -} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index b1aed051b41ab..77c6700da0087 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -1988,6 +1988,8 @@ int vmw_kms_init(struct vmw_private *dev_priv) dev->mode_config.min_height = 1; dev->mode_config.max_width = dev_priv->texture_max_width; dev->mode_config.max_height = dev_priv->texture_max_height; + dev->mode_config.preferred_depth = dev_priv->assume_16bpp ? 16 : 32; + dev->mode_config.prefer_shadow_fbdev = !dev_priv->has_mob;
drm_mode_create_suggested_offset_properties(dev); vmw_kms_create_hotplug_mode_update_property(dev_priv); @@ -2134,8 +2136,8 @@ static int vmw_du_update_layout(struct vmw_private *dev_priv, du->gui_x = rects[du->unit].x1; du->gui_y = rects[du->unit].y1; } else { - du->pref_width = 800; - du->pref_height = 600; + du->pref_width = VMWGFX_MIN_INITIAL_WIDTH; + du->pref_height = VMWGFX_MIN_INITIAL_HEIGHT; du->pref_active = false; du->gui_x = 0; du->gui_y = 0; @@ -2162,13 +2164,13 @@ static int vmw_du_update_layout(struct vmw_private *dev_priv, } con->status = vmw_du_connector_detect(con, true); } - - drm_sysfs_hotplug_event(dev); out_fini: drm_modeset_drop_locks(&ctx); drm_modeset_acquire_fini(&ctx); mutex_unlock(&dev->mode_config.mutex);
+ drm_sysfs_hotplug_event(dev); + return 0; }
@@ -2448,10 +2450,9 @@ int vmw_kms_update_layout_ioctl(struct drm_device *dev, void *data, int ret, i;
if (!arg->num_outputs) { - struct drm_rect def_rect = {0, 0, 800, 600}; - VMW_DEBUG_KMS("Default layout x1 = %d y1 = %d x2 = %d y2 = %d\n", - def_rect.x1, def_rect.y1, - def_rect.x2, def_rect.y2); + struct drm_rect def_rect = {0, 0, + VMWGFX_MIN_INITIAL_WIDTH, + VMWGFX_MIN_INITIAL_HEIGHT}; vmw_du_update_layout(dev_priv, 1, &def_rect); return 0; } @@ -2746,68 +2747,6 @@ int vmw_kms_update_proxy(struct vmw_resource *res, return 0; }
-int vmw_kms_fbdev_init_data(struct vmw_private *dev_priv, - unsigned unit, - u32 max_width, - u32 max_height, - struct drm_connector **p_con, - struct drm_crtc **p_crtc, - struct drm_display_mode **p_mode) -{ - struct drm_connector *con; - struct vmw_display_unit *du; - struct drm_display_mode *mode; - int i = 0; - int ret = 0; - - mutex_lock(&dev_priv->drm.mode_config.mutex); - list_for_each_entry(con, &dev_priv->drm.mode_config.connector_list, - head) { - if (i == unit) - break; - - ++i; - } - - if (&con->head == &dev_priv->drm.mode_config.connector_list) { - DRM_ERROR("Could not find initial display unit.\n"); - ret = -EINVAL; - goto out_unlock; - } - - if (list_empty(&con->modes)) - (void) vmw_du_connector_fill_modes(con, max_width, max_height); - - if (list_empty(&con->modes)) { - DRM_ERROR("Could not find initial display mode.\n"); - ret = -EINVAL; - goto out_unlock; - } - - du = vmw_connector_to_du(con); - *p_con = con; - *p_crtc = &du->crtc; - - list_for_each_entry(mode, &con->modes, head) { - if (mode->type & DRM_MODE_TYPE_PREFERRED) - break; - } - - if (&mode->head == &con->modes) { - WARN_ONCE(true, "Could not find initial preferred mode.\n"); - *p_mode = list_first_entry(&con->modes, - struct drm_display_mode, - head); - } else { - *p_mode = mode; - } - - out_unlock: - mutex_unlock(&dev_priv->drm.mode_config.mutex); - - return ret; -} - /** * vmw_kms_create_implicit_placement_property - Set up the implicit placement * property. diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h index b116600b343a8..7e15c851871f2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h @@ -458,13 +458,6 @@ vmw_kms_new_framebuffer(struct vmw_private *dev_priv, struct vmw_surface *surface, bool only_2d, const struct drm_mode_fb_cmd2 *mode_cmd); -int vmw_kms_fbdev_init_data(struct vmw_private *dev_priv, - unsigned unit, - u32 max_width, - u32 max_height, - struct drm_connector **p_con, - struct drm_crtc **p_crtc, - struct drm_display_mode **p_mode); void vmw_guess_mode_timing(struct drm_display_mode *mode); void vmw_kms_update_implicit_fb(struct vmw_private *dev_priv); void vmw_kms_create_implicit_placement_property(struct vmw_private *dev_priv);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Krastev martin.krastev@broadcom.com
[ Upstream commit 935f795045a6f9b13d28d46ebdad04bfea8750dd ]
Implement drm_connector_helper_funcs.mode_valid and .get_modes, replacing custom drm_connector_funcs.fill_modes code with drm_helper_probe_single_connector_modes; for STDU, LDU & SOU display units.
Signed-off-by: Martin Krastev martin.krastev@broadcom.com Reviewed-by: Zack Rusin zack.rusin@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://patchwork.freedesktop.org/patch/msgid/20240126200804.732454-2-zack.r... Stable-dep-of: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 272 +++++++++------------------ drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 6 +- drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 +- drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 5 +- drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 4 +- 5 files changed, 101 insertions(+), 191 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index 77c6700da0087..7a2d29370a534 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -31,6 +31,7 @@ #include <drm/drm_fourcc.h> #include <drm/drm_rect.h> #include <drm/drm_sysfs.h> +#include <drm/drm_edid.h>
#include "vmwgfx_kms.h"
@@ -2213,107 +2214,6 @@ vmw_du_connector_detect(struct drm_connector *connector, bool force) connector_status_connected : connector_status_disconnected); }
-static struct drm_display_mode vmw_kms_connector_builtin[] = { - /* 640x480@60Hz */ - { DRM_MODE("640x480", DRM_MODE_TYPE_DRIVER, 25175, 640, 656, - 752, 800, 0, 480, 489, 492, 525, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 800x600@60Hz */ - { DRM_MODE("800x600", DRM_MODE_TYPE_DRIVER, 40000, 800, 840, - 968, 1056, 0, 600, 601, 605, 628, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1024x768@60Hz */ - { DRM_MODE("1024x768", DRM_MODE_TYPE_DRIVER, 65000, 1024, 1048, - 1184, 1344, 0, 768, 771, 777, 806, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 1152x864@75Hz */ - { DRM_MODE("1152x864", DRM_MODE_TYPE_DRIVER, 108000, 1152, 1216, - 1344, 1600, 0, 864, 865, 868, 900, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1280x720@60Hz */ - { DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74500, 1280, 1344, - 1472, 1664, 0, 720, 723, 728, 748, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1280x768@60Hz */ - { DRM_MODE("1280x768", DRM_MODE_TYPE_DRIVER, 79500, 1280, 1344, - 1472, 1664, 0, 768, 771, 778, 798, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1280x800@60Hz */ - { DRM_MODE("1280x800", DRM_MODE_TYPE_DRIVER, 83500, 1280, 1352, - 1480, 1680, 0, 800, 803, 809, 831, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 1280x960@60Hz */ - { DRM_MODE("1280x960", DRM_MODE_TYPE_DRIVER, 108000, 1280, 1376, - 1488, 1800, 0, 960, 961, 964, 1000, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1280x1024@60Hz */ - { DRM_MODE("1280x1024", DRM_MODE_TYPE_DRIVER, 108000, 1280, 1328, - 1440, 1688, 0, 1024, 1025, 1028, 1066, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1360x768@60Hz */ - { DRM_MODE("1360x768", DRM_MODE_TYPE_DRIVER, 85500, 1360, 1424, - 1536, 1792, 0, 768, 771, 777, 795, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1440x1050@60Hz */ - { DRM_MODE("1400x1050", DRM_MODE_TYPE_DRIVER, 121750, 1400, 1488, - 1632, 1864, 0, 1050, 1053, 1057, 1089, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1440x900@60Hz */ - { DRM_MODE("1440x900", DRM_MODE_TYPE_DRIVER, 106500, 1440, 1520, - 1672, 1904, 0, 900, 903, 909, 934, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1600x1200@60Hz */ - { DRM_MODE("1600x1200", DRM_MODE_TYPE_DRIVER, 162000, 1600, 1664, - 1856, 2160, 0, 1200, 1201, 1204, 1250, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1680x1050@60Hz */ - { DRM_MODE("1680x1050", DRM_MODE_TYPE_DRIVER, 146250, 1680, 1784, - 1960, 2240, 0, 1050, 1053, 1059, 1089, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1792x1344@60Hz */ - { DRM_MODE("1792x1344", DRM_MODE_TYPE_DRIVER, 204750, 1792, 1920, - 2120, 2448, 0, 1344, 1345, 1348, 1394, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1853x1392@60Hz */ - { DRM_MODE("1856x1392", DRM_MODE_TYPE_DRIVER, 218250, 1856, 1952, - 2176, 2528, 0, 1392, 1393, 1396, 1439, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1920x1080@60Hz */ - { DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 173000, 1920, 2048, - 2248, 2576, 0, 1080, 1083, 1088, 1120, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1920x1200@60Hz */ - { DRM_MODE("1920x1200", DRM_MODE_TYPE_DRIVER, 193250, 1920, 2056, - 2256, 2592, 0, 1200, 1203, 1209, 1245, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 1920x1440@60Hz */ - { DRM_MODE("1920x1440", DRM_MODE_TYPE_DRIVER, 234000, 1920, 2048, - 2256, 2600, 0, 1440, 1441, 1444, 1500, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 2560x1440@60Hz */ - { DRM_MODE("2560x1440", DRM_MODE_TYPE_DRIVER, 241500, 2560, 2608, - 2640, 2720, 0, 1440, 1443, 1448, 1481, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 2560x1600@60Hz */ - { DRM_MODE("2560x1600", DRM_MODE_TYPE_DRIVER, 348500, 2560, 2752, - 3032, 3504, 0, 1600, 1603, 1609, 1658, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) }, - /* 2880x1800@60Hz */ - { DRM_MODE("2880x1800", DRM_MODE_TYPE_DRIVER, 337500, 2880, 2928, - 2960, 3040, 0, 1800, 1803, 1809, 1852, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 3840x2160@60Hz */ - { DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 533000, 3840, 3888, - 3920, 4000, 0, 2160, 2163, 2168, 2222, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* 3840x2400@60Hz */ - { DRM_MODE("3840x2400", DRM_MODE_TYPE_DRIVER, 592250, 3840, 3888, - 3920, 4000, 0, 2400, 2403, 2409, 2469, 0, - DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) }, - /* Terminate */ - { DRM_MODE("", 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) }, -}; - /** * vmw_guess_mode_timing - Provide fake timings for a * 60Hz vrefresh mode. @@ -2335,88 +2235,6 @@ void vmw_guess_mode_timing(struct drm_display_mode *mode) }
-int vmw_du_connector_fill_modes(struct drm_connector *connector, - uint32_t max_width, uint32_t max_height) -{ - struct vmw_display_unit *du = vmw_connector_to_du(connector); - struct drm_device *dev = connector->dev; - struct vmw_private *dev_priv = vmw_priv(dev); - struct drm_display_mode *mode = NULL; - struct drm_display_mode *bmode; - struct drm_display_mode prefmode = { DRM_MODE("preferred", - DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) - }; - int i; - u32 assumed_bpp = 4; - - if (dev_priv->assume_16bpp) - assumed_bpp = 2; - - max_width = min(max_width, dev_priv->texture_max_width); - max_height = min(max_height, dev_priv->texture_max_height); - - /* - * For STDU extra limit for a mode on SVGA_REG_SCREENTARGET_MAX_WIDTH/ - * HEIGHT registers. - */ - if (dev_priv->active_display_unit == vmw_du_screen_target) { - max_width = min(max_width, dev_priv->stdu_max_width); - max_height = min(max_height, dev_priv->stdu_max_height); - } - - /* Add preferred mode */ - mode = drm_mode_duplicate(dev, &prefmode); - if (!mode) - return 0; - mode->hdisplay = du->pref_width; - mode->vdisplay = du->pref_height; - vmw_guess_mode_timing(mode); - drm_mode_set_name(mode); - - if (vmw_kms_validate_mode_vram(dev_priv, - mode->hdisplay * assumed_bpp, - mode->vdisplay)) { - drm_mode_probed_add(connector, mode); - } else { - drm_mode_destroy(dev, mode); - mode = NULL; - } - - if (du->pref_mode) { - list_del_init(&du->pref_mode->head); - drm_mode_destroy(dev, du->pref_mode); - } - - /* mode might be null here, this is intended */ - du->pref_mode = mode; - - for (i = 0; vmw_kms_connector_builtin[i].type != 0; i++) { - bmode = &vmw_kms_connector_builtin[i]; - if (bmode->hdisplay > max_width || - bmode->vdisplay > max_height) - continue; - - if (!vmw_kms_validate_mode_vram(dev_priv, - bmode->hdisplay * assumed_bpp, - bmode->vdisplay)) - continue; - - mode = drm_mode_duplicate(dev, bmode); - if (!mode) - return 0; - - drm_mode_probed_add(connector, mode); - } - - drm_connector_list_update(connector); - /* Move the prefered mode first, help apps pick the right mode. */ - drm_mode_sort(&connector->modes); - - return 1; -} - /** * vmw_kms_update_layout_ioctl - Handler for DRM_VMW_UPDATE_LAYOUT ioctl * @dev: drm device for the ioctl @@ -2945,3 +2763,91 @@ int vmw_du_helper_plane_update(struct vmw_du_update_plane *update) vmw_validation_unref_lists(&val_ctx); return ret; } + +/** + * vmw_connector_mode_valid - implements drm_connector_helper_funcs.mode_valid callback + * + * @connector: the drm connector, part of a DU container + * @mode: drm mode to check + * + * Returns MODE_OK on success, or a drm_mode_status error code. + */ +enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, + struct drm_display_mode *mode) +{ + struct drm_device *dev = connector->dev; + struct vmw_private *dev_priv = vmw_priv(dev); + u32 max_width = dev_priv->texture_max_width; + u32 max_height = dev_priv->texture_max_height; + u32 assumed_cpp = 4; + + if (dev_priv->assume_16bpp) + assumed_cpp = 2; + + if (dev_priv->active_display_unit == vmw_du_screen_target) { + max_width = min(dev_priv->stdu_max_width, max_width); + max_height = min(dev_priv->stdu_max_height, max_height); + } + + if (max_width < mode->hdisplay) + return MODE_BAD_HVALUE; + + if (max_height < mode->vdisplay) + return MODE_BAD_VVALUE; + + if (!vmw_kms_validate_mode_vram(dev_priv, + mode->hdisplay * assumed_cpp, + mode->vdisplay)) + return MODE_MEM; + + return MODE_OK; +} + +/** + * vmw_connector_get_modes - implements drm_connector_helper_funcs.get_modes callback + * + * @connector: the drm connector, part of a DU container + * + * Returns the number of added modes. + */ +int vmw_connector_get_modes(struct drm_connector *connector) +{ + struct vmw_display_unit *du = vmw_connector_to_du(connector); + struct drm_device *dev = connector->dev; + struct vmw_private *dev_priv = vmw_priv(dev); + struct drm_display_mode *mode = NULL; + struct drm_display_mode prefmode = { DRM_MODE("preferred", + DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) + }; + u32 max_width; + u32 max_height; + u32 num_modes; + + /* Add preferred mode */ + mode = drm_mode_duplicate(dev, &prefmode); + if (!mode) + return 0; + + mode->hdisplay = du->pref_width; + mode->vdisplay = du->pref_height; + vmw_guess_mode_timing(mode); + drm_mode_set_name(mode); + + drm_mode_probed_add(connector, mode); + drm_dbg_kms(dev, "preferred mode " DRM_MODE_FMT "\n", DRM_MODE_ARG(mode)); + + /* Probe connector for all modes not exceeding our geom limits */ + max_width = dev_priv->texture_max_width; + max_height = dev_priv->texture_max_height; + + if (dev_priv->active_display_unit == vmw_du_screen_target) { + max_width = min(dev_priv->stdu_max_width, max_width); + max_height = min(dev_priv->stdu_max_height, max_height); + } + + num_modes = 1 + drm_add_modes_noedid(connector, max_width, max_height); + + return num_modes; +} diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h index 7e15c851871f2..1099de1ece4b3 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h @@ -379,7 +379,6 @@ struct vmw_display_unit { unsigned pref_width; unsigned pref_height; bool pref_active; - struct drm_display_mode *pref_mode;
/* * Gui positioning @@ -429,8 +428,6 @@ void vmw_du_connector_save(struct drm_connector *connector); void vmw_du_connector_restore(struct drm_connector *connector); enum drm_connector_status vmw_du_connector_detect(struct drm_connector *connector, bool force); -int vmw_du_connector_fill_modes(struct drm_connector *connector, - uint32_t max_width, uint32_t max_height); int vmw_kms_helper_dirty(struct vmw_private *dev_priv, struct vmw_framebuffer *framebuffer, const struct drm_clip_rect *clips, @@ -439,6 +436,9 @@ int vmw_kms_helper_dirty(struct vmw_private *dev_priv, int num_clips, int increment, struct vmw_kms_dirty *dirty); +enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, + struct drm_display_mode *mode); +int vmw_connector_get_modes(struct drm_connector *connector);
void vmw_kms_helper_validation_finish(struct vmw_private *dev_priv, struct drm_file *file_priv, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c index ac72c20715f32..fdaf7d28cb211 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c @@ -263,7 +263,7 @@ static void vmw_ldu_connector_destroy(struct drm_connector *connector) static const struct drm_connector_funcs vmw_legacy_connector_funcs = { .dpms = vmw_du_connector_dpms, .detect = vmw_du_connector_detect, - .fill_modes = vmw_du_connector_fill_modes, + .fill_modes = drm_helper_probe_single_connector_modes, .destroy = vmw_ldu_connector_destroy, .reset = vmw_du_connector_reset, .atomic_duplicate_state = vmw_du_connector_duplicate_state, @@ -272,6 +272,8 @@ static const struct drm_connector_funcs vmw_legacy_connector_funcs = {
static const struct drm_connector_helper_funcs vmw_ldu_connector_helper_funcs = { + .get_modes = vmw_connector_get_modes, + .mode_valid = vmw_connector_mode_valid };
static int vmw_kms_ldu_do_bo_dirty(struct vmw_private *dev_priv, @@ -408,7 +410,6 @@ static int vmw_ldu_init(struct vmw_private *dev_priv, unsigned unit) ldu->base.pref_active = (unit == 0); ldu->base.pref_width = dev_priv->initial_width; ldu->base.pref_height = dev_priv->initial_height; - ldu->base.pref_mode = NULL;
/* * Remove this after enabling atomic because property values can diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c index e1f36a09c59c1..e33684f56eda8 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c @@ -346,7 +346,7 @@ static void vmw_sou_connector_destroy(struct drm_connector *connector) static const struct drm_connector_funcs vmw_sou_connector_funcs = { .dpms = vmw_du_connector_dpms, .detect = vmw_du_connector_detect, - .fill_modes = vmw_du_connector_fill_modes, + .fill_modes = drm_helper_probe_single_connector_modes, .destroy = vmw_sou_connector_destroy, .reset = vmw_du_connector_reset, .atomic_duplicate_state = vmw_du_connector_duplicate_state, @@ -356,6 +356,8 @@ static const struct drm_connector_funcs vmw_sou_connector_funcs = {
static const struct drm_connector_helper_funcs vmw_sou_connector_helper_funcs = { + .get_modes = vmw_connector_get_modes, + .mode_valid = vmw_connector_mode_valid };
@@ -827,7 +829,6 @@ static int vmw_sou_init(struct vmw_private *dev_priv, unsigned unit) sou->base.pref_active = (unit == 0); sou->base.pref_width = dev_priv->initial_width; sou->base.pref_height = dev_priv->initial_height; - sou->base.pref_mode = NULL;
/* * Remove this after enabling atomic because property values can diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c index 0090abe892548..b3e70ace6d9b0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c @@ -977,7 +977,7 @@ static void vmw_stdu_connector_destroy(struct drm_connector *connector) static const struct drm_connector_funcs vmw_stdu_connector_funcs = { .dpms = vmw_du_connector_dpms, .detect = vmw_du_connector_detect, - .fill_modes = vmw_du_connector_fill_modes, + .fill_modes = drm_helper_probe_single_connector_modes, .destroy = vmw_stdu_connector_destroy, .reset = vmw_du_connector_reset, .atomic_duplicate_state = vmw_du_connector_duplicate_state, @@ -987,6 +987,8 @@ static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
static const struct drm_connector_helper_funcs vmw_stdu_connector_helper_funcs = { + .get_modes = vmw_connector_get_modes, + .mode_valid = vmw_connector_mode_valid };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes ian.forbes@broadcom.com
[ Upstream commit 426826933109093503e7ef15d49348fc5ab505fe ]
SVGA requires individual surfaces to fit within graphics memory (max_mob_pages) which means that modes with a final buffer size that would exceed graphics memory must be pruned otherwise creation will fail.
Additionally llvmpipe requires its buffer height and width to be a multiple of its tile size which is 64. As a result we have to anticipate that llvmpipe will round up the mode size passed to it by the compositor when it creates buffers and filter modes where this rounding exceeds graphics memory.
This fixes an issue where VMs with low graphics memory (< 64MiB) configured with high resolution mode boot to a black screen because surface creation fails.
Fixes: d947d1b71deb ("drm/vmwgfx: Add and connect connector helper function") Signed-off-by: Ian Forbes ian.forbes@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-2-ian.forbe... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 45 ++++++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c index b3e70ace6d9b0..6dd33d1258d11 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c @@ -40,7 +40,14 @@ #define vmw_connector_to_stdu(x) \ container_of(x, struct vmw_screen_target_display_unit, base.connector)
- +/* + * Some renderers such as llvmpipe will align the width and height of their + * buffers to match their tile size. We need to keep this in mind when exposing + * modes to userspace so that this possible over-allocation will not exceed + * graphics memory. 64x64 pixels seems to be a reasonable upper bound for the + * tile size of current renderers. + */ +#define GPU_TILE_SIZE 64
enum stdu_content_type { SAME_AS_DISPLAY = 0, @@ -972,7 +979,41 @@ static void vmw_stdu_connector_destroy(struct drm_connector *connector) vmw_stdu_destroy(vmw_connector_to_stdu(connector)); }
+static enum drm_mode_status +vmw_stdu_connector_mode_valid(struct drm_connector *connector, + struct drm_display_mode *mode) +{ + enum drm_mode_status ret; + struct drm_device *dev = connector->dev; + struct vmw_private *dev_priv = vmw_priv(dev); + u64 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4; + /* Align width and height to account for GPU tile over-alignment */ + u64 required_mem = ALIGN(mode->hdisplay, GPU_TILE_SIZE) * + ALIGN(mode->vdisplay, GPU_TILE_SIZE) * + assumed_cpp; + required_mem = ALIGN(required_mem, PAGE_SIZE); + + ret = drm_mode_validate_size(mode, dev_priv->stdu_max_width, + dev_priv->stdu_max_height); + if (ret != MODE_OK) + return ret;
+ ret = drm_mode_validate_size(mode, dev_priv->texture_max_width, + dev_priv->texture_max_height); + if (ret != MODE_OK) + return ret; + + if (required_mem > dev_priv->max_primary_mem) + return MODE_MEM; + + if (required_mem > dev_priv->max_mob_pages * PAGE_SIZE) + return MODE_MEM; + + if (required_mem > dev_priv->max_mob_size) + return MODE_MEM; + + return MODE_OK; +}
static const struct drm_connector_funcs vmw_stdu_connector_funcs = { .dpms = vmw_du_connector_dpms, @@ -988,7 +1029,7 @@ static const struct drm_connector_funcs vmw_stdu_connector_funcs = { static const struct drm_connector_helper_funcs vmw_stdu_connector_helper_funcs = { .get_modes = vmw_connector_get_modes, - .mode_valid = vmw_connector_mode_valid + .mode_valid = vmw_stdu_connector_mode_valid };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes ian.forbes@broadcom.com
[ Upstream commit fb5e19d2dd03eb995ccd468d599b2337f7f66555 ]
This limit became a hard cap starting with the change referenced below. Surface creation on the device will fail if the requested size is larger than this limit so altering the value arbitrarily will expose modes that are too large for the device's hard limits.
Fixes: 7ebb47c9f9ab ("drm/vmwgfx: Read new register for GB memory when available")
Signed-off-by: Ian Forbes ian.forbes@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-3-ian.forbe... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 0a75084cd32a2..be27f9a3bf67b 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -938,13 +938,6 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id) vmw_read(dev_priv, SVGA_REG_SUGGESTED_GBOBJECT_MEM_SIZE_KB);
- /* - * Workaround for low memory 2D VMs to compensate for the - * allocation taken by fbdev - */ - if (!(dev_priv->capabilities & SVGA_CAP_3D)) - mem_size *= 3; - dev_priv->max_mob_pages = mem_size * 1024 / PAGE_SIZE; dev_priv->max_primary_mem = vmw_read(dev_priv, SVGA_REG_MAX_PRIMARY_MEM);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes ian.forbes@broadcom.com
[ Upstream commit dde1de06bd7248fd83c4ce5cf0dbe9e4e95bbb91 ]
STDU has its own mode_valid function now so this logic can be removed from the generic version.
Fixes: 935f795045a6 ("drm/vmwgfx: Refactor drm connector probing for display modes")
Signed-off-by: Ian Forbes ian.forbes@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-4-ian.forbe... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 3 --- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 26 +++++++++----------------- 2 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h index 00d9e58b7e149..b0c23559511a1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h @@ -1194,9 +1194,6 @@ void vmw_kms_cursor_snoop(struct vmw_surface *srf, int vmw_kms_write_svga(struct vmw_private *vmw_priv, unsigned width, unsigned height, unsigned pitch, unsigned bpp, unsigned depth); -bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv, - uint32_t pitch, - uint32_t height); int vmw_kms_present(struct vmw_private *dev_priv, struct drm_file *file_priv, struct vmw_framebuffer *vfb, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index 7a2d29370a534..5b30e4ba2811a 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -2085,13 +2085,12 @@ int vmw_kms_write_svga(struct vmw_private *vmw_priv, return 0; }
+static bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv, - uint32_t pitch, - uint32_t height) + u64 pitch, + u64 height) { - return ((u64) pitch * (u64) height) < (u64) - ((dev_priv->active_display_unit == vmw_du_screen_target) ? - dev_priv->max_primary_mem : dev_priv->vram_size); + return (pitch * height) < (u64)dev_priv->vram_size; }
/** @@ -2775,25 +2774,18 @@ int vmw_du_helper_plane_update(struct vmw_du_update_plane *update) enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, struct drm_display_mode *mode) { + enum drm_mode_status ret; struct drm_device *dev = connector->dev; struct vmw_private *dev_priv = vmw_priv(dev); - u32 max_width = dev_priv->texture_max_width; - u32 max_height = dev_priv->texture_max_height; u32 assumed_cpp = 4;
if (dev_priv->assume_16bpp) assumed_cpp = 2;
- if (dev_priv->active_display_unit == vmw_du_screen_target) { - max_width = min(dev_priv->stdu_max_width, max_width); - max_height = min(dev_priv->stdu_max_height, max_height); - } - - if (max_width < mode->hdisplay) - return MODE_BAD_HVALUE; - - if (max_height < mode->vdisplay) - return MODE_BAD_VVALUE; + ret = drm_mode_validate_size(mode, dev_priv->texture_max_width, + dev_priv->texture_max_height); + if (ret != MODE_OK) + return ret;
if (!vmw_kms_validate_mode_vram(dev_priv, mode->hdisplay * assumed_cpp,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Csókás, Bence csokas.bence@prolan.hu
[ Upstream commit e96b2933152fd87b6a41765b2f58b158fde855b6 ]
If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will not be run. As a consequence, `sfp_hwmon_remove()` is not getting run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()` itself checks `sfp->sm_mod_state` anyways, so this check was not really needed in the first place.
Fixes: d2e816c0293f ("net: sfp: handle module remove outside state machine") Signed-off-by: "Csókás, Bence" csokas.bence@prolan.hu Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/sfp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 9b1403291d921..06dce78d7b0c9 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -2150,8 +2150,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
/* Handle remove event globally, it resets this state machine */ if (event == SFP_E_REMOVE) { - if (sfp->sm_mod_state > SFP_MOD_PROBE) - sfp_sm_mod_remove(sfp); + sfp_sm_mod_remove(sfp); sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0); return; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit 12cda920212a49fa22d9e8b9492ac4ea013310a4 ]
When link status change, the nic driver need to notify the roce driver to handle this event, but at this time, the roce driver may uninit, then cause kernel crash.
To fix the problem, when link status change, need to check whether the roce registered, and when uninit, need to wait link update finish.
Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_main.c | 21 ++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index a2655adc764cd..01e24b69e9203 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -3129,9 +3129,7 @@ static void hclge_push_link_status(struct hclge_dev *hdev)
static void hclge_update_link_status(struct hclge_dev *hdev) { - struct hnae3_handle *rhandle = &hdev->vport[0].roce; struct hnae3_handle *handle = &hdev->vport[0].nic; - struct hnae3_client *rclient = hdev->roce_client; struct hnae3_client *client = hdev->nic_client; int state; int ret; @@ -3155,8 +3153,15 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
client->ops->link_status_change(handle, state); hclge_config_mac_tnl_int(hdev, state); - if (rclient && rclient->ops->link_status_change) - rclient->ops->link_status_change(rhandle, state); + + if (test_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state)) { + struct hnae3_handle *rhandle = &hdev->vport[0].roce; + struct hnae3_client *rclient = hdev->roce_client; + + if (rclient && rclient->ops->link_status_change) + rclient->ops->link_status_change(rhandle, + state); + }
hclge_push_link_status(hdev); } @@ -11339,6 +11344,12 @@ static int hclge_init_client_instance(struct hnae3_client *client, return ret; }
+static bool hclge_uninit_need_wait(struct hclge_dev *hdev) +{ + return test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) || + test_bit(HCLGE_STATE_LINK_UPDATING, &hdev->state); +} + static void hclge_uninit_client_instance(struct hnae3_client *client, struct hnae3_ae_dev *ae_dev) { @@ -11347,7 +11358,7 @@ static void hclge_uninit_client_instance(struct hnae3_client *client,
if (hdev->roce_client) { clear_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state); - while (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state)) + while (hclge_uninit_need_wait(hdev)) msleep(HCLGE_WAIT_RESET_DONE);
hdev->roce_client->ops->uninit_instance(&vport->roce, 0);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jie Wang wangjie125@huawei.com
[ Upstream commit 968fde83841a8c23558dfbd0a0c69d636db52b55 ]
Currently hns3 ring buffer init process would hold cpu too long with big Tx/Rx ring depth. This could cause soft lockup.
So this patch adds cond_resched() to the process. Then cpu can break to run other tasks instead of busy looping.
Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters") Signed-off-by: Jie Wang wangjie125@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ++++ drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 2 ++ 2 files changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 78d6752fe0519..4ce43c3a00a37 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -3538,6 +3538,9 @@ static int hns3_alloc_ring_buffers(struct hns3_enet_ring *ring) ret = hns3_alloc_and_attach_buffer(ring, i); if (ret) goto out_buffer_fail; + + if (!(i % HNS3_RESCHED_BD_NUM)) + cond_resched(); }
return 0; @@ -5111,6 +5114,7 @@ int hns3_init_all_ring(struct hns3_nic_priv *priv) }
u64_stats_init(&priv->ring[i].syncp); + cond_resched(); }
return 0; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index 294a14b4fdefb..1aac93f9aaa15 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -214,6 +214,8 @@ enum hns3_nic_state { #define HNS3_CQ_MODE_EQE 1U #define HNS3_CQ_MODE_CQE 0U
+#define HNS3_RESCHED_BD_NUM 1024 + enum hns3_pkt_l2t_type { HNS3_L2_TYPE_UNICAST, HNS3_L2_TYPE_MULTICAST,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
[ Upstream commit c44711b78608c98a3e6b49ce91678cd0917d5349 ]
In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value, but then it is unconditionally passed to skb_add_rx_frag() which looks strange and could lead to null pointer dereference.
lio_vf_rep_copy_packet() call trace looks like: octeon_droq_process_packets octeon_droq_fast_process_packets octeon_droq_dispatch_pkt octeon_create_recv_info ...search in the dispatch_list... ->disp_fn(rdisp->rinfo, ...) lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...) In this path there is no code which sets pg_info->page to NULL. So this check looks unneeded and doesn't solve potential problem. But I guess the author had reason to add a check and I have no such card and can't do real test. In addition, the code in the function liquidio_push_packet() in liquidio/lio_core.c does exactly the same.
Based on this, I consider the most acceptable compromise solution to adjust this issue by moving skb_add_rx_frag() into conditional scope.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 1f233f327913 ("liquidio: switchdev support for LiquidIO NIC") Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c index 600de587d7a98..e70b9ccca380e 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c @@ -272,13 +272,12 @@ lio_vf_rep_copy_packet(struct octeon_device *oct, pg_info->page_offset; memcpy(skb->data, va, MIN_SKB_SIZE); skb_put(skb, MIN_SKB_SIZE); + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, + pg_info->page, + pg_info->page_offset + MIN_SKB_SIZE, + len - MIN_SKB_SIZE, + LIO_RXBUFFER_SZ); } - - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, - pg_info->page, - pg_info->page_offset + MIN_SKB_SIZE, - len - MIN_SKB_SIZE, - LIO_RXBUFFER_SZ); } else { struct octeon_skb_page_info *pg_info = ((struct octeon_skb_page_info *)(skb->cb));
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amjad Ouled-Ameur amjad.ouled-ameur@arm.com
[ Upstream commit b880018edd3a577e50366338194dee9b899947e0 ]
komeda_pipeline_get_state() may return an error-valued pointer, thus check the pointer for negative or null value before dereferencing.
Fixes: 502932a03fce ("drm/komeda: Add the initial scaler support for CORE") Signed-off-by: Amjad Ouled-Ameur amjad.ouled-ameur@arm.com Signed-off-by: Maxime Ripard mripard@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20240610102056.40406-1-amjad.o... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c index 916f2c36bf2f7..e200decd00c6d 100644 --- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c +++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c @@ -259,7 +259,7 @@ komeda_component_get_avail_scaler(struct komeda_component *c, u32 avail_scalers;
pipe_st = komeda_pipeline_get_state(c->pipeline, state); - if (!pipe_st) + if (IS_ERR_OR_NULL(pipe_st)) return NULL;
avail_scalers = (pipe_st->active_comps & KOMEDA_PIPELINE_SCALERS) ^
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adam Miotk adam.miotk@arm.com
[ Upstream commit ce62600c4dbee8d43b02277669dd91785a9b81d9 ]
Device managed panel bridge wrappers are created by calling to drm_panel_bridge_add_typed() and registering a release handler for clean-up when the device gets unbound.
Since the memory for this bridge is also managed and linked to the panel device, the release function should not try to free that memory. Moreover, the call to devm_kfree() inside drm_panel_bridge_remove() will fail in this case and emit a warning because the panel bridge resource is no longer on the device resources list (it has been removed from there before the call to release handlers).
Fixes: 67022227ffb1 ("drm/bridge: Add a devm_ allocator for panel bridge.") Signed-off-by: Adam Miotk adam.miotk@arm.com Signed-off-by: Maxime Ripard mripard@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20240610102739.139852-1-adam.m... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/panel.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c index 216af76d00427..cdfcdbecd4c80 100644 --- a/drivers/gpu/drm/bridge/panel.c +++ b/drivers/gpu/drm/bridge/panel.c @@ -306,9 +306,12 @@ EXPORT_SYMBOL(drm_panel_bridge_set_orientation);
static void devm_drm_panel_bridge_release(struct device *dev, void *res) { - struct drm_bridge **bridge = res; + struct drm_bridge *bridge = *(struct drm_bridge **)res;
- drm_panel_bridge_remove(*bridge); + if (!bridge) + return; + + drm_bridge_remove(bridge); }
/**
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit d37fe4255abe8e7b419b90c5847e8ec2b8debb08 ]
tcp_v6_syn_recv_sock() calls ip6_dst_store() before inet_sk(newsk)->pinet6 has been set up.
This means ip6_dst_store() writes over the parent (listener) np->dst_cookie.
This is racy because multiple threads could share the same parent and their final np->dst_cookie could be wrong.
Move ip6_dst_store() call after inet_sk(newsk)->pinet6 has been changed and after the copy of parent ipv6_pinfo.
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/tcp_ipv6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index ba9a22db5805c..4b0e05349862d 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1291,7 +1291,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * */
newsk->sk_gso_type = SKB_GSO_TCPV6; - ip6_dst_store(newsk, dst, NULL, NULL); inet6_sk_rx_dst_set(newsk, skb);
inet_sk(newsk)->pinet6 = tcp_inet6_sk(newsk); @@ -1302,6 +1301,8 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
memcpy(newnp, np, sizeof(struct ipv6_pinfo));
+ ip6_dst_store(newsk, dst, NULL, NULL); + newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr; newnp->saddr = ireq->ir_v6_loc_addr; newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gal Pressman gal@nvidia.com
[ Upstream commit c6ae073f5903f6c6439d0ac855836a4da5c0a701 ]
When innerprotoinherit is set, the tunneled packets do not have an inner Ethernet header. Change 'maclen' to not always assume the header length is ETH_HLEN, as there might not be a MAC header.
This resolves issues with drivers (e.g. mlx5, in mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset to be correct, and use it for TX offloads.
Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman gal@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/geneve.c | 10 ++++++---- include/net/ip_tunnels.h | 5 +++-- 2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 488ca1c854962..c4a49a75250e3 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -919,6 +919,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, struct geneve_dev *geneve, const struct ip_tunnel_info *info) { + bool inner_proto_inherit = geneve->cfg.inner_proto_inherit; bool xnet = !net_eq(geneve->net, dev_net(geneve->dev)); struct geneve_sock *gs4 = rcu_dereference(geneve->sock4); const struct ip_tunnel_key *key = &info->key; @@ -930,7 +931,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, __be16 sport; int err;
- if (!skb_vlan_inet_prepare(skb)) + if (!skb_vlan_inet_prepare(skb, inner_proto_inherit)) return -EINVAL;
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); @@ -1003,7 +1004,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev, }
err = geneve_build_skb(&rt->dst, skb, info, xnet, sizeof(struct iphdr), - geneve->cfg.inner_proto_inherit); + inner_proto_inherit); if (unlikely(err)) return err;
@@ -1019,6 +1020,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, struct geneve_dev *geneve, const struct ip_tunnel_info *info) { + bool inner_proto_inherit = geneve->cfg.inner_proto_inherit; bool xnet = !net_eq(geneve->net, dev_net(geneve->dev)); struct geneve_sock *gs6 = rcu_dereference(geneve->sock6); const struct ip_tunnel_key *key = &info->key; @@ -1028,7 +1030,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, __be16 sport; int err;
- if (!skb_vlan_inet_prepare(skb)) + if (!skb_vlan_inet_prepare(skb, inner_proto_inherit)) return -EINVAL;
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true); @@ -1083,7 +1085,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev, ttl = ttl ? : ip6_dst_hoplimit(dst); } err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct ipv6hdr), - geneve->cfg.inner_proto_inherit); + inner_proto_inherit); if (unlikely(err)) return err;
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index f9906b73e7ff4..0cc077c3dda30 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -353,9 +353,10 @@ static inline bool pskb_inet_may_pull(struct sk_buff *skb)
/* Variant of pskb_inet_may_pull(). */ -static inline bool skb_vlan_inet_prepare(struct sk_buff *skb) +static inline bool skb_vlan_inet_prepare(struct sk_buff *skb, + bool inner_proto_inherit) { - int nhlen = 0, maclen = ETH_HLEN; + int nhlen = 0, maclen = inner_proto_inherit ? 0 : ETH_HLEN; __be16 type = skb->protocol;
/* Essentially this is skb_protocol(skb, true)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gal Pressman gal@nvidia.com
[ Upstream commit 791b4089e326271424b78f2fae778b20e53d071b ]
Move the vxlan_features_check() call to after we verified the packet is a tunneled VXLAN packet.
Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might wrongly not get offloaded. In some cases, it worked by chance as GENEVE header is the same size as VXLAN, but it is obviously incorrect.
Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Gal Pressman gal@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Tariq Toukan tariqt@nvidia.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index e2f134e1d9fcf..4c0eac83546de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4587,7 +4587,7 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
/* Verify if UDP port is being offloaded by HW */ if (mlx5_vxlan_lookup_port(priv->mdev->vxlan, port)) - return features; + return vxlan_features_check(skb, features);
#if IS_ENABLED(CONFIG_GENEVE) /* Support Geneve offload for default UDP port */ @@ -4613,7 +4613,6 @@ netdev_features_t mlx5e_features_check(struct sk_buff *skb, struct mlx5e_priv *priv = netdev_priv(netdev);
features = vlan_features_check(skb, features); - features = vxlan_features_check(skb, features);
/* Validate if the tunneled packet is being offloaded by HW */ if (skb->encapsulation &&
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
[ Upstream commit 806a5198c05987b748b50f3d0c0cfb3d417381a4 ]
This removes the bogus check for max > hcon->le_conn_max_interval since the later is just the initial maximum conn interval not the maximum the stack could support which is really 3200=4000ms.
In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values of the following fields in IXIT that would cause hci_check_conn_params to fail:
TSPX_conn_update_int_min TSPX_conn_update_int_max TSPX_conn_update_peripheral_latency TSPX_conn_update_supervision_timeout
Link: https://github.com/bluez/bluez/issues/847 Fixes: e4b019515f95 ("Bluetooth: Enforce validation on max value of connection interval") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/bluetooth/hci_core.h | 36 ++++++++++++++++++++++++++++---- net/bluetooth/l2cap_core.c | 8 +------ 2 files changed, 33 insertions(+), 11 deletions(-)
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index c50a41f1782a4..9df7e29386bcc 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -1936,18 +1936,46 @@ static inline int hci_check_conn_params(u16 min, u16 max, u16 latency, { u16 max_latency;
- if (min > max || min < 6 || max > 3200) + if (min > max) { + BT_WARN("min %d > max %d", min, max); return -EINVAL; + } + + if (min < 6) { + BT_WARN("min %d < 6", min); + return -EINVAL; + } + + if (max > 3200) { + BT_WARN("max %d > 3200", max); + return -EINVAL; + } + + if (to_multiplier < 10) { + BT_WARN("to_multiplier %d < 10", to_multiplier); + return -EINVAL; + }
- if (to_multiplier < 10 || to_multiplier > 3200) + if (to_multiplier > 3200) { + BT_WARN("to_multiplier %d > 3200", to_multiplier); return -EINVAL; + }
- if (max >= to_multiplier * 8) + if (max >= to_multiplier * 8) { + BT_WARN("max %d >= to_multiplier %d * 8", max, to_multiplier); return -EINVAL; + }
max_latency = (to_multiplier * 4 / max) - 1; - if (latency > 499 || latency > max_latency) + if (latency > 499) { + BT_WARN("latency %d > 499", latency); return -EINVAL; + } + + if (latency > max_latency) { + BT_WARN("latency %d > max_latency %d", latency, max_latency); + return -EINVAL; + }
return 0; } diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 5f9a599baa34d..a204488a21759 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -5641,13 +5641,7 @@ static inline int l2cap_conn_param_update_req(struct l2cap_conn *conn,
memset(&rsp, 0, sizeof(rsp));
- if (max > hcon->le_conn_max_interval) { - BT_DBG("requested connection interval exceeds current bounds."); - err = -EINVAL; - } else { - err = hci_check_conn_params(min, max, latency, to_multiplier); - } - + err = hci_check_conn_params(min, max, latency, to_multiplier); if (err) rsp.result = cpu_to_le16(L2CAP_CONN_PARAM_REJECTED); else
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jozsef Kadlecsik kadlec@netfilter.org
[ Upstream commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10 ]
Lion Ackermann reported that there is a race condition between namespace cleanup in ipset and the garbage collection of the list:set type. The namespace cleanup can destroy the list:set type of sets while the gc of the set type is waiting to run in rcu cleanup. The latter uses data from the destroyed set which thus leads use after free. The patch contains the following parts:
- When destroying all sets, first remove the garbage collectors, then wait if needed and then destroy the sets. - Fix the badly ordered "wait then remove gc" for the destroy a single set case. - Fix the missing rcu locking in the list:set type in the userspace test case. - Use proper RCU list handlings in the list:set type.
The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list flush to cancel_gc).
Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression in swap operation) Reported-by: Lion Ackermann nnamrec@gmail.com Tested-by: Lion Ackermann nnamrec@gmail.com Signed-off-by: Jozsef Kadlecsik kadlec@netfilter.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/ipset/ip_set_core.c | 81 +++++++++++++++------------ net/netfilter/ipset/ip_set_list_set.c | 30 +++++----- 2 files changed, 60 insertions(+), 51 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index f645da82d826e..649b8a5901e33 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -1174,23 +1174,50 @@ ip_set_setname_policy[IPSET_ATTR_CMD_MAX + 1] = { .len = IPSET_MAXNAMELEN - 1 }, };
+/* In order to return quickly when destroying a single set, it is split + * into two stages: + * - Cancel garbage collector + * - Destroy the set itself via call_rcu() + */ + static void -ip_set_destroy_set(struct ip_set *set) +ip_set_destroy_set_rcu(struct rcu_head *head) { - pr_debug("set: %s\n", set->name); + struct ip_set *set = container_of(head, struct ip_set, rcu);
- /* Must call it without holding any lock */ set->variant->destroy(set); module_put(set->type->me); kfree(set); }
static void -ip_set_destroy_set_rcu(struct rcu_head *head) +_destroy_all_sets(struct ip_set_net *inst) { - struct ip_set *set = container_of(head, struct ip_set, rcu); + struct ip_set *set; + ip_set_id_t i; + bool need_wait = false;
- ip_set_destroy_set(set); + /* First cancel gc's: set:list sets are flushed as well */ + for (i = 0; i < inst->ip_set_max; i++) { + set = ip_set(inst, i); + if (set) { + set->variant->cancel_gc(set); + if (set->type->features & IPSET_TYPE_NAME) + need_wait = true; + } + } + /* Must wait for flush to be really finished */ + if (need_wait) + rcu_barrier(); + for (i = 0; i < inst->ip_set_max; i++) { + set = ip_set(inst, i); + if (set) { + ip_set(inst, i) = NULL; + set->variant->destroy(set); + module_put(set->type->me); + kfree(set); + } + } }
static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info, @@ -1204,11 +1231,10 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info, if (unlikely(protocol_min_failed(attr))) return -IPSET_ERR_PROTOCOL;
- /* Commands are serialized and references are * protected by the ip_set_ref_lock. * External systems (i.e. xt_set) must call - * ip_set_put|get_nfnl_* functions, that way we + * ip_set_nfnl_get_* functions, that way we * can safely check references here. * * list:set timer can only decrement the reference @@ -1216,8 +1242,6 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info, * without holding the lock. */ if (!attr[IPSET_ATTR_SETNAME]) { - /* Must wait for flush to be really finished in list:set */ - rcu_barrier(); read_lock_bh(&ip_set_ref_lock); for (i = 0; i < inst->ip_set_max; i++) { s = ip_set(inst, i); @@ -1228,15 +1252,7 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info, } inst->is_destroyed = true; read_unlock_bh(&ip_set_ref_lock); - for (i = 0; i < inst->ip_set_max; i++) { - s = ip_set(inst, i); - if (s) { - ip_set(inst, i) = NULL; - /* Must cancel garbage collectors */ - s->variant->cancel_gc(s); - ip_set_destroy_set(s); - } - } + _destroy_all_sets(inst); /* Modified by ip_set_destroy() only, which is serialized */ inst->is_destroyed = false; } else { @@ -1257,12 +1273,12 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info, features = s->type->features; ip_set(inst, i) = NULL; read_unlock_bh(&ip_set_ref_lock); + /* Must cancel garbage collectors */ + s->variant->cancel_gc(s); if (features & IPSET_TYPE_NAME) { /* Must wait for flush to be really finished */ rcu_barrier(); } - /* Must cancel garbage collectors */ - s->variant->cancel_gc(s); call_rcu(&s->rcu, ip_set_destroy_set_rcu); } return 0; @@ -2367,30 +2383,25 @@ ip_set_net_init(struct net *net) }
static void __net_exit -ip_set_net_exit(struct net *net) +ip_set_net_pre_exit(struct net *net) { struct ip_set_net *inst = ip_set_pernet(net);
- struct ip_set *set = NULL; - ip_set_id_t i; - inst->is_deleted = true; /* flag for ip_set_nfnl_put */ +}
- nfnl_lock(NFNL_SUBSYS_IPSET); - for (i = 0; i < inst->ip_set_max; i++) { - set = ip_set(inst, i); - if (set) { - ip_set(inst, i) = NULL; - set->variant->cancel_gc(set); - ip_set_destroy_set(set); - } - } - nfnl_unlock(NFNL_SUBSYS_IPSET); +static void __net_exit +ip_set_net_exit(struct net *net) +{ + struct ip_set_net *inst = ip_set_pernet(net); + + _destroy_all_sets(inst); kvfree(rcu_dereference_protected(inst->ip_set_list, 1)); }
static struct pernet_operations ip_set_net_ops = { .init = ip_set_net_init, + .pre_exit = ip_set_net_pre_exit, .exit = ip_set_net_exit, .id = &ip_set_net_id, .size = sizeof(struct ip_set_net), diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c index 6bc7019982b05..e839c356bcb56 100644 --- a/net/netfilter/ipset/ip_set_list_set.c +++ b/net/netfilter/ipset/ip_set_list_set.c @@ -79,7 +79,7 @@ list_set_kadd(struct ip_set *set, const struct sk_buff *skb, struct set_elem *e; int ret;
- list_for_each_entry(e, &map->members, list) { + list_for_each_entry_rcu(e, &map->members, list) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(e, set))) continue; @@ -99,7 +99,7 @@ list_set_kdel(struct ip_set *set, const struct sk_buff *skb, struct set_elem *e; int ret;
- list_for_each_entry(e, &map->members, list) { + list_for_each_entry_rcu(e, &map->members, list) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(e, set))) continue; @@ -188,9 +188,10 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext, struct list_set *map = set->data; struct set_adt_elem *d = value; struct set_elem *e, *next, *prev = NULL; - int ret; + int ret = 0;
- list_for_each_entry(e, &map->members, list) { + rcu_read_lock(); + list_for_each_entry_rcu(e, &map->members, list) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(e, set))) continue; @@ -201,6 +202,7 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
if (d->before == 0) { ret = 1; + goto out; } else if (d->before > 0) { next = list_next_entry(e, list); ret = !list_is_last(&e->list, &map->members) && @@ -208,9 +210,11 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext, } else { ret = prev && prev->id == d->refid; } - return ret; + goto out; } - return 0; +out: + rcu_read_unlock(); + return ret; }
static void @@ -239,7 +243,7 @@ list_set_uadd(struct ip_set *set, void *value, const struct ip_set_ext *ext,
/* Find where to add the new entry */ n = prev = next = NULL; - list_for_each_entry(e, &map->members, list) { + list_for_each_entry_rcu(e, &map->members, list) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(e, set))) continue; @@ -316,9 +320,9 @@ list_set_udel(struct ip_set *set, void *value, const struct ip_set_ext *ext, { struct list_set *map = set->data; struct set_adt_elem *d = value; - struct set_elem *e, *next, *prev = NULL; + struct set_elem *e, *n, *next, *prev = NULL;
- list_for_each_entry(e, &map->members, list) { + list_for_each_entry_safe(e, n, &map->members, list) { if (SET_WITH_TIMEOUT(set) && ip_set_timeout_expired(ext_timeout(e, set))) continue; @@ -424,14 +428,8 @@ static void list_set_destroy(struct ip_set *set) { struct list_set *map = set->data; - struct set_elem *e, *n;
- list_for_each_entry_safe(e, n, &map->members, list) { - list_del(&e->list); - ip_set_put_byindex(map->net, e->id); - ip_set_ext_destroy(set, e); - kfree(e); - } + WARN_ON_ONCE(!list_empty(&map->members)); kfree(map);
set->data = NULL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kory Maincent kory.maincent@bootlin.com
[ Upstream commit 144ba8580bcb82b2686c3d1a043299d844b9a682 ]
ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by checkpatch script.
Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment") Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Kory Maincent kory.maincent@bootlin.com Link: https://lore.kernel.org/r/20240610083426.740660-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/pse-pd/pse.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h index fb724c65c77bc..5ce0cd76956e0 100644 --- a/include/linux/pse-pd/pse.h +++ b/include/linux/pse-pd/pse.h @@ -114,14 +114,14 @@ static inline int pse_ethtool_get_status(struct pse_control *psec, struct netlink_ext_ack *extack, struct pse_control_status *status) { - return -ENOTSUPP; + return -EOPNOTSUPP; }
static inline int pse_ethtool_set_config(struct pse_control *psec, struct netlink_ext_ack *extack, const struct pse_control_config *config) { - return -ENOTSUPP; + return -EOPNOTSUPP; }
#endif
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Washington joshwash@google.com
[ Upstream commit 1b9f756344416e02b41439bf2324b26aa25e141c ]
TSO currently fails when the skb's gso_type field has more than one bit set.
TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case.
The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type |= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification.
This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set.
Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington joshwash@google.com Reviewed-by: Praveen Kaligineedi pkaligineedi@google.com Reviewed-by: Harshitha Ramamurthy hramamurthy@google.com Reviewed-by: Willem de Bruijn willemb@google.com Suggested-by: Eric Dumazet edumazet@google.com Acked-by: Andrei Vagin avagin@gmail.com
v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs.
v3 - Add back unrelated empty line removal.
Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/google/gve/gve_tx_dqo.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c index e84e944d751d2..5147fb37929e0 100644 --- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c +++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c @@ -370,28 +370,18 @@ static int gve_prep_tso(struct sk_buff *skb) if (unlikely(skb_shinfo(skb)->gso_size < GVE_TX_MIN_TSO_MSS_DQO)) return -1;
+ if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) + return -EINVAL; + /* Needed because we will modify header. */ err = skb_cow_head(skb, 0); if (err < 0) return err;
tcp = tcp_hdr(skb); - - /* Remove payload length from checksum. */ paylen = skb->len - skb_transport_offset(skb); - - switch (skb_shinfo(skb)->gso_type) { - case SKB_GSO_TCPV4: - case SKB_GSO_TCPV6: - csum_replace_by_diff(&tcp->check, - (__force __wsum)htonl(paylen)); - - /* Compute length of segmentation header. */ - header_len = skb_tcp_all_headers(skb); - break; - default: - return -EINVAL; - } + csum_replace_by_diff(&tcp->check, (__force __wsum)htonl(paylen)); + header_len = skb_tcp_all_headers(skb);
if (unlikely(header_len > GVE_TX_MAX_HDR_SIZE_DQO)) return -EINVAL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaolei Wang xiaolei.wang@windriver.com
[ Upstream commit be27b896529787e23a35ae4befb6337ce73fcca0 ]
The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula.
Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/stmicro/stmmac/stmmac_tc.c | 25 ++++++++----------- 1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c index 390c900832cd2..074ff289eaf25 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c @@ -343,10 +343,11 @@ static int tc_setup_cbs(struct stmmac_priv *priv, struct tc_cbs_qopt_offload *qopt) { u32 tx_queues_count = priv->plat->tx_queues_to_use; + s64 port_transmit_rate_kbps; u32 queue = qopt->queue; - u32 ptr, speed_div; u32 mode_to_use; u64 value; + u32 ptr; int ret;
/* Queue 0 is not AVB capable */ @@ -355,30 +356,26 @@ static int tc_setup_cbs(struct stmmac_priv *priv, if (!priv->dma_cap.av) return -EOPNOTSUPP;
+ port_transmit_rate_kbps = qopt->idleslope - qopt->sendslope; + /* Port Transmit Rate and Speed Divider */ - switch (priv->speed) { + switch (div_s64(port_transmit_rate_kbps, 1000)) { case SPEED_10000: - ptr = 32; - speed_div = 10000000; - break; case SPEED_5000: ptr = 32; - speed_div = 5000000; break; case SPEED_2500: - ptr = 8; - speed_div = 2500000; - break; case SPEED_1000: ptr = 8; - speed_div = 1000000; break; case SPEED_100: ptr = 4; - speed_div = 100000; break; default: - return -EOPNOTSUPP; + netdev_err(priv->dev, + "Invalid portTransmitRate %lld (idleSlope - sendSlope)\n", + port_transmit_rate_kbps); + return -EINVAL; }
mode_to_use = priv->plat->tx_queues_cfg[queue].mode_to_use; @@ -398,10 +395,10 @@ static int tc_setup_cbs(struct stmmac_priv *priv, }
/* Final adjustments for HW */ - value = div_s64(qopt->idleslope * 1024ll * ptr, speed_div); + value = div_s64(qopt->idleslope * 1024ll * ptr, port_transmit_rate_kbps); priv->plat->tx_queues_cfg[queue].idle_slope = value & GENMASK(31, 0);
- value = div_s64(-qopt->sendslope * 1024ll * ptr, speed_div); + value = div_s64(-qopt->sendslope * 1024ll * ptr, port_transmit_rate_kbps); priv->plat->tx_queues_cfg[queue].send_slope = value & GENMASK(31, 0);
value = qopt->hicredit * 1024ll * 8;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Wagner dwagner@suse.de
[ Upstream commit d76584e53f4244dbc154bec447c3852600acc914 ]
The id override functions return a status which is not propagated to the caller.
Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Daniel Wagner dwagner@suse.de Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/passthru.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index a0a292d49588c..dc756a1c9d0e3 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -226,13 +226,13 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w) req->cmd->common.opcode == nvme_admin_identify) { switch (req->cmd->identify.cns) { case NVME_ID_CNS_CTRL: - nvmet_passthru_override_id_ctrl(req); + status = nvmet_passthru_override_id_ctrl(req); break; case NVME_ID_CNS_NS: - nvmet_passthru_override_id_ns(req); + status = nvmet_passthru_override_id_ns(req); break; case NVME_ID_CNS_NS_DESC_LIST: - nvmet_passthru_override_id_descs(req); + status = nvmet_passthru_override_id_descs(req); break; } } else if (status < 0)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Pavlu petr.pavlu@suse.com
[ Upstream commit 14a20e5b4ad998793c5f43b0330d9e1388446cf3 ]
The net.ipv6.route.flush system parameter takes a value which specifies a delay used during the flush operation for aging exception routes. The written value is however not used in the currently requested flush and instead utilized only in the next one.
A problem is that ipv6_sysctl_rtcache_flush() first reads the old value of net->ipv6.sysctl.flush_delay into a local delay variable and then calls proc_dointvec() which actually updates the sysctl based on the provided input.
Fix the problem by switching the order of the two operations.
Fixes: 4990509f19e8 ("[NETNS][IPV6]: Make sysctls route per namespace.") Signed-off-by: Petr Pavlu petr.pavlu@suse.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/route.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 627431722f9d6..d305051e8ab5f 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -6335,12 +6335,12 @@ static int ipv6_sysctl_rtcache_flush(struct ctl_table *ctl, int write, if (!write) return -EINVAL;
- net = (struct net *)ctl->extra1; - delay = net->ipv6.sysctl.flush_delay; ret = proc_dointvec(ctl, write, buffer, lenp, ppos); if (ret) return ret;
+ net = (struct net *)ctl->extra1; + delay = net->ipv6.sysctl.flush_delay; fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov razor@blackwall.org
[ Upstream commit 36c92936e868601fa1f43da6758cf55805043509 ]
Pass the already obtained vlan group pointer to br_mst_vlan_set_state() instead of dereferencing it again. Each caller has already correctly dereferenced it for their context. This change is required for the following suspicious RCU dereference fix. No functional changes intended.
Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free") Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe Signed-off-by: Nikolay Aleksandrov razor@blackwall.org Link: https://lore.kernel.org/r/20240609103654.914987-2-razor@blackwall.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_mst.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/net/bridge/br_mst.c b/net/bridge/br_mst.c index 3c66141d34d62..1de72816b0fb2 100644 --- a/net/bridge/br_mst.c +++ b/net/bridge/br_mst.c @@ -73,11 +73,10 @@ int br_mst_get_state(const struct net_device *dev, u16 msti, u8 *state) } EXPORT_SYMBOL_GPL(br_mst_get_state);
-static void br_mst_vlan_set_state(struct net_bridge_port *p, struct net_bridge_vlan *v, +static void br_mst_vlan_set_state(struct net_bridge_vlan_group *vg, + struct net_bridge_vlan *v, u8 state) { - struct net_bridge_vlan_group *vg = nbp_vlan_group(p); - if (br_vlan_get_state(v) == state) return;
@@ -121,7 +120,7 @@ int br_mst_set_state(struct net_bridge_port *p, u16 msti, u8 state, if (v->brvlan->msti != msti) continue;
- br_mst_vlan_set_state(p, v, state); + br_mst_vlan_set_state(vg, v, state); }
out: @@ -140,13 +139,13 @@ static void br_mst_vlan_sync_state(struct net_bridge_vlan *pv, u16 msti) * it. */ if (v != pv && v->brvlan->msti == msti) { - br_mst_vlan_set_state(pv->port, pv, v->state); + br_mst_vlan_set_state(vg, pv, v->state); return; } }
/* Otherwise, start out in a new MSTI with all ports disabled. */ - return br_mst_vlan_set_state(pv->port, pv, BR_STATE_DISABLED); + return br_mst_vlan_set_state(vg, pv, BR_STATE_DISABLED); }
int br_mst_vlan_set_msti(struct net_bridge_vlan *mv, u16 msti)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov razor@blackwall.org
[ Upstream commit 546ceb1dfdac866648ec959cbc71d9525bd73462 ]
I converted br_mst_set_state to RCU to avoid a vlan use-after-free but forgot to change the vlan group dereference helper. Switch to vlan group RCU deref helper to fix the suspicious rcu usage warning.
Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free") Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe Signed-off-by: Nikolay Aleksandrov razor@blackwall.org Link: https://lore.kernel.org/r/20240609103654.914987-3-razor@blackwall.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_mst.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_mst.c b/net/bridge/br_mst.c index 1de72816b0fb2..1820f09ff59ce 100644 --- a/net/bridge/br_mst.c +++ b/net/bridge/br_mst.c @@ -102,7 +102,7 @@ int br_mst_set_state(struct net_bridge_port *p, u16 msti, u8 state, int err = 0;
rcu_read_lock(); - vg = nbp_vlan_group(p); + vg = nbp_vlan_group_rcu(p); if (!vg) goto out;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 79f18a41dd056115d685f3b0a419c7cd40055e13 ]
When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del().
Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4
Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo ap420073@gmail.com Reviewed-by: Brett Creeley brett.creeley@amd.com Reviewed-by: Shannon Nelson shannon.nelson@amd.com Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index d33cf8ee7c336..d34aea85f8a69 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -292,10 +292,8 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq) if (ret) return ret;
- if (qcq->napi.poll) - napi_enable(&qcq->napi); - if (qcq->flags & IONIC_QCQ_F_INTR) { + napi_enable(&qcq->napi); irq_set_affinity_hint(qcq->intr.vector, &qcq->intr.affinity_mask); ionic_intr_mask(idev->intr_ctrl, qcq->intr.index,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rao Shoaib Rao.Shoaib@oracle.com
[ Upstream commit a6736a0addd60fccc3a3508461d72314cc609772 ]
Read with MSG_PEEK flag loops if the first byte to read is an OOB byte. commit 22dd70eb2c3d ("af_unix: Don't peek OOB data without MSG_OOB.") addresses the loop issue but does not address the issue that no data beyond OOB byte can be read.
from socket import * c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) c1.send(b'a', MSG_OOB)
1
c1.send(b'b')
1
c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'b'
from socket import * c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) c2.setsockopt(SOL_SOCKET, SO_OOBINLINE, 1) c1.send(b'a', MSG_OOB)
1
c1.send(b'b')
1
c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'a'
c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'a'
c2.recv(1, MSG_DONTWAIT)
b'a'
c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'b'
Fixes: 314001f0bf92 ("af_unix: Add OOB support") Signed-off-by: Rao Shoaib Rao.Shoaib@oracle.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20240611084639.2248934-1-Rao.Shoaib@oracle.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index bb94a67229aa3..3905cdcaa5184 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2682,18 +2682,18 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk, if (skb == u->oob_skb) { if (copied) { skb = NULL; - } else if (sock_flag(sk, SOCK_URGINLINE)) { - if (!(flags & MSG_PEEK)) { + } else if (!(flags & MSG_PEEK)) { + if (sock_flag(sk, SOCK_URGINLINE)) { WRITE_ONCE(u->oob_skb, NULL); consume_skb(skb); + } else { + __skb_unlink(skb, &sk->sk_receive_queue); + WRITE_ONCE(u->oob_skb, NULL); + unlinked_skb = skb; + skb = skb_peek(&sk->sk_receive_queue); } - } else if (flags & MSG_PEEK) { - skb = NULL; - } else { - __skb_unlink(skb, &sk->sk_receive_queue); - WRITE_ONCE(u->oob_skb, NULL); - unlinked_skb = skb; - skb = skb_peek(&sk->sk_receive_queue); + } else if (!sock_flag(sk, SOCK_URGINLINE)) { + skb = skb_peek_next(skb, &sk->sk_receive_queue); } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
[ Upstream commit a9b9741854a9fe9df948af49ca5514e0ed0429df ]
In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 8fa4219dba8e ("bnxt_en: add dynamic debug support for HWRM messages") Suggested-by: Michael Chan michael.chan@broadcom.com Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Reviewed-by: Michael Chan michael.chan@broadcom.com Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c index 132442f16fe67..7a4e08b5a8c1b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c @@ -678,7 +678,7 @@ static int __hwrm_send(struct bnxt *bp, struct bnxt_hwrm_ctx *ctx) req_type); else if (rc && rc != HWRM_ERR_CODE_PF_UNAVAILABLE) hwrm_err(bp, ctx, "hwrm req_type 0x%x seq id 0x%x error 0x%x\n", - req_type, token->seq_id, rc); + req_type, le16_to_cpu(ctx->req->seq_id), rc); rc = __hwrm_to_stderr(rc); exit: if (token)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongzhi Liu hyperlyzcs@gmail.com
commit 086c6cbcc563c81d55257f9b27e14faf1d0963d3 upstream.
When auxiliary_device_add() returns error and then calls auxiliary_device_uninit(), callback function gp_auxiliary_device_release() calls ida_free() and kfree(aux_device_wrapper) to free memory. We should't call them again in the error handling path.
Fix this by skipping the redundant cleanup functions.
Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu hyperlyzcs@gmail.com Link: https://lore.kernel.org/r/20240523121434.21855-3-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c +++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c @@ -111,6 +111,7 @@ static int gp_aux_bus_probe(struct pci_d
err_aux_dev_add_1: auxiliary_device_uninit(&aux_bus->aux_device_wrapper[1]->aux_dev); + goto err_aux_dev_add_0;
err_aux_dev_init_1: ida_free(&gp_client_ida, aux_bus->aux_device_wrapper[1]->aux_dev.id); @@ -120,6 +121,7 @@ err_ida_alloc_1:
err_aux_dev_add_0: auxiliary_device_uninit(&aux_bus->aux_device_wrapper[0]->aux_dev); + goto err_ret;
err_aux_dev_init_0: ida_free(&gp_client_ida, aux_bus->aux_device_wrapper[0]->aux_dev.id); @@ -127,6 +129,7 @@ err_aux_dev_init_0: err_ida_alloc_0: kfree(aux_bus->aux_device_wrapper[0]);
+err_ret: return retval; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Segall bsegall@google.com
commit b2747f108b8034271fd5289bd8f3a7003e0775a3 upstream.
This is a re-commit of
da05b143a308 ("x86/boot: Don't add the EFI stub to targets")
after the tagged patch incorrectly reverted it.
vmlinux-objs-y is added to targets, with an assumption that they are all relative to $(obj); adding a $(objtree)/drivers/... path causes the build to incorrectly create a useless arch/x86/boot/compressed/drivers/... directory tree.
Fix this just by using a different make variable for the EFI stub.
Fixes: cb8bda8ad443 ("x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S") Signed-off-by: Ben Segall bsegall@google.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Ard Biesheuvel ardb@kernel.org Cc: stable@vger.kernel.org # v6.1+ Link: https://lore.kernel.org/r/xm267ceukksz.fsf@bsegall.svl.corp.google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/boot/compressed/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -115,9 +115,9 @@ vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) +
vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o -vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a +vmlinux-libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
-$(obj)/vmlinux: $(vmlinux-objs-y) FORCE +$(obj)/vmlinux: $(vmlinux-objs-y) $(vmlinux-libs-y) FORCE $(call if_changed,ld)
OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit 8a01ef749b0a632f0e1f4ead0f08b3310d99fcb1 upstream.
According to the IIO documentation, the sign in the scan type should be lower case. The ad9467 driver was incorrectly using upper case.
Fix by changing to lower case.
Fixes: 4606d0f4b05f ("iio: adc: ad9467: add support for AD9434 high-speed ADC") Fixes: ad6797120238 ("iio: adc: ad9467: add support AD9467 ADC") Signed-off-by: David Lechner dlechner@baylibre.com Link: https://lore.kernel.org/r/20240503-ad9467-fix-scan-type-sign-v1-1-c7a1a066eb... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad9467.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/adc/ad9467.c +++ b/drivers/iio/adc/ad9467.c @@ -223,11 +223,11 @@ static void __ad9467_get_scale(struct ad }
static const struct iio_chan_spec ad9434_channels[] = { - AD9467_CHAN(0, 0, 12, 'S'), + AD9467_CHAN(0, 0, 12, 's'), };
static const struct iio_chan_spec ad9467_channels[] = { - AD9467_CHAN(0, 0, 16, 'S'), + AD9467_CHAN(0, 0, 16, 's'), };
static const struct ad9467_chip_info ad9467_chip_tbl[] = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Ferland marc.ferland@sonatest.com
commit 279428df888319bf68f2686934897301a250bb84 upstream.
The scale value for the temperature channel is (assuming Vref=2.5 and the datasheet):
376.7897513
When calculating both val and val2 for the temperature scale we use (3767897513/25) and multiply it by Vref (here I assume 2500mV) to obtain:
2500 * (3767897513/25) ==> 376789751300
Finally we divide with remainder by 10^9 to get:
val = 376 val2 = 789751300
However, we return IIO_VAL_INT_PLUS_MICRO (should have been NANO) as the scale type. So when converting the raw temperature value to the 'processed' temperature value we will get (assuming raw=810, offset=-753):
processed = (raw + offset) * scale_val = (810 + -753) * 376 = 21432
processed += div((raw + offset) * scale_val2, 10^6) += div((810 + -753) * 789751300, 10^6) += 45015 ==> 66447 ==> 66.4 Celcius
instead of the expected 21.5 Celsius.
Fix this issue by changing IIO_VAL_INT_PLUS_MICRO to IIO_VAL_INT_PLUS_NANO.
Fixes: 56ca9db862bf ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs") Signed-off-by: Marc Ferland marc.ferland@sonatest.com Link: https://lore.kernel.org/r/20240501150554.1871390-1-marc.ferland@sonatest.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/dac/ad5592r-base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/dac/ad5592r-base.c +++ b/drivers/iio/dac/ad5592r-base.c @@ -410,7 +410,7 @@ static int ad5592r_read_raw(struct iio_d s64 tmp = *val * (3767897513LL / 25LL); *val = div_s64_rem(tmp, 1000000000LL, val2);
- return IIO_VAL_INT_PLUS_MICRO; + return IIO_VAL_INT_PLUS_NANO; }
mutex_lock(&st->lock);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com
commit 245f3b149e6cc3ac6ee612cdb7042263bfc9e73c upstream.
Update watermark will be done inside the hwfifo_set_watermark callback just after the update_scan_mode. It is useless to do it here.
Fixes: 7f85e42a6c54 ("iio: imu: inv_icm42600: add buffer support in iio devices") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com Link: https://lore.kernel.org/r/20240527210008.612932-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 4 ---- drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 4 ---- 2 files changed, 8 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c @@ -128,10 +128,6 @@ static int inv_icm42600_accel_update_sca /* update data FIFO write */ inv_icm42600_timestamp_apply_odr(ts, 0, 0, 0); ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en); - if (ret) - goto out_unlock; - - ret = inv_icm42600_buffer_update_watermark(st);
out_unlock: mutex_unlock(&st->lock); --- a/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c @@ -128,10 +128,6 @@ static int inv_icm42600_gyro_update_scan /* update data FIFO write */ inv_icm42600_timestamp_apply_odr(ts, 0, 0, 0); ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en); - if (ret) - goto out_unlock; - - ret = inv_icm42600_buffer_update_watermark(st);
out_unlock: mutex_unlock(&st->lock);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dirk Behme dirk.behme@de.bosch.com
commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream.
Synchronize the dev->driver usage in really_probe() and dev_uevent(). These can run in different threads, what can result in the following race condition for dev->driver uninitialization:
Thread #1: ==========
really_probe() { ... probe_failed: ... device_unbind_cleanup(dev) { ... dev->driver = NULL; // <= Failed probe sets dev->driver to NULL ... } ... }
Thread #2: ==========
dev_uevent() { ... if (dev->driver) // If dev->driver is NULLed from really_probe() from here on, // after above check, the system crashes add_uevent_var(env, "DRIVER=%s", dev->driver->name); ... }
really_probe() holds the lock, already. So nothing needs to be done there. dev_uevent() is called with lock held, often, too. But not always. What implies that we can't add any locking in dev_uevent() itself. So fix this race by adding the lock to the non-protected path. This is the path where above race is observed:
dev_uevent+0x235/0x380 uevent_show+0x10c/0x1f0 <= Add lock here dev_attr_show+0x3a/0xa0 sysfs_kf_seq_show+0x17c/0x250 kernfs_seq_show+0x7c/0x90 seq_read_iter+0x2d7/0x940 kernfs_fop_read_iter+0xc6/0x310 vfs_read+0x5bc/0x6b0 ksys_read+0xeb/0x1b0 __x64_sys_read+0x42/0x50 x64_sys_call+0x27ad/0x2d30 do_syscall_64+0xcd/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Similar cases are reported by syzkaller in
https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a
But these are regarding the *initialization* of dev->driver
dev->driver = drv;
As this switches dev->driver to non-NULL these reports can be considered to be false-positives (which should be "fixed" by this commit, as well, though).
The same issue was reported and tried to be fixed back in 2015 in
https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@sams...
already.
Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class") Cc: stable stable@kernel.org Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com Cc: Ashish Sangwan a.sangwan@samsung.com Cc: Namjae Jeon namjae.jeon@samsung.com Signed-off-by: Dirk Behme dirk.behme@de.bosch.com Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -2657,8 +2657,11 @@ static ssize_t uevent_show(struct device if (!env) return -ENOMEM;
+ /* Synchronize with really_probe() */ + device_lock(dev); /* let the kset specific function add its keys */ retval = kset->uevent_ops->uevent(&dev->kobj, env); + device_unlock(dev); if (retval) goto out;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
commit 38e3825631b1f314b21e3ade00b5a4d737eb054e upstream.
The duplicated EDID is never freed. Fix it.
Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Inki Dae inki.dae@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/exynos/exynos_drm_vidi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c @@ -309,6 +309,7 @@ static int vidi_get_modes(struct drm_con struct vidi_context *ctx = ctx_from_connector(connector); struct edid *edid; int edid_len; + int count;
/* * the edid data comes from user side and it would be set @@ -328,7 +329,11 @@ static int vidi_get_modes(struct drm_con
drm_connector_update_edid_property(connector, edid);
- return drm_add_edid_modes(connector, edid); + count = drm_add_edid_modes(connector, edid); + + kfree(edid); + + return count; }
static const struct drm_connector_helper_funcs vidi_connector_helper_funcs = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Szyprowski m.szyprowski@samsung.com
commit 799d4b392417ed6889030a5b2335ccb6dcf030ab upstream.
When reading EDID fails and driver reports no modes available, the DRM core adds an artificial 1024x786 mode to the connector. Unfortunately some variants of the Exynos HDMI (like the one in Exynos4 SoCs) are not able to drive such mode, so report a safe 640x480 mode instead of nothing in case of the EDID reading failure.
This fixes the following issue observed on Trats2 board since commit 13d5b040363c ("drm/exynos: do not return negative values from .get_modes()"):
[drm] Exynos DRM: using 11c00000.fimd device for DMA mapping operations exynos-drm exynos-drm: bound 11c00000.fimd (ops fimd_component_ops) exynos-drm exynos-drm: bound 12c10000.mixer (ops mixer_component_ops) exynos-dsi 11c80000.dsi: [drm:samsung_dsim_host_attach] Attached s6e8aa0 device (lanes:4 bpp:24 mode-flags:0x10b) exynos-drm exynos-drm: bound 11c80000.dsi (ops exynos_dsi_component_ops) exynos-drm exynos-drm: bound 12d00000.hdmi (ops hdmi_component_ops) [drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1 exynos-hdmi 12d00000.hdmi: [drm:hdmiphy_enable.part.0] *ERROR* PLL could not reach steady state panel-samsung-s6e8aa0 11c80000.dsi.0: ID: 0xa2, 0x20, 0x8c exynos-mixer 12c10000.mixer: timeout waiting for VSYNC ------------[ cut here ]------------ WARNING: CPU: 1 PID: 11 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8 [CRTC:70:crtc-1] vblank wait timed out Modules linked in: CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.9.0-rc5-next-20240424 #14913 Hardware name: Samsung Exynos (Flattened Device Tree) Workqueue: events_unbound deferred_probe_work_func Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x88 dump_stack_lvl from __warn+0x7c/0x1c4 __warn from warn_slowpath_fmt+0x11c/0x1a8 warn_slowpath_fmt from drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8 drm_atomic_helper_wait_for_vblanks.part.0 from drm_atomic_helper_commit_tail_rpm+0x7c/0x8c drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184 commit_tail from drm_atomic_helper_commit+0x168/0x190 drm_atomic_helper_commit from drm_atomic_commit+0xb4/0xe0 drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x27c drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40 drm_client_modeset_commit from __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0xc4 __drm_fb_helper_restore_fbdev_mode_unlocked from drm_fb_helper_set_par+0x2c/0x3c drm_fb_helper_set_par from fbcon_init+0x3d8/0x550 fbcon_init from visual_init+0xc0/0x108 visual_init from do_bind_con_driver+0x1b8/0x3a4 do_bind_con_driver from do_take_over_console+0x140/0x1ec do_take_over_console from do_fbcon_takeover+0x70/0xd0 do_fbcon_takeover from fbcon_fb_registered+0x19c/0x1ac fbcon_fb_registered from register_framebuffer+0x190/0x21c register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x350/0x574 __drm_fb_helper_initial_config_and_unlock from exynos_drm_fbdev_client_hotplug+0x6c/0xb0 exynos_drm_fbdev_client_hotplug from drm_client_register+0x58/0x94 drm_client_register from exynos_drm_bind+0x160/0x190 exynos_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8 try_to_bring_up_aggregate_device from __component_add+0xb0/0x170 __component_add from mixer_probe+0x74/0xcc mixer_probe from platform_probe+0x5c/0xb8 platform_probe from really_probe+0xe0/0x3d8 really_probe from __driver_probe_device+0x9c/0x1e4 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __device_attach_driver+0xa8/0x120 __device_attach_driver from bus_for_each_drv+0x80/0xcc bus_for_each_drv from __device_attach+0xac/0x1fc __device_attach from bus_probe_device+0x8c/0x90 bus_probe_device from deferred_probe_work_func+0x98/0xe0 deferred_probe_work_func from process_one_work+0x240/0x6d0 process_one_work from worker_thread+0x1a0/0x3f4 worker_thread from kthread+0x104/0x138 kthread from ret_from_fork+0x14/0x28 Exception stack(0xf0895fb0 to 0xf0895ff8) ... irq event stamp: 82357 hardirqs last enabled at (82363): [<c01a96e8>] vprintk_emit+0x308/0x33c hardirqs last disabled at (82368): [<c01a969c>] vprintk_emit+0x2bc/0x33c softirqs last enabled at (81614): [<c0101644>] __do_softirq+0x320/0x500 softirqs last disabled at (81609): [<c012dfe0>] __irq_exit_rcu+0x130/0x184 ---[ end trace 0000000000000000 ]--- exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out exynos-drm exynos-drm: [drm] *ERROR* [CRTC:70:crtc-1] commit wait timed out exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out exynos-drm exynos-drm: [drm] *ERROR* [CONNECTOR:74:HDMI-A-1] commit wait timed out exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out exynos-drm exynos-drm: [drm] *ERROR* [PLANE:56:plane-5] commit wait timed out exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
Cc: stable@vger.kernel.org Fixes: 13d5b040363c ("drm/exynos: do not return negative values from .get_modes()") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Inki Dae inki.dae@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/exynos/exynos_hdmi.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/exynos/exynos_hdmi.c +++ b/drivers/gpu/drm/exynos/exynos_hdmi.c @@ -887,11 +887,11 @@ static int hdmi_get_modes(struct drm_con int ret;
if (!hdata->ddc_adpt) - return 0; + goto no_edid;
edid = drm_get_edid(connector, hdata->ddc_adpt); if (!edid) - return 0; + goto no_edid;
hdata->dvi_mode = !connector->display_info.is_hdmi; DRM_DEV_DEBUG_KMS(hdata->dev, "%s : width[%d] x height[%d]\n", @@ -906,6 +906,9 @@ static int hdmi_get_modes(struct drm_con kfree(edid);
return ret; + +no_edid: + return drm_add_modes_noedid(connector, 640, 480); }
static int hdmi_find_phy_conf(struct hdmi_context *hdata, u32 pixel_clock)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 8031b58c3a9b1db3ef68b3bd749fbee2e1e1aaa3 upstream.
This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt is properly initialized on connect"). It turns out that syzkaller can trigger the retransmit after fallback and before processing any other incoming packet - so that snd_una is still left uninitialized.
Address the issue explicitly initializing snd_una together with snd_nxt and write_seq.
Suggested-by: Mat Martineau martineau@kernel.org Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3757,6 +3757,7 @@ static int mptcp_connect(struct sock *sk
WRITE_ONCE(msk->write_seq, subflow->idsn); WRITE_ONCE(msk->snd_nxt, subflow->idsn); + WRITE_ONCE(msk->snd_una, subflow->idsn); if (likely(!__mptcp_check_fallback(msk))) MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVE);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: YonglongLi liyonglong@chinatelecom.cn
commit 6a09788c1a66e3d8b04b3b3e7618cc817bb60ae9 upstream.
The RmAddr MIB counter is supposed to be incremented once when a valid RM_ADDR has been received. Before this patch, it could have been incremented as many times as the number of subflows connected to the linked address ID, so it could have been 0, 1 or more than 1.
The "RmSubflow" is incremented after a local operation. In this case, it is normal to tied it with the number of subflows that have been actually removed.
The "remove invalid addresses" MP Join subtest has been modified to validate this case. A broadcast IP address is now used instead: the client will not be able to create a subflow to this address. The consequence is that when receiving the RM_ADDR with the ID attached to this broadcast IP address, no subflow linked to this ID will be found.
Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: YonglongLi liyonglong@chinatelecom.cn Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/pm_netlink.c | 5 ++++- tools/testing/selftests/net/mptcp/mptcp_join.sh | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -820,10 +820,13 @@ static void mptcp_pm_nl_rm_addr_or_subfl spin_lock_bh(&msk->pm.lock);
removed = true; - __MPTCP_INC_STATS(sock_net(sk), rm_type); + if (rm_type == MPTCP_MIB_RMSUBFLOW) + __MPTCP_INC_STATS(sock_net(sk), rm_type); } if (rm_type == MPTCP_MIB_RMSUBFLOW) __set_bit(rm_id ? rm_id : msk->mpc_endpoint_id, msk->pm.id_avail_bitmap); + else if (rm_type == MPTCP_MIB_RMADDR) + __MPTCP_INC_STATS(sock_net(sk), rm_type); if (!removed) continue;
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -2344,7 +2344,8 @@ remove_tests() pm_nl_set_limits $ns1 3 3 pm_nl_add_endpoint $ns1 10.0.12.1 flags signal pm_nl_add_endpoint $ns1 10.0.3.1 flags signal - pm_nl_add_endpoint $ns1 10.0.14.1 flags signal + # broadcast IP: no packet for this address will be received on ns1 + pm_nl_add_endpoint $ns1 224.0.0.1 flags signal pm_nl_set_limits $ns2 3 3 run_tests $ns1 $ns2 10.0.1.1 0 -3 0 speed_10 chk_join_nr 1 1 1
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hagar Hemdan hagarhem@amazon.com
commit b97e8a2f7130a4b30d1502003095833d16c028b3 upstream.
its_vlpi_prop_update() calls lpi_write_config() which obtains the mapping information for a VLPI without lock held. So it could race with its_vlpi_unmap().
Since all calls from its_irq_set_vcpu_affinity() require the same lock to be held, hoist the locking there instead of sprinkling the locking all over the place.
This bug was discovered using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc.
[ tglx: Use guard() instead of goto ]
Fixes: 015ec0386ab6 ("irqchip/gic-v3-its: Add VLPI configuration handling") Suggested-by: Marc Zyngier maz@kernel.org Signed-off-by: Hagar Hemdan hagarhem@amazon.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20240531162144.28650-1-hagarhem@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/irqchip/irq-gic-v3-its.c | 44 ++++++++++----------------------------- 1 file changed, 12 insertions(+), 32 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1837,28 +1837,22 @@ static int its_vlpi_map(struct irq_data { struct its_device *its_dev = irq_data_get_irq_chip_data(d); u32 event = its_get_event_id(d); - int ret = 0;
if (!info->map) return -EINVAL;
- raw_spin_lock(&its_dev->event_map.vlpi_lock); - if (!its_dev->event_map.vm) { struct its_vlpi_map *maps;
maps = kcalloc(its_dev->event_map.nr_lpis, sizeof(*maps), GFP_ATOMIC); - if (!maps) { - ret = -ENOMEM; - goto out; - } + if (!maps) + return -ENOMEM;
its_dev->event_map.vm = info->map->vm; its_dev->event_map.vlpi_maps = maps; } else if (its_dev->event_map.vm != info->map->vm) { - ret = -EINVAL; - goto out; + return -EINVAL; }
/* Get our private copy of the mapping information */ @@ -1890,46 +1884,32 @@ static int its_vlpi_map(struct irq_data its_dev->event_map.nr_vlpis++; }
-out: - raw_spin_unlock(&its_dev->event_map.vlpi_lock); - return ret; + return 0; }
static int its_vlpi_get(struct irq_data *d, struct its_cmd_info *info) { struct its_device *its_dev = irq_data_get_irq_chip_data(d); struct its_vlpi_map *map; - int ret = 0; - - raw_spin_lock(&its_dev->event_map.vlpi_lock);
map = get_vlpi_map(d);
- if (!its_dev->event_map.vm || !map) { - ret = -EINVAL; - goto out; - } + if (!its_dev->event_map.vm || !map) + return -EINVAL;
/* Copy our mapping information to the incoming request */ *info->map = *map;
-out: - raw_spin_unlock(&its_dev->event_map.vlpi_lock); - return ret; + return 0; }
static int its_vlpi_unmap(struct irq_data *d) { struct its_device *its_dev = irq_data_get_irq_chip_data(d); u32 event = its_get_event_id(d); - int ret = 0;
- raw_spin_lock(&its_dev->event_map.vlpi_lock); - - if (!its_dev->event_map.vm || !irqd_is_forwarded_to_vcpu(d)) { - ret = -EINVAL; - goto out; - } + if (!its_dev->event_map.vm || !irqd_is_forwarded_to_vcpu(d)) + return -EINVAL;
/* Drop the virtual mapping */ its_send_discard(its_dev, event); @@ -1953,9 +1933,7 @@ static int its_vlpi_unmap(struct irq_dat kfree(its_dev->event_map.vlpi_maps); }
-out: - raw_spin_unlock(&its_dev->event_map.vlpi_lock); - return ret; + return 0; }
static int its_vlpi_prop_update(struct irq_data *d, struct its_cmd_info *info) @@ -1983,6 +1961,8 @@ static int its_irq_set_vcpu_affinity(str if (!is_v4(its_dev->its)) return -EINVAL;
+ guard(raw_spinlock_irq)(&its_dev->event_map.vlpi_lock); + /* Unmap request? */ if (!info) return its_vlpi_unmap(d);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yazen Ghannam yazen.ghannam@amd.com
commit c625dabbf1c4a8e77e4734014f2fde7aa9071a1f upstream.
AMD Zen-based systems use a System Management Network (SMN) that provides access to implementation-specific registers.
SMN accesses are done indirectly through an index/data pair in PCI config space. The PCI config access may fail and return an error code. This would prevent the "read" value from being updated.
However, the PCI config access may succeed, but the return value may be invalid. This is in similar fashion to PCI bad reads, i.e. return all bits set.
Most systems will return 0 for SMN addresses that are not accessible. This is in line with AMD convention that unavailable registers are Read-as-Zero/Writes-Ignored.
However, some systems will return a "PCI Error Response" instead. This value, along with an error code of 0 from the PCI config access, will confuse callers of the amd_smn_read() function.
Check for this condition, clear the return value, and set a proper error code.
Fixes: ddfe43cdc0da ("x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h") Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403164244.471141-1-yazen.ghannam@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/amd_nb.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -195,7 +195,14 @@ out:
int amd_smn_read(u16 node, u32 address, u32 *value) { - return __amd_smn_rw(node, address, value, false); + int err = __amd_smn_rw(node, address, value, false); + + if (PCI_POSSIBLE_ERROR(*value)) { + err = -ENODEV; + *value = 0; + } + + return err; } EXPORT_SYMBOL_GPL(amd_smn_read);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haifeng Xu haifeng.xu@shopee.com
commit 74751ef5c1912ebd3e65c3b65f45587e05ce5d36 upstream.
In our production environment, we found many hung tasks which are blocked for more than 18 hours. Their call traces are like this:
[346278.191038] __schedule+0x2d8/0x890 [346278.191046] schedule+0x4e/0xb0 [346278.191049] perf_event_free_task+0x220/0x270 [346278.191056] ? init_wait_var_entry+0x50/0x50 [346278.191060] copy_process+0x663/0x18d0 [346278.191068] kernel_clone+0x9d/0x3d0 [346278.191072] __do_sys_clone+0x5d/0x80 [346278.191076] __x64_sys_clone+0x25/0x30 [346278.191079] do_syscall_64+0x5c/0xc0 [346278.191083] ? syscall_exit_to_user_mode+0x27/0x50 [346278.191086] ? do_syscall_64+0x69/0xc0 [346278.191088] ? irqentry_exit_to_user_mode+0x9/0x20 [346278.191092] ? irqentry_exit+0x19/0x30 [346278.191095] ? exc_page_fault+0x89/0x160 [346278.191097] ? asm_exc_page_fault+0x8/0x30 [346278.191102] entry_SYSCALL_64_after_hwframe+0x44/0xae
The task was waiting for the refcount become to 1, but from the vmcore, we found the refcount has already been 1. It seems that the task didn't get woken up by perf_event_release_kernel() and got stuck forever. The below scenario may cause the problem.
Thread A Thread B ... ... perf_event_free_task perf_event_release_kernel ... acquire event->child_mutex ... get_ctx ... release event->child_mutex acquire ctx->mutex ... perf_free_event (acquire/release event->child_mutex) ... release ctx->mutex wait_var_event acquire ctx->mutex acquire event->child_mutex # move existing events to free_list release event->child_mutex release ctx->mutex put_ctx ... ...
In this case, all events of the ctx have been freed, so we couldn't find the ctx in free_list and Thread A will miss the wakeup. It's thus necessary to add a wakeup after dropping the reference.
Fixes: 1cf8dfe8a661 ("perf/core: Fix race between close() and fork()") Signed-off-by: Haifeng Xu haifeng.xu@shopee.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Acked-by: Mark Rutland mark.rutland@arm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240513103948.33570-1-haifeng.xu@shopee.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/events/core.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5182,6 +5182,7 @@ int perf_event_release_kernel(struct per again: mutex_lock(&event->child_mutex); list_for_each_entry(child, &event->child_list, child_list) { + void *var = NULL;
/* * Cannot change, child events are not migrated, see the @@ -5222,11 +5223,23 @@ again: * this can't be the last reference. */ put_event(event); + } else { + var = &ctx->refcount; }
mutex_unlock(&event->child_mutex); mutex_unlock(&ctx->mutex); put_ctx(ctx); + + if (var) { + /* + * If perf_event_free_task() has deleted all events from the + * ctx while the child_mutex got released above, make sure to + * notify about the preceding put_ctx(). + */ + smp_mb(); /* pairs with wait_var_event() */ + wake_up_var(var); + } goto again; } mutex_unlock(&event->child_mutex);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit 994af1825a2aa286f4903ff64a1c7378b52defe6 upstream.
On riscv32, it is possible for the last page in virtual address space (0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that shouldn't happen.
There is already some code to ensure memblock won't allocate the last page. However, buddy allocator is left unchecked.
Fix this by reserving physical memory that would be mapped at virtual addresses greater than 0xfffff000.
Reported-by: Björn Töpel bjorn@kernel.org Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.... Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Tested-by: Björn Töpel bjorn@rivosinc.com Reviewed-by: Björn Töpel bjorn@rivosinc.com Reviewed-by: Mike Rapoport (IBM) rppt@kernel.org Link: https://lore.kernel.org/r/20240425115201.3044202-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/mm/init.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
--- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -213,18 +213,19 @@ static void __init setup_bootmem(void) if (!IS_ENABLED(CONFIG_XIP_KERNEL)) phys_ram_base = memblock_start_of_DRAM(); /* - * memblock allocator is not aware of the fact that last 4K bytes of - * the addressable memory can not be mapped because of IS_ERR_VALUE - * macro. Make sure that last 4k bytes are not usable by memblock - * if end of dram is equal to maximum addressable memory. For 64-bit - * kernel, this problem can't happen here as the end of the virtual - * address space is occupied by the kernel mapping then this check must - * be done as soon as the kernel mapping base address is determined. + * Reserve physical address space that would be mapped to virtual + * addresses greater than (void *)(-PAGE_SIZE) because: + * - This memory would overlap with ERR_PTR + * - This memory belongs to high memory, which is not supported + * + * This is not applicable to 64-bit kernel, because virtual addresses + * after (void *)(-PAGE_SIZE) are not linearly mapped: they are + * occupied by kernel mapping. Also it is unrealistic for high memory + * to exist on 64-bit platforms. */ if (!IS_ENABLED(CONFIG_64BIT)) { - max_mapped_addr = __pa(~(ulong)0); - if (max_mapped_addr == (phys_ram_end - 1)) - memblock_set_current_limit(max_mapped_addr - 4096); + max_mapped_addr = __va_to_pa_nodebug(-PAGE_SIZE); + memblock_reserve(max_mapped_addr, (phys_addr_t)-max_mapped_addr); }
min_low_pfn = PFN_UP(phys_ram_base);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 23a4b108accc29a6125ed14de4a044689ffeda78 upstream.
The kprobe_eventname.tc test checks if a function with .isra. can have a kprobe attached to it. It loops through the kallsyms file for all the functions that have the .isra. name, and checks if it exists in the available_filter_functions file, and if it does, it uses it to attach a kprobe to it.
The issue is that kprobes can not attach to functions that are listed more than once in available_filter_functions. With the latest kernel, the function that is found is: rapl_event_update.isra.0
# grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions rapl_event_update.isra.0 rapl_event_update.isra.0
It is listed twice. This causes the attached kprobe to it to fail which in turn fails the test. Instead of just picking the function function that is found in available_filter_functions, pick the first one that is listed only once in available_filter_functions.
Cc: stable@vger.kernel.org Fixes: 604e3548236d ("selftests/ftrace: Select an existing function in kprobe_eventname test") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc @@ -30,7 +30,8 @@ find_dot_func() { fi
grep " [tT] .*.isra..*" /proc/kallsyms | cut -f 3 -d " " | while read f; do - if grep -s $f available_filter_functions; then + cnt=`grep -s $f available_filter_functions | wc -l`; + if [ $cnt -eq 1 ]; then echo $f break fi
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit 233e27b4d21c3e44eb863f03e566d3a22e81a7ae upstream.
When changing the maximum number of open zones, print that number instead of the total number of zones.
Fixes: dc4d137ee3b7 ("null_blk: add support for max open/active zone limit for zoned devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Niklas Cassel cassel@kernel.org Link: https://lore.kernel.org/r/20240528062852.437599-1-dlemoal@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/null_blk/zoned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/block/null_blk/zoned.c +++ b/drivers/block/null_blk/zoned.c @@ -112,7 +112,7 @@ int null_init_zoned_dev(struct nullb_dev if (dev->zone_max_active && dev->zone_max_open > dev->zone_max_active) { dev->zone_max_open = dev->zone_max_active; pr_info("changed the maximum number of open zones to %u\n", - dev->nr_zones); + dev->zone_max_open); } else if (dev->zone_max_open >= dev->nr_zones - dev->zone_nr_conv) { dev->zone_max_open = 0; pr_info("zone_max_open limit disabled, limit >= zone count\n");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thadeu Lima de Souza Cascardo cascardo@igalia.com
commit 4b4647add7d3c8530493f7247d11e257ee425bf0 upstream.
sk_psock_get will return NULL if the refcount of psock has gone to 0, which will happen when the last call of sk_psock_put is done. However, sk_psock_drop may not have finished yet, so the close callback will still point to sock_map_close despite psock being NULL.
This can be reproduced with a thread deleting an element from the sock map, while the second one creates a socket, adds it to the map and closes it.
That will trigger the WARN_ON_ONCE:
------------[ cut here ]------------ WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Modules linked in: CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02 RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293 RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000 RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0 RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3 R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840 R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870 FS: 000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0 Call Trace: <TASK> unix_release+0x87/0xc0 net/unix/af_unix.c:1048 __sock_release net/socket.c:659 [inline] sock_close+0xbe/0x240 net/socket.c:1421 __fput+0x42b/0x8a0 fs/file_table.c:422 __do_sys_close fs/open.c:1556 [inline] __se_sys_close fs/open.c:1541 [inline] __x64_sys_close+0x7f/0x110 fs/open.c:1541 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fb37d618070 Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070 RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK>
Use sk_psock, which will only check that the pointer is not been set to NULL yet, which should only happen after the callbacks are restored. If, then, a reference can still be gotten, we may call sk_psock_stop and cancel psock->work.
As suggested by Paolo Abeni, reorder the condition so the control flow is less convoluted.
After that change, the reproducer does not trigger the WARN_ON_ONCE anymore.
Suggested-by: Paolo Abeni pabeni@redhat.com Reported-by: syzbot+07a2e4a1a57118ef7355@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=07a2e4a1a57118ef7355 Fixes: aadb2bb83ff7 ("sock_map: Fix a potential use-after-free in sock_map_close()") Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself") Cc: stable@vger.kernel.org Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Acked-by: Jakub Sitnicki jakub@cloudflare.com Link: https://lore.kernel.org/r/20240524144702.1178377-1-cascardo@igalia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/sock_map.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1623,19 +1623,23 @@ void sock_map_close(struct sock *sk, lon
lock_sock(sk); rcu_read_lock(); - psock = sk_psock_get(sk); - if (unlikely(!psock)) { - rcu_read_unlock(); - release_sock(sk); - saved_close = READ_ONCE(sk->sk_prot)->close; - } else { + psock = sk_psock(sk); + if (likely(psock)) { saved_close = psock->saved_close; sock_map_remove_links(sk, psock); + psock = sk_psock_get(sk); + if (unlikely(!psock)) + goto no_psock; rcu_read_unlock(); sk_psock_stop(psock); release_sock(sk); cancel_delayed_work_sync(&psock->work); sk_psock_put(sk, psock); + } else { + saved_close = READ_ONCE(sk->sk_prot)->close; +no_psock: + rcu_read_unlock(); + release_sock(sk); }
/* Make sure we do not recurse. This is a bug.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hagar Gamal Halim Hemdan hagarhem@amazon.com
commit 8003f00d895310d409b2bf9ef907c56b42a4e0f4 upstream.
Coverity spotted that event_msg is controlled by user-space, event_msg->event_data.event is passed to event_deliver() and used as an index without sanitization.
This change ensures that the event index is sanitized to mitigate any possibility of speculative information leaks.
This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc.
Only compile tested, no access to HW.
Fixes: 1d990201f9bb ("VMCI: event handling implementation.") Cc: stable stable@kernel.org Signed-off-by: Hagar Gamal Halim Hemdan hagarhem@amazon.com Link: https://lore.kernel.org/stable/20231127193533.46174-1-hagarhem%40amazon.com Link: https://lore.kernel.org/r/20240430085916.4753-1-hagarhem@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/vmw_vmci/vmci_event.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/misc/vmw_vmci/vmci_event.c +++ b/drivers/misc/vmw_vmci/vmci_event.c @@ -9,6 +9,7 @@ #include <linux/vmw_vmci_api.h> #include <linux/list.h> #include <linux/module.h> +#include <linux/nospec.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/rculist.h> @@ -86,9 +87,12 @@ static void event_deliver(struct vmci_ev { struct vmci_subscription *cur; struct list_head *subscriber_list; + u32 sanitized_event, max_vmci_event;
rcu_read_lock(); - subscriber_list = &subscriber_array[event_msg->event_data.event]; + max_vmci_event = ARRAY_SIZE(subscriber_array); + sanitized_event = array_index_nospec(event_msg->event_data.event, max_vmci_event); + subscriber_list = &subscriber_array[sanitized_event]; list_for_each_entry_rcu(cur, subscriber_list, node) { cur->callback(cur->id, &event_msg->event_data, cur->callback_data);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vamshi Gajjela vamshigajjela@google.com
commit eda4923d78d634482227c0b189d9b7ca18824146 upstream.
'nr' member of struct spmi_controller, which serves as an identifier for the controller/bus. This value is a dynamic ID assigned in spmi_controller_alloc, and overriding it from the driver results in an ida_free error "ida_free called for id=xx which is not allocated".
Signed-off-by: Vamshi Gajjela vamshigajjela@google.com Fixes: 70f59c90c819 ("staging: spmi: add Hikey 970 SPMI controller driver") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240228185116.1269-1-vamshigajjela@google.com Signed-off-by: Stephen Boyd sboyd@kernel.org Link: https://lore.kernel.org/r/20240507210809.3479953-5-sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spmi/hisi-spmi-controller.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/spmi/hisi-spmi-controller.c +++ b/drivers/spmi/hisi-spmi-controller.c @@ -303,7 +303,6 @@ static int spmi_controller_probe(struct
spin_lock_init(&spmi_controller->lock);
- ctrl->nr = spmi_controller->channel; ctrl->dev.parent = pdev->dev.parent; ctrl->dev.of_node = of_node_get(pdev->dev.of_node);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit e221c45da3770962418fb30c27d941bbc70d595a upstream.
The 'NFS error' NFSERR_OPNOTSUPP is not described by any of the official NFS related RFCs, but appears to have snuck into some older .x files for NFSv2. Either way, it is not in RFC1094, RFC1813 or any of the NFSv4 RFCs, so should not be returned by the knfsd server, and particularly not by the "LOOKUP" operation.
Instead, let's return NFSERR_STALE, which is more appropriate if the filesystem encodes the filehandle as FILEID_INVALID.
Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfsfh.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -569,7 +569,7 @@ fh_compose(struct svc_fh *fhp, struct sv _fh_update(fhp, exp, dentry); if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) { fh_put(fhp); - return nfserr_opnotsupp; + return nfserr_stale; }
return 0; @@ -595,7 +595,7 @@ fh_update(struct svc_fh *fhp)
_fh_update(fhp, fhp->fh_export, dentry); if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) - return nfserr_opnotsupp; + return nfserr_stale; return 0; out_bad: printk(KERN_ERR "fh_update: fh not verified!\n");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rik van Riel riel@surriel.com
commit 5cbcb62dddf5346077feb82b7b0c9254222d3445 upstream.
While taking a kernel core dump with makedumpfile on a larger system, softlockup messages often appear.
While softlockup warnings can be harmless, they can also interfere with things like RCU freeing memory, which can be problematic when the kdump kexec image is configured with as little memory as possible.
Avoid the softlockup, and give things like work items and RCU a chance to do their thing during __read_vmcore by adding a cond_resched.
Link: https://lkml.kernel.org/r/20240507091858.36ff767f@imladris.surriel.com Signed-off-by: Rik van Riel riel@surriel.com Acked-by: Baoquan He bhe@redhat.com Cc: Dave Young dyoung@redhat.com Cc: Vivek Goyal vgoyal@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/vmcore.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -383,6 +383,8 @@ static ssize_t __read_vmcore(struct iov_ /* leave now if filled buffer already */ if (!iov_iter_count(iter)) return acc; + + cond_resched(); }
list_for_each_entry(m, &vmcore_list, list) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Yue glass.su@suse.com
commit b8cb324277ee16f3eca3055b96fce4735a5a41c6 upstream.
The default atime related mount option is '-o realtime' which means file atime should be updated if atime <= ctime or atime <= mtime. atime should be updated in the following scenario, but it is not: ========================================================== $ rm /mnt/testfile; $ echo test > /mnt/testfile $ stat -c "%X %Y %Z" /mnt/testfile 1711881646 1711881646 1711881646 $ sleep 5 $ cat /mnt/testfile > /dev/null $ stat -c "%X %Y %Z" /mnt/testfile 1711881646 1711881646 1711881646 ==========================================================
And the reason the atime in the test is not updated is that ocfs2 calls ktime_get_real_ts64() in __ocfs2_mknod_locked during file creation. Then inode_set_ctime_current() is called in inode_set_ctime_current() calls ktime_get_coarse_real_ts64() to get current time.
ktime_get_real_ts64() is more accurate than ktime_get_coarse_real_ts64(). In my test box, I saw ctime set by ktime_get_coarse_real_ts64() is less than ktime_get_real_ts64() even ctime is set later. The ctime of the new inode is smaller than atime.
The call trace is like:
ocfs2_create ocfs2_mknod __ocfs2_mknod_locked ....
ktime_get_real_ts64 <------- set atime,ctime,mtime, more accurate ocfs2_populate_inode ... ocfs2_init_acl ocfs2_acl_set_mode inode_set_ctime_current current_time ktime_get_coarse_real_ts64 <-------less accurate
ocfs2_file_read_iter ocfs2_inode_lock_atime ocfs2_should_update_atime atime <= ctime ? <-------- false, ctime < atime due to accuracy
So here call ktime_get_coarse_real_ts64 to set inode time coarser while creating new files. It may lower the accuracy of file times. But it's not a big deal since we already use coarse time in other places like ocfs2_update_inode_atime and inode_set_ctime_current.
Link: https://lkml.kernel.org/r/20240408082041.20925-5-glass.su@suse.com Fixes: c62c38f6b91b ("ocfs2: replace CURRENT_TIME macro") Signed-off-by: Su Yue glass.su@suse.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -566,7 +566,7 @@ static int __ocfs2_mknod_locked(struct i fe->i_last_eb_blk = 0; strcpy(fe->i_signature, OCFS2_INODE_SIGNATURE); fe->i_flags |= cpu_to_le32(OCFS2_VALID_FL); - ktime_get_real_ts64(&ts); + ktime_get_coarse_real_ts64(&ts); fe->i_atime = fe->i_ctime = fe->i_mtime = cpu_to_le64(ts.tv_sec); fe->i_mtime_nsec = fe->i_ctime_nsec = fe->i_atime_nsec =
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Yue glass.su@suse.com
commit 952b023f06a24b2ad6ba67304c4c84d45bea2f18 upstream.
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block", fstests/generic/300 become from always failed to sometimes failed:
======================================================================== [ 473.293420 ] run fstests generic/300
[ 475.296983 ] JBD2: Ignoring recovery information on journal [ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode. [ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found [ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. [ 494.292018 ] OCFS2: File system is now read-only. [ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30 [ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3 fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072 =========================================================================
In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten extents to a list. extents are also inserted into extent tree in ocfs2_write_begin_nolock. Then another thread call fallocate to puch a hole at one of the unwritten extent. The extent at cpos was removed by ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list found there is no such extent at the cpos.
T1 T2 T3 inode lock ... insert extents ... inode unlock ocfs2_fallocate __ocfs2_change_file_space inode lock lock ip_alloc_sem ocfs2_remove_inode_range inode ocfs2_remove_btree_range ocfs2_remove_extent ^---remove the extent at cpos 78723 ... unlock ip_alloc_sem inode unlock ocfs2_dio_end_io ocfs2_dio_end_io_write lock ip_alloc_sem ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_search_extent_list ^---failed to find extent ... unlock ip_alloc_sem
In most filesystems, fallocate is not compatible with racing with AIO+DIO, so fix it by adding to wait for all dio before fallocate/punch_hole like ext4.
Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com Fixes: b25801038da5 ("ocfs2: Support xfs style space reservation ioctls") Signed-off-by: Su Yue glass.su@suse.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Joel Becker jlbec@evilplan.org Cc: Jun Piao piaojun@huawei.com Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Mark Fasheh mark@fasheh.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/file.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1937,6 +1937,8 @@ static int __ocfs2_change_file_space(str
inode_lock(inode);
+ /* Wait all existing dio workers, newcomers will block on i_rwsem */ + inode_dio_wait(inode); /* * This prevents concurrent writes on other nodes */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit 2dba285caba53f309d6060fca911b43d63f41697 upstream.
Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem Vendor ID are u16 variables and are written to a u32 register of the controller. The Subsystem Vendor ID was always 0 because the u16 value was masked incorrectly with GENMASK(31,16) resulting in all lower 16 bits being set to 0 prior to the shift.
Remove both masks as they are unnecessary and set the register correctly i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the Subsystem Vendor ID.
This is documented in the RK3399 TRM section 17.6.7.1.17
[kwilczynski: removed unnecesary newline] Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@g... Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -98,10 +98,8 @@ static int rockchip_pcie_ep_write_header
/* All functions share the same vendor ID with function 0 */ if (fn == 0) { - u32 vid_regs = (hdr->vendorid & GENMASK(15, 0)) | - (hdr->subsys_vendor_id & GENMASK(31, 16)) << 16; - - rockchip_pcie_write(rockchip, vid_regs, + rockchip_pcie_write(rockchip, + hdr->vendorid | hdr->subsys_vendor_id << 16, PCIE_CORE_CONFIG_VENDOR); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nuno Sa nuno.sa@analog.com
commit 1bc31444209c8efae98cb78818131950d9a6f4d6 upstream.
We need to first free the IRQ before calling of_dma_controller_free(). Otherwise we could get an interrupt and schedule a tasklet while removing the DMA controller.
Fixes: 0e3b67b348b8 ("dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller") Cc: stable@kernel.org Signed-off-by: Nuno Sa nuno.sa@analog.com Link: https://lore.kernel.org/r/20240328-axi-dmac-devm-probe-v3-1-523c0176df70@ana... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/dma-axi-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/dma/dma-axi-dmac.c +++ b/drivers/dma/dma-axi-dmac.c @@ -1036,8 +1036,8 @@ static int axi_dmac_remove(struct platfo { struct axi_dmac *dmac = platform_get_drvdata(pdev);
- of_dma_controller_free(pdev->dev.of_node); free_irq(dmac->irq, dmac); + of_dma_controller_free(pdev->dev.of_node); tasklet_kill(&dmac->chan.vchan.task); dma_async_device_unregister(&dmac->dma_dev); clk_disable_unprepare(dmac->clk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Apurva Nandan a-nandan@ti.com
commit 61f6f68447aba08aeaa97593af3a7d85a114891f upstream.
PSC controller has a limitation that it can only power-up the second core when the first core is in ON state. Power-state for core0 should be equal to or higher than core1, else the kernel is seen hanging during rproc loading.
Make the powering up of cores sequential, by waiting for the current core to power-up before proceeding to the next core, with a timeout of 2sec. Add a wait queue event in k3_r5_cluster_rproc_init call, that will wait for the current core to be released from reset before proceeding with the next core.
Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem") Signed-off-by: Apurva Nandan a-nandan@ti.com Signed-off-by: Beleswar Padhi b-padhi@ti.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240430105307.1190615-2-b-padhi@ti.com Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/remoteproc/ti_k3_r5_remoteproc.c | 33 +++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c @@ -98,12 +98,14 @@ struct k3_r5_soc_data { * @dev: cached device pointer * @mode: Mode to configure the Cluster - Split or LockStep * @cores: list of R5 cores within the cluster + * @core_transition: wait queue to sync core state changes * @soc_data: SoC-specific feature data for a R5FSS */ struct k3_r5_cluster { struct device *dev; enum cluster_mode mode; struct list_head cores; + wait_queue_head_t core_transition; const struct k3_r5_soc_data *soc_data; };
@@ -123,6 +125,7 @@ struct k3_r5_cluster { * @atcm_enable: flag to control ATCM enablement * @btcm_enable: flag to control BTCM enablement * @loczrama: flag to dictate which TCM is at device address 0x0 + * @released_from_reset: flag to signal when core is out of reset */ struct k3_r5_core { struct list_head elem; @@ -139,6 +142,7 @@ struct k3_r5_core { u32 atcm_enable; u32 btcm_enable; u32 loczrama; + bool released_from_reset; };
/** @@ -455,6 +459,8 @@ static int k3_r5_rproc_prepare(struct rp ret); return ret; } + core->released_from_reset = true; + wake_up_interruptible(&cluster->core_transition);
/* * Newer IP revisions like on J7200 SoCs support h/w auto-initialization @@ -1137,6 +1143,12 @@ static int k3_r5_rproc_configure_mode(st return ret; }
+ /* + * Skip the waiting mechanism for sequential power-on of cores if the + * core has already been booted by another entity. + */ + core->released_from_reset = c_state; + ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl, &stat); if (ret < 0) { @@ -1273,6 +1285,26 @@ init_rmem: if (cluster->mode == CLUSTER_MODE_LOCKSTEP || cluster->mode == CLUSTER_MODE_SINGLECPU) break; + + /* + * R5 cores require to be powered on sequentially, core0 + * should be in higher power state than core1 in a cluster + * So, wait for current core to power up before proceeding + * to next core and put timeout of 2sec for each core. + * + * This waiting mechanism is necessary because + * rproc_auto_boot_callback() for core1 can be called before + * core0 due to thread execution order. + */ + ret = wait_event_interruptible_timeout(cluster->core_transition, + core->released_from_reset, + msecs_to_jiffies(2000)); + if (ret <= 0) { + dev_err(dev, + "Timed out waiting for %s core to power up!\n", + rproc->name); + return ret; + } }
return 0; @@ -1708,6 +1740,7 @@ static int k3_r5_probe(struct platform_d CLUSTER_MODE_SPLIT : CLUSTER_MODE_LOCKSTEP; cluster->soc_data = data; INIT_LIST_HEAD(&cluster->cores); + init_waitqueue_head(&cluster->core_transition);
ret = of_property_read_u32(np, "ti,cluster-mode", &cluster->mode); if (ret < 0 && ret != -EINVAL) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Beleswar Padhi b-padhi@ti.com
commit 3c8a9066d584f5010b6f4ba03bf6b19d28973d52 upstream.
PSC controller has a limitation that it can only power-up the second core when the first core is in ON state. Power-state for core0 should be equal to or higher than core1.
Therefore, prevent core1 from powering up before core0 during the start process from sysfs. Similarly, prevent core0 from shutting down before core1 has been shut down from sysfs.
Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem") Signed-off-by: Beleswar Padhi b-padhi@ti.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240430105307.1190615-3-b-padhi@ti.com Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/remoteproc/ti_k3_r5_remoteproc.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c @@ -543,7 +543,7 @@ static int k3_r5_rproc_start(struct rpro struct k3_r5_rproc *kproc = rproc->priv; struct k3_r5_cluster *cluster = kproc->cluster; struct device *dev = kproc->dev; - struct k3_r5_core *core; + struct k3_r5_core *core0, *core; u32 boot_addr; int ret;
@@ -569,6 +569,15 @@ static int k3_r5_rproc_start(struct rpro goto unroll_core_run; } } else { + /* do not allow core 1 to start before core 0 */ + core0 = list_first_entry(&cluster->cores, struct k3_r5_core, + elem); + if (core != core0 && core0->rproc->state == RPROC_OFFLINE) { + dev_err(dev, "%s: can not start core 1 before core 0\n", + __func__); + return -EPERM; + } + ret = k3_r5_core_run(core); if (ret) goto put_mbox; @@ -614,7 +623,8 @@ static int k3_r5_rproc_stop(struct rproc { struct k3_r5_rproc *kproc = rproc->priv; struct k3_r5_cluster *cluster = kproc->cluster; - struct k3_r5_core *core = kproc->core; + struct device *dev = kproc->dev; + struct k3_r5_core *core1, *core = kproc->core; int ret;
/* halt all applicable cores */ @@ -627,6 +637,15 @@ static int k3_r5_rproc_stop(struct rproc } } } else { + /* do not allow core 0 to stop before core 1 */ + core1 = list_last_entry(&cluster->cores, struct k3_r5_core, + elem); + if (core != core1 && core1->rproc->state != RPROC_OFFLINE) { + dev_err(dev, "%s: can not stop core 0 before core 1\n", + __func__); + return -EPERM; + } + ret = k3_r5_core_halt(core); if (ret) goto out;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit fb1cf0878328fe75d47f0aed0a65b30126fcefc4 upstream.
__kernel_map_pages() is a debug function which clears the valid bit in page table entry for deallocated pages to detect illegal memory accesses to freed pages.
This function set/clear the valid bit using __set_memory(). __set_memory() acquires init_mm's semaphore, and this operation may sleep. This is problematic, because __kernel_map_pages() can be called in atomic context, and thus is illegal to sleep. An example warning that this causes:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd preempt_count: 2, expected: 0 CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff800060dc>] dump_backtrace+0x1c/0x24 [<ffffffff8091ef6e>] show_stack+0x2c/0x38 [<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72 [<ffffffff8092bb24>] dump_stack+0x14/0x1c [<ffffffff8003b7ac>] __might_resched+0x104/0x10e [<ffffffff8003b7f4>] __might_sleep+0x3e/0x62 [<ffffffff8093276a>] down_write+0x20/0x72 [<ffffffff8000cf00>] __set_memory+0x82/0x2fa [<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4 [<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a [<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba [<ffffffff80011904>] copy_process+0x72c/0x17ec [<ffffffff80012ab4>] kernel_clone+0x60/0x2fe [<ffffffff80012f62>] kernel_thread+0x82/0xa0 [<ffffffff8003552c>] kthreadd+0x14a/0x1be [<ffffffff809357de>] ret_from_fork+0xe/0x1c
Rewrite this function with apply_to_existing_page_range(). It is fine to not have any locking, because __kernel_map_pages() works with pages being allocated/deallocated and those pages are not changed by anyone else in the meantime.
Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.171575093... Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/mm/pageattr.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)
--- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -386,17 +386,33 @@ int set_direct_map_default_noflush(struc }
#ifdef CONFIG_DEBUG_PAGEALLOC +static int debug_pagealloc_set_page(pte_t *pte, unsigned long addr, void *data) +{ + int enable = *(int *)data; + + unsigned long val = pte_val(ptep_get(pte)); + + if (enable) + val |= _PAGE_PRESENT; + else + val &= ~_PAGE_PRESENT; + + set_pte(pte, __pte(val)); + + return 0; +} + void __kernel_map_pages(struct page *page, int numpages, int enable) { if (!debug_pagealloc_enabled()) return;
- if (enable) - __set_memory((unsigned long)page_address(page), numpages, - __pgprot(_PAGE_PRESENT), __pgprot(0)); - else - __set_memory((unsigned long)page_address(page), numpages, - __pgprot(0), __pgprot(_PAGE_PRESENT)); + unsigned long start = (unsigned long)page_address(page); + unsigned long size = PAGE_SIZE * numpages; + + apply_to_existing_page_range(&init_mm, start, size, debug_pagealloc_set_page, &enable); + + flush_tlb_kernel_range(start, start + size); } #endif
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Wilson chris@chris-wilson.co.uk
commit 70cb9188ffc75e643debf292fcddff36c9dbd4ae upstream.
The breadcrumbs use a GT wakeref for guarding the interrupt, but are disarmed during release of the engine wakeref. This leaves a hole where we may attach a breadcrumb just as the engine is parking (after it has parked its breadcrumbs), execute the irq worker with some signalers still attached, but never be woken again.
That issue manifests itself in CI with IGT runner timeouts while tests are waiting indefinitely for release of all GT wakerefs.
<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5 <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4 <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3 <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2 <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6 <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02 <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT. ... <6> [299.356526] sysrq: Show State ... <6> [299.373964] task:i915_selftest state:D stack:11784 pid:5578 tgid:5578 ppid:873 flags:0x00004002 <6> [299.373967] Call Trace: <6> [299.373968] <TASK> <6> [299.373970] __schedule+0x3bb/0xda0 <6> [299.373974] schedule+0x41/0x110 <6> [299.373976] intel_wakeref_wait_for_idle+0x82/0x100 [i915] <6> [299.374083] ? __pfx_var_wake_function+0x10/0x10 <6> [299.374087] live_engine_busy_stats+0x9b/0x500 [i915] <6> [299.374173] __i915_subtests+0xbe/0x240 [i915] <6> [299.374277] ? __pfx___intel_gt_live_setup+0x10/0x10 [i915] <6> [299.374369] ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915] <6> [299.374456] intel_engine_live_selftests+0x1c/0x30 [i915] <6> [299.374547] __run_selftests+0xbb/0x190 [i915] <6> [299.374635] i915_live_selftests+0x4b/0x90 [i915] <6> [299.374717] i915_pci_probe+0x10d/0x210 [i915]
At the end of the interrupt worker, if there are no more engines awake, disarm the breadcrumb and go to sleep.
Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026 Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Andrzej Hajda andrzej.hajda@intel.com Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Acked-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz... (cherry picked from commit fbad43eccae5cb14594195c20113369aabaa22b5) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -258,8 +258,13 @@ static void signal_irq_work(struct irq_w i915_request_put(rq); }
+ /* Lazy irq enabling after HW submission */ if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers)) intel_breadcrumbs_arm_irq(b); + + /* And confirm that we still want irqs enabled before we yield */ + if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) + intel_breadcrumbs_disarm_irq(b); }
struct intel_breadcrumbs * @@ -310,13 +315,7 @@ void __intel_breadcrumbs_park(struct int return;
/* Kick the work once more to drain the signalers, and disarm the irq */ - irq_work_sync(&b->irq_work); - while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) { - local_irq_disable(); - signal_irq_work(&b->irq_work); - local_irq_enable(); - cond_resched(); - } + irq_work_queue(&b->irq_work); }
void intel_breadcrumbs_free(struct kref *kref) @@ -399,7 +398,7 @@ static void insert_breadcrumb(struct i91 * the request as it may have completed and raised the interrupt as * we were attaching it into the lists. */ - if (!b->irq_armed || __i915_request_is_complete(rq)) + if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq)) irq_work_queue(&b->irq_work); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vidya Srinivas vidya.srinivas@intel.com
commit 43e2b37e2ab660c3565d4cff27922bc70e79c3f1 upstream.
In some scenarios, the DPT object gets shrunk but the actual framebuffer did not and thus its still there on the DPT's vm->bound_list. Then it tries to rewrite the PTEs via a stale CPU mapping. This causes panic.
Cc: stable@vger.kernel.org Reported-by: Shawn Lee shawn.c.lee@intel.com Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt") Signed-off-by: Vidya Srinivas vidya.srinivas@intel.com [vsyrjala: Add TODO comment] Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240520165634.1162470-1-vidya... (cherry picked from commit 51064d471c53dcc8eddd2333c3f1c1d9131ba36c) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -295,7 +295,9 @@ bool i915_gem_object_has_iomem(const str static inline bool i915_gem_object_is_shrinkable(const struct drm_i915_gem_object *obj) { - return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE); + /* TODO: make DPT shrinkable when it has no bound vmas */ + return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE) && + !obj->is_dpt; }
static inline bool
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit e44937889bdf4ecd1f0c25762b7226406b9b7a69 upstream.
Add support for the Trace Hub in Granite Rapids.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-11-alexander.shishkin@linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -305,6 +305,11 @@ static const struct pci_device_id intel_ .driver_data = (kernel_ulong_t)&intel_th_2x, }, { + /* Granite Rapids */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x0963), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, + { /* Alder Lake CPU */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f), .driver_data = (kernel_ulong_t)&intel_th_2x,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 854afe461b009801a171b3a49c5f75ea43e4c04c upstream.
Add support for the Trace Hub in Granite Rapids SOC.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-12-alexander.shishkin@linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -310,6 +310,11 @@ static const struct pci_device_id intel_ .driver_data = (kernel_ulong_t)&intel_th_2x, }, { + /* Granite Rapids SOC */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3256), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, + { /* Alder Lake CPU */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f), .driver_data = (kernel_ulong_t)&intel_th_2x,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit 2e1da7efabe05cb0cf0b358883b2bc89080ed0eb upstream.
Add support for the Trace Hub in Sapphire Rapids SOC.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-13-alexander.shishkin@linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -315,6 +315,11 @@ static const struct pci_device_id intel_ .driver_data = (kernel_ulong_t)&intel_th_2x, }, { + /* Sapphire Rapids SOC */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3456), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, + { /* Alder Lake CPU */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f), .driver_data = (kernel_ulong_t)&intel_th_2x,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit c4a30def564d75e84718b059d1a62cc79b137cf9 upstream.
Add support for the Trace Hub in Meteor Lake-S.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-14-alexander.shishkin@linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -295,6 +295,11 @@ static const struct pci_device_id intel_ .driver_data = (kernel_ulong_t)&intel_th_2x, }, { + /* Meteor Lake-S */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7f26), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, + { /* Raptor Lake-S */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7a26), .driver_data = (kernel_ulong_t)&intel_th_2x,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin alexander.shishkin@linux.intel.com
commit f866b65322bfbc8fcca13c25f49e1a5c5a93ae4d upstream.
Add support for the Trace Hub in Lunar Lake.
Signed-off-by: Alexander Shishkin alexander.shishkin@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20240429130119.1518073-16-alexander.shishkin@linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwtracing/intel_th/pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c +++ b/drivers/hwtracing/intel_th/pci.c @@ -325,6 +325,11 @@ static const struct pci_device_id intel_ .driver_data = (kernel_ulong_t)&intel_th_2x, }, { + /* Lunar Lake */ + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa824), + .driver_data = (kernel_ulong_t)&intel_th_2x, + }, + { /* Alder Lake CPU */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f), .driver_data = (kernel_ulong_t)&intel_th_2x,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 15c12fcc50a1b12a747f8b6ec05cdb18c537a4d1 upstream.
Add a new zone_info structure to hold per-zone information in btrfs_load_block_group_zone_info and prepare for breaking out helpers from it.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/zoned.c | 84 ++++++++++++++++++++++++------------------------------- 1 file changed, 37 insertions(+), 47 deletions(-)
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1270,6 +1270,12 @@ out: return ret; }
+struct zone_info { + u64 physical; + u64 capacity; + u64 alloc_offset; +}; + int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) { struct btrfs_fs_info *fs_info = cache->fs_info; @@ -1279,12 +1285,10 @@ int btrfs_load_block_group_zone_info(str struct btrfs_device *device; u64 logical = cache->start; u64 length = cache->length; + struct zone_info *zone_info = NULL; int ret; int i; unsigned int nofs_flag; - u64 *alloc_offsets = NULL; - u64 *caps = NULL; - u64 *physical = NULL; unsigned long *active = NULL; u64 last_alloc = 0; u32 num_sequential = 0, num_conventional = 0; @@ -1316,20 +1320,8 @@ int btrfs_load_block_group_zone_info(str goto out; }
- alloc_offsets = kcalloc(map->num_stripes, sizeof(*alloc_offsets), GFP_NOFS); - if (!alloc_offsets) { - ret = -ENOMEM; - goto out; - } - - caps = kcalloc(map->num_stripes, sizeof(*caps), GFP_NOFS); - if (!caps) { - ret = -ENOMEM; - goto out; - } - - physical = kcalloc(map->num_stripes, sizeof(*physical), GFP_NOFS); - if (!physical) { + zone_info = kcalloc(map->num_stripes, sizeof(*zone_info), GFP_NOFS); + if (!zone_info) { ret = -ENOMEM; goto out; } @@ -1341,20 +1333,21 @@ int btrfs_load_block_group_zone_info(str }
for (i = 0; i < map->num_stripes; i++) { + struct zone_info *info = &zone_info[i]; bool is_sequential; struct blk_zone zone; struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; int dev_replace_is_ongoing = 0;
device = map->stripes[i].dev; - physical[i] = map->stripes[i].physical; + info->physical = map->stripes[i].physical;
if (device->bdev == NULL) { - alloc_offsets[i] = WP_MISSING_DEV; + info->alloc_offset = WP_MISSING_DEV; continue; }
- is_sequential = btrfs_dev_is_sequential(device, physical[i]); + is_sequential = btrfs_dev_is_sequential(device, info->physical); if (is_sequential) num_sequential++; else @@ -1368,7 +1361,7 @@ int btrfs_load_block_group_zone_info(str __set_bit(i, active);
if (!is_sequential) { - alloc_offsets[i] = WP_CONVENTIONAL; + info->alloc_offset = WP_CONVENTIONAL; continue; }
@@ -1376,25 +1369,25 @@ int btrfs_load_block_group_zone_info(str * This zone will be used for allocation, so mark this zone * non-empty. */ - btrfs_dev_clear_zone_empty(device, physical[i]); + btrfs_dev_clear_zone_empty(device, info->physical);
down_read(&dev_replace->rwsem); dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace); if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL) - btrfs_dev_clear_zone_empty(dev_replace->tgtdev, physical[i]); + btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical); up_read(&dev_replace->rwsem);
/* * The group is mapped to a sequential zone. Get the zone write * pointer to determine the allocation offset within the zone. */ - WARN_ON(!IS_ALIGNED(physical[i], fs_info->zone_size)); + WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size)); nofs_flag = memalloc_nofs_save(); - ret = btrfs_get_dev_zone(device, physical[i], &zone); + ret = btrfs_get_dev_zone(device, info->physical, &zone); memalloc_nofs_restore(nofs_flag); if (ret == -EIO || ret == -EOPNOTSUPP) { ret = 0; - alloc_offsets[i] = WP_MISSING_DEV; + info->alloc_offset = WP_MISSING_DEV; continue; } else if (ret) { goto out; @@ -1409,27 +1402,26 @@ int btrfs_load_block_group_zone_info(str goto out; }
- caps[i] = (zone.capacity << SECTOR_SHIFT); + info->capacity = (zone.capacity << SECTOR_SHIFT);
switch (zone.cond) { case BLK_ZONE_COND_OFFLINE: case BLK_ZONE_COND_READONLY: btrfs_err(fs_info, "zoned: offline/readonly zone %llu on device %s (devid %llu)", - physical[i] >> device->zone_info->zone_size_shift, + info->physical >> device->zone_info->zone_size_shift, rcu_str_deref(device->name), device->devid); - alloc_offsets[i] = WP_MISSING_DEV; + info->alloc_offset = WP_MISSING_DEV; break; case BLK_ZONE_COND_EMPTY: - alloc_offsets[i] = 0; + info->alloc_offset = 0; break; case BLK_ZONE_COND_FULL: - alloc_offsets[i] = caps[i]; + info->alloc_offset = info->capacity; break; default: /* Partially used zone */ - alloc_offsets[i] = - ((zone.wp - zone.start) << SECTOR_SHIFT); + info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT); __set_bit(i, active); break; } @@ -1456,15 +1448,15 @@ int btrfs_load_block_group_zone_info(str
switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { case 0: /* single */ - if (alloc_offsets[0] == WP_MISSING_DEV) { + if (zone_info[0].alloc_offset == WP_MISSING_DEV) { btrfs_err(fs_info, "zoned: cannot recover write pointer for zone %llu", - physical[0]); + zone_info[0].physical); ret = -EIO; goto out; } - cache->alloc_offset = alloc_offsets[0]; - cache->zone_capacity = caps[0]; + cache->alloc_offset = zone_info[0].alloc_offset; + cache->zone_capacity = zone_info[0].capacity; if (test_bit(0, active)) set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); break; @@ -1474,21 +1466,21 @@ int btrfs_load_block_group_zone_info(str ret = -EINVAL; goto out; } - if (alloc_offsets[0] == WP_MISSING_DEV) { + if (zone_info[0].alloc_offset == WP_MISSING_DEV) { btrfs_err(fs_info, "zoned: cannot recover write pointer for zone %llu", - physical[0]); + zone_info[0].physical); ret = -EIO; goto out; } - if (alloc_offsets[1] == WP_MISSING_DEV) { + if (zone_info[1].alloc_offset == WP_MISSING_DEV) { btrfs_err(fs_info, "zoned: cannot recover write pointer for zone %llu", - physical[1]); + zone_info[1].physical); ret = -EIO; goto out; } - if (alloc_offsets[0] != alloc_offsets[1]) { + if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) { btrfs_err(fs_info, "zoned: write pointer offset mismatch of zones in DUP profile"); ret = -EIO; @@ -1504,8 +1496,8 @@ int btrfs_load_block_group_zone_info(str set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); } - cache->alloc_offset = alloc_offsets[0]; - cache->zone_capacity = min(caps[0], caps[1]); + cache->alloc_offset = zone_info[0].alloc_offset; + cache->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity); break; case BTRFS_BLOCK_GROUP_RAID1: case BTRFS_BLOCK_GROUP_RAID0: @@ -1558,9 +1550,7 @@ out: cache->physical_map = NULL; } bitmap_free(active); - kfree(physical); - kfree(caps); - kfree(alloc_offsets); + kfree(zone_info); free_extent_map(em);
return ret;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 09a46725cc84165af452d978a3532d6b97a28796 upstream.
Split out a helper for the body of the per-zone loop in btrfs_load_block_group_zone_info to make the function easier to read and modify.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/zoned.c | 184 +++++++++++++++++++++++++++---------------------------- 1 file changed, 92 insertions(+), 92 deletions(-)
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1276,19 +1276,103 @@ struct zone_info { u64 alloc_offset; };
+static int btrfs_load_zone_info(struct btrfs_fs_info *fs_info, int zone_idx, + struct zone_info *info, unsigned long *active, + struct map_lookup *map) +{ + struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; + struct btrfs_device *device = map->stripes[zone_idx].dev; + int dev_replace_is_ongoing = 0; + unsigned int nofs_flag; + struct blk_zone zone; + int ret; + + info->physical = map->stripes[zone_idx].physical; + + if (!device->bdev) { + info->alloc_offset = WP_MISSING_DEV; + return 0; + } + + /* Consider a zone as active if we can allow any number of active zones. */ + if (!device->zone_info->max_active_zones) + __set_bit(zone_idx, active); + + if (!btrfs_dev_is_sequential(device, info->physical)) { + info->alloc_offset = WP_CONVENTIONAL; + return 0; + } + + /* This zone will be used for allocation, so mark this zone non-empty. */ + btrfs_dev_clear_zone_empty(device, info->physical); + + down_read(&dev_replace->rwsem); + dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace); + if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL) + btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical); + up_read(&dev_replace->rwsem); + + /* + * The group is mapped to a sequential zone. Get the zone write pointer + * to determine the allocation offset within the zone. + */ + WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size)); + nofs_flag = memalloc_nofs_save(); + ret = btrfs_get_dev_zone(device, info->physical, &zone); + memalloc_nofs_restore(nofs_flag); + if (ret) { + if (ret != -EIO && ret != -EOPNOTSUPP) + return ret; + info->alloc_offset = WP_MISSING_DEV; + return 0; + } + + if (zone.type == BLK_ZONE_TYPE_CONVENTIONAL) { + btrfs_err_in_rcu(fs_info, + "zoned: unexpected conventional zone %llu on device %s (devid %llu)", + zone.start << SECTOR_SHIFT, rcu_str_deref(device->name), + device->devid); + return -EIO; + } + + info->capacity = (zone.capacity << SECTOR_SHIFT); + + switch (zone.cond) { + case BLK_ZONE_COND_OFFLINE: + case BLK_ZONE_COND_READONLY: + btrfs_err(fs_info, + "zoned: offline/readonly zone %llu on device %s (devid %llu)", + (info->physical >> device->zone_info->zone_size_shift), + rcu_str_deref(device->name), device->devid); + info->alloc_offset = WP_MISSING_DEV; + break; + case BLK_ZONE_COND_EMPTY: + info->alloc_offset = 0; + break; + case BLK_ZONE_COND_FULL: + info->alloc_offset = info->capacity; + break; + default: + /* Partially used zone. */ + info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT); + __set_bit(zone_idx, active); + break; + } + + return 0; +} + int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) { struct btrfs_fs_info *fs_info = cache->fs_info; struct extent_map_tree *em_tree = &fs_info->mapping_tree; struct extent_map *em; struct map_lookup *map; - struct btrfs_device *device; u64 logical = cache->start; u64 length = cache->length; struct zone_info *zone_info = NULL; int ret; int i; - unsigned int nofs_flag; unsigned long *active = NULL; u64 last_alloc = 0; u32 num_sequential = 0, num_conventional = 0; @@ -1333,98 +1417,14 @@ int btrfs_load_block_group_zone_info(str }
for (i = 0; i < map->num_stripes; i++) { - struct zone_info *info = &zone_info[i]; - bool is_sequential; - struct blk_zone zone; - struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; - int dev_replace_is_ongoing = 0; - - device = map->stripes[i].dev; - info->physical = map->stripes[i].physical; - - if (device->bdev == NULL) { - info->alloc_offset = WP_MISSING_DEV; - continue; - } - - is_sequential = btrfs_dev_is_sequential(device, info->physical); - if (is_sequential) - num_sequential++; - else - num_conventional++; - - /* - * Consider a zone as active if we can allow any number of - * active zones. - */ - if (!device->zone_info->max_active_zones) - __set_bit(i, active); - - if (!is_sequential) { - info->alloc_offset = WP_CONVENTIONAL; - continue; - } - - /* - * This zone will be used for allocation, so mark this zone - * non-empty. - */ - btrfs_dev_clear_zone_empty(device, info->physical); - - down_read(&dev_replace->rwsem); - dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace); - if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL) - btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical); - up_read(&dev_replace->rwsem); - - /* - * The group is mapped to a sequential zone. Get the zone write - * pointer to determine the allocation offset within the zone. - */ - WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size)); - nofs_flag = memalloc_nofs_save(); - ret = btrfs_get_dev_zone(device, info->physical, &zone); - memalloc_nofs_restore(nofs_flag); - if (ret == -EIO || ret == -EOPNOTSUPP) { - ret = 0; - info->alloc_offset = WP_MISSING_DEV; - continue; - } else if (ret) { + ret = btrfs_load_zone_info(fs_info, i, &zone_info[i], active, map); + if (ret) goto out; - } - - if (zone.type == BLK_ZONE_TYPE_CONVENTIONAL) { - btrfs_err_in_rcu(fs_info, - "zoned: unexpected conventional zone %llu on device %s (devid %llu)", - zone.start << SECTOR_SHIFT, - rcu_str_deref(device->name), device->devid); - ret = -EIO; - goto out; - } - - info->capacity = (zone.capacity << SECTOR_SHIFT);
- switch (zone.cond) { - case BLK_ZONE_COND_OFFLINE: - case BLK_ZONE_COND_READONLY: - btrfs_err(fs_info, - "zoned: offline/readonly zone %llu on device %s (devid %llu)", - info->physical >> device->zone_info->zone_size_shift, - rcu_str_deref(device->name), device->devid); - info->alloc_offset = WP_MISSING_DEV; - break; - case BLK_ZONE_COND_EMPTY: - info->alloc_offset = 0; - break; - case BLK_ZONE_COND_FULL: - info->alloc_offset = info->capacity; - break; - default: - /* Partially used zone */ - info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT); - __set_bit(i, active); - break; - } + if (zone_info[i].alloc_offset == WP_CONVENTIONAL) + num_conventional++; + else + num_sequential++; }
if (num_sequential > 0)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 9e0e3e74dc6928a0956f4e27e24d473c65887e96 upstream.
Split the code handling a type single block group from btrfs_load_block_group_zone_info to make the code more readable.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/zoned.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1362,6 +1362,24 @@ static int btrfs_load_zone_info(struct b return 0; }
+static int btrfs_load_block_group_single(struct btrfs_block_group *bg, + struct zone_info *info, + unsigned long *active) +{ + if (info->alloc_offset == WP_MISSING_DEV) { + btrfs_err(bg->fs_info, + "zoned: cannot recover write pointer for zone %llu", + info->physical); + return -EIO; + } + + bg->alloc_offset = info->alloc_offset; + bg->zone_capacity = info->capacity; + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &bg->runtime_flags); + return 0; +} + int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) { struct btrfs_fs_info *fs_info = cache->fs_info; @@ -1448,17 +1466,7 @@ int btrfs_load_block_group_zone_info(str
switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { case 0: /* single */ - if (zone_info[0].alloc_offset == WP_MISSING_DEV) { - btrfs_err(fs_info, - "zoned: cannot recover write pointer for zone %llu", - zone_info[0].physical); - ret = -EIO; - goto out; - } - cache->alloc_offset = zone_info[0].alloc_offset; - cache->zone_capacity = zone_info[0].capacity; - if (test_bit(0, active)) - set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); + ret = btrfs_load_block_group_single(cache, &zone_info[0], active); break; case BTRFS_BLOCK_GROUP_DUP: if (map->type & BTRFS_BLOCK_GROUP_DATA) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
commit 87463f7e0250d471fac41e7c9c45ae21d83b5f85 upstream.
Split the code handling a type DUP block group from btrfs_load_block_group_zone_info to make the code more readable.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/zoned.c | 79 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 42 insertions(+), 37 deletions(-)
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1380,6 +1380,47 @@ static int btrfs_load_block_group_single return 0; }
+static int btrfs_load_block_group_dup(struct btrfs_block_group *bg, + struct map_lookup *map, + struct zone_info *zone_info, + unsigned long *active) +{ + if (map->type & BTRFS_BLOCK_GROUP_DATA) { + btrfs_err(bg->fs_info, + "zoned: profile DUP not yet supported on data bg"); + return -EINVAL; + } + + if (zone_info[0].alloc_offset == WP_MISSING_DEV) { + btrfs_err(bg->fs_info, + "zoned: cannot recover write pointer for zone %llu", + zone_info[0].physical); + return -EIO; + } + if (zone_info[1].alloc_offset == WP_MISSING_DEV) { + btrfs_err(bg->fs_info, + "zoned: cannot recover write pointer for zone %llu", + zone_info[1].physical); + return -EIO; + } + if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) { + btrfs_err(bg->fs_info, + "zoned: write pointer offset mismatch of zones in DUP profile"); + return -EIO; + } + + if (test_bit(0, active) != test_bit(1, active)) { + if (!btrfs_zone_activate(bg)) + return -EIO; + } else if (test_bit(0, active)) { + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &bg->runtime_flags); + } + + bg->alloc_offset = zone_info[0].alloc_offset; + bg->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity); + return 0; +} + int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) { struct btrfs_fs_info *fs_info = cache->fs_info; @@ -1469,43 +1510,7 @@ int btrfs_load_block_group_zone_info(str ret = btrfs_load_block_group_single(cache, &zone_info[0], active); break; case BTRFS_BLOCK_GROUP_DUP: - if (map->type & BTRFS_BLOCK_GROUP_DATA) { - btrfs_err(fs_info, "zoned: profile DUP not yet supported on data bg"); - ret = -EINVAL; - goto out; - } - if (zone_info[0].alloc_offset == WP_MISSING_DEV) { - btrfs_err(fs_info, - "zoned: cannot recover write pointer for zone %llu", - zone_info[0].physical); - ret = -EIO; - goto out; - } - if (zone_info[1].alloc_offset == WP_MISSING_DEV) { - btrfs_err(fs_info, - "zoned: cannot recover write pointer for zone %llu", - zone_info[1].physical); - ret = -EIO; - goto out; - } - if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) { - btrfs_err(fs_info, - "zoned: write pointer offset mismatch of zones in DUP profile"); - ret = -EIO; - goto out; - } - if (test_bit(0, active) != test_bit(1, active)) { - if (!btrfs_zone_activate(cache)) { - ret = -EIO; - goto out; - } - } else { - if (test_bit(0, active)) - set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, - &cache->runtime_flags); - } - cache->alloc_offset = zone_info[0].alloc_offset; - cache->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity); + ret = btrfs_load_block_group_dup(cache, map, zone_info, active); break; case BTRFS_BLOCK_GROUP_RAID1: case BTRFS_BLOCK_GROUP_RAID0:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 0090d6e1b210551e63cf43958dc7a1ec942cdde9 upstream.
While loading a zone's info during creation of a block group, we can race with a device replace operation and then trigger a use-after-free on the device that was just replaced (source device of the replace operation).
This happens because at btrfs_load_zone_info() we extract a device from the chunk map into a local variable and then use the device while not under the protection of the device replace rwsem. So if there's a device replace operation happening when we extract the device and that device is the source of the replace operation, we will trigger a use-after-free if before we finish using the device the replace operation finishes and frees the device.
Fix this by enlarging the critical section under the protection of the device replace rwsem so that all uses of the device are done inside the critical section.
CC: stable@vger.kernel.org # 6.1.x: 15c12fcc50a1: btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info CC: stable@vger.kernel.org # 6.1.x: 09a46725cc84: btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info CC: stable@vger.kernel.org # 6.1.x: 9e0e3e74dc69: btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info CC: stable@vger.kernel.org # 6.1.x: 87463f7e0250: btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info CC: stable@vger.kernel.org # 6.1.x Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/zoned.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1281,7 +1281,7 @@ static int btrfs_load_zone_info(struct b struct map_lookup *map) { struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; - struct btrfs_device *device = map->stripes[zone_idx].dev; + struct btrfs_device *device; int dev_replace_is_ongoing = 0; unsigned int nofs_flag; struct blk_zone zone; @@ -1289,7 +1289,11 @@ static int btrfs_load_zone_info(struct b
info->physical = map->stripes[zone_idx].physical;
+ down_read(&dev_replace->rwsem); + device = map->stripes[zone_idx].dev; + if (!device->bdev) { + up_read(&dev_replace->rwsem); info->alloc_offset = WP_MISSING_DEV; return 0; } @@ -1299,6 +1303,7 @@ static int btrfs_load_zone_info(struct b __set_bit(zone_idx, active);
if (!btrfs_dev_is_sequential(device, info->physical)) { + up_read(&dev_replace->rwsem); info->alloc_offset = WP_CONVENTIONAL; return 0; } @@ -1306,11 +1311,9 @@ static int btrfs_load_zone_info(struct b /* This zone will be used for allocation, so mark this zone non-empty. */ btrfs_dev_clear_zone_empty(device, info->physical);
- down_read(&dev_replace->rwsem); dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace); if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL) btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical); - up_read(&dev_replace->rwsem);
/* * The group is mapped to a sequential zone. Get the zone write pointer @@ -1321,6 +1324,7 @@ static int btrfs_load_zone_info(struct b ret = btrfs_get_dev_zone(device, info->physical, &zone); memalloc_nofs_restore(nofs_flag); if (ret) { + up_read(&dev_replace->rwsem); if (ret != -EIO && ret != -EOPNOTSUPP) return ret; info->alloc_offset = WP_MISSING_DEV; @@ -1332,6 +1336,7 @@ static int btrfs_load_zone_info(struct b "zoned: unexpected conventional zone %llu on device %s (devid %llu)", zone.start << SECTOR_SHIFT, rcu_str_deref(device->name), device->devid); + up_read(&dev_replace->rwsem); return -EIO; }
@@ -1359,6 +1364,8 @@ static int btrfs_load_zone_info(struct b break; }
+ up_read(&dev_replace->rwsem); + return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit a4ca369ca221bb7e06c725792ac107f0e48e82e7 upstream.
Destructive writes to a block device on which nilfs2 is mounted can cause a kernel bug in the folio/page writeback start routine or writeback end routine (__folio_start_writeback in the log below):
kernel BUG at mm/page-writeback.c:3070! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI ... RIP: 0010:__folio_start_writeback+0xbaa/0x10e0 Code: 25 ff 0f 00 00 0f 84 18 01 00 00 e8 40 ca c6 ff e9 17 f6 ff ff e8 36 ca c6 ff 4c 89 f7 48 c7 c6 80 c0 12 84 e8 e7 b3 0f 00 90 <0f> 0b e8 1f ca c6 ff 4c 89 f7 48 c7 c6 a0 c6 12 84 e8 d0 b3 0f 00 ... Call Trace: <TASK> nilfs_segctor_do_construct+0x4654/0x69d0 [nilfs2] nilfs_segctor_construct+0x181/0x6b0 [nilfs2] nilfs_segctor_thread+0x548/0x11c0 [nilfs2] kthread+0x2f0/0x390 ret_from_fork+0x4b/0x80 ret_from_fork_asm+0x1a/0x30 </TASK>
This is because when the log writer starts a writeback for segment summary blocks or a super root block that use the backing device's page cache, it does not wait for the ongoing folio/page writeback, resulting in an inconsistent writeback state.
Fix this issue by waiting for ongoing writebacks when putting folios/pages on the backing device into writeback state.
Link: https://lkml.kernel.org/r/20240530141556.4411-1-konishi.ryusuke@gmail.com Fixes: 9ff05123e3bf ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/segment.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1692,6 +1692,7 @@ static void nilfs_segctor_prepare_write( if (bh->b_page != bd_page) { if (bd_page) { lock_page(bd_page); + wait_on_page_writeback(bd_page); clear_page_dirty_for_io(bd_page); set_page_writeback(bd_page); unlock_page(bd_page); @@ -1705,6 +1706,7 @@ static void nilfs_segctor_prepare_write( if (bh == segbuf->sb_super_root) { if (bh->b_page != bd_page) { lock_page(bd_page); + wait_on_page_writeback(bd_page); clear_page_dirty_for_io(bd_page); set_page_writeback(bd_page); unlock_page(bd_page); @@ -1721,6 +1723,7 @@ static void nilfs_segctor_prepare_write( } if (bd_page) { lock_page(bd_page); + wait_on_page_writeback(bd_page); clear_page_dirty_for_io(bd_page); set_page_writeback(bd_page); unlock_page(bd_page);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov oleg@redhat.com
commit 07c54cc5988f19c9642fd463c2dbdac7fc52f777 upstream.
After the recent commit 5097cbcb38e6 ("sched/isolation: Prevent boot crash when the boot CPU is nohz_full") the kernel no longer crashes, but there is another problem.
In this case tick_setup_device() calls tick_take_do_timer_from_boot() to update tick_do_timer_cpu and this triggers the WARN_ON_ONCE(irqs_disabled) in smp_call_function_single().
Kill tick_take_do_timer_from_boot() and just use WRITE_ONCE(), the new comment explains why this is safe (thanks Thomas!).
Fixes: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full") Signed-off-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240528122019.GA28794@redhat.com Link: https://lore.kernel.org/all/20240522151742.GA10400@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/tick-common.c | 42 ++++++++++++++---------------------------- 1 file changed, 14 insertions(+), 28 deletions(-)
--- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -179,26 +179,6 @@ void tick_setup_periodic(struct clock_ev } }
-#ifdef CONFIG_NO_HZ_FULL -static void giveup_do_timer(void *info) -{ - int cpu = *(unsigned int *)info; - - WARN_ON(tick_do_timer_cpu != smp_processor_id()); - - tick_do_timer_cpu = cpu; -} - -static void tick_take_do_timer_from_boot(void) -{ - int cpu = smp_processor_id(); - int from = tick_do_timer_boot_cpu; - - if (from >= 0 && from != cpu) - smp_call_function_single(from, giveup_do_timer, &cpu, 1); -} -#endif - /* * Setup the tick device */ @@ -222,19 +202,25 @@ static void tick_setup_device(struct tic tick_next_period = ktime_get(); #ifdef CONFIG_NO_HZ_FULL /* - * The boot CPU may be nohz_full, in which case set - * tick_do_timer_boot_cpu so the first housekeeping - * secondary that comes up will take do_timer from - * us. + * The boot CPU may be nohz_full, in which case the + * first housekeeping secondary will take do_timer() + * from it. */ if (tick_nohz_full_cpu(cpu)) tick_do_timer_boot_cpu = cpu;
- } else if (tick_do_timer_boot_cpu != -1 && - !tick_nohz_full_cpu(cpu)) { - tick_take_do_timer_from_boot(); + } else if (tick_do_timer_boot_cpu != -1 && !tick_nohz_full_cpu(cpu)) { tick_do_timer_boot_cpu = -1; - WARN_ON(tick_do_timer_cpu != cpu); + /* + * The boot CPU will stay in periodic (NOHZ disabled) + * mode until clocksource_done_booting() called after + * smp_init() selects a high resolution clocksource and + * timekeeping_notify() kicks the NOHZ stuff alive. + * + * So this WRITE_ONCE can only race with the READ_ONCE + * check in tick_periodic() but this race is harmless. + */ + WRITE_ONCE(tick_do_timer_cpu, cpu); #endif }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaohe Lin linmiaohe@huawei.com
commit fe6f86f4b40855a130a19aa589f9ba7f650423f4 upstream.
When I did memory failure tests recently, below panic occurs:
kernel BUG at include/linux/mm.h:1135! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14 RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 Call Trace: <TASK> do_shrink_slab+0x14f/0x6a0 shrink_slab+0xca/0x8c0 shrink_node+0x2d0/0x7d0 balance_pgdat+0x33a/0x720 kswapd+0x1f3/0x410 kthread+0xd5/0x100 ret_from_fork+0x2f/0x50 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
The root cause is that HWPoison flag will be set for huge_zero_folio without increasing the folio refcnt. But then unpoison_memory() will decrease the folio refcnt unexpectedly as it appears like a successfully hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when releasing huge_zero_folio.
Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue. We're not prepared to unpoison huge_zero_folio yet.
Link: https://lkml.kernel.org/r/20240516122608.22610-1-linmiaohe@huawei.com Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Yang Shi shy828301@gmail.com Reviewed-by: Oscar Salvador osalvador@suse.de Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Cc: Naoya Horiguchi nao.horiguchi@gmail.com Cc: Xu Yu xuyu@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Miaohe Lin linmiaohe@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2346,6 +2346,13 @@ int unpoison_memory(unsigned long pfn) goto unlock_mutex; }
+ if (is_huge_zero_page(page)) { + unpoison_pr_info("Unpoison: huge zero page is not supported %#lx\n", + pfn, &unpoison_rs); + ret = -EOPNOTSUPP; + goto unlock_mutex; + } + if (!PageHWPoison(p)) { unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n", pfn, &unpoison_rs);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaohe Lin linmiaohe@huawei.com
commit 8cf360b9d6a840700e06864236a01a883b34bbad upstream.
When I did memory failure tests recently, below panic occurs:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]---
And before the panic, there had an warning about bad page state:
BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK>
The root cause should be the below race:
memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1.
Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page.
Fix this issue by only treating __page_handle_poison() as successful when it returns 1.
Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Cc: Naoya Horiguchi nao.horiguchi@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Miaohe Lin linmiaohe@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1110,7 +1110,7 @@ static int me_huge_page(struct page_stat * subpages. */ put_page(hpage); - if (__page_handle_poison(p) >= 0) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } else { @@ -1888,7 +1888,7 @@ retry: */ if (res == 0) { unlock_page(head); - if (__page_handle_poison(p) >= 0) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } else {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Doug Brown doug@schmorgal.com
commit 5208e7ced520a813b4f4774451fbac4e517e78b2 upstream.
The FIFO is 64 bytes, but the FCR is configured to fire the TX interrupt when the FIFO is half empty (bit 3 = 0). Thus, we should only write 32 bytes when a TX interrupt occurs.
This fixes a problem observed on the PXA168 that dropped a bunch of TX bytes during large transmissions.
Fixes: ab28f51c77cd ("serial: rewrite pxa2xx-uart to use 8250_core") Signed-off-by: Doug Brown doug@schmorgal.com Link: https://lore.kernel.org/r/20240519191929.122202-1-doug@schmorgal.com Cc: stable stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_pxa.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/tty/serial/8250/8250_pxa.c +++ b/drivers/tty/serial/8250/8250_pxa.c @@ -124,6 +124,7 @@ static int serial_pxa_probe(struct platf uart.port.regshift = 2; uart.port.irq = irq; uart.port.fifosize = 64; + uart.tx_loadsz = 32; uart.port.flags = UPF_IOREMAP | UPF_SKIP_TEST | UPF_FIXED_TYPE; uart.port.dev = &pdev->dev; uart.port.uartclk = clk_get_rate(data->clk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: YonglongLi liyonglong@chinatelecom.cn
commit 40eec1795cc27b076d49236649a29507c7ed8c2d upstream.
The creation of new subflows can fail for different reasons. If no subflow have been created using the received ADD_ADDR, the related counters should not be updated, otherwise they will never be decremented for events related to this ID later on.
For the moment, the number of accepted ADD_ADDR is only decremented upon the reception of a related RM_ADDR, and only if the remote address ID is currently being used by at least one subflow. In other words, if no subflow can be created with the received address, the counter will not be decremented. In this case, it is then important not to increment pm.add_addr_accepted counter, and not to modify pm.accept_addr bit.
Note that this patch does not modify the behaviour in case of failures later on, e.g. if the MP Join is dropped or rejected.
The "remove invalid addresses" MP Join subtest has been modified to validate this case. The broadcast IP address is added before the "valid" address that will be used to successfully create a subflow, and the limit is decreased by one: without this patch, it was not possible to create the last subflow, because:
- the broadcast address would have been accepted even if it was not usable: the creation of a subflow to this address results in an error,
- the limit of 2 accepted ADD_ADDR would have then been reached.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: YonglongLi liyonglong@chinatelecom.cn Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab... Signed-off-by: Jakub Kicinski kuba@kernel.org [ Conflicts in the selftests, in the same context, because the next line with 'run_tests' has been updated later by a few commits like commit e571fb09c893 ("selftests: mptcp: add speed env var"). We don't need to touch this line, nor to backport the long refactoring series. ] Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/pm_netlink.c | 16 ++++++++++------ tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 ++-- 2 files changed, 12 insertions(+), 8 deletions(-)
--- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -685,6 +685,7 @@ static void mptcp_pm_nl_add_addr_receive unsigned int add_addr_accept_max; struct mptcp_addr_info remote; unsigned int subflows_max; + bool sf_created = false; int i, nr;
add_addr_accept_max = mptcp_pm_get_add_addr_accept_max(msk); @@ -710,15 +711,18 @@ static void mptcp_pm_nl_add_addr_receive */ nr = fill_local_addresses_vec(msk, addrs);
- msk->pm.add_addr_accepted++; - if (msk->pm.add_addr_accepted >= add_addr_accept_max || - msk->pm.subflows >= subflows_max) - WRITE_ONCE(msk->pm.accept_addr, false); - spin_unlock_bh(&msk->pm.lock); for (i = 0; i < nr; i++) - __mptcp_subflow_connect(sk, &addrs[i], &remote); + if (__mptcp_subflow_connect(sk, &addrs[i], &remote) == 0) + sf_created = true; spin_lock_bh(&msk->pm.lock); + + if (sf_created) { + msk->pm.add_addr_accepted++; + if (msk->pm.add_addr_accepted >= add_addr_accept_max || + msk->pm.subflows >= subflows_max) + WRITE_ONCE(msk->pm.accept_addr, false); + } }
void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -2343,10 +2343,10 @@ remove_tests() if reset "remove invalid addresses"; then pm_nl_set_limits $ns1 3 3 pm_nl_add_endpoint $ns1 10.0.12.1 flags signal - pm_nl_add_endpoint $ns1 10.0.3.1 flags signal # broadcast IP: no packet for this address will be received on ns1 pm_nl_add_endpoint $ns1 224.0.0.1 flags signal - pm_nl_set_limits $ns2 3 3 + pm_nl_add_endpoint $ns1 10.0.3.1 flags signal + pm_nl_set_limits $ns2 2 2 run_tests $ns1 $ns2 10.0.1.1 0 -3 0 speed_10 chk_join_nr 1 1 1 chk_add_nr 3 3
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sam James sam@gentoo.org
This reverts commit 0c42f7e039aba3de6d7dbf92da708e2b2ecba557 which is commit 35e351780fa9d8240dd6f7e4f245f9ea37e96c19 upstream.
The backport is incomplete and causes xfstests failures. The consequences of the incomplete backport seem worse than the original issue, so pick the lesser evil and revert until a full backport is ready.
Link: https://lore.kernel.org/stable/20240604004751.3883227-1-leah.rumancik@gmail.... Reported-by: Leah Rumancik leah.rumancik@gmail.com Signed-off-by: Sam James sam@gentoo.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -662,15 +662,6 @@ static __latent_entropy int dup_mmap(str } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); - /* - * Copy/update hugetlb private vma information. - */ - if (is_vm_hugetlb_page(tmp)) - hugetlb_dup_vma_private(tmp); - - if (tmp->vm_ops && tmp->vm_ops->open) - tmp->vm_ops->open(tmp); - file = tmp->vm_file; if (file) { struct address_space *mapping = file->f_mapping; @@ -687,6 +678,12 @@ static __latent_entropy int dup_mmap(str i_mmap_unlock_write(mapping); }
+ /* + * Copy/update hugetlb private vma information. + */ + if (is_vm_hugetlb_page(tmp)) + hugetlb_dup_vma_private(tmp); + /* Link the vma into the MT */ mas.index = tmp->vm_start; mas.last = tmp->vm_end - 1; @@ -698,6 +695,9 @@ static __latent_entropy int dup_mmap(str if (!(tmp->vm_flags & VM_WIPEONFORK)) retval = copy_page_range(tmp, mpnt);
+ if (tmp->vm_ops && tmp->vm_ops->open) + tmp->vm_ops->open(tmp); + if (retval) goto loop_out; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Beleswar Padhi b-padhi@ti.com
commit 1dc7242f6ee0c99852cb90676d7fe201cf5de422 upstream.
In case of errors during core start operation from sysfs, the driver directly returns with the -EPERM error code. Fix this to ensure that mailbox channels are freed on error before returning by jumping to the 'put_mbox' error handling label. Similarly, jump to the 'out' error handling label to return with required -EPERM error code during the core stop operation from sysfs.
Fixes: 3c8a9066d584 ("remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs") Signed-off-by: Beleswar Padhi b-padhi@ti.com Link: https://lore.kernel.org/r/20240506141849.1735679-1-b-padhi@ti.com Signed-off-by: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/remoteproc/ti_k3_r5_remoteproc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c @@ -575,7 +575,8 @@ static int k3_r5_rproc_start(struct rpro if (core != core0 && core0->rproc->state == RPROC_OFFLINE) { dev_err(dev, "%s: can not start core 1 before core 0\n", __func__); - return -EPERM; + ret = -EPERM; + goto put_mbox; }
ret = k3_r5_core_run(core); @@ -643,7 +644,8 @@ static int k3_r5_rproc_stop(struct rproc if (core != core1 && core1->rproc->state != RPROC_OFFLINE) { dev_err(dev, "%s: can not stop core 0 before core 1\n", __func__); - return -EPERM; + ret = -EPERM; + goto out; }
ret = k3_r5_core_halt(core);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells dhowells@redhat.com
commit c3d6569a43322f371e7ba0ad386112723757ac8f upstream.
cachefiles_ondemand_init_object() as called from cachefiles_open_file() and cachefiles_create_tmpfile() does not check if object->ondemand is set before dereferencing it, leading to an oops something like:
RIP: 0010:cachefiles_ondemand_init_object+0x9/0x41 ... Call Trace: <TASK> cachefiles_open_file+0xc9/0x187 cachefiles_lookup_cookie+0x122/0x2be fscache_cookie_state_machine+0xbe/0x32b fscache_cookie_worker+0x1f/0x2d process_one_work+0x136/0x208 process_scheduled_works+0x3a/0x41 worker_thread+0x1a2/0x1f6 kthread+0xca/0xd2 ret_from_fork+0x21/0x33
Fix this by making cachefiles_ondemand_init_object() return immediately if cachefiles->ondemand is NULL.
Fixes: 3c5ecfe16e76 ("cachefiles: extract ondemand info field from cachefiles_object") Reported-by: Marc Dionne marc.dionne@auristor.com Signed-off-by: David Howells dhowells@redhat.com cc: Gao Xiang xiang@kernel.org cc: Chao Yu chao@kernel.org cc: Yue Hu huyue2@coolpad.com cc: Jeffle Xu jefflexu@linux.alibaba.com cc: linux-erofs@lists.ozlabs.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cachefiles/ondemand.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/cachefiles/ondemand.c +++ b/fs/cachefiles/ondemand.c @@ -611,6 +611,9 @@ int cachefiles_ondemand_init_object(stru struct fscache_volume *volume = object->volume->vcookie; size_t volume_key_size, cookie_key_size, data_len;
+ if (!object->ondemand) + return 0; + /* * CacheFiles will firstly check the cache file under the root cache * directory. If the coherency check failed, it will fallback to
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 66c39332d02d65e311ec89b0051130bfcd00c9ac upstream.
Qualcomm Bluetooth controllers may not have been provisioned with a valid device address and instead end up using the default address 00:00:00:00:5a:ad.
This address is now used to determine if a controller has a valid address or if one needs to be provided through devicetree or by user space before the controller can be used.
It turns out that the WCN3991 controllers used in Chromium Trogdor machines use a different default address, 39:98:00:00:5a:ad, which also needs to be marked as invalid so that the correct address is fetched from the devicetree.
Qualcomm has unfortunately not yet provided any answers as to whether the 39:98 encodes a hardware id and if there are other variants of the default address that needs to be handled by the driver.
For now, add the Trogdor WCN3991 default address to the device address check to avoid having these controllers start with the default address instead of their assigned addresses.
Fixes: 32868e126c78 ("Bluetooth: qca: fix invalid device address check") Cc: stable@vger.kernel.org # 6.5 Cc: Doug Anderson dianders@chromium.org Cc: Janaki Ramaiah Thota quic_janathot@quicinc.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Tested-by: Douglas Anderson dianders@chromium.org Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/btqca.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/bluetooth/btqca.c +++ b/drivers/bluetooth/btqca.c @@ -16,6 +16,7 @@ #define VERSION "0.1"
#define QCA_BDADDR_DEFAULT (&(bdaddr_t) {{ 0xad, 0x5a, 0x00, 0x00, 0x00, 0x00 }}) +#define QCA_BDADDR_WCN3991 (&(bdaddr_t) {{ 0xad, 0x5a, 0x00, 0x00, 0x98, 0x39 }})
int qca_read_soc_version(struct hci_dev *hdev, struct qca_btsoc_version *ver, enum qca_btsoc_type soc_type) @@ -708,8 +709,10 @@ static int qca_check_bdaddr(struct hci_d }
bda = (struct hci_rp_read_bd_addr *)skb->data; - if (!bacmp(&bda->bdaddr, QCA_BDADDR_DEFAULT)) + if (!bacmp(&bda->bdaddr, QCA_BDADDR_DEFAULT) || + !bacmp(&bda->bdaddr, QCA_BDADDR_WCN3991)) { set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); + }
kfree_skb(skb);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit dd336649ba89789c845618dcbc09867010aec673 upstream.
The default device address apparently comes from the NVM configuration file and can differ quite a bit between controllers.
Store the default address when parsing the configuration file and use it to determine whether the controller has been provisioned with an address.
This makes sure that devices without a unique address start as unconfigured unless a valid address has been provided in the devicetree.
Fixes: 32868e126c78 ("Bluetooth: qca: fix invalid device address check") Cc: stable@vger.kernel.org # 6.5 Cc: Doug Anderson dianders@chromium.org Cc: Janaki Ramaiah Thota quic_janathot@quicinc.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Tested-by: Douglas Anderson dianders@chromium.org Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/btqca.c | 21 ++++++++++++--------- drivers/bluetooth/btqca.h | 2 ++ 2 files changed, 14 insertions(+), 9 deletions(-)
--- a/drivers/bluetooth/btqca.c +++ b/drivers/bluetooth/btqca.c @@ -15,9 +15,6 @@
#define VERSION "0.1"
-#define QCA_BDADDR_DEFAULT (&(bdaddr_t) {{ 0xad, 0x5a, 0x00, 0x00, 0x00, 0x00 }}) -#define QCA_BDADDR_WCN3991 (&(bdaddr_t) {{ 0xad, 0x5a, 0x00, 0x00, 0x98, 0x39 }}) - int qca_read_soc_version(struct hci_dev *hdev, struct qca_btsoc_version *ver, enum qca_btsoc_type soc_type) { @@ -411,6 +408,14 @@ static int qca_tlv_check_data(struct hci
/* Update NVM tags as needed */ switch (tag_id) { + case EDL_TAG_ID_BD_ADDR: + if (tag_len != sizeof(bdaddr_t)) + return -EINVAL; + + memcpy(&config->bdaddr, tlv_nvm->data, sizeof(bdaddr_t)); + + break; + case EDL_TAG_ID_HCI: if (tag_len < 3) return -EINVAL; @@ -685,7 +690,7 @@ int qca_set_bdaddr_rome(struct hci_dev * } EXPORT_SYMBOL_GPL(qca_set_bdaddr_rome);
-static int qca_check_bdaddr(struct hci_dev *hdev) +static int qca_check_bdaddr(struct hci_dev *hdev, const struct qca_fw_config *config) { struct hci_rp_read_bd_addr *bda; struct sk_buff *skb; @@ -709,10 +714,8 @@ static int qca_check_bdaddr(struct hci_d }
bda = (struct hci_rp_read_bd_addr *)skb->data; - if (!bacmp(&bda->bdaddr, QCA_BDADDR_DEFAULT) || - !bacmp(&bda->bdaddr, QCA_BDADDR_WCN3991)) { + if (!bacmp(&bda->bdaddr, &config->bdaddr)) set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); - }
kfree_skb(skb);
@@ -740,7 +743,7 @@ int qca_uart_setup(struct hci_dev *hdev, enum qca_btsoc_type soc_type, struct qca_btsoc_version ver, const char *firmware_name) { - struct qca_fw_config config; + struct qca_fw_config config = {}; int err; u8 rom_ver = 0; u32 soc_ver; @@ -925,7 +928,7 @@ int qca_uart_setup(struct hci_dev *hdev, break; }
- err = qca_check_bdaddr(hdev); + err = qca_check_bdaddr(hdev, &config); if (err) return err;
--- a/drivers/bluetooth/btqca.h +++ b/drivers/bluetooth/btqca.h @@ -29,6 +29,7 @@ #define EDL_PATCH_CONFIG_RES_EVT (0x00) #define QCA_DISABLE_LOGGING_SUB_OP (0x14)
+#define EDL_TAG_ID_BD_ADDR 2 #define EDL_TAG_ID_HCI (17) #define EDL_TAG_ID_DEEP_SLEEP (27)
@@ -93,6 +94,7 @@ struct qca_fw_config { uint8_t user_baud_rate; enum qca_tlv_dnld_mode dnld_mode; enum qca_tlv_dnld_mode dnld_type; + bdaddr_t bdaddr; };
struct edl_event_hdr {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sicong Huang congei42@163.com
commit 5c9c5d7f26acc2c669c1dcf57d1bb43ee99220ce upstream.
In gb_interface_create, &intf->mode_switch_completion is bound with gb_interface_mode_switch_work. Then it will be started by gb_interface_request_mode_switch. Here is the relevant code. if (!queue_work(system_long_wq, &intf->mode_switch_work)) { ... }
If we call gb_interface_release to make cleanup, there may be an unfinished work. This function will call kfree to free the object "intf". However, if gb_interface_mode_switch_work is scheduled to run after kfree, it may cause use-after-free error as gb_interface_mode_switch_work will use the object "intf". The possible execution flow that may lead to the issue is as follows:
CPU0 CPU1
| gb_interface_create | gb_interface_request_mode_switch gb_interface_release | kfree(intf) (free) | | gb_interface_mode_switch_work | mutex_lock(&intf->mutex) (use)
Fix it by canceling the work before kfree.
Signed-off-by: Sicong Huang congei42@163.com Link: https://lore.kernel.org/r/20240416080313.92306-1-congei42@163.com Cc: Ronnie Sahlberg rsahlberg@ciq.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/greybus/interface.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/greybus/interface.c +++ b/drivers/greybus/interface.c @@ -694,6 +694,7 @@ static void gb_interface_release(struct
trace_gb_interface_release(intf);
+ cancel_work_sync(&intf->mode_switch_work); kfree(intf); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jisheng Zhang jszhang@kernel.org
[ Upstream commit 22130dae0533c474e4e0db930a88caa9b397d083 ]
When there's no irq(this can be due to various reasons, for example, no irq from HW support, or we just want to use poll solution, and so on), falling back to poll is still better than no support at all.
Signed-off-by: Jisheng Zhang jszhang@kernel.org Link: https://lore.kernel.org/r/20230806092056.2467-3-jszhang@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/8250/8250_dw.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c index 88035100b86c6..a1f2259cc9a98 100644 --- a/drivers/tty/serial/8250/8250_dw.c +++ b/drivers/tty/serial/8250/8250_dw.c @@ -523,7 +523,10 @@ static int dw8250_probe(struct platform_device *pdev) if (!regs) return dev_err_probe(dev, -EINVAL, "no registers defined\n");
- irq = platform_get_irq(pdev, 0); + irq = platform_get_irq_optional(pdev, 0); + /* no interrupt -> fall back to polling */ + if (irq == -ENXIO) + irq = 0; if (irq < 0) return irq;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 79d713baf63c8f23cc58b304c40be33d64a12aaf ]
In some APIs we would like to assign the special value to iotype and compare against it in another places. Introduce UPIO_UNKNOWN for this purpose.
Note, we can't use 0, because it's a valid value for IO port access.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20240304123035.758700-3-andriy.shevchenko@linux.in... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/serial_core.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 13bf20242b61a..1c9b3f27f2d36 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -467,6 +467,7 @@ struct uart_port { unsigned char iotype; /* io access style */ unsigned char quirks; /* internal quirks */
+#define UPIO_UNKNOWN ((unsigned char)~0U) /* UCHAR_MAX */ #define UPIO_PORT (SERIAL_IO_PORT) /* 8b I/O port access */ #define UPIO_HUB6 (SERIAL_IO_HUB6) /* Hub6 ISA card */ #define UPIO_MEM (SERIAL_IO_MEM) /* driver-specific */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shichao Lai shichaorai@gmail.com
[ Upstream commit 16637fea001ab3c8df528a8995b3211906165a30 ]
The member "uzonesize" of struct alauda_info will remain 0 if alauda_init_media() fails, potentially causing divide errors in alauda_read_data() and alauda_write_lba(). - Add a member "media_initialized" to struct alauda_info. - Change a condition in alauda_check_media() to ensure the first initialization. - Add an error check for the return value of alauda_init_media().
Fixes: e80b0fade09e ("[PATCH] USB Storage: add alauda support") Reported-by: xingwei lee xrivendell7@gmail.com Reported-by: yue sun samsun1006219@gmail.com Reviewed-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: Shichao Lai shichaorai@gmail.com Link: https://lore.kernel.org/r/20240526012745.2852061-1-shichaorai@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/storage/alauda.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/storage/alauda.c b/drivers/usb/storage/alauda.c index 115f05a6201a1..40d34cc28344a 100644 --- a/drivers/usb/storage/alauda.c +++ b/drivers/usb/storage/alauda.c @@ -105,6 +105,8 @@ struct alauda_info { unsigned char sense_key; unsigned long sense_asc; /* additional sense code */ unsigned long sense_ascq; /* additional sense code qualifier */ + + bool media_initialized; };
#define short_pack(lsb,msb) ( ((u16)(lsb)) | ( ((u16)(msb))<<8 ) ) @@ -476,11 +478,12 @@ static int alauda_check_media(struct us_data *us) }
/* Check for media change */ - if (status[0] & 0x08) { + if (status[0] & 0x08 || !info->media_initialized) { usb_stor_dbg(us, "Media change detected\n"); alauda_free_maps(&MEDIA_INFO(us)); - alauda_init_media(us); - + rc = alauda_init_media(us); + if (rc == USB_STOR_TRANSPORT_GOOD) + info->media_initialized = true; info->sense_key = UNIT_ATTENTION; info->sense_asc = 0x28; info->sense_ascq = 0x00;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongzhi Liu hyperlyzcs@gmail.com
[ Upstream commit 77427e3d5c353e3dd98c7c0af322f8d9e3131ace ]
There is a memory leak (forget to free allocated buffers) in a memory allocation failure path.
Fix it to jump to the correct error handling code.
Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu hyperlyzcs@gmail.com Reviewed-by: Kumaravel Thiagarajan kumaravel.thiagarajan@microchip.com Link: https://lore.kernel.org/r/20240523121434.21855-4-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c index de75d89ef53e8..34c9be437432a 100644 --- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c +++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c @@ -69,8 +69,10 @@ static int gp_aux_bus_probe(struct pci_dev *pdev, const struct pci_device_id *id
aux_bus->aux_device_wrapper[1] = kzalloc(sizeof(*aux_bus->aux_device_wrapper[1]), GFP_KERNEL); - if (!aux_bus->aux_device_wrapper[1]) - return -ENOMEM; + if (!aux_bus->aux_device_wrapper[1]) { + retval = -ENOMEM; + goto err_aux_dev_add_0; + }
retval = ida_alloc(&gp_client_ida, GFP_KERNEL); if (retval < 0)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean Delvare jdelvare@suse.de
[ Upstream commit d6d5645e5fc1233a7ba950de4a72981c394a2557 ]
When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities.
Fixes: 9d3ca54b550c ("i2c: at91: added slave mode support") Signed-off-by: Jean Delvare jdelvare@suse.de Cc: Juergen Fitschen me@jue.yt Cc: Ludovic Desroches ludovic.desroches@microchip.com Cc: Codrin Ciubotariu codrin.ciubotariu@microchip.com Cc: Andi Shyti andi.shyti@kernel.org Cc: Nicolas Ferre nicolas.ferre@microchip.com Cc: Alexandre Belloni alexandre.belloni@bootlin.com Cc: Claudiu Beznea claudiu.beznea@tuxon.dev Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-at91-slave.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-at91-slave.c b/drivers/i2c/busses/i2c-at91-slave.c index d6eeea5166c04..131a67d9d4a68 100644 --- a/drivers/i2c/busses/i2c-at91-slave.c +++ b/drivers/i2c/busses/i2c-at91-slave.c @@ -106,8 +106,7 @@ static int at91_unreg_slave(struct i2c_client *slave)
static u32 at91_twi_func(struct i2c_adapter *adapter) { - return I2C_FUNC_SLAVE | I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL - | I2C_FUNC_SMBUS_READ_BLOCK_DATA; + return I2C_FUNC_SLAVE; }
static const struct i2c_algorithm at91_twi_algorithm_slave = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean Delvare jdelvare@suse.de
[ Upstream commit cbf3fb5b29e99e3689d63a88c3cddbffa1b8de99 ]
When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities.
Fixes: 5b6d721b266a ("i2c: designware: enable SLAVE in platform module") Signed-off-by: Jean Delvare jdelvare@suse.de Cc: Luis Oliveira lolivei@synopsys.com Cc: Jarkko Nikula jarkko.nikula@linux.intel.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Mika Westerberg mika.westerberg@linux.intel.com Cc: Jan Dabros jsd@semihalf.com Cc: Andi Shyti andi.shyti@kernel.org Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Tested-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-designware-slave.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-designware-slave.c b/drivers/i2c/busses/i2c-designware-slave.c index 0d15f4c1e9f7e..5b54a9b9ed1a3 100644 --- a/drivers/i2c/busses/i2c-designware-slave.c +++ b/drivers/i2c/busses/i2c-designware-slave.c @@ -232,7 +232,7 @@ static const struct i2c_algorithm i2c_dw_algo = {
void i2c_dw_configure_slave(struct dw_i2c_dev *dev) { - dev->functionality = I2C_FUNC_SLAVE | DW_IC_DEFAULT_FUNCTIONALITY; + dev->functionality = I2C_FUNC_SLAVE;
dev->slave_cfg = DW_IC_CON_RX_FIFO_FULL_HLD_CTRL | DW_IC_CON_RESTART_EN | DW_IC_CON_STOP_DET_IFADDRESSED;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov oleg@redhat.com
[ Upstream commit 7fea700e04bd3f424c2d836e98425782f97b494e ]
kernel_wait4() doesn't sleep and returns -EINTR if there is no eligible child and signal_pending() is true.
That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending() return false and avoid a busy-wait loop.
Link: https://lkml.kernel.org/r/20240608120616.GB7947@redhat.com Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL") Signed-off-by: Oleg Nesterov oleg@redhat.com Reported-by: Rachel Menge rachelmenge@linux.microsoft.com Closes: https://lore.kernel.org/all/1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.micro... Reviewed-by: Boqun Feng boqun.feng@gmail.com Tested-by: Wei Fu fuweid89@gmail.com Reviewed-by: Jens Axboe axboe@kernel.dk Cc: Allen Pais apais@linux.microsoft.com Cc: Christian Brauner brauner@kernel.org Cc: Frederic Weisbecker frederic@kernel.org Cc: Joel Fernandes (Google) joel@joelfernandes.org Cc: Joel Granados j.granados@samsung.com Cc: Josh Triplett josh@joshtriplett.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: Mateusz Guzik mjguzik@gmail.com Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Mike Christie michael.christie@oracle.com Cc: Neeraj Upadhyay neeraj.upadhyay@kernel.org Cc: Paul E. McKenney paulmck@kernel.org Cc: Steven Rostedt (Google) rostedt@goodmis.org Cc: Zqiang qiang.zhang1211@gmail.com Cc: Thomas Gleixner tglx@linutronix.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/pid_namespace.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index fc21c5d5fd5de..1daadbefcee3a 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -214,6 +214,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) */ do { clear_thread_flag(TIF_SIGPENDING); + clear_thread_flag(TIF_NOTIFY_SIGNAL); rc = kernel_wait4(-1, NULL, __WALL, NULL); } while (rc != -ECHILD);
On 6/19/2024 1:54 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
Hello,
On Wed, 19 Jun 2024 14:54:03 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] 7035314997c8 ("Linux 6.1.95-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: sysfs.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_arm64.sh ok 12 selftests: damon-tests: build_m68k.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh [33m [92mPASS [39m
On Wed, 19 Jun 2024 14:54:03 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.1: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.1.95-rc1-g0891d95b9db3 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hi!
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my x86_64 and ARM64 test systems. No errors or regressions.
Tested-by: Allen Pais apais@linux.microsoft.com
Thanks.
On Wed, Jun 19, 2024 at 02:54:03PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
No regressions found on WSL (x86 and arm64).
Built, booted, and reviewed dmesg.
Thank you. :)
Tested-by: Kelsey Steele kelseysteele@linux.microsoft.com
On Wed, Jun 19, 2024 at 02:54:03PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On 6/19/24 5:54 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Wed, 19 Jun 2024 at 18:56, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.1.95-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.1.y * git commit: 0891d95b9db39ae51c0edef73f56d41521be9fbd * git describe: v6.1.94-218-g0891d95b9db3 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.94...
## Test Regressions (compared to v6.1.94)
## Metric Regressions (compared to v6.1.94)
## Test Fixes (compared to v6.1.94)
## Metric Fixes (compared to v6.1.94)
## Test result summary total: 171975, pass: 145682, fail: 3002, skip: 23038, xfail: 253
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 135 total, 135 passed, 0 failed * arm64: 38 total, 38 passed, 0 failed * i386: 29 total, 29 passed, 0 failed * mips: 24 total, 24 passed, 0 failed * parisc: 3 total, 3 passed, 0 failed * powerpc: 33 total, 33 passed, 0 failed * riscv: 9 total, 9 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 10 total, 10 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 33 total, 33 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mm * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * log-parser-boot * log-parser-test * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-smoketest * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
On 2024-06-19 14:54 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
Works fine for me on x86_64.
Tested-by: Sven Joachim svenjoac@gmx.de
Cheers, Sven
Am 19.06.2024 um 14:54 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Built and booted successfully, I don't see any regressions nor any scary or suspicious dmesg outout, running for an hour now. Built on dual socket Ivy Bridge Xeon E5-2697 v2.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Still no goal in Spain vs Italy... 🙄
Beste Grüße, Peter Schneider
On 6/19/24 06:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.95-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Reverting this patch fixes the problem (I tried several times to be sure since I don't see the connection). I don't see the problem in any other branch. Bisect log is attached for reference.
Guenter
--- # bad: [a6398e37309000e35cedb5cc328a0f8d00d7d7b9] Linux 6.1.95 # good: [eb44d83053d66372327e69145e8d2fa7400a4991] Linux 6.1.94 git bisect start 'HEAD' 'v6.1.94' # good: [f17443d52d805c9a7fab5e67a4e8b973626fe1cd] cachefiles: resend an open request if the read request's object is closed git bisect good f17443d52d805c9a7fab5e67a4e8b973626fe1cd # good: [cc09e1d3519feab823685f4297853d468f44549d] iio: imu: inv_icm42600: delete unneeded update watermark call git bisect good cc09e1d3519feab823685f4297853d468f44549d # good: [b7b6bc60edb2132a569899bcd9ca099a0556c6ee] intel_th: pci: Add Granite Rapids SOC support git bisect good b7b6bc60edb2132a569899bcd9ca099a0556c6ee # good: [35e395373ecd14b64da7d54f565927a9368dcf20] mptcp: pm: update add_addr counters after connect git bisect good 35e395373ecd14b64da7d54f565927a9368dcf20 # good: [29d35f0b53d4bd82ebc37c500a8dd73da61318ff] serial: 8250_dw: fall back to poll if there's no interrupt git bisect good 29d35f0b53d4bd82ebc37c500a8dd73da61318ff # good: [ea25a4c0de5700928c7fd0aa789eee39a457ba95] misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() git bisect good ea25a4c0de5700928c7fd0aa789eee39a457ba95 # good: [e44999ec0b49dca9a9a2090c5432d893ea4f8d20] i2c: designware: Fix the functionality flags of the slave-only interface git bisect good e44999ec0b49dca9a9a2090c5432d893ea4f8d20 # bad: [edd2754a62bee8d97b4808a15de024f66a1ddccf] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING git bisect bad edd2754a62bee8d97b4808a15de024f66a1ddccf # first bad commit: [edd2754a62bee8d97b4808a15de024f66a1ddccf] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
On 6/22/24 16:58, Guenter Roeck wrote:
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Just to be sure, did you tested the same kernel on physical hardware as well?
Please note, that 64-bit hppa (C3700) support in qemu was just recently added and is still considered experimental. So, maybe it's not a bug in the source, but in qemu...?!?
Reverting this patch fixes the problem (I tried several times to be sure since I don't see the connection). I don't see the problem in any other branch. Bisect log is attached for reference.
Guenter
# bad: [a6398e37309000e35cedb5cc328a0f8d00d7d7b9] Linux 6.1.95 # good: [eb44d83053d66372327e69145e8d2fa7400a4991] Linux 6.1.94 git bisect start 'HEAD' 'v6.1.94' # good: [f17443d52d805c9a7fab5e67a4e8b973626fe1cd] cachefiles: resend an open request if the read request's object is closed git bisect good f17443d52d805c9a7fab5e67a4e8b973626fe1cd # good: [cc09e1d3519feab823685f4297853d468f44549d] iio: imu: inv_icm42600: delete unneeded update watermark call git bisect good cc09e1d3519feab823685f4297853d468f44549d # good: [b7b6bc60edb2132a569899bcd9ca099a0556c6ee] intel_th: pci: Add Granite Rapids SOC support git bisect good b7b6bc60edb2132a569899bcd9ca099a0556c6ee # good: [35e395373ecd14b64da7d54f565927a9368dcf20] mptcp: pm: update add_addr counters after connect git bisect good 35e395373ecd14b64da7d54f565927a9368dcf20 # good: [29d35f0b53d4bd82ebc37c500a8dd73da61318ff] serial: 8250_dw: fall back to poll if there's no interrupt git bisect good 29d35f0b53d4bd82ebc37c500a8dd73da61318ff # good: [ea25a4c0de5700928c7fd0aa789eee39a457ba95] misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() git bisect good ea25a4c0de5700928c7fd0aa789eee39a457ba95 # good: [e44999ec0b49dca9a9a2090c5432d893ea4f8d20] i2c: designware: Fix the functionality flags of the slave-only interface git bisect good e44999ec0b49dca9a9a2090c5432d893ea4f8d20 # bad: [edd2754a62bee8d97b4808a15de024f66a1ddccf] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING git bisect bad edd2754a62bee8d97b4808a15de024f66a1ddccf # first bad commit: [edd2754a62bee8d97b4808a15de024f66a1ddccf] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
On 6/22/24 08:13, Helge Deller wrote:
On 6/22/24 16:58, Guenter Roeck wrote:
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Just to be sure, did you tested the same kernel on physical hardware as well?
No, I don't have hardware. I only have qemu. That is why I copied you and the parisc mailing list. I would hope that someone can either confirm that this is a real problem or that it is qemu related. If it is qemu related, I'll just stop testing c3700 64-bit support with qemu on v6.1.y and other branches if/when the problem shows up there as well.
Please note, that 64-bit hppa (C3700) support in qemu was just recently added and is still considered experimental. So, maybe it's not a bug in the source, but in qemu...?!?
Sure, that is possible, though it is a bit unusual that it is only seen in 6.1.95 and not in any other branches or releases.
In summary, please see this report as "This is a problem seen in qemu. It may or may not be seen on real hardware". Maybe I should add this as a common disclaimer to all my reports to avoid misunderstandings.
Thanks, Guenter
On 6/22/24 17:34, Guenter Roeck wrote:
On 6/22/24 08:13, Helge Deller wrote:
On 6/22/24 16:58, Guenter Roeck wrote:
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Just to be sure, did you tested the same kernel on physical hardware as well?
No, I don't have hardware. I only have qemu. That is why I copied you and the parisc mailing list.
Yes, sorry, I saw your top line in the mail after I already sent my reply....
I would hope that someone can either confirm that this is a real problem or that it is qemu related. If it is qemu related, I'll just stop testing c3700 64-bit support with qemu on v6.1.y and other branches if/when the problem shows up there as well.
I just booted 6.1.95 successfully in qemu and on my physical C3700 machine. I assume the problem can be reproduced with your .config ? Please send it to me off-list, then I can try again.
I know there are still some issues with the 64-bit hppa emulation in qemu, which are quite hard for me to trigger and to find the cause. So, maybe you now found one easier-to-trigger reproducer? :-)
Helge
Please note, that 64-bit hppa (C3700) support in qemu was just recently added and is still considered experimental. So, maybe it's not a bug in the source, but in qemu...?!?
Sure, that is possible, though it is a bit unusual that it is only seen in 6.1.95 and not in any other branches or releases.
In summary, please see this report as "This is a problem seen in qemu. It may or may not be seen on real hardware". Maybe I should add this as a common disclaimer to all my reports to avoid misunderstandings.
Thanks, Guenter
On 6/22/24 08:49, Helge Deller wrote:
On 6/22/24 17:34, Guenter Roeck wrote:
On 6/22/24 08:13, Helge Deller wrote:
On 6/22/24 16:58, Guenter Roeck wrote:
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Just to be sure, did you tested the same kernel on physical hardware as well?
No, I don't have hardware. I only have qemu. That is why I copied you and the parisc mailing list.
Yes, sorry, I saw your top line in the mail after I already sent my reply....
I would hope that someone can either confirm that this is a real problem or that it is qemu related. If it is qemu related, I'll just stop testing c3700 64-bit support with qemu on v6.1.y and other branches if/when the problem shows up there as well.
I just booted 6.1.95 successfully in qemu and on my physical C3700 machine. I assume the problem can be reproduced with your .config ? Please send it to me off-list, then I can try again.
Done.
Thanks, Guenter
On 6/22/24 08:49, Helge Deller wrote: [ ... ]
I just booted 6.1.95 successfully in qemu and on my physical C3700 machine.
FWIW, I don't see the problem either if I just build and boot generic-64bit_defconfig. Sorry, I didn't really expect this, so I didn't mention that my configuration adds lots of debug and unit test options.
Guenter
On 6/22/24 08:13, Helge Deller wrote:
On 6/22/24 16:58, Guenter Roeck wrote:
[ Copying parisc maintainers - maybe they can test on real hardware ]
On 6/19/24 05:54, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.95 release. There are 217 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. Anything received after that time might be too late.
...
Oleg Nesterov oleg@redhat.com zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
I can not explain it, but this patch causes all my parisc64 (C3700) boot tests to crash. There are lots of memory corruption BUGs such as
[ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-96 (Not tainted): Padding overwritten. 0x0000000043411dd0-0x0000000043411f5f @offset=3536
ultimately followed by
[ 0.462562] Unaligned handler failed, ret = -14 ... [ 0.469160] IAOQ[0]: idr_alloc_cyclic+0x48/0x118 [ 0.469372] IAOQ[1]: idr_alloc_cyclic+0x54/0x118 [ 0.469548] RP(r2): __kernfs_new_node.constprop.0+0x160/0x420 [ 0.469782] Backtrace: [ 0.469928] [<00000000404af108>] __kernfs_new_node.constprop.0+0x160/0x420 [ 0.470285] [<00000000404b0cac>] kernfs_new_node+0xbc/0x118 [ 0.470523] [<00000000404b158c>] kernfs_create_empty_dir+0x54/0xf0 [ 0.470756] [<00000000404b665c>] sysfs_create_mount_point+0x4c/0xb0 [ 0.470996] [<00000000401181cc>] cgroup_init+0x5b4/0x738 [ 0.471213] [<0000000040102220>] start_kernel+0x1238/0x1308 [ 0.471429] [<0000000040107c90>] start_parisc+0x188/0x1d0 ... [ 0.474956] Kernel panic - not syncing: Attempted to kill the idle task! SeaBIOS wants SYSTEM RESET.
This is with qemu v9.0.1.
Just to be sure, did you tested the same kernel on physical hardware as well?
Please note, that 64-bit hppa (C3700) support in qemu was just recently added and is still considered experimental. So, maybe it's not a bug in the source, but in qemu...?!?
Following up on this for everyone: Helge doesn't see the problem on real hardware. I can make the problem disappear by any of the following: - Use gcc 13.3 instead of 12.3 - Disable CONFIG_KUNIT - Enable CONFIG_PAGE_POISONING (without actually enabling it in the runtime)
Overall, that suggests some kind of heisenbug, most likely in qemu, unrelated to the commit above.
Thanks, and sorry for the noise.
Guenter
linux-stable-mirror@lists.linaro.org