This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.0.12-rc1
Christophe Leroy christophe.leroy@csgroup.eu powerpc/bpf/32: Fix Oops on tail call tests
Zhang Xiaoxu zhangxiaoxu5@huawei.com Input: raydium_ts_i2c - fix memory leak in raydium_i2c_send()
Jan Dabros jsd@semihalf.com char: tpm: Protect tpm_pm_suspend with locks
Conor Dooley conor.dooley@microchip.com Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"
Vishal Verma vishal.l.verma@intel.com ACPI: HMAT: Fix initiator registration for single-initiator systems
Vishal Verma vishal.l.verma@intel.com ACPI: HMAT: remove unnecessary variable initialization
Andrew Lunn andrew@lunn.ch i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag set
Wang Yufen wangyufen@huawei.com i2c: qcom-geni: fix error return code in geni_i2c_gpi_xfer
Yuan Can yuancan@huawei.com i2c: npcm7xx: Fix error handling in npcm_i2c_init()
Ricardo Ribalda ribalda@chromium.org i2c: Restore initial power state if probe fails
SeongJae Park sj@kernel.org mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()
Yajun Deng yajun.deng@linux.dev mm/damon: introduce struct damos_access_pattern
Ido Schimmel idosch@nvidia.com ipv4: Fix route deletion when nexthop info is not specified
David Ahern dsahern@kernel.org ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference
Xiongfeng Wang wangxiongfeng2@huawei.com iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()
Xiongfeng Wang wangxiongfeng2@huawei.com iommu/vt-d: Fix PCI device refcount leak in has_external_pci()
Caleb Sander csander@purestorage.com nvme: fix SRCU protection of nvme_ns_head list
Guo Ren guoren@kernel.org riscv: kexec: Fixup crash_smp_send_stop without multi cores
Guo Ren guoren@kernel.org riscv: kexec: Fixup irq controller broken in kexec crash path
Jisheng Zhang jszhang@kernel.org riscv: fix race when vmap stack overflow
Alexandre Ghiti alexghiti@rivosinc.com riscv: Sync efi page table's kernel mappings before switching
Maxim Korotkov korotkov.maxim.s@gmail.com pinctrl: single: Fix potential division by zero
Hui Tang tanghui20@huawei.com ASoC: tlv320adc3xxx: Fix build error for implicit function declaration
Mark Brown broonie@kernel.org ASoC: ops: Fix bounds check for _sx controls
Steven Rostedt (Google) rostedt@goodmis.org tracing: Free buffers when a used dynamic event is removed
Steven Rostedt (Google) rostedt@goodmis.org tracing: Fix race where histograms can be called before the event
Daniel Bristot de Oliveira bristot@kernel.org tracing/osnoise: Fix duration type
Janusz Krzysztofik janusz.krzysztofik@linux.intel.com drm/i915: Never return 0 if not all requests retired
Janusz Krzysztofik janusz.krzysztofik@linux.intel.com drm/i915: Fix negative value passed as remaining time
Leo Liu leo.liu@amd.com drm/amdgpu: enable Vangogh VCN indirect sram mode
Lee Jones lee@kernel.org drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
Lee Jones lee@kernel.org Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
Adrian Hunter adrian.hunter@intel.com mmc: sdhci: Fix voltage switch delay
Wenchao Chen wenchao.chen@unisoc.com mmc: sdhci-sprd: Fix no reset data and command after voltage switch
Sebastian Falbesoner sebastian.falbesoner@gmail.com mmc: sdhci-esdhc-imx: correct CQHCI exit halt state check
Christian Löhle CLoehle@hyperstone.com mmc: core: Fix ambiguous TRIM and DISCARD arg
Gaosheng Cui cuigaosheng1@huawei.com mmc: mtk-sd: Fix missing clk_disable_unprepare in msdc_of_clock_parse()
Ye Bin yebin10@huawei.com mmc: mmc_test: Fix removal of debugfs file
Goh, Wei Sheng wei.sheng.goh@intel.com net: stmmac: Set MAC's flow control register to reflect current settings
Gavin Shan gshan@redhat.com mm: migrate: fix THP's mapcount on isolation
Linus Torvalds torvalds@linux-foundation.org v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails
Andy Shevchenko andriy.shevchenko@linux.intel.com pinctrl: intel: Save and restore pins in "direct IRQ" mode
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3
ZhangPeng zhangpeng362@huawei.com nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
Tiezhu Yang yangtiezhu@loongson.cn tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
Steven Rostedt (Google) rostedt@goodmis.org error-injection: Add prompt for function error injection
Ziyang Xuan william.xuanziyang@huawei.com can: can327: can327_feed_frame_to_netdev(): fix potential skb leak when netdev is down
Takashi Sakamoto o-takashi@sakamocchi.jp ALSA: dice: fix regression for Lexicon I-ONIX FW810S
Björn Töpel bjorn@rivosinc.com riscv: mm: Proper page permissions after initmem free
Jisheng Zhang jszhang@kernel.org riscv: vdso: fix section overlapping under some conditions
Yuan Can yuancan@huawei.com hwmon: (asus-ec-sensors) Add checks for devm_kcalloc
Yang Yingliang yangyingliang@huawei.com hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()
Phil Auld pauld@redhat.com hwmon: (coretemp) Check for null before removing sysfs attrs
Marc Dionne marc.dionne@auristor.com afs: Fix server->active leak in afs_put_server
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed
Zhengchao Shao shaozhengchao@huawei.com sctp: fix memory leak in sctp_stream_outq_migrate()
Willem de Bruijn willemb@google.com packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE
Chris Mi cmi@nvidia.com net/mlx5: Lag, Fix for loop when checking lag
Shigeru Yoshida syoshida@redhat.com net: tun: Fix use-after-free in tun_detach()
David Howells dhowells@redhat.com afs: Fix fileserver probe RTT handling
Yang Yingliang yangyingliang@huawei.com net: mdiobus: fix unbalanced node reference count
YueHaibing yuehaibing@huawei.com net: hsr: Fix potential use-after-free
Xin Long lucien.xin@gmail.com tipc: re-fetch skb cb after tipc_msg_validate
Paolo Abeni pabeni@redhat.com mptcp: fix sleep in atomic at close time
Menglong Dong imagedong@tencent.com mptcp: don't orphan ssk in mptcp_close()
Jerry Ray jerry.ray@microchip.com dsa: lan9303: Correct stat name
M Chetan Kumar m.chetan.kumar@linux.intel.com net: wwan: iosm: fix incorrect skb length
M Chetan Kumar m.chetan.kumar@linux.intel.com net: wwan: iosm: fix crash in peek throughput test
M Chetan Kumar m.chetan.kumar@linux.intel.com net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type
M Chetan Kumar m.chetan.kumar@linux.intel.com net: wwan: iosm: fix kernel test robot reported error
Yuri Karpov YKarpov@ispras.ru net: ethernet: nixge: fix NULL dereference
Wang Hai wanghai38@huawei.com net/9p: Fix a potential socket leak in p9_socket_open
Yuan Can yuancan@huawei.com net: net_netdev: Fix error handling in ntb_netdev_init_module()
Zhang Changzhong zhangchangzhong@huawei.com net: ethernet: ti: am65-cpsw: fix error handling in am65_cpsw_nuss_probe()
Yang Yingliang yangyingliang@huawei.com net: phy: fix null-ptr-deref while probe() failed
Lorenzo Bianconi lorenzo@kernel.org wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration
Johannes Berg johannes.berg@intel.com wifi: cfg80211: don't allow multi-BSSID in S1G
Johannes Berg johannes.berg@intel.com wifi: cfg80211: fix buffer overflow in elem comparison
Izabela Bakollari ibakolla@redhat.com aquantia: Do not purge addresses when setting the number of rings
Duoming Zhou duoming@zju.edu.cn qlcnic: fix sleep-in-atomic-context bugs caused by msleep
Amir Goldstein amir73il@gmail.com vfs: fix copy_file_range() averts filesystem freeze protection
Jiasheng Jiang jiasheng@iscas.ac.cn can: m_can: Add check for devm_clk_get
Zhang Changzhong zhangchangzhong@huawei.com can: m_can: pci: add missing m_can_class_free_dev() in probe/remove methods
Zhang Changzhong zhangchangzhong@huawei.com can: etas_es58x: es58x_init_netdev(): free netdev when register_candev()
Zhang Changzhong zhangchangzhong@huawei.com can: cc770: cc770_isa_probe(): add missing free_cc770dev()
Zhang Changzhong zhangchangzhong@huawei.com can: sja1000_isa: sja1000_isa_probe(): add missing free_sja1000dev()
Roi Dayan roid@nvidia.com net/mlx5e: Fix use-after-free when reverting termination table
YueHaibing yuehaibing@huawei.com net/mlx5: Fix uninitialized variable bug in outlen_write()
Chris Mi cmi@nvidia.com net/mlx5: E-switch, Fix duplicate lag creation
Chris Mi cmi@nvidia.com net/mlx5: E-switch, Destroy legacy fdb table when needed
YueHaibing yuehaibing@huawei.com net/mlx5: DR, Fix uninitialized var warning
Wang Hai wanghai38@huawei.com e100: Fix possible use after free in e100_xmit_prepare
Yuan Can yuancan@huawei.com iavf: Fix error handling in iavf_init_module()
Yuan Can yuancan@huawei.com fm10k: Fix error handling in fm10k_init_module()
Shang XiaoJing shangxiaojing@huawei.com i40e: Fix error handling in i40e_init_module()
Shang XiaoJing shangxiaojing@huawei.com ixgbevf: Fix resource leak in ixgbevf_init_module()
Shazad Hussain quic_shazhuss@quicinc.com clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks
Yang Yingliang yangyingliang@huawei.com of: property: decrement node refcount in of_fwnode_get_reference_args()
Wei Yongjun weiyongjun1@huawei.com nvmem: rmem: Fix return value check in rmem_read()
Xu Kuohai xukuohai@huawei.com bpf: Do not copy spin lock field from user in bpf_selem_alloc
Joe Korty joe.korty@concurrent-rt.com clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
Gaosheng Cui cuigaosheng1@huawei.com hwmon: (ibmpex) Fix possible UAF when ibmpex_register_bmc() fails
Yang Yingliang yangyingliang@huawei.com hwmon: (i5500_temp) fix missing pci_disable_device()
Ninad Malwade nmalwade@nvidia.com hwmon: (ina3221) Fix shunt sum critical calculation
Derek Nguyen derek.nguyen@collins.com hwmon: (ltc2947) fix temperature scaling
Hou Tao houtao1@huawei.com libbpf: Handle size overflow for ringbuf mmap
Michael Grzeschik m.grzeschik@pengutronix.de ARM: at91: rm9200: fix usb device clock id
Srikar Dronamraju srikar@linux.vnet.ibm.com scripts/faddr2line: Fix regression in name resolution on ppc64le
Hou Tao houtao1@huawei.com bpf, perf: Use subprog name when reporting subprog ksymbol
Jiri Olsa jolsa@kernel.org libbpf: Use correct return pointer in attach_raw_tp
Paul Gazzillo paul@pgazz.com iio: light: rpr0521: add missing Kconfig dependencies
Wei Yongjun weiyongjun1@huawei.com iio: health: afe4404: Fix oob read in afe4404_[read|write]_raw
Wei Yongjun weiyongjun1@huawei.com iio: health: afe4403: Fix oob read in afe4403_read_raw
Stephen Boyd swboyd@chromium.org clk: qcom: gdsc: Remove direct runtime PM calls
Johan Hovold johan+linaro@kernel.org clk: qcom: gdsc: add missing error handling
David Virag virag.david003@gmail.com clk: samsung: exynos7885: Correct "div4" clock parents
lyndonli Lyndon.Li@amd.com drm/amd/pm: update driver if header for smu_13_0_7
Kenneth Feng kenneth.feng@amd.com drm/amd/pm: update driver-if header for smu_v13_0_10
Yang Wang KevinYang.Wang@amd.com drm/amd/pm: add smu_v13_0_10 driver if version
Sam James sam@gentoo.org kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible
Christian König christian.koenig@amd.com drm/amdgpu: fix userptr HMM range handling v2
Christian König christian.koenig@amd.com drm/amdgpu: cleanup error handling in amdgpu_cs_parser_bos
Christian König christian.koenig@amd.com drm/amdgpu: move setting the job resources
ChenXiaoSong chenxiaosong2@huawei.com btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/at91rm9200.dtsi | 2 +- arch/powerpc/net/bpf_jit_comp32.c | 52 ++++----- arch/riscv/include/asm/asm.h | 1 + arch/riscv/include/asm/efi.h | 6 +- arch/riscv/include/asm/pgalloc.h | 11 +- arch/riscv/include/asm/smp.h | 3 + arch/riscv/kernel/entry.S | 13 +++ arch/riscv/kernel/machine_kexec.c | 46 ++++++-- arch/riscv/kernel/setup.c | 9 +- arch/riscv/kernel/smp.c | 97 ++++++++++++++++- arch/riscv/kernel/traps.c | 18 ++++ arch/riscv/kernel/vdso/Makefile | 1 + arch/x86/include/asm/nospec-branch.h | 2 +- arch/x86/kernel/cpu/bugs.c | 21 ++-- arch/x86/kernel/process.c | 2 +- drivers/acpi/numa/hmat.c | 27 +++-- drivers/char/tpm/tpm-interface.c | 5 +- drivers/clk/at91/at91rm9200.c | 2 +- drivers/clk/qcom/gcc-sc8280xp.c | 6 ++ drivers/clk/qcom/gdsc.c | 70 +++--------- drivers/clk/qcom/gdsc.h | 2 - drivers/clk/samsung/clk-exynos7885.c | 4 +- drivers/clocksource/arm_arch_timer.c | 7 +- drivers/clocksource/timer-riscv.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 60 ++++------- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 17 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 53 +++------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 14 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 + drivers/gpu/drm/amd/display/Kconfig | 7 ++ .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 111 +++++++++++++------ .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h | 117 ++++++++++++++------- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 5 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +- drivers/gpu/drm/i915/gt/intel_gt.c | 9 +- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +- drivers/hwmon/asus-ec-sensors.c | 2 + drivers/hwmon/coretemp.c | 9 +- drivers/hwmon/i5500_temp.c | 2 +- drivers/hwmon/ibmpex.c | 1 + drivers/hwmon/ina3221.c | 4 +- drivers/hwmon/ltc2947-core.c | 2 +- drivers/i2c/busses/i2c-imx.c | 6 +- drivers/i2c/busses/i2c-npcm7xx.c | 11 +- drivers/i2c/busses/i2c-qcom-geni.c | 1 - drivers/i2c/i2c-core-base.c | 9 +- drivers/iio/health/afe4403.c | 5 +- drivers/iio/health/afe4404.c | 12 ++- drivers/iio/light/Kconfig | 2 + drivers/input/touchscreen/raydium_i2c_ts.c | 4 +- drivers/iommu/intel/dmar.c | 1 + drivers/iommu/intel/iommu.c | 4 +- drivers/media/common/videobuf2/frame_vector.c | 68 +++--------- drivers/mmc/core/core.c | 9 +- drivers/mmc/core/mmc_test.c | 3 +- drivers/mmc/host/mtk-sd.c | 6 +- drivers/mmc/host/sdhci-esdhc-imx.c | 2 +- drivers/mmc/host/sdhci-sprd.c | 4 +- drivers/mmc/host/sdhci.c | 61 +++++++++-- drivers/mmc/host/sdhci.h | 2 + drivers/net/can/can327.c | 4 +- drivers/net/can/cc770/cc770_isa.c | 10 +- drivers/net/can/m_can/m_can.c | 2 +- drivers/net/can/m_can/m_can_pci.c | 9 +- drivers/net/can/sja1000/sja1000_isa.c | 10 +- drivers/net/can/usb/etas_es58x/es58x_core.c | 5 +- drivers/net/dsa/lan9303-core.c | 2 +- .../net/ethernet/aquantia/atlantic/aq_ethtool.c | 5 +- drivers/net/ethernet/aquantia/atlantic/aq_main.c | 4 +- drivers/net/ethernet/aquantia/atlantic/aq_main.h | 2 + drivers/net/ethernet/intel/e100.c | 5 +- drivers/net/ethernet/intel/fm10k/fm10k_main.c | 10 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 11 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 9 +- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 10 +- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 3 + drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 8 ++ .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 7 ++ .../mellanox/mlx5/core/eswitch_offloads_termtbl.c | 2 + drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 9 +- .../mellanox/mlx5/core/steering/dr_table.c | 5 +- drivers/net/ethernet/ni/nixge.c | 29 ++--- .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 +- drivers/net/ethernet/renesas/ravb_main.c | 1 + drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 2 + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 ++- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +- drivers/net/mdio/fwnode_mdio.c | 2 +- drivers/net/ntb_netdev.c | 9 +- drivers/net/phy/phy_device.c | 2 + drivers/net/tun.c | 4 +- drivers/net/wwan/iosm/iosm_ipc_mux_codec.c | 26 ++--- drivers/net/wwan/iosm/iosm_ipc_protocol.h | 2 +- drivers/nvme/host/core.c | 2 +- drivers/nvme/host/multipath.c | 3 + drivers/nvmem/rmem.c | 4 +- drivers/of/property.c | 4 +- drivers/pinctrl/intel/pinctrl-intel.c | 27 ++++- drivers/pinctrl/pinctrl-single.c | 2 +- fs/afs/fs_probe.c | 4 +- fs/afs/server.c | 2 +- fs/btrfs/qgroup.c | 9 +- fs/ksmbd/vfs.c | 6 +- fs/nfsd/vfs.c | 4 +- fs/nilfs2/dat.c | 7 ++ fs/read_write.c | 19 +++- include/linux/damon.h | 37 ++++--- include/linux/fs.h | 8 ++ include/linux/license.h | 2 + include/linux/mmc/mmc.h | 2 +- include/net/sctp/stream_sched.h | 2 + kernel/bpf/bpf_local_storage.c | 2 +- kernel/events/core.c | 2 +- kernel/trace/trace_dynevent.c | 2 + kernel/trace/trace_events.c | 11 +- kernel/trace/trace_events_hist.c | 3 + kernel/trace/trace_osnoise.c | 6 +- lib/Kconfig.debug | 9 +- mm/compaction.c | 22 ++-- mm/damon/core.c | 31 +++--- mm/damon/dbgfs.c | 27 +++-- mm/damon/lru_sort.c | 46 ++++---- mm/damon/reclaim.c | 23 ++-- mm/damon/sysfs.c | 63 +++++++++-- net/9p/trans_fd.c | 4 +- net/hsr/hsr_forward.c | 5 +- net/ipv4/fib_semantics.c | 10 +- net/mac80211/airtime.c | 3 + net/mptcp/protocol.c | 13 ++- net/mptcp/subflow.c | 6 +- net/packet/af_packet.c | 6 +- net/sctp/stream.c | 25 +++-- net/sctp/stream_sched.c | 5 + net/sctp/stream_sched_prio.c | 19 ++++ net/sctp/stream_sched_rr.c | 5 + net/tipc/crypto.c | 3 + net/wireless/scan.c | 10 +- scripts/faddr2line | 7 +- sound/firewire/dice/dice-stream.c | 12 ++- sound/soc/codecs/tlv320adc3xxx.c | 3 + sound/soc/soc-ops.c | 2 +- tools/lib/bpf/libbpf.c | 2 +- tools/lib/bpf/ringbuf.c | 12 ++- tools/testing/selftests/net/fib_nexthops.sh | 16 +++ tools/vm/slabinfo-gnuplot.sh | 4 +- 152 files changed, 1236 insertions(+), 626 deletions(-)
From: ChenXiaoSong chenxiaosong2@huawei.com
[ Upstream commit f7e942b5bb35d8e3af54053d19a6bf04143a3955 ]
Syzkaller reported BUG as follows:
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 __might_resched.cold+0x222/0x26b kmem_cache_alloc+0x2e7/0x3c0 update_qgroup_limit_item+0xe1/0x390 btrfs_qgroup_inherit+0x147b/0x1ee0 create_subvol+0x4eb/0x1710 btrfs_mksubvol+0xfe5/0x13f0 __btrfs_ioctl_snap_create+0x2b0/0x430 btrfs_ioctl_snap_create_v2+0x25a/0x520 btrfs_ioctl+0x2a1c/0x5ce0 __x64_sys_ioctl+0x193/0x200 do_syscall_64+0x35/0x80
Fix this by calling qgroup_dirty() on @dstqgroup, and update limit item in btrfs_run_qgroups() later outside of the spinlock context.
CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: ChenXiaoSong chenxiaosong2@huawei.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/qgroup.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index ba323dcb0a0b..db56e0c0e9ac 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -2920,14 +2920,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, dstgroup->rsv_rfer = inherit->lim.rsv_rfer; dstgroup->rsv_excl = inherit->lim.rsv_excl;
- ret = update_qgroup_limit_item(trans, dstgroup); - if (ret) { - fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; - btrfs_info(fs_info, - "unable to update quota limit for %llu", - dstgroup->qgroupid); - goto unlock; - } + qgroup_dirty(fs_info, dstgroup); }
if (srcid) {
From: Christian König christian.koenig@amd.com
[ Upstream commit 736ec9fadd7a1fde8480df7e5cfac465c07ff6f3 ]
Move setting the job resources into amdgpu_job.c
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Andrey Grodzovsky andrey.grodzovsky@amd.com Reviewed-by: Luben Tuikov luben.tuikov@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 4458da0bb09d ("drm/amdgpu: fix userptr HMM range handling v2") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 21 ++------------------- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 17 +++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 ++ 3 files changed, 21 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index b7bae833c804..aa3ce01cd538 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -495,9 +495,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, struct amdgpu_vm *vm = &fpriv->vm; struct amdgpu_bo_list_entry *e; struct list_head duplicates; - struct amdgpu_bo *gds; - struct amdgpu_bo *gws; - struct amdgpu_bo *oa; int r;
INIT_LIST_HEAD(&p->validated); @@ -614,22 +611,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved, p->bytes_moved_vis);
- gds = p->bo_list->gds_obj; - gws = p->bo_list->gws_obj; - oa = p->bo_list->oa_obj; - - if (gds) { - p->job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT; - p->job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT; - } - if (gws) { - p->job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT; - p->job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT; - } - if (oa) { - p->job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT; - p->job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT; - } + amdgpu_job_set_resources(p->job, p->bo_list->gds_obj, + p->bo_list->gws_obj, p->bo_list->oa_obj);
if (!r && p->uf_entry.tv.bo) { struct amdgpu_bo *uf = ttm_to_amdgpu_bo(p->uf_entry.tv.bo); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index c2fd6f3076a6..3b025aace283 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -129,6 +129,23 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size, return r; }
+void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds, + struct amdgpu_bo *gws, struct amdgpu_bo *oa) +{ + if (gds) { + job->gds_base = amdgpu_bo_gpu_offset(gds) >> PAGE_SHIFT; + job->gds_size = amdgpu_bo_size(gds) >> PAGE_SHIFT; + } + if (gws) { + job->gws_base = amdgpu_bo_gpu_offset(gws) >> PAGE_SHIFT; + job->gws_size = amdgpu_bo_size(gws) >> PAGE_SHIFT; + } + if (oa) { + job->oa_base = amdgpu_bo_gpu_offset(oa) >> PAGE_SHIFT; + job->oa_size = amdgpu_bo_size(oa) >> PAGE_SHIFT; + } +} + void amdgpu_job_free_resources(struct amdgpu_job *job) { struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h index babc0af751c2..2a1961bf1194 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h @@ -76,6 +76,8 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs, struct amdgpu_job **job, struct amdgpu_vm *vm); int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size, enum amdgpu_ib_pool_type pool, struct amdgpu_job **job); +void amdgpu_job_set_resources(struct amdgpu_job *job, struct amdgpu_bo *gds, + struct amdgpu_bo *gws, struct amdgpu_bo *oa); void amdgpu_job_free_resources(struct amdgpu_job *job); void amdgpu_job_free(struct amdgpu_job *job); int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
From: Christian König christian.koenig@amd.com
[ Upstream commit 4953b6b22ab9d7f64706631a027b1ed1130ce4c8 ]
Return early on success and so remove all those "if (r)" in the error path.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 4458da0bb09d ("drm/amdgpu: fix userptr HMM range handling v2") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 37 +++++++++++++------------- 1 file changed, 18 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index aa3ce01cd538..fee99a40804e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -608,35 +608,34 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, if (r) goto error_validate;
- amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved, - p->bytes_moved_vis); - - amdgpu_job_set_resources(p->job, p->bo_list->gds_obj, - p->bo_list->gws_obj, p->bo_list->oa_obj); - - if (!r && p->uf_entry.tv.bo) { + if (p->uf_entry.tv.bo) { struct amdgpu_bo *uf = ttm_to_amdgpu_bo(p->uf_entry.tv.bo);
r = amdgpu_ttm_alloc_gart(&uf->tbo); + if (r) + goto error_validate; + p->job->uf_addr += amdgpu_bo_gpu_offset(uf); }
+ amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved, + p->bytes_moved_vis); + amdgpu_job_set_resources(p->job, p->bo_list->gds_obj, + p->bo_list->gws_obj, p->bo_list->oa_obj); + return 0; + error_validate: - if (r) - ttm_eu_backoff_reservation(&p->ticket, &p->validated); + ttm_eu_backoff_reservation(&p->ticket, &p->validated);
out_free_user_pages: - if (r) { - amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { - struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo); + amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { + struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
- if (!e->user_pages) - continue; - amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); - kvfree(e->user_pages); - e->user_pages = NULL; - } - mutex_unlock(&p->bo_list->bo_list_mutex); + if (!e->user_pages) + continue; + amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + kvfree(e->user_pages); + e->user_pages = NULL; } return r; }
From: Christian König christian.koenig@amd.com
[ Upstream commit 4458da0bb09d4435956b4377685e8836935e9b9d ]
The basic problem here is that it's not allowed to page fault while holding the reservation lock.
So it can happen that multiple processes try to validate an userptr at the same time.
Work around that by putting the HMM range object into the mutex protected bo list for now.
v2: make sure range is set to NULL in case of an error
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com CC: stable@vger.kernel.org Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h | 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 8 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 53 ++++++------------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 14 +++-- 7 files changed, 46 insertions(+), 51 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 7db4aef9c45c..5e184952ec98 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -985,6 +985,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr, struct amdkfd_process_info *process_info = mem->process_info; struct amdgpu_bo *bo = mem->bo; struct ttm_operation_ctx ctx = { true, false }; + struct hmm_range *range; int ret = 0;
mutex_lock(&process_info->lock); @@ -1014,7 +1015,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr, return 0; }
- ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); + ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages, &range); if (ret) { pr_err("%s: Failed to get user pages: %d\n", __func__, ret); goto unregister_out; @@ -1032,7 +1033,7 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr, amdgpu_bo_unreserve(bo);
release_out: - amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm, range); unregister_out: if (ret) amdgpu_mn_unregister(bo); @@ -2367,6 +2368,8 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info, /* Go through userptr_inval_list and update any invalid user_pages */ list_for_each_entry(mem, &process_info->userptr_inval_list, validate_list.head) { + struct hmm_range *range; + invalid = atomic_read(&mem->invalid); if (!invalid) /* BO hasn't been invalidated since the last @@ -2377,7 +2380,8 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info, bo = mem->bo;
/* Get updated user pages */ - ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); + ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages, + &range); if (ret) { pr_debug("Failed %d to get user pages\n", ret);
@@ -2396,7 +2400,7 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info, * FIXME: Cannot ignore the return code, must hold * notifier_lock */ - amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm, range); }
/* Mark the BO as valid unless it was invalidated diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c index 2168163aad2d..252a876b0725 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c @@ -209,6 +209,7 @@ void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list, list_add_tail(&e->tv.head, &bucket[priority]);
e->user_pages = NULL; + e->range = NULL; }
/* Connect the sorted buckets in the output list. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h index 9caea1688fc3..e4d78491bcc7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h @@ -26,6 +26,8 @@ #include <drm/ttm/ttm_execbuf_util.h> #include <drm/amdgpu_drm.h>
+struct hmm_range; + struct amdgpu_device; struct amdgpu_bo; struct amdgpu_bo_va; @@ -36,6 +38,7 @@ struct amdgpu_bo_list_entry { struct amdgpu_bo_va *bo_va; uint32_t priority; struct page **user_pages; + struct hmm_range *range; bool user_invalidated; };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index fee99a40804e..7e350ea0368b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -548,7 +548,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, goto out_free_user_pages; }
- r = amdgpu_ttm_tt_get_user_pages(bo, e->user_pages); + r = amdgpu_ttm_tt_get_user_pages(bo, e->user_pages, &e->range); if (r) { kvfree(e->user_pages); e->user_pages = NULL; @@ -633,9 +633,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
if (!e->user_pages) continue; - amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm, e->range); kvfree(e->user_pages); e->user_pages = NULL; + e->range = NULL; } return r; } @@ -1230,7 +1231,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, amdgpu_bo_list_for_each_userptr_entry(e, p->bo_list) { struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
- r |= !amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + r |= !amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm, e->range); + e->range = NULL; } if (r) { r = -EAGAIN; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 111484ceb47d..91571b1324f2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -378,6 +378,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, struct amdgpu_device *adev = drm_to_adev(dev); struct drm_amdgpu_gem_userptr *args = data; struct drm_gem_object *gobj; + struct hmm_range *range; struct amdgpu_bo *bo; uint32_t handle; int r; @@ -418,7 +419,8 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, goto release_object;
if (args->flags & AMDGPU_GEM_USERPTR_VALIDATE) { - r = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); + r = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages, + &range); if (r) goto release_object;
@@ -441,7 +443,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
user_pages_done: if (args->flags & AMDGPU_GEM_USERPTR_VALIDATE) - amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm); + amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm, range);
release_object: drm_gem_object_put(gobj); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 9e6c23266a1a..dfb8875e0f28 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -642,9 +642,6 @@ struct amdgpu_ttm_tt { struct task_struct *usertask; uint32_t userflags; bool bound; -#if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR) - struct hmm_range *range; -#endif };
#define ttm_to_amdgpu_ttm_tt(ptr) container_of(ptr, struct amdgpu_ttm_tt, ttm) @@ -657,7 +654,8 @@ struct amdgpu_ttm_tt { * Calling function must call amdgpu_ttm_tt_userptr_range_done() once and only * once afterwards to stop HMM tracking */ -int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) +int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages, + struct hmm_range **range) { struct ttm_tt *ttm = bo->tbo.ttm; struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm); @@ -667,16 +665,15 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) bool readonly; int r = 0;
+ /* Make sure get_user_pages_done() can cleanup gracefully */ + *range = NULL; + mm = bo->notifier.mm; if (unlikely(!mm)) { DRM_DEBUG_DRIVER("BO is not registered?\n"); return -EFAULT; }
- /* Another get_user_pages is running at the same time?? */ - if (WARN_ON(gtt->range)) - return -EFAULT; - if (!mmget_not_zero(mm)) /* Happens during process shutdown */ return -ESRCH;
@@ -694,7 +691,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
readonly = amdgpu_ttm_tt_is_readonly(ttm); r = amdgpu_hmm_range_get_pages(&bo->notifier, mm, pages, start, - ttm->num_pages, >t->range, readonly, + ttm->num_pages, range, readonly, true, NULL); out_unlock: mmap_read_unlock(mm); @@ -712,30 +709,24 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) * * Returns: true if pages are still valid */ -bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm) +bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm, + struct hmm_range *range) { struct amdgpu_ttm_tt *gtt = ttm_to_amdgpu_ttm_tt(ttm); - bool r = false;
- if (!gtt || !gtt->userptr) + if (!gtt || !gtt->userptr || !range) return false;
DRM_DEBUG_DRIVER("user_pages_done 0x%llx pages 0x%x\n", gtt->userptr, ttm->num_pages);
- WARN_ONCE(!gtt->range || !gtt->range->hmm_pfns, - "No user pages to check\n"); + WARN_ONCE(!range->hmm_pfns, "No user pages to check\n");
- if (gtt->range) { - /* - * FIXME: Must always hold notifier_lock for this, and must - * not ignore the return code. - */ - r = amdgpu_hmm_range_get_pages_done(gtt->range); - gtt->range = NULL; - } - - return !r; + /* + * FIXME: Must always hold notifier_lock for this, and must + * not ignore the return code. + */ + return !amdgpu_hmm_range_get_pages_done(range); } #endif
@@ -812,20 +803,6 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_device *bdev, /* unmap the pages mapped to the device */ dma_unmap_sgtable(adev->dev, ttm->sg, direction, 0); sg_free_table(ttm->sg); - -#if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR) - if (gtt->range) { - unsigned long i; - - for (i = 0; i < ttm->num_pages; i++) { - if (ttm->pages[i] != - hmm_pfn_to_page(gtt->range->hmm_pfns[i])) - break; - } - - WARN((i == ttm->num_pages), "Missing get_user_page_done\n"); - } -#endif }
static void amdgpu_ttm_gart_bind(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 6a70818039dd..a37207011a69 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -39,6 +39,8 @@
#define AMDGPU_POISON 0xd0bed0be
+struct hmm_range; + struct amdgpu_gtt_mgr { struct ttm_resource_manager manager; struct drm_mm mm; @@ -149,15 +151,19 @@ void amdgpu_ttm_recover_gart(struct ttm_buffer_object *tbo); uint64_t amdgpu_ttm_domain_start(struct amdgpu_device *adev, uint32_t type);
#if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR) -int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages); -bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm); +int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages, + struct hmm_range **range); +bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm, + struct hmm_range *range); #else static inline int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, - struct page **pages) + struct page **pages, + struct hmm_range **range) { return -EPERM; } -static inline bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm) +static inline bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm, + struct hmm_range *range) { return false; }
From: Sam James sam@gentoo.org
[ Upstream commit 50c697215a8cc22f0e58c88f06f2716c05a26e85 ]
Add missing <linux/string.h> include for strcmp.
Clang 16 makes -Wimplicit-function-declaration an error by default. Unfortunately, out of tree modules may use this in configure scripts, which means failure might cause silent miscompilation or misconfiguration.
For more information, see LWN.net [0] or LLVM's Discourse [1], gentoo-dev@ [2], or the (new) c-std-porting mailing list [3].
[0] https://lwn.net/Articles/913505/ [1] https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-i... [2] https://archives.gentoo.org/gentoo-dev/message/dd9f2d3082b8b6f8dfbccb0639e6e... [3] hosted at lists.linux.dev.
[akpm@linux-foundation.org: remember "linux/"] Link: https://lkml.kernel.org/r/20221116182634.2823136-1-sam@gentoo.org Signed-off-by: Sam James sam@gentoo.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/license.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/linux/license.h b/include/linux/license.h index 7cce390f120b..ad937f57f2cb 100644 --- a/include/linux/license.h +++ b/include/linux/license.h @@ -2,6 +2,8 @@ #ifndef __LICENSE_H #define __LICENSE_H
+#include <linux/string.h> + static inline int license_is_gpl_compatible(const char *license) { return (strcmp(license, "GPL") == 0
From: Yang Wang KevinYang.Wang@amd.com
[ Upstream commit 8e039cd176c61a9770e1956038c93738efc800f7 ]
add smu_v13_0_10 driver if version
Signed-off-by: Yang Wang KevinYang.Wang@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: f2e1aa267f12 ("drm/amd/pm: update driver if header for smu_13_0_7") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 1 + drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 +++ 2 files changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h index 3e29fe4cc4ae..dd5867561068 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -32,6 +32,7 @@ #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_5 0x04 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0 0x30 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x2C +#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_10 0x1D
#define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500 //500ms
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index 33710dcf1eb1..e7380aa4f6be 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -304,6 +304,9 @@ int smu_v13_0_check_fw_version(struct smu_context *smu) case IP_VERSION(13, 0, 5): smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_5; break; + case IP_VERSION(13, 0, 10): + smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_10; + break; default: dev_err(adev->dev, "smu unsupported IP version: 0x%x.\n", adev->ip_versions[MP1_HWIP][0]);
From: Kenneth Feng kenneth.feng@amd.com
[ Upstream commit 09aef0258a327409bb2279a5ba8f82ad2ca099ca ]
update driver-if header for smu_v13_0_10 and merge with smu_v13_0_0
Signed-off-by: Kenneth Feng kenneth.feng@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: f2e1aa267f12 ("drm/amd/pm: update driver if header for smu_13_0_7") Signed-off-by: Sasha Levin sashal@kernel.org --- .../inc/pmfw_if/smu13_driver_if_v13_0_0.h | 111 +++++++++++++----- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 2 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 6 +- 3 files changed, 84 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h index 063f4a737605..b76f0f7e4299 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h @@ -25,7 +25,7 @@ #define SMU13_DRIVER_IF_V13_0_0_H
//Increment this version if SkuTable_t or BoardTable_t change -#define PPTABLE_VERSION 0x24 +#define PPTABLE_VERSION 0x26
#define NUM_GFXCLK_DPM_LEVELS 16 #define NUM_SOCCLK_DPM_LEVELS 8 @@ -109,6 +109,22 @@ #define FEATURE_SPARE_63_BIT 63 #define NUM_FEATURES 64
+#define ALLOWED_FEATURE_CTRL_DEFAULT 0xFFFFFFFFFFFFFFFFULL +#define ALLOWED_FEATURE_CTRL_SCPM ((1 << FEATURE_DPM_GFXCLK_BIT) | \ + (1 << FEATURE_DPM_GFX_POWER_OPTIMIZER_BIT) | \ + (1 << FEATURE_DPM_UCLK_BIT) | \ + (1 << FEATURE_DPM_FCLK_BIT) | \ + (1 << FEATURE_DPM_SOCCLK_BIT) | \ + (1 << FEATURE_DPM_MP0CLK_BIT) | \ + (1 << FEATURE_DPM_LINK_BIT) | \ + (1 << FEATURE_DPM_DCN_BIT) | \ + (1 << FEATURE_DS_GFXCLK_BIT) | \ + (1 << FEATURE_DS_SOCCLK_BIT) | \ + (1 << FEATURE_DS_FCLK_BIT) | \ + (1 << FEATURE_DS_LCLK_BIT) | \ + (1 << FEATURE_DS_DCFCLK_BIT) | \ + (1 << FEATURE_DS_UCLK_BIT)) + //For use with feature control messages typedef enum { FEATURE_PWR_ALL, @@ -133,6 +149,7 @@ typedef enum { #define DEBUG_OVERRIDE_DISABLE_DFLL 0x00000200 #define DEBUG_OVERRIDE_ENABLE_RLC_VF_BRINGUP_MODE 0x00000400 #define DEBUG_OVERRIDE_DFLL_MASTER_MODE 0x00000800 +#define DEBUG_OVERRIDE_ENABLE_PROFILING_MODE 0x00001000
// VR Mapping Bit Defines #define VR_MAPPING_VR_SELECT_MASK 0x01 @@ -262,15 +279,15 @@ typedef enum { } I2cControllerPort_e;
typedef enum { - I2C_CONTROLLER_NAME_VR_GFX = 0, - I2C_CONTROLLER_NAME_VR_SOC, - I2C_CONTROLLER_NAME_VR_VMEMP, - I2C_CONTROLLER_NAME_VR_VDDIO, - I2C_CONTROLLER_NAME_LIQUID0, - I2C_CONTROLLER_NAME_LIQUID1, - I2C_CONTROLLER_NAME_PLX, - I2C_CONTROLLER_NAME_OTHER, - I2C_CONTROLLER_NAME_COUNT, + I2C_CONTROLLER_NAME_VR_GFX = 0, + I2C_CONTROLLER_NAME_VR_SOC, + I2C_CONTROLLER_NAME_VR_VMEMP, + I2C_CONTROLLER_NAME_VR_VDDIO, + I2C_CONTROLLER_NAME_LIQUID0, + I2C_CONTROLLER_NAME_LIQUID1, + I2C_CONTROLLER_NAME_PLX, + I2C_CONTROLLER_NAME_FAN_INTAKE, + I2C_CONTROLLER_NAME_COUNT, } I2cControllerName_e;
typedef enum { @@ -282,16 +299,17 @@ typedef enum { I2C_CONTROLLER_THROTTLER_LIQUID0, I2C_CONTROLLER_THROTTLER_LIQUID1, I2C_CONTROLLER_THROTTLER_PLX, + I2C_CONTROLLER_THROTTLER_FAN_INTAKE, I2C_CONTROLLER_THROTTLER_INA3221, I2C_CONTROLLER_THROTTLER_COUNT, } I2cControllerThrottler_e;
typedef enum { - I2C_CONTROLLER_PROTOCOL_VR_XPDE132G5, - I2C_CONTROLLER_PROTOCOL_VR_IR35217, - I2C_CONTROLLER_PROTOCOL_TMP_TMP102A, - I2C_CONTROLLER_PROTOCOL_INA3221, - I2C_CONTROLLER_PROTOCOL_COUNT, + I2C_CONTROLLER_PROTOCOL_VR_XPDE132G5, + I2C_CONTROLLER_PROTOCOL_VR_IR35217, + I2C_CONTROLLER_PROTOCOL_TMP_MAX31875, + I2C_CONTROLLER_PROTOCOL_INA3221, + I2C_CONTROLLER_PROTOCOL_COUNT, } I2cControllerProtocol_e;
typedef struct { @@ -658,13 +676,20 @@ typedef struct {
#define PP_NUM_OD_VF_CURVE_POINTS PP_NUM_RTAVFS_PWL_ZONES + 1
+typedef enum { + FAN_MODE_AUTO = 0, + FAN_MODE_MANUAL_LINEAR, +} FanMode_e;
typedef struct { uint32_t FeatureCtrlMask;
//Voltage control int16_t VoltageOffsetPerZoneBoundary[PP_NUM_OD_VF_CURVE_POINTS]; - uint16_t reserved[2]; + uint16_t VddGfxVmax; // in mV + + uint8_t IdlePwrSavingFeaturesCtrl; + uint8_t RuntimePwrSavingFeaturesCtrl;
//Frequency changes int16_t GfxclkFmin; // MHz @@ -674,7 +699,7 @@ typedef struct {
//PPT int16_t Ppt; // % - int16_t reserved1; + int16_t Tdc;
//Fan control uint8_t FanLinearPwmPoints[NUM_OD_FAN_MAX_POINTS]; @@ -701,16 +726,19 @@ typedef struct { uint32_t FeatureCtrlMask;
int16_t VoltageOffsetPerZoneBoundary; - uint16_t reserved[2]; + uint16_t VddGfxVmax; // in mV + + uint8_t IdlePwrSavingFeaturesCtrl; + uint8_t RuntimePwrSavingFeaturesCtrl;
- uint16_t GfxclkFmin; // MHz - uint16_t GfxclkFmax; // MHz + int16_t GfxclkFmin; // MHz + int16_t GfxclkFmax; // MHz uint16_t UclkFmin; // MHz uint16_t UclkFmax; // MHz
//PPT int16_t Ppt; // % - int16_t reserved1; + int16_t Tdc;
uint8_t FanLinearPwmPoints; uint8_t FanLinearTempPoints; @@ -857,7 +885,8 @@ typedef struct { uint16_t FanStartTempMin; uint16_t FanStartTempMax;
- uint32_t Spare[12]; + uint16_t PowerMinPpt0[POWER_SOURCE_COUNT]; + uint32_t Spare[11];
} MsgLimits_t;
@@ -1041,7 +1070,17 @@ typedef struct { uint32_t GfxoffSpare[15];
// GFX GPO - uint32_t GfxGpoSpare[16]; + uint32_t DfllBtcMasterScalerM; + int32_t DfllBtcMasterScalerB; + uint32_t DfllBtcSlaveScalerM; + int32_t DfllBtcSlaveScalerB; + + uint32_t DfllPccAsWaitCtrl; //GDFLL_AS_WAIT_CTRL_PCC register value to be passed to RLC msg + uint32_t DfllPccAsStepCtrl; //GDFLL_AS_STEP_CTRL_PCC register value to be passed to RLC msg + + uint32_t DfllL2FrequencyBoostM; //Unitless (float) + uint32_t DfllL2FrequencyBoostB; //In MHz (integer) + uint32_t GfxGpoSpare[8];
// GFX DCS
@@ -1114,12 +1153,14 @@ typedef struct { uint16_t IntakeTempHighIntakeAcousticLimit; uint16_t IntakeTempAcouticLimitReleaseRate;
- uint16_t FanStalledTempLimitOffset; + int16_t FanAbnormalTempLimitOffset; uint16_t FanStalledTriggerRpm; - uint16_t FanAbnormalTriggerRpm; - uint16_t FanPadding; + uint16_t FanAbnormalTriggerRpmCoeff; + uint16_t FanAbnormalDetectionEnable;
- uint32_t FanSpare[14]; + uint8_t FanIntakeSensorSupport; + uint8_t FanIntakePadding[3]; + uint32_t FanSpare[13];
// SECTION: VDD_GFX AVFS
@@ -1198,8 +1239,13 @@ typedef struct { int16_t TotalBoardPowerM; int16_t TotalBoardPowerB;
+ //PMFW-11158 + QuadraticInt_t qFeffCoeffGameClock[POWER_SOURCE_COUNT]; + QuadraticInt_t qFeffCoeffBaseClock[POWER_SOURCE_COUNT]; + QuadraticInt_t qFeffCoeffBoostClock[POWER_SOURCE_COUNT]; + // SECTION: Sku Reserved - uint32_t Spare[61]; + uint32_t Spare[43];
// Padding for MMHUB - do not modify this uint32_t MmHubPadding[8]; @@ -1288,8 +1334,11 @@ typedef struct { uint32_t PostVoltageSetBacoDelay; // in microseconds. Amount of time FW will wait after power good is established or PSI0 command is issued uint32_t BacoEntryDelay; // in milliseconds. Amount of time FW will wait to trigger BACO entry after receiving entry notification from OS
+ uint8_t FuseWritePowerMuxPresent; + uint8_t FuseWritePadding[3]; + // SECTION: Board Reserved - uint32_t BoardSpare[64]; + uint32_t BoardSpare[63];
// SECTION: Structure Padding
@@ -1381,7 +1430,7 @@ typedef struct { uint16_t AverageTotalBoardPower;
uint16_t AvgTemperature[TEMP_COUNT]; - uint16_t TempPadding; + uint16_t AvgTemperatureFanIntake;
uint8_t PcieRate ; uint8_t PcieWidth ; @@ -1550,5 +1599,7 @@ typedef struct { #define IH_INTERRUPT_CONTEXT_ID_AUDIO_D0 0x5 #define IH_INTERRUPT_CONTEXT_ID_AUDIO_D3 0x6 #define IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING 0x7 +#define IH_INTERRUPT_CONTEXT_ID_FAN_ABNORMAL 0x8 +#define IH_INTERRUPT_CONTEXT_ID_FAN_RECOVERY 0x9
#endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h index dd5867561068..b7f4569aff2a 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -30,7 +30,7 @@ #define SMU13_DRIVER_IF_VERSION_ALDE 0x08 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_4 0x07 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_5 0x04 -#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0 0x30 +#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0_10 0x32 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x2C #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_10 0x1D
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c index e7380aa4f6be..1983e0d29e9d 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -288,7 +288,8 @@ int smu_v13_0_check_fw_version(struct smu_context *smu) smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_ALDE; break; case IP_VERSION(13, 0, 0): - smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_0; + case IP_VERSION(13, 0, 10): + smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_0_10; break; case IP_VERSION(13, 0, 7): smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_7; @@ -304,9 +305,6 @@ int smu_v13_0_check_fw_version(struct smu_context *smu) case IP_VERSION(13, 0, 5): smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_5; break; - case IP_VERSION(13, 0, 10): - smu->smc_driver_if_version = SMU13_DRIVER_IF_VERSION_SMU_V13_0_10; - break; default: dev_err(adev->dev, "smu unsupported IP version: 0x%x.\n", adev->ip_versions[MP1_HWIP][0]);
From: lyndonli Lyndon.Li@amd.com
[ Upstream commit f2e1aa267f12b82e03927d1e918d2844ddd3eea5 ]
update driver if header for smu_13_0_7
Signed-off-by: lyndonli Lyndon.Li@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Reviewed-by: Kenneth Feng kenneth.feng@amd.com Reviewed-by: Evan Quan evan.quan@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.0.x Signed-off-by: Sasha Levin sashal@kernel.org --- .../inc/pmfw_if/smu13_driver_if_v13_0_7.h | 117 ++++++++++++------ drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 2 +- 2 files changed, 80 insertions(+), 39 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h index 25c08f963f49..d6b13933a98f 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h @@ -25,10 +25,10 @@
// *** IMPORTANT *** // PMFW TEAM: Always increment the interface version on any change to this file -#define SMU13_DRIVER_IF_VERSION 0x2C +#define SMU13_DRIVER_IF_VERSION 0x35
//Increment this version if SkuTable_t or BoardTable_t change -#define PPTABLE_VERSION 0x20 +#define PPTABLE_VERSION 0x27
#define NUM_GFXCLK_DPM_LEVELS 16 #define NUM_SOCCLK_DPM_LEVELS 8 @@ -96,7 +96,7 @@ #define FEATURE_MEM_TEMP_READ_BIT 47 #define FEATURE_ATHUB_MMHUB_PG_BIT 48 #define FEATURE_SOC_PCC_BIT 49 -#define FEATURE_SPARE_50_BIT 50 +#define FEATURE_EDC_PWRBRK_BIT 50 #define FEATURE_SPARE_51_BIT 51 #define FEATURE_SPARE_52_BIT 52 #define FEATURE_SPARE_53_BIT 53 @@ -282,15 +282,15 @@ typedef enum { } I2cControllerPort_e;
typedef enum { - I2C_CONTROLLER_NAME_VR_GFX = 0, - I2C_CONTROLLER_NAME_VR_SOC, - I2C_CONTROLLER_NAME_VR_VMEMP, - I2C_CONTROLLER_NAME_VR_VDDIO, - I2C_CONTROLLER_NAME_LIQUID0, - I2C_CONTROLLER_NAME_LIQUID1, - I2C_CONTROLLER_NAME_PLX, - I2C_CONTROLLER_NAME_OTHER, - I2C_CONTROLLER_NAME_COUNT, + I2C_CONTROLLER_NAME_VR_GFX = 0, + I2C_CONTROLLER_NAME_VR_SOC, + I2C_CONTROLLER_NAME_VR_VMEMP, + I2C_CONTROLLER_NAME_VR_VDDIO, + I2C_CONTROLLER_NAME_LIQUID0, + I2C_CONTROLLER_NAME_LIQUID1, + I2C_CONTROLLER_NAME_PLX, + I2C_CONTROLLER_NAME_FAN_INTAKE, + I2C_CONTROLLER_NAME_COUNT, } I2cControllerName_e;
typedef enum { @@ -302,6 +302,7 @@ typedef enum { I2C_CONTROLLER_THROTTLER_LIQUID0, I2C_CONTROLLER_THROTTLER_LIQUID1, I2C_CONTROLLER_THROTTLER_PLX, + I2C_CONTROLLER_THROTTLER_FAN_INTAKE, I2C_CONTROLLER_THROTTLER_INA3221, I2C_CONTROLLER_THROTTLER_COUNT, } I2cControllerThrottler_e; @@ -309,8 +310,9 @@ typedef enum { typedef enum { I2C_CONTROLLER_PROTOCOL_VR_XPDE132G5, I2C_CONTROLLER_PROTOCOL_VR_IR35217, - I2C_CONTROLLER_PROTOCOL_TMP_TMP102A, + I2C_CONTROLLER_PROTOCOL_TMP_MAX31875, I2C_CONTROLLER_PROTOCOL_INA3221, + I2C_CONTROLLER_PROTOCOL_TMP_MAX6604, I2C_CONTROLLER_PROTOCOL_COUNT, } I2cControllerProtocol_e;
@@ -690,6 +692,9 @@ typedef struct { #define PP_OD_FEATURE_UCLK_BIT 8 #define PP_OD_FEATURE_ZERO_FAN_BIT 9 #define PP_OD_FEATURE_TEMPERATURE_BIT 10 +#define PP_OD_FEATURE_POWER_FEATURE_CTRL_BIT 11 +#define PP_OD_FEATURE_ASIC_TDC_BIT 12 +#define PP_OD_FEATURE_COUNT 13
typedef enum { PP_OD_POWER_FEATURE_ALWAYS_ENABLED, @@ -697,6 +702,11 @@ typedef enum { PP_OD_POWER_FEATURE_ALWAYS_DISABLED, } PP_OD_POWER_FEATURE_e;
+typedef enum { + FAN_MODE_AUTO = 0, + FAN_MODE_MANUAL_LINEAR, +} FanMode_e; + typedef struct { uint32_t FeatureCtrlMask;
@@ -708,8 +718,8 @@ typedef struct { uint8_t RuntimePwrSavingFeaturesCtrl;
//Frequency changes - int16_t GfxclkFmin; // MHz - int16_t GfxclkFmax; // MHz + int16_t GfxclkFmin; // MHz + int16_t GfxclkFmax; // MHz uint16_t UclkFmin; // MHz uint16_t UclkFmax; // MHz
@@ -730,7 +740,12 @@ typedef struct { uint8_t MaxOpTemp; uint8_t Padding[4];
- uint32_t Spare[12]; + uint16_t GfxVoltageFullCtrlMode; + uint16_t GfxclkFullCtrlMode; + uint16_t UclkFullCtrlMode; + int16_t AsicTdc; + + uint32_t Spare[10]; uint32_t MmHubPadding[8]; // SMU internal use. Adding here instead of external as a workaround } OverDriveTable_t;
@@ -748,8 +763,8 @@ typedef struct { uint8_t IdlePwrSavingFeaturesCtrl; uint8_t RuntimePwrSavingFeaturesCtrl;
- uint16_t GfxclkFmin; // MHz - uint16_t GfxclkFmax; // MHz + int16_t GfxclkFmin; // MHz + int16_t GfxclkFmax; // MHz uint16_t UclkFmin; // MHz uint16_t UclkFmax; // MHz
@@ -769,7 +784,12 @@ typedef struct { uint8_t MaxOpTemp; uint8_t Padding[4];
- uint32_t Spare[12]; + uint16_t GfxVoltageFullCtrlMode; + uint16_t GfxclkFullCtrlMode; + uint16_t UclkFullCtrlMode; + int16_t AsicTdc; + + uint32_t Spare[10];
} OverDriveLimits_t;
@@ -903,7 +923,8 @@ typedef struct { uint16_t FanStartTempMin; uint16_t FanStartTempMax;
- uint32_t Spare[12]; + uint16_t PowerMinPpt0[POWER_SOURCE_COUNT]; + uint32_t Spare[11];
} MsgLimits_t;
@@ -1086,11 +1107,13 @@ typedef struct { uint32_t GfxoffSpare[15];
// GFX GPO - float DfllBtcMasterScalerM; + uint32_t DfllBtcMasterScalerM; int32_t DfllBtcMasterScalerB; - float DfllBtcSlaveScalerM; + uint32_t DfllBtcSlaveScalerM; int32_t DfllBtcSlaveScalerB; - uint32_t GfxGpoSpare[12]; + uint32_t DfllPccAsWaitCtrl; //GDFLL_AS_WAIT_CTRL_PCC register value to be passed to RLC msg + uint32_t DfllPccAsStepCtrl; //GDFLL_AS_STEP_CTRL_PCC register value to be passed to RLC msg + uint32_t GfxGpoSpare[10];
// GFX DCS
@@ -1106,7 +1129,10 @@ typedef struct { uint16_t DcsTimeout; //This is the amount of time SMU FW waits for RLC to put GFX into GFXOFF before reverting to the fallback mechanism of throttling GFXCLK to Fmin.
- uint32_t DcsSpare[16]; + uint32_t DcsSpare[14]; + + // UCLK section + uint16_t ShadowFreqTableUclk[NUM_UCLK_DPM_LEVELS]; // In MHz
// UCLK section uint8_t UseStrobeModeOptimizations; //Set to indicate that FW should use strobe mode optimizations @@ -1163,13 +1189,14 @@ typedef struct { uint16_t IntakeTempHighIntakeAcousticLimit; uint16_t IntakeTempAcouticLimitReleaseRate;
- uint16_t FanStalledTempLimitOffset; + int16_t FanAbnormalTempLimitOffset; uint16_t FanStalledTriggerRpm; - uint16_t FanAbnormalTriggerRpm; - uint16_t FanPadding; - - uint32_t FanSpare[14]; + uint16_t FanAbnormalTriggerRpmCoeff; + uint16_t FanAbnormalDetectionEnable;
+ uint8_t FanIntakeSensorSupport; + uint8_t FanIntakePadding[3]; + uint32_t FanSpare[13]; // SECTION: VDD_GFX AVFS
uint8_t OverrideGfxAvfsFuses; @@ -1193,7 +1220,6 @@ typedef struct { uint32_t dGbV_dT_vmin; uint32_t dGbV_dT_vmax;
- //Unused: PMFW-9370 uint32_t V2F_vmin_range_low; uint32_t V2F_vmin_range_high; uint32_t V2F_vmax_range_low; @@ -1238,8 +1264,21 @@ typedef struct { // SECTION: Advanced Options uint32_t DebugOverrides;
+ // Section: Total Board Power idle vs active coefficients + uint8_t TotalBoardPowerSupport; + uint8_t TotalBoardPowerPadding[3]; + + int16_t TotalIdleBoardPowerM; + int16_t TotalIdleBoardPowerB; + int16_t TotalBoardPowerM; + int16_t TotalBoardPowerB; + + QuadraticInt_t qFeffCoeffGameClock[POWER_SOURCE_COUNT]; + QuadraticInt_t qFeffCoeffBaseClock[POWER_SOURCE_COUNT]; + QuadraticInt_t qFeffCoeffBoostClock[POWER_SOURCE_COUNT]; + // SECTION: Sku Reserved - uint32_t Spare[64]; + uint32_t Spare[43];
// Padding for MMHUB - do not modify this uint32_t MmHubPadding[8]; @@ -1304,7 +1343,8 @@ typedef struct { // SECTION: Clock Spread Spectrum
// UCLK Spread Spectrum - uint16_t UclkSpreadPadding; + uint8_t UclkTrainingModeSpreadPercent; // Q4.4 + uint8_t UclkSpreadPadding; uint16_t UclkSpreadFreq; // kHz
// UCLK Spread Spectrum @@ -1317,11 +1357,7 @@ typedef struct {
// Section: Memory Config uint8_t DramWidth; // Width of interface to the channel for each DRAM module. See DRAM_BIT_WIDTH_TYPE_e - uint8_t PaddingMem1[3]; - - // Section: Total Board Power - uint16_t TotalBoardPower; //Only needed for TCP Estimated case, where TCP = TGP+Total Board Power - uint16_t BoardPowerPadding; + uint8_t PaddingMem1[7];
// SECTION: UMC feature flags uint8_t HsrEnabled; @@ -1423,8 +1459,11 @@ typedef struct { uint16_t Vcn1ActivityPercentage ;
uint32_t EnergyAccumulator; - uint16_t AverageSocketPower ; + uint16_t AverageSocketPower; + uint16_t AverageTotalBoardPower; + uint16_t AvgTemperature[TEMP_COUNT]; + uint16_t AvgTemperatureFanIntake;
uint8_t PcieRate ; uint8_t PcieWidth ; @@ -1592,5 +1631,7 @@ typedef struct { #define IH_INTERRUPT_CONTEXT_ID_AUDIO_D0 0x5 #define IH_INTERRUPT_CONTEXT_ID_AUDIO_D3 0x6 #define IH_INTERRUPT_CONTEXT_ID_THERMAL_THROTTLING 0x7 +#define IH_INTERRUPT_CONTEXT_ID_FAN_ABNORMAL 0x8 +#define IH_INTERRUPT_CONTEXT_ID_FAN_RECOVERY 0x9
#endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h index b7f4569aff2a..865d6358918d 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -31,7 +31,7 @@ #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_4 0x07 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_5 0x04 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0_10 0x32 -#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x2C +#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x35 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_10 0x1D
#define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500 //500ms
From: David Virag virag.david003@gmail.com
[ Upstream commit ef80c95c29dc67c3034f32d93c41e2ede398e387 ]
"div4" DIVs which divide PLLs by 4 are actually dividing "div2" DIVs by 2 to achieve a by 4 division, thus their parents are the respective "div2" DIVs. These DIVs were mistakenly set to have the PLLs as parents. This leads to the kernel thinking "div4"s and everything under them run at 2x the clock speed. Fix this.
Fixes: 45bd8166a1d8 ("clk: samsung: Add initial Exynos7885 clock driver") Signed-off-by: David Virag virag.david003@gmail.com Acked-by: Chanwoo Choi cw00.choi@samsung.com Link: https://lore.kernel.org/r/20221013151341.151208-1-virag.david003@gmail.com Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/samsung/clk-exynos7885.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/samsung/clk-exynos7885.c b/drivers/clk/samsung/clk-exynos7885.c index a7b106302706..368c50badd15 100644 --- a/drivers/clk/samsung/clk-exynos7885.c +++ b/drivers/clk/samsung/clk-exynos7885.c @@ -182,7 +182,7 @@ static const struct samsung_div_clock top_div_clks[] __initconst = { CLK_CON_DIV_PLL_SHARED0_DIV2, 0, 1), DIV(CLK_DOUT_SHARED0_DIV3, "dout_shared0_div3", "fout_shared0_pll", CLK_CON_DIV_PLL_SHARED0_DIV3, 0, 2), - DIV(CLK_DOUT_SHARED0_DIV4, "dout_shared0_div4", "fout_shared0_pll", + DIV(CLK_DOUT_SHARED0_DIV4, "dout_shared0_div4", "dout_shared0_div2", CLK_CON_DIV_PLL_SHARED0_DIV4, 0, 1), DIV(CLK_DOUT_SHARED0_DIV5, "dout_shared0_div5", "fout_shared0_pll", CLK_CON_DIV_PLL_SHARED0_DIV5, 0, 3), @@ -190,7 +190,7 @@ static const struct samsung_div_clock top_div_clks[] __initconst = { CLK_CON_DIV_PLL_SHARED1_DIV2, 0, 1), DIV(CLK_DOUT_SHARED1_DIV3, "dout_shared1_div3", "fout_shared1_pll", CLK_CON_DIV_PLL_SHARED1_DIV3, 0, 2), - DIV(CLK_DOUT_SHARED1_DIV4, "dout_shared1_div4", "fout_shared1_pll", + DIV(CLK_DOUT_SHARED1_DIV4, "dout_shared1_div4", "dout_shared1_div2", CLK_CON_DIV_PLL_SHARED1_DIV4, 0, 1),
/* CORE */
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit eab4c1ebdd657957bf7ae66ffb8849b462db78b3 ]
Since commit 7eb231c337e0 ("PM / Domains: Convert pm_genpd_init() to return an error code") pm_genpd_init() can return an error which the caller must handle.
The current error handling was also incomplete as the runtime PM and regulator use counts were not balanced in all error paths.
Add the missing error handling to the GDSC initialisation to avoid continuing as if nothing happened on errors.
Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20220929155816.17425-1-johan+linaro@kernel.org Stable-dep-of: 4cc47e8add63 ("clk: qcom: gdsc: Remove direct runtime PM calls") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gdsc.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c index d3244006c661..4b66ce0f1940 100644 --- a/drivers/clk/qcom/gdsc.c +++ b/drivers/clk/qcom/gdsc.c @@ -439,11 +439,8 @@ static int gdsc_init(struct gdsc *sc)
/* ...and the power-domain */ ret = gdsc_pm_runtime_get(sc); - if (ret) { - if (sc->rsupply) - regulator_disable(sc->rsupply); - return ret; - } + if (ret) + goto err_disable_supply;
/* * Votable GDSCs can be ON due to Vote from other masters. @@ -452,14 +449,14 @@ static int gdsc_init(struct gdsc *sc) if (sc->flags & VOTABLE) { ret = gdsc_update_collapse_bit(sc, false); if (ret) - return ret; + goto err_put_rpm; }
/* Turn on HW trigger mode if supported */ if (sc->flags & HW_CTRL) { ret = gdsc_hwctrl(sc, true); if (ret < 0) - return ret; + goto err_put_rpm; }
/* @@ -486,9 +483,21 @@ static int gdsc_init(struct gdsc *sc) sc->pd.power_off = gdsc_disable; if (!sc->pd.power_on) sc->pd.power_on = gdsc_enable; - pm_genpd_init(&sc->pd, NULL, !on); + + ret = pm_genpd_init(&sc->pd, NULL, !on); + if (ret) + goto err_put_rpm;
return 0; + +err_put_rpm: + if (on) + gdsc_pm_runtime_put(sc); +err_disable_supply: + if (on && sc->rsupply) + regulator_disable(sc->rsupply); + + return ret; }
int gdsc_register(struct gdsc_desc *desc,
From: Stephen Boyd swboyd@chromium.org
[ Upstream commit 4cc47e8add635408e063c98b52d56b7ceacf0b70 ]
We shouldn't be calling runtime PM APIs from within the genpd enable/disable path for a couple reasons.
First, this causes an AA lockdep splat[1] because genpd can call into genpd code again while holding the genpd lock.
WARNING: possible recursive locking detected 5.19.0-rc2-lockdep+ #7 Not tainted -------------------------------------------- kworker/2:1/49 is trying to acquire lock: ffffffeea0370788 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30
but task is already holding lock: ffffffeea03710a8 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&genpd->mlock); lock(&genpd->mlock);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/2:1/49: #0: 74ffff80811a5748 ((wq_completion)pm){+.+.}-{0:0}, at: process_one_work+0x320/0x5fc #1: ffffffc008537cf8 ((work_completion)(&genpd->power_off_work)){+.+.}-{0:0}, at: process_one_work+0x354/0x5fc #2: ffffffeea03710a8 (&genpd->mlock){+.+.}-{3:3}, at: genpd_lock_mtx+0x24/0x30
stack backtrace: CPU: 2 PID: 49 Comm: kworker/2:1 Not tainted 5.19.0-rc2-lockdep+ #7 Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT) Workqueue: pm genpd_power_off_work_fn Call trace: dump_backtrace+0x1a0/0x200 show_stack+0x24/0x30 dump_stack_lvl+0x7c/0xa0 dump_stack+0x18/0x44 __lock_acquire+0xb38/0x3634 lock_acquire+0x180/0x2d4 __mutex_lock_common+0x118/0xe30 mutex_lock_nested+0x70/0x7c genpd_lock_mtx+0x24/0x30 genpd_runtime_suspend+0x2f0/0x414 __rpm_callback+0xdc/0x1b8 rpm_callback+0x4c/0xcc rpm_suspend+0x21c/0x5f0 rpm_idle+0x17c/0x1e0 __pm_runtime_idle+0x78/0xcc gdsc_disable+0x24c/0x26c _genpd_power_off+0xd4/0x1c4 genpd_power_off+0x2d8/0x41c genpd_power_off_work_fn+0x60/0x94 process_one_work+0x398/0x5fc worker_thread+0x42c/0x6c4 kthread+0x194/0x1b4 ret_from_fork+0x10/0x20
Second, this confuses runtime PM on CoachZ for the camera devices by causing the camera clock controller's runtime PM usage_count to go negative after resuming from suspend. This is because runtime PM is being used on the clock controller while runtime PM is disabled for the device.
The reason for the negative count is because a GDSC is represented as a genpd and each genpd that is attached to a device is resumed during the noirq phase of system wide suspend/resume (see the noirq suspend ops assignment in pm_genpd_init() for more details). The camera GDSCs are attached to camera devices with the 'power-domains' property in DT. Every device has runtime PM disabled in the late system suspend phase via __device_suspend_late(). Runtime PM is not usable until runtime PM is enabled in device_resume_early(). The noirq phases run after the 'late' and before the 'early' phase of suspend/resume. When the genpds are resumed in genpd_resume_noirq(), we call down into gdsc_enable() that calls pm_runtime_resume_and_get() and that returns -EACCES to indicate failure to resume because runtime PM is disabled for all devices.
Upon closer inspection, calling runtime PM APIs like this in the GDSC driver doesn't make sense. It was intended to make sure the GDSC for the clock controller providing other GDSCs was enabled, specifically the MMCX GDSC for the display clk controller on SM8250 (sm8250-dispcc), so that GDSC register accesses succeeded. That will already happen because we make the 'dev->pm_domain' a parent domain of each GDSC we register in gdsc_register() via pm_genpd_add_subdomain(). When any of these GDSCs are accessed, we'll enable the parent domain (in this specific case MMCX).
We also remove any getting of runtime PM during registration, because when a genpd is registered it increments the count on the parent if the genpd itself is already enabled.
Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Johan Hovold johan+linaro@kernel.org Cc: Ulf Hansson ulf.hansson@linaro.org Cc: Taniya Das quic_tdas@quicinc.com Cc: Satya Priya quic_c_skakit@quicinc.com Reviewed-by: Douglas Anderson dianders@chromium.org Tested-by: Douglas Anderson dianders@chromium.org Cc: Matthias Kaehlcke mka@chromium.org Reported-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/CAE-0n52xbZeJ66RaKwggeRB57fUAwjvxGxfFMKOKJMKVyFTe+... [1] Fixes: 1b771839de05 ("clk: qcom: gdsc: enable optional power domain support") Signed-off-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20221103183030.3594899-1-swboyd@chromium.org Tested-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gdsc.c | 61 ++++------------------------------------- drivers/clk/qcom/gdsc.h | 2 -- 2 files changed, 6 insertions(+), 57 deletions(-)
diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c index 4b66ce0f1940..39b35058ad47 100644 --- a/drivers/clk/qcom/gdsc.c +++ b/drivers/clk/qcom/gdsc.c @@ -11,7 +11,6 @@ #include <linux/kernel.h> #include <linux/ktime.h> #include <linux/pm_domain.h> -#include <linux/pm_runtime.h> #include <linux/regmap.h> #include <linux/regulator/consumer.h> #include <linux/reset-controller.h> @@ -56,22 +55,6 @@ enum gdsc_status { GDSC_ON };
-static int gdsc_pm_runtime_get(struct gdsc *sc) -{ - if (!sc->dev) - return 0; - - return pm_runtime_resume_and_get(sc->dev); -} - -static int gdsc_pm_runtime_put(struct gdsc *sc) -{ - if (!sc->dev) - return 0; - - return pm_runtime_put_sync(sc->dev); -} - /* Returns 1 if GDSC status is status, 0 if not, and < 0 on error */ static int gdsc_check_status(struct gdsc *sc, enum gdsc_status status) { @@ -271,8 +254,9 @@ static void gdsc_retain_ff_on(struct gdsc *sc) regmap_update_bits(sc->regmap, sc->gdscr, mask, mask); }
-static int _gdsc_enable(struct gdsc *sc) +static int gdsc_enable(struct generic_pm_domain *domain) { + struct gdsc *sc = domain_to_gdsc(domain); int ret;
if (sc->pwrsts == PWRSTS_ON) @@ -328,22 +312,11 @@ static int _gdsc_enable(struct gdsc *sc) return 0; }
-static int gdsc_enable(struct generic_pm_domain *domain) +static int gdsc_disable(struct generic_pm_domain *domain) { struct gdsc *sc = domain_to_gdsc(domain); int ret;
- ret = gdsc_pm_runtime_get(sc); - if (ret) - return ret; - - return _gdsc_enable(sc); -} - -static int _gdsc_disable(struct gdsc *sc) -{ - int ret; - if (sc->pwrsts == PWRSTS_ON) return gdsc_assert_reset(sc);
@@ -378,18 +351,6 @@ static int _gdsc_disable(struct gdsc *sc) return 0; }
-static int gdsc_disable(struct generic_pm_domain *domain) -{ - struct gdsc *sc = domain_to_gdsc(domain); - int ret; - - ret = _gdsc_disable(sc); - - gdsc_pm_runtime_put(sc); - - return ret; -} - static int gdsc_init(struct gdsc *sc) { u32 mask, val; @@ -437,11 +398,6 @@ static int gdsc_init(struct gdsc *sc) return ret; }
- /* ...and the power-domain */ - ret = gdsc_pm_runtime_get(sc); - if (ret) - goto err_disable_supply; - /* * Votable GDSCs can be ON due to Vote from other masters. * If a Votable GDSC is ON, make sure we have a Vote. @@ -449,14 +405,14 @@ static int gdsc_init(struct gdsc *sc) if (sc->flags & VOTABLE) { ret = gdsc_update_collapse_bit(sc, false); if (ret) - goto err_put_rpm; + goto err_disable_supply; }
/* Turn on HW trigger mode if supported */ if (sc->flags & HW_CTRL) { ret = gdsc_hwctrl(sc, true); if (ret < 0) - goto err_put_rpm; + goto err_disable_supply; }
/* @@ -486,13 +442,10 @@ static int gdsc_init(struct gdsc *sc)
ret = pm_genpd_init(&sc->pd, NULL, !on); if (ret) - goto err_put_rpm; + goto err_disable_supply;
return 0;
-err_put_rpm: - if (on) - gdsc_pm_runtime_put(sc); err_disable_supply: if (on && sc->rsupply) regulator_disable(sc->rsupply); @@ -531,8 +484,6 @@ int gdsc_register(struct gdsc_desc *desc, for (i = 0; i < num; i++) { if (!scs[i]) continue; - if (pm_runtime_enabled(dev)) - scs[i]->dev = dev; scs[i]->regmap = regmap; scs[i]->rcdev = rcdev; ret = gdsc_init(scs[i]); diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h index 5de48c9439b2..8d569232bbd6 100644 --- a/drivers/clk/qcom/gdsc.h +++ b/drivers/clk/qcom/gdsc.h @@ -30,7 +30,6 @@ struct reset_controller_dev; * @resets: ids of resets associated with this gdsc * @reset_count: number of @resets * @rcdev: reset controller - * @dev: the device holding the GDSC, used for pm_runtime calls */ struct gdsc { struct generic_pm_domain pd; @@ -69,7 +68,6 @@ struct gdsc {
const char *supply; struct regulator *rsupply; - struct device *dev; };
struct gdsc_desc {
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit 58143c1ed5882c138a3cd2251a336fc8755f23d9 ]
KASAN report out-of-bounds read as follows:
BUG: KASAN: global-out-of-bounds in afe4403_read_raw+0x42e/0x4c0 Read of size 4 at addr ffffffffc02ac638 by task cat/279
Call Trace: afe4403_read_raw iio_read_channel_info dev_attr_show
The buggy address belongs to the variable: afe4403_channel_leds+0x18/0xffffffffffffe9e0
This issue can be reproduced by singe command:
$ cat /sys/bus/spi/devices/spi0.0/iio:device0/in_intensity6_raw
The array size of afe4403_channel_leds is less than channels, so access with chan->address cause OOB read in afe4403_read_raw. Fix it by moving access before use it.
Fixes: b36e8257641a ("iio: health/afe440x: Use regmap fields") Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Acked-by: Andrew Davis afd@ti.com Link: https://lore.kernel.org/r/20221107151946.89260-1-weiyongjun@huaweicloud.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/health/afe4403.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/health/afe4403.c b/drivers/iio/health/afe4403.c index 3bb4028c5d74..df3bc5c3d378 100644 --- a/drivers/iio/health/afe4403.c +++ b/drivers/iio/health/afe4403.c @@ -245,14 +245,14 @@ static int afe4403_read_raw(struct iio_dev *indio_dev, int *val, int *val2, long mask) { struct afe4403_data *afe = iio_priv(indio_dev); - unsigned int reg = afe4403_channel_values[chan->address]; - unsigned int field = afe4403_channel_leds[chan->address]; + unsigned int reg, field; int ret;
switch (chan->type) { case IIO_INTENSITY: switch (mask) { case IIO_CHAN_INFO_RAW: + reg = afe4403_channel_values[chan->address]; ret = afe4403_read(afe, reg, val); if (ret) return ret; @@ -262,6 +262,7 @@ static int afe4403_read_raw(struct iio_dev *indio_dev, case IIO_CURRENT: switch (mask) { case IIO_CHAN_INFO_RAW: + field = afe4403_channel_leds[chan->address]; ret = regmap_field_read(afe->fields[field], val); if (ret) return ret;
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit fc92d9e3de0b2d30a3ccc08048a5fad533e4672b ]
KASAN report out-of-bounds read as follows:
BUG: KASAN: global-out-of-bounds in afe4404_read_raw+0x2ce/0x380 Read of size 4 at addr ffffffffc00e4658 by task cat/278
Call Trace: afe4404_read_raw iio_read_channel_info dev_attr_show
The buggy address belongs to the variable: afe4404_channel_leds+0x18/0xffffffffffffe9c0
This issue can be reproduce by singe command:
$ cat /sys/bus/i2c/devices/0-0058/iio:device0/in_intensity6_raw
The array size of afe4404_channel_leds and afe4404_channel_offdacs are less than channels, so access with chan->address cause OOB read in afe4404_[read|write]_raw. Fix it by moving access before use them.
Fixes: b36e8257641a ("iio: health/afe440x: Use regmap fields") Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Acked-by: Andrew Davis afd@ti.com Link: https://lore.kernel.org/r/20221107152010.95937-1-weiyongjun@huaweicloud.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/health/afe4404.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/health/afe4404.c b/drivers/iio/health/afe4404.c index dd7800159051..f03c466c9385 100644 --- a/drivers/iio/health/afe4404.c +++ b/drivers/iio/health/afe4404.c @@ -250,20 +250,20 @@ static int afe4404_read_raw(struct iio_dev *indio_dev, int *val, int *val2, long mask) { struct afe4404_data *afe = iio_priv(indio_dev); - unsigned int value_reg = afe4404_channel_values[chan->address]; - unsigned int led_field = afe4404_channel_leds[chan->address]; - unsigned int offdac_field = afe4404_channel_offdacs[chan->address]; + unsigned int value_reg, led_field, offdac_field; int ret;
switch (chan->type) { case IIO_INTENSITY: switch (mask) { case IIO_CHAN_INFO_RAW: + value_reg = afe4404_channel_values[chan->address]; ret = regmap_read(afe->regmap, value_reg, val); if (ret) return ret; return IIO_VAL_INT; case IIO_CHAN_INFO_OFFSET: + offdac_field = afe4404_channel_offdacs[chan->address]; ret = regmap_field_read(afe->fields[offdac_field], val); if (ret) return ret; @@ -273,6 +273,7 @@ static int afe4404_read_raw(struct iio_dev *indio_dev, case IIO_CURRENT: switch (mask) { case IIO_CHAN_INFO_RAW: + led_field = afe4404_channel_leds[chan->address]; ret = regmap_field_read(afe->fields[led_field], val); if (ret) return ret; @@ -295,19 +296,20 @@ static int afe4404_write_raw(struct iio_dev *indio_dev, int val, int val2, long mask) { struct afe4404_data *afe = iio_priv(indio_dev); - unsigned int led_field = afe4404_channel_leds[chan->address]; - unsigned int offdac_field = afe4404_channel_offdacs[chan->address]; + unsigned int led_field, offdac_field;
switch (chan->type) { case IIO_INTENSITY: switch (mask) { case IIO_CHAN_INFO_OFFSET: + offdac_field = afe4404_channel_offdacs[chan->address]; return regmap_field_write(afe->fields[offdac_field], val); } break; case IIO_CURRENT: switch (mask) { case IIO_CHAN_INFO_RAW: + led_field = afe4404_channel_leds[chan->address]; return regmap_field_write(afe->fields[led_field], val); } break;
From: Paul Gazzillo paul@pgazz.com
[ Upstream commit 6ac12303572ef9ace5603c2c07f5f1b00a33f580 ]
Fix an implicit declaration of function error for rpr0521 under some configs
When CONFIG_RPR0521 is enabled without CONFIG_IIO_TRIGGERED_BUFFER, the build results in "implicit declaration of function" errors, e.g., drivers/iio/light/rpr0521.c:434:3: error: implicit declaration of function 'iio_trigger_poll_chained' [-Werror=implicit-function-declaration] 434 | iio_trigger_poll_chained(data->drdy_trigger0); | ^~~~~~~~~~~~~~~~~~~~~~~~
This fix adds select dependencies to RPR0521's configuration declaration.
Fixes: e12ffd241c00 ("iio: light: rpr0521 triggered buffer") Signed-off-by: Paul Gazzillo paul@pgazz.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216678 Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20221110214729.ls5ixav5kxpeftk7@device Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/light/Kconfig | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/iio/light/Kconfig b/drivers/iio/light/Kconfig index 8537e88f02e3..c02393009a2c 100644 --- a/drivers/iio/light/Kconfig +++ b/drivers/iio/light/Kconfig @@ -293,6 +293,8 @@ config RPR0521 tristate "ROHM RPR0521 ALS and proximity sensor driver" depends on I2C select REGMAP_I2C + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help Say Y here if you want to build support for ROHM's RPR0521 ambient light and proximity sensor device.
From: Jiri Olsa jolsa@kernel.org
[ Upstream commit 5fd2a60aecf3a42b14fa371c55b3dbb18b229230 ]
We need to pass '*link' to final libbpf_get_error, because that one holds the return value, not 'link'.
Fixes: 4fa5bcfe07f7 ("libbpf: Allow BPF program auto-attach handlers to bail out") Signed-off-by: Jiri Olsa jolsa@kernel.org Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20221114145257.882322-1-jolsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/libbpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e36c44090720..79ea83be21ce 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -11143,7 +11143,7 @@ static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf }
*link = bpf_program__attach_raw_tracepoint(prog, tp_name); - return libbpf_get_error(link); + return libbpf_get_error(*link); }
/* Common logic for all BPF program types that attach to a btf_id */
From: Hou Tao houtao1@huawei.com
[ Upstream commit 47df8a2f78bc34ff170d147d05b121f84e252b85 ]
Since commit bfea9a8574f3 ("bpf: Add name to struct bpf_ksym"), when reporting subprog ksymbol to perf, prog name instead of subprog name is used. The backtrace of bpf program with subprogs will be incorrect as shown below:
ffffffffc02deace bpf_prog_e44a3057dcb151f8_overwrite+0x66 ffffffffc02de9f7 bpf_prog_e44a3057dcb151f8_overwrite+0x9f ffffffffa71d8d4e trace_call_bpf+0xce ffffffffa71c2938 perf_call_bpf_enter.isra.0+0x48
overwrite is the entry program and it invokes the overwrite_htab subprog through bpf_loop, but in above backtrace, overwrite program just jumps inside itself.
Fixing it by using subprog name when reporting subprog ksymbol. After the fix, the output of perf script will be correct as shown below:
ffffffffc031aad2 bpf_prog_37c0bec7d7c764a4_overwrite_htab+0x66 ffffffffc031a9e7 bpf_prog_c7eb827ef4f23e71_overwrite+0x9f ffffffffa3dd8d4e trace_call_bpf+0xce ffffffffa3dc2938 perf_call_bpf_enter.isra.0+0x48
Fixes: bfea9a8574f3 ("bpf: Add name to struct bpf_ksym") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Jiri Olsa jolsa@kernel.org Link: https://lore.kernel.org/bpf/20221114095733.158588-1-houtao@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index bec18d81b116..8dcbefd90b7f 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9006,7 +9006,7 @@ static void perf_event_bpf_emit_ksymbols(struct bpf_prog *prog, PERF_RECORD_KSYMBOL_TYPE_BPF, (u64)(unsigned long)subprog->bpf_func, subprog->jited_len, unregister, - prog->aux->ksym.name); + subprog->aux->ksym.name); } } }
From: Srikar Dronamraju srikar@linux.vnet.ibm.com
[ Upstream commit 2d77de1581bb5b470486edaf17a7d70151131afd ]
Commit 1d1a0e7c5100 ("scripts/faddr2line: Fix overlapping text section failures") can cause faddr2line to fail on ppc64le on some distributions, while it works fine on other distributions. The failure can be attributed to differences in the readelf output.
$ ./scripts/faddr2line vmlinux find_busiest_group+0x00 no match for find_busiest_group+0x00
On ppc64le, readelf adds the localentry tag before the symbol name on some distributions, and adds the localentry tag after the symbol name on other distributions. This problem has been discussed previously:
https://lore.kernel.org/bpf/20191211160133.GB4580@calabresa/
This problem can be overcome by filtering out the localentry tags in the readelf output. Similar fixes are already present in the kernel by way of the following commits:
1fd6cee127e2 ("libbpf: Fix VERSIONED_SYM_COUNT number parsing") aa915931ac3e ("libbpf: Fix readelf output parsing for Fedora")
[jpoimboe: rework commit log]
Fixes: 1d1a0e7c5100 ("scripts/faddr2line: Fix overlapping text section failures") Signed-off-by: Srikar Dronamraju srikar@linux.vnet.ibm.com Acked-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Reviewed-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Link: https://lore.kernel.org/r/20220927075211.897152-1-srikar@linux.vnet.ibm.com Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/faddr2line | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/scripts/faddr2line b/scripts/faddr2line index 5514c23f45c2..0e73aca4f908 100755 --- a/scripts/faddr2line +++ b/scripts/faddr2line @@ -74,7 +74,8 @@ command -v ${ADDR2LINE} >/dev/null 2>&1 || die "${ADDR2LINE} isn't installed" find_dir_prefix() { local objfile=$1
- local start_kernel_addr=$(${READELF} --symbols --wide $objfile | ${AWK} '$8 == "start_kernel" {printf "0x%s", $2}') + local start_kernel_addr=$(${READELF} --symbols --wide $objfile | sed 's/[.*]//' | + ${AWK} '$8 == "start_kernel" {printf "0x%s", $2}') [[ -z $start_kernel_addr ]] && return
local file_line=$(${ADDR2LINE} -e $objfile $start_kernel_addr) @@ -178,7 +179,7 @@ __faddr2line() { found=2 break fi - done < <(${READELF} --symbols --wide $objfile | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2) + done < <(${READELF} --symbols --wide $objfile | sed 's/[.*]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
if [[ $found = 0 ]]; then warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size" @@ -259,7 +260,7 @@ __faddr2line() {
DONE=1
- done < <(${READELF} --symbols --wide $objfile | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn') + done < <(${READELF} --symbols --wide $objfile | sed 's/[.*]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn') }
[[ $# -lt 2 ]] && usage
From: Michael Grzeschik m.grzeschik@pengutronix.de
[ Upstream commit 57976762428675f259339385d3324d28ee53ec02 ]
Referring to the datasheet the index 2 is the MCKUDP. When enabled, it "Enables the automatic disable of the Master Clock of the USB Device Port when a suspend condition occurs". We fix the index to the real UDP id which "Enables the 48 MHz clock of the USB Device Port".
Cc: nicolas.ferre@microchip.com Cc: ludovic.desroches@microchip.com Cc: alexandre.belloni@bootlin.com Cc: mturquette@baylibre.com Cc: sboyd@kernel.org Cc: claudiu.beznea@microchip.com Cc: linux-clk@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: kernel@pengutronix.de Fixes: 02ff48e4d7f7 ("clk: at91: add at91rm9200 pmc driver") Fixes: 0e0e528d8260 ("ARM: dts: at91: rm9200: switch to new clock bindings") Reviewed-by: Claudiu Beznea claudiu.beznea@microchip.com Signed-off-by: Michael Grzeschik m.grzeschik@pengutronix.de Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Link: https://lore.kernel.org/r/20221114185923.1023249-2-m.grzeschik@pengutronix.d... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/at91rm9200.dtsi | 2 +- drivers/clk/at91/at91rm9200.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/boot/dts/at91rm9200.dtsi b/arch/arm/boot/dts/at91rm9200.dtsi index d1181ead18e5..21344fbc89e5 100644 --- a/arch/arm/boot/dts/at91rm9200.dtsi +++ b/arch/arm/boot/dts/at91rm9200.dtsi @@ -660,7 +660,7 @@ usb1: gadget@fffb0000 { compatible = "atmel,at91rm9200-udc"; reg = <0xfffb0000 0x4000>; interrupts = <11 IRQ_TYPE_LEVEL_HIGH 2>; - clocks = <&pmc PMC_TYPE_PERIPHERAL 11>, <&pmc PMC_TYPE_SYSTEM 2>; + clocks = <&pmc PMC_TYPE_PERIPHERAL 11>, <&pmc PMC_TYPE_SYSTEM 1>; clock-names = "pclk", "hclk"; status = "disabled"; }; diff --git a/drivers/clk/at91/at91rm9200.c b/drivers/clk/at91/at91rm9200.c index b174f727a8ef..16870943a13e 100644 --- a/drivers/clk/at91/at91rm9200.c +++ b/drivers/clk/at91/at91rm9200.c @@ -40,7 +40,7 @@ static const struct clk_pll_characteristics rm9200_pll_characteristics = { };
static const struct sck at91rm9200_systemck[] = { - { .n = "udpck", .p = "usbck", .id = 2 }, + { .n = "udpck", .p = "usbck", .id = 1 }, { .n = "uhpck", .p = "usbck", .id = 4 }, { .n = "pck0", .p = "prog0", .id = 8 }, { .n = "pck1", .p = "prog1", .id = 9 },
From: Hou Tao houtao1@huawei.com
[ Upstream commit 927cbb478adf917e0a142b94baa37f06279cc466 ]
The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries will overflow u32 when mapping producer page and data pages. Only casting max_entries to size_t is not enough, because for 32-bits application on 64-bits kernel the size of read-only mmap region also could overflow size_t.
So fixing it by casting the size of read-only mmap region into a __u64 and checking whether or not there will be overflow during mmap.
Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20221116072351.1168938-3-houtao@huaweicloud.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/ringbuf.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c index 8bc117bcc7bc..c42ba9358d8c 100644 --- a/tools/lib/bpf/ringbuf.c +++ b/tools/lib/bpf/ringbuf.c @@ -59,6 +59,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, __u32 len = sizeof(info); struct epoll_event *e; struct ring *r; + __u64 mmap_sz; void *tmp; int err;
@@ -97,8 +98,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, r->mask = info.max_entries - 1;
/* Map writable consumer page */ - tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, - map_fd, 0); + tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, 0); if (tmp == MAP_FAILED) { err = -errno; pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n", @@ -111,8 +111,12 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd, * data size to allow simple reading of samples that wrap around the * end of a ring buffer. See kernel implementation for details. * */ - tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ, - MAP_SHARED, map_fd, rb->page_size); + mmap_sz = rb->page_size + 2 * (__u64)info.max_entries; + if (mmap_sz != (__u64)(size_t)mmap_sz) { + pr_warn("ringbuf: ring buffer size (%u) is too big\n", info.max_entries); + return libbpf_err(-E2BIG); + } + tmp = mmap(NULL, (size_t)mmap_sz, PROT_READ, MAP_SHARED, map_fd, rb->page_size); if (tmp == MAP_FAILED) { err = -errno; ringbuf_unmap_ring(rb, r);
From: Derek Nguyen derek.nguyen@collins.com
[ Upstream commit 07e06193ead86d4812f431b4d87bbd4161222e3f ]
The LTC2947 datasheet (Rev. B) calls out in the section "Register Description: Non-Accumulated Result Registers" (pg. 30) that "To calculate temperature, multiply the TEMP register value by 0.204°C and add 5.5°C". Fix to add 5.5C and not 0.55C.
Fixes: 9f90fd652bed ("hwmon: Add support for ltc2947") Signed-off-by: Derek Nguyen derek.nguyen@collins.com Signed-off-by: Brandon Maier brandon.maier@collins.com Link: https://lore.kernel.org/r/20221110192108.20624-1-brandon.maier@collins.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ltc2947-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/ltc2947-core.c b/drivers/hwmon/ltc2947-core.c index 5423466de697..e918490f3ff7 100644 --- a/drivers/hwmon/ltc2947-core.c +++ b/drivers/hwmon/ltc2947-core.c @@ -396,7 +396,7 @@ static int ltc2947_read_temp(struct device *dev, const u32 attr, long *val, return ret;
/* in milidegrees celcius, temp is given by: */ - *val = (__val * 204) + 550; + *val = (__val * 204) + 5500;
return 0; }
From: Ninad Malwade nmalwade@nvidia.com
[ Upstream commit b8d27d2ce8dfc207e4b67b929a86f2be76fbc6ef ]
The shunt sum critical limit register value should be left shifted by one bit as its LSB-0 is a reserved bit.
Fixes: 2057bdfb7184 ("hwmon: (ina3221) Add summation feature support") Signed-off-by: Ninad Malwade nmalwade@nvidia.com Reviewed-by: Thierry Reding treding@nvidia.com Link: https://lore.kernel.org/r/20221108044508.23463-1-nmalwade@nvidia.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ina3221.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/ina3221.c b/drivers/hwmon/ina3221.c index 58d3828e2ec0..14586b2fb17d 100644 --- a/drivers/hwmon/ina3221.c +++ b/drivers/hwmon/ina3221.c @@ -228,7 +228,7 @@ static int ina3221_read_value(struct ina3221_data *ina, unsigned int reg, * Shunt Voltage Sum register has 14-bit value with 1-bit shift * Other Shunt Voltage registers have 12 bits with 3-bit shift */ - if (reg == INA3221_SHUNT_SUM) + if (reg == INA3221_SHUNT_SUM || reg == INA3221_CRIT_SUM) *val = sign_extend32(regval >> 1, 14); else *val = sign_extend32(regval >> 3, 12); @@ -465,7 +465,7 @@ static int ina3221_write_curr(struct device *dev, u32 attr, * SHUNT_SUM: (1 / 40uV) << 1 = 1 / 20uV * SHUNT[1-3]: (1 / 40uV) << 3 = 1 / 5uV */ - if (reg == INA3221_SHUNT_SUM) + if (reg == INA3221_SHUNT_SUM || reg == INA3221_CRIT_SUM) regval = DIV_ROUND_CLOSEST(voltage_uv, 20) & 0xfffe; else regval = DIV_ROUND_CLOSEST(voltage_uv, 5) & 0xfff8;
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 3b7f98f237528c496ea0b689bace0e35eec3e060 ]
pci_disable_device() need be called while module exiting, switch to use pcim_enable(), pci_disable_device() will be called in pcim_release().
Fixes: ada072816be1 ("hwmon: (i5500_temp) New driver for the Intel 5500/5520/X58 chipsets") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Link: https://lore.kernel.org/r/20221112125606.3751430-1-yangyingliang@huawei.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/i5500_temp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/i5500_temp.c b/drivers/hwmon/i5500_temp.c index 05f68e9c9477..23b9f94fe0a9 100644 --- a/drivers/hwmon/i5500_temp.c +++ b/drivers/hwmon/i5500_temp.c @@ -117,7 +117,7 @@ static int i5500_temp_probe(struct pci_dev *pdev, u32 tstimer; s8 tsfsc;
- err = pci_enable_device(pdev); + err = pcim_enable_device(pdev); if (err) { dev_err(&pdev->dev, "Failed to enable device\n"); return err;
From: Gaosheng Cui cuigaosheng1@huawei.com
[ Upstream commit e2a87785aab0dac190ac89be6a9ba955e2c634f2 ]
Smatch report warning as follows:
drivers/hwmon/ibmpex.c:509 ibmpex_register_bmc() warn: '&data->list' not removed from list
If ibmpex_find_sensors() fails in ibmpex_register_bmc(), data will be freed, but data->list will not be removed from driver_data.bmc_data, then list traversal may cause UAF.
Fix by removeing it from driver_data.bmc_data before free().
Fixes: 57c7c3a0fdea ("hwmon: IBM power meter driver") Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Link: https://lore.kernel.org/r/20221117034423.2935739-1-cuigaosheng1@huawei.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ibmpex.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/ibmpex.c b/drivers/hwmon/ibmpex.c index f6ec165c0fa8..1837cccd993c 100644 --- a/drivers/hwmon/ibmpex.c +++ b/drivers/hwmon/ibmpex.c @@ -502,6 +502,7 @@ static void ibmpex_register_bmc(int iface, struct device *dev) return;
out_register: + list_del(&data->list); hwmon_device_unregister(data->hwmon_dev); out_user: ipmi_destroy_user(data->user);
From: Joe Korty joe.korty@concurrent-rt.com
[ Upstream commit 839a973988a94c15002cbd81536e4af6ced2bd30 ]
The TVAL register is 32 bit signed. Thus only the lower 31 bits are available to specify when an interrupt is to occur at some time in the near future. Attempting to specify a larger interval with TVAL results in a negative time delta which means the timer fires immediately upon being programmed, rather than firing at that expected future time.
The solution is for Linux to declare that TVAL is a 31 bit register rather than give its true size of 32 bits. This prevents Linux from programming TVAL with a too-large value. Note that, prior to 5.16, this little trick was the standard way to handle TVAL in Linux, so there is nothing new happening here on that front.
The softlockup detector hides the issue, because it keeps generating short timer deadlines that are within the scope of the broken timer.
Disabling it, it starts using NO_HZ with much longer timer deadlines, which turns into an interrupt flood:
11: 1124855130 949168462 758009394 76417474 104782230 30210281 310890 1734323687 GICv2 29 Level arch_timer
And "much longer" isn't that long: it takes less than 43s to underflow TVAL at 50MHz (the frequency of the counter on XGene-1).
Some comments on the v1 version of this patch by Marc Zyngier:
XGene implements CVAL (a 64bit comparator) in terms of TVAL (a countdown register) instead of the other way around. TVAL being a 32bit register, the width of the counter should equally be 32. However, TVAL is a *signed* value, and keeps counting down in the negative range once the timer fires.
It means that any TVAL value with bit 31 set will fire immediately, as it cannot be distinguished from an already expired timer. Reducing the timer range back to a paltry 31 bits papers over the issue.
Another problem cannot be fixed though, which is that the timer interrupt *must* be handled within the negative countdown period, or the interrupt will be lost (TVAL will rollover to a positive value, indicative of a new timer deadline).
Fixes: 012f18850452 ("clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations") Signed-off-by: Joe Korty joe.korty@concurrent-rt.com Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221024165422.GA51107@zipoli.concurrent-rt.com Link: https://lore.kernel.org/r/20221121145343.896018-1-maz@kernel.org
[maz: revamped the commit message]
Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/arm_arch_timer.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c index a7ff77550e17..933bb960490d 100644 --- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -806,6 +806,9 @@ static u64 __arch_timer_check_delta(void) /* * XGene-1 implements CVAL in terms of TVAL, meaning * that the maximum timer range is 32bit. Shame on them. + * + * Note that TVAL is signed, thus has only 31 of its + * 32 bits to express magnitude. */ MIDR_ALL_VERSIONS(MIDR_CPU_MODEL(ARM_CPU_IMP_APM, APM_CPU_PART_POTENZA)), @@ -813,8 +816,8 @@ static u64 __arch_timer_check_delta(void) };
if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) { - pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 32bits"); - return CLOCKSOURCE_MASK(32); + pr_warn_once("Broken CNTx_CVAL_EL1, using 31 bit TVAL instead.\n"); + return CLOCKSOURCE_MASK(31); } #endif return CLOCKSOURCE_MASK(arch_counter_get_width());
From: Xu Kuohai xukuohai@huawei.com
[ Upstream commit 836e49e103dfeeff670c934b7d563cbd982fce87 ]
bpf_selem_alloc function is used by inode_storage, sk_storage and task_storage maps to set map value, for these map types, there may be a spin lock in the map value, so if we use memcpy to copy the whole map value from user, the spin lock field may be initialized incorrectly.
Since the spin lock field is zeroed by kzalloc, call copy_map_value instead of memcpy to skip copying the spin lock field to fix it.
Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") Signed-off-by: Xu Kuohai xukuohai@huawei.com Link: https://lore.kernel.org/r/20221114134720.1057939-2-xukuohai@huawei.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/bpf_local_storage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index d13ffb00e981..cbe918ba9035 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -74,7 +74,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, gfp_flags | __GFP_NOWARN); if (selem) { if (value) - memcpy(SDATA(selem)->data, value, smap->map.value_size); + copy_map_value(&smap->map, SDATA(selem)->data, value); return selem; }
From: Wei Yongjun weiyongjun1@huawei.com
[ Upstream commit 58e92c4a496b27156020a59a98c7f4a92c2b1533 ]
In case of error, the function memremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test.
Fixes: 5a3fa75a4d9c ("nvmem: Add driver to expose reserved memory as nvmem") Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Cc: Nicolas Saenz Julienne nsaenzjulienne@suse.de Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Acked-by: Nicolas Saenz Julienne nsaenzjulienne@suse.de Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20221118063840.6357-3-srinivas.kandagatla@linaro.o... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvmem/rmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/nvmem/rmem.c b/drivers/nvmem/rmem.c index b11c3c974b3d..80cb187f1481 100644 --- a/drivers/nvmem/rmem.c +++ b/drivers/nvmem/rmem.c @@ -37,9 +37,9 @@ static int rmem_read(void *context, unsigned int offset, * but as of Dec 2020 this isn't possible on arm64. */ addr = memremap(priv->mem->base, available, MEMREMAP_WB); - if (IS_ERR(addr)) { + if (!addr) { dev_err(priv->dev, "Failed to remap memory region\n"); - return PTR_ERR(addr); + return -ENOMEM; }
count = memory_read_from_buffer(val, bytes, &off, addr, available);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 60d865bd5a9b15a3961eb1c08bd4155682a3c81e ]
In of_fwnode_get_reference_args(), the refcount of of_args.np has been incremented in the case of successful return from of_parse_phandle_with_args() or of_parse_phandle_with_fixed_args().
Decrement the refcount if of_args is not returned to the caller of of_fwnode_get_reference_args().
Fixes: 3e3119d3088f ("device property: Introduce fwnode_property_get_reference_args") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Frank Rowand frowand.list@gmail.com Link: https://lore.kernel.org/r/20221121023209.3909759-1-yangyingliang@huawei.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/property.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/of/property.c b/drivers/of/property.c index 967f79b59016..134cfc980b70 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -993,8 +993,10 @@ of_fwnode_get_reference_args(const struct fwnode_handle *fwnode, nargs, index, &of_args); if (ret < 0) return ret; - if (!args) + if (!args) { + of_node_put(of_args.np); return 0; + }
args->nargs = of_args.args_count; args->fwnode = of_fwnode_handle(of_args.np);
From: Shazad Hussain quic_shazhuss@quicinc.com
[ Upstream commit f6abcc21d94393801937aed808b8f055ffec8579 ]
The three UFS reference clocks, gcc_ufs_ref_clkref_clk for external UFS devices, gcc_ufs_card_clkref_clk and gcc_ufs_1_card_clkref_clk for two PHYs are all sourced from CXO.
Added parent_data for all three reference clocks described above to reflect that all three clocks are sourced from CXO to have valid frequency for the ref clock needed by UFS controller driver.
Fixes: d65d005f9a6c ("clk: qcom: add sc8280xp GCC driver") Link: https://lore.kernel.org/lkml/Y2Tber39cHuOSR%2FW@hovoldconsulting.com/ Signed-off-by: Shazad Hussain quic_shazhuss@quicinc.com Tested-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Johan Hovold johan+linaro@kernel.org Tested-by: Andrew Halaney ahalaney@redhat.com Reviewed-by: Andrew Halaney ahalaney@redhat.com Reviewed-by: Brian Masney bmasney@redhat.com Link: https://lore.kernel.org/r/20221115152956.21677-1-quic_shazhuss@quicinc.com Reviewed-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/gcc-sc8280xp.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c index a2f3ffcc5849..fd332383527f 100644 --- a/drivers/clk/qcom/gcc-sc8280xp.c +++ b/drivers/clk/qcom/gcc-sc8280xp.c @@ -5364,6 +5364,8 @@ static struct clk_branch gcc_ufs_1_card_clkref_clk = { .enable_mask = BIT(0), .hw.init = &(const struct clk_init_data) { .name = "gcc_ufs_1_card_clkref_clk", + .parent_data = &gcc_parent_data_tcxo, + .num_parents = 1, .ops = &clk_branch2_ops, }, }, @@ -5432,6 +5434,8 @@ static struct clk_branch gcc_ufs_card_clkref_clk = { .enable_mask = BIT(0), .hw.init = &(const struct clk_init_data) { .name = "gcc_ufs_card_clkref_clk", + .parent_data = &gcc_parent_data_tcxo, + .num_parents = 1, .ops = &clk_branch2_ops, }, }, @@ -5848,6 +5852,8 @@ static struct clk_branch gcc_ufs_ref_clkref_clk = { .enable_mask = BIT(0), .hw.init = &(const struct clk_init_data) { .name = "gcc_ufs_ref_clkref_clk", + .parent_data = &gcc_parent_data_tcxo, + .num_parents = 1, .ops = &clk_branch2_ops, }, },
From: Shang XiaoJing shangxiaojing@huawei.com
[ Upstream commit 8cfa238a48f34038464b99d0b4825238c2687181 ]
ixgbevf_init_module() won't destroy the workqueue created by create_singlethread_workqueue() when pci_register_driver() failed. Add destroy_workqueue() in fail path to prevent the resource leak.
Similar to the handling of u132_hcd_init in commit f276e002793c ("usb: u132-hcd: fix resource leak")
Fixes: 40a13e2493c9 ("ixgbevf: Use a private workqueue to avoid certain possible hangs") Signed-off-by: Shang XiaoJing shangxiaojing@huawei.com Reviewed-by: Saeed Mahameed saeed@kernel.org Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 2f12fbe229c1..624b8aa4508c 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4869,6 +4869,8 @@ static struct pci_driver ixgbevf_driver = { **/ static int __init ixgbevf_init_module(void) { + int err; + pr_info("%s\n", ixgbevf_driver_string); pr_info("%s\n", ixgbevf_copyright); ixgbevf_wq = create_singlethread_workqueue(ixgbevf_driver_name); @@ -4877,7 +4879,13 @@ static int __init ixgbevf_init_module(void) return -ENOMEM; }
- return pci_register_driver(&ixgbevf_driver); + err = pci_register_driver(&ixgbevf_driver); + if (err) { + destroy_workqueue(ixgbevf_wq); + return err; + } + + return 0; }
module_init(ixgbevf_init_module);
From: Shang XiaoJing shangxiaojing@huawei.com
[ Upstream commit 479dd06149425b9e00477f52200872587af76a48 ]
i40e_init_module() won't free the debugfs directory created by i40e_dbg_init() when pci_register_driver() failed. Add fail path to call i40e_dbg_exit() to remove the debugfs entries to prevent the bug.
i40e: Intel(R) Ethernet Connection XL710 Network Driver i40e: Copyright (c) 2013 - 2019 Intel Corporation. debugfs: Directory 'i40e' with parent '/' already present!
Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Shang XiaoJing shangxiaojing@huawei.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Tested-by: Gurucharan G gurucharanx.g@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index b3336d31f8a9..023685cca2c1 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -16652,6 +16652,8 @@ static struct pci_driver i40e_driver = { **/ static int __init i40e_init_module(void) { + int err; + pr_info("%s: %s\n", i40e_driver_name, i40e_driver_string); pr_info("%s: %s\n", i40e_driver_name, i40e_copyright);
@@ -16669,7 +16671,14 @@ static int __init i40e_init_module(void) }
i40e_dbg_init(); - return pci_register_driver(&i40e_driver); + err = pci_register_driver(&i40e_driver); + if (err) { + destroy_workqueue(i40e_wq); + i40e_dbg_exit(); + return err; + } + + return 0; } module_init(i40e_init_module);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 771a794c0a3c3e7f0d86cc34be4f9537e8c0a20c ]
A problem about modprobe fm10k failed is triggered with the following log given:
Intel(R) Ethernet Switch Host Interface Driver Copyright(c) 2013 - 2019 Intel Corporation. debugfs: Directory 'fm10k' with parent '/' already present!
The reason is that fm10k_init_module() returns fm10k_register_pci_driver() directly without checking its return value, if fm10k_register_pci_driver() failed, it returns without removing debugfs and destroy workqueue, resulting the debugfs of fm10k can never be created later and leaks the workqueue.
fm10k_init_module() alloc_workqueue() fm10k_dbg_init() # create debugfs fm10k_register_pci_driver() pci_register_driver() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without remove debugfs and destroy workqueue
Fix by remove debugfs and destroy workqueue when fm10k_register_pci_driver() returns error.
Fixes: 7461fd913afe ("fm10k: Add support for debugfs") Fixes: b382bb1b3e2d ("fm10k: use separate workqueue for fm10k driver") Signed-off-by: Yuan Can yuancan@huawei.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/fm10k/fm10k_main.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c index 3362f26d7f99..1b273446621c 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c @@ -32,6 +32,8 @@ struct workqueue_struct *fm10k_workqueue; **/ static int __init fm10k_init_module(void) { + int ret; + pr_info("%s\n", fm10k_driver_string); pr_info("%s\n", fm10k_copyright);
@@ -43,7 +45,13 @@ static int __init fm10k_init_module(void)
fm10k_dbg_init();
- return fm10k_register_pci_driver(); + ret = fm10k_register_pci_driver(); + if (ret) { + fm10k_dbg_exit(); + destroy_workqueue(fm10k_workqueue); + } + + return ret; } module_init(fm10k_init_module);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 227d8d2f7f2278b8468c5531b0cd0f2a905b4486 ]
The iavf_init_module() won't destroy workqueue when pci_register_driver() failed. Call destroy_workqueue() when pci_register_driver() failed to prevent the resource leak.
Similar to the handling of u132_hcd_init in commit f276e002793c ("usb: u132-hcd: fix resource leak")
Fixes: 2803b16c10ea ("i40e/i40evf: Use private workqueue") Signed-off-by: Yuan Can yuancan@huawei.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index cff03723f4f9..4e03712726f2 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -5196,6 +5196,8 @@ static struct pci_driver iavf_driver = { **/ static int __init iavf_init_module(void) { + int ret; + pr_info("iavf: %s\n", iavf_driver_string);
pr_info("%s\n", iavf_copyright); @@ -5206,7 +5208,12 @@ static int __init iavf_init_module(void) pr_err("%s: Failed to create workqueue\n", iavf_driver_name); return -ENOMEM; } - return pci_register_driver(&iavf_driver); + + ret = pci_register_driver(&iavf_driver); + if (ret) + destroy_workqueue(iavf_wq); + + return ret; }
module_init(iavf_init_module);
From: Wang Hai wanghai38@huawei.com
[ Upstream commit 45605c75c52c7ae7bfe902214343aabcfe5ba0ff ]
In e100_xmit_prepare(), if we can't map the skb, then return -ENOMEM, so e100_xmit_frame() will return NETDEV_TX_BUSY and the upper layer will resend the skb. But the skb is already freed, which will cause UAF bug when the upper layer resends the skb.
Remove the harmful free.
Fixes: 5e5d49422dfb ("e100: Release skb when DMA mapping is failed in e100_xmit_prepare") Signed-off-by: Wang Hai wanghai38@huawei.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e100.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/e100.c b/drivers/net/ethernet/intel/e100.c index 11a884aa5082..90a2ba20e902 100644 --- a/drivers/net/ethernet/intel/e100.c +++ b/drivers/net/ethernet/intel/e100.c @@ -1741,11 +1741,8 @@ static int e100_xmit_prepare(struct nic *nic, struct cb *cb, dma_addr = dma_map_single(&nic->pdev->dev, skb->data, skb->len, DMA_TO_DEVICE); /* If we can't map the skb, have the upper layer try later */ - if (dma_mapping_error(&nic->pdev->dev, dma_addr)) { - dev_kfree_skb_any(skb); - skb = NULL; + if (dma_mapping_error(&nic->pdev->dev, dma_addr)) return -ENOMEM; - }
/* * Use the last 4 bytes of the SKB payload packet as the CRC, used for
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit 52f7cf70eb8fac6111786c59ae9dfc5cf2bee710 ]
Smatch warns this:
drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c:81 mlx5dr_table_set_miss_action() error: uninitialized symbol 'ret'.
Initializing ret with -EOPNOTSUPP and fix missing action case.
Fixes: 7838e1725394 ("net/mlx5: DR, Expose steering table functionality") Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c index 31d443dd8386..f68461b13391 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c @@ -46,7 +46,7 @@ static int dr_table_set_miss_action_nic(struct mlx5dr_domain *dmn, int mlx5dr_table_set_miss_action(struct mlx5dr_table *tbl, struct mlx5dr_action *action) { - int ret; + int ret = -EOPNOTSUPP;
if (action && action->action_type != DR_ACTION_TYP_FT) return -EOPNOTSUPP; @@ -67,6 +67,9 @@ int mlx5dr_table_set_miss_action(struct mlx5dr_table *tbl, goto out; }
+ if (ret) + goto out; + /* Release old action */ if (tbl->miss_action) refcount_dec(&tbl->miss_action->refcount);
From: Chris Mi cmi@nvidia.com
[ Upstream commit 2318b8bb94a3a21363cd0d49cad5934bd1e2d60e ]
The cited commit removes eswitch mode none. But when disabling sriov in legacy mode or changing from switchdev to legacy mode without sriov enabled, the legacy fdb table is not destroyed.
It is not the right behavior. Destroy legacy fdb table in above two caes.
Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi cmi@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Reviewed-by: Eli Cohen elic@nvidia.com Reviewed-by: Mark Bloch mbloch@nvidia.com Reviewed-by: Vlad Buslov vladbu@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 3 +++ drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 7 +++++++ 2 files changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 4d8b8f6143cc..59cffa49e4b5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1363,6 +1363,9 @@ void mlx5_eswitch_disable_sriov(struct mlx5_eswitch *esw, bool clear_vf) esw_offloads_del_send_to_vport_meta_rules(esw); devl_rate_nodes_destroy(devlink); } + /* Destroy legacy fdb when disabling sriov in legacy mode. */ + if (esw->mode == MLX5_ESWITCH_LEGACY) + mlx5_eswitch_disable_locked(esw);
esw->esw_funcs.num_vfs = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 061ac8799354..11cb7d28e1f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -3270,6 +3270,13 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw, int err;
esw->mode = MLX5_ESWITCH_LEGACY; + + /* If changing from switchdev to legacy mode without sriov enabled, + * no need to create legacy fdb. + */ + if (!mlx5_sriov_is_enabled(esw->dev)) + return 0; + err = mlx5_eswitch_enable_locked(esw, MLX5_ESWITCH_IGNORE_NUM_VFS); if (err) NL_SET_ERR_MSG_MOD(extack, "Failed setting eswitch to legacy");
From: Chris Mi cmi@nvidia.com
[ Upstream commit e87c6a832f889c093c055a30a7b8c6843e6573bf ]
If creating bond first and then enabling sriov in switchdev mode, will hit the following syndrome:
mlx5_core 0000:08:00.0: mlx5_cmd_out_err:778:(pid 25543): CREATE_LAG(0x840) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x7d49cb), err(-22)
The reason is because the offending patch removes eswitch mode none. In vf lag, the checking of eswitch mode none is replaced by checking if sriov is enabled. But when driver enables sriov, it triggers the bond workqueue task first and then setting sriov number in pci_enable_sriov(). So the check fails.
Fix it by checking if sriov is enabled using eswitch internal counter that is set before triggering the bond workqueue task.
Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi cmi@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Reviewed-by: Mark Bloch mbloch@nvidia.com Reviewed-by: Vlad Buslov vladbu@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 8 ++++++++ drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 5 +++-- 2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 87ce5a208cb5..5ceed4e6c658 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -731,6 +731,14 @@ void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw, struct mlx5_eswitch *slave_esw); int mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw);
+static inline int mlx5_eswitch_num_vfs(struct mlx5_eswitch *esw) +{ + if (mlx5_esw_allowed(esw)) + return esw->esw_funcs.num_vfs; + + return 0; +} + #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index 065102278cb8..a879e0b0f702 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -649,8 +649,9 @@ static bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
#ifdef CONFIG_MLX5_ESWITCH dev = ldev->pf[MLX5_LAG_P1].dev; - if ((mlx5_sriov_is_enabled(dev)) && !is_mdev_switchdev_mode(dev)) - return false; + for (i = 0; i < ldev->ports; i++) + if (mlx5_eswitch_num_vfs(dev->priv.eswitch) && !is_mdev_switchdev_mode(dev)) + return false;
mode = mlx5_eswitch_mode(dev); for (i = 0; i < ldev->ports; i++)
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit 3f5769a074c13d8f08455e40586600419e02a880 ]
If sscanf() return 0, outlen is uninitialized and used in kzalloc(), this is unexpected. We should return -EINVAL if the string is invalid.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: YueHaibing yuehaibing@huawei.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 74bd05e5dda2..e7a894ba5c3e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1497,8 +1497,8 @@ static ssize_t outlen_write(struct file *filp, const char __user *buf, return -EFAULT;
err = sscanf(outlen_str, "%d", &outlen); - if (err < 0) - return err; + if (err != 1) + return -EINVAL;
ptr = kzalloc(outlen, GFP_KERNEL); if (!ptr)
From: Roi Dayan roid@nvidia.com
[ Upstream commit 52c795af04441d76f565c4634f893e5b553df2ae ]
When having multiple dests with termination tables and second one or afterwards fails the driver reverts usage of term tables but doesn't reset the assignment in attr->dests[num_vport_dests].termtbl which case a use-after-free when releasing the rule. Fix by resetting the assignment of termtbl to null.
Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c index 108a3503f413..edd910258314 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c @@ -312,6 +312,8 @@ mlx5_eswitch_add_termtbl_rule(struct mlx5_eswitch *esw, for (curr_dest = 0; curr_dest < num_vport_dests; curr_dest++) { struct mlx5_termtbl_handle *tt = attr->dests[curr_dest].termtbl;
+ attr->dests[curr_dest].termtbl = NULL; + /* search for the destination associated with the * current term table */
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 92dfd9310a71d28cefe6a2d5174d43fab240e631 ]
Add the missing free_sja1000dev() before return from sja1000_isa_probe() in the register_sja1000dev() error handling case.
In addition, remove blanks before goto labels.
Fixes: 2a6ba39ad6a2 ("can: sja1000: legacy SJA1000 ISA bus driver") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Link: https://lore.kernel.org/all/1668168521-5540-1-git-send-email-zhangchangzhong... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/sja1000/sja1000_isa.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/sja1000/sja1000_isa.c b/drivers/net/can/sja1000/sja1000_isa.c index d513fac50718..db3e767d5320 100644 --- a/drivers/net/can/sja1000/sja1000_isa.c +++ b/drivers/net/can/sja1000/sja1000_isa.c @@ -202,22 +202,24 @@ static int sja1000_isa_probe(struct platform_device *pdev) if (err) { dev_err(&pdev->dev, "registering %s failed (err=%d)\n", DRV_NAME, err); - goto exit_unmap; + goto exit_free; }
dev_info(&pdev->dev, "%s device registered (reg_base=0x%p, irq=%d)\n", DRV_NAME, priv->reg_base, dev->irq); return 0;
- exit_unmap: +exit_free: + free_sja1000dev(dev); +exit_unmap: if (mem[idx]) iounmap(base); - exit_release: +exit_release: if (mem[idx]) release_mem_region(mem[idx], iosize); else release_region(port[idx], iosize); - exit: +exit: return err; }
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 62ec89e74099a3d6995988ed9f2f996b368417ec ]
Add the missing free_cc770dev() before return from cc770_isa_probe() in the register_cc770dev() error handling case.
In addition, remove blanks before goto labels.
Fixes: 7e02e5433e00 ("can: cc770: legacy CC770 ISA bus driver") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Link: https://lore.kernel.org/all/1668168557-6024-1-git-send-email-zhangchangzhong... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/cc770/cc770_isa.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/cc770/cc770_isa.c b/drivers/net/can/cc770/cc770_isa.c index 194c86e0f340..8f6dccd5a587 100644 --- a/drivers/net/can/cc770/cc770_isa.c +++ b/drivers/net/can/cc770/cc770_isa.c @@ -264,22 +264,24 @@ static int cc770_isa_probe(struct platform_device *pdev) if (err) { dev_err(&pdev->dev, "couldn't register device (err=%d)\n", err); - goto exit_unmap; + goto exit_free; }
dev_info(&pdev->dev, "device registered (reg_base=0x%p, irq=%d)\n", priv->reg_base, dev->irq); return 0;
- exit_unmap: +exit_free: + free_cc770dev(dev); +exit_unmap: if (mem[idx]) iounmap(base); - exit_release: +exit_release: if (mem[idx]) release_mem_region(mem[idx], iosize); else release_region(port[idx], iosize); - exit: +exit: return err; }
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 709cb2f9ed2006eb1dc4b36b99d601cd24889ec4 ]
In case of register_candev() fails, clear es58x_dev->netdev[channel_idx] and add free_candev(). Otherwise es58x_free_netdevs() will unregister the netdev that has never been registered.
Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Acked-by: Arunachalam Santhanam Arunachalam.Santhanam@in.bosch.com Acked-by: Vincent Mailhol mailhol.vincent@wanadoo.fr Link: https://lore.kernel.org/all/1668413685-23354-1-git-send-email-zhangchangzhon... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/usb/etas_es58x/es58x_core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/usb/etas_es58x/es58x_core.c b/drivers/net/can/usb/etas_es58x/es58x_core.c index 25f863b4f5f0..ddb7c5735c9a 100644 --- a/drivers/net/can/usb/etas_es58x/es58x_core.c +++ b/drivers/net/can/usb/etas_es58x/es58x_core.c @@ -2091,8 +2091,11 @@ static int es58x_init_netdev(struct es58x_device *es58x_dev, int channel_idx) netdev->dev_port = channel_idx;
ret = register_candev(netdev); - if (ret) + if (ret) { + es58x_dev->netdev[channel_idx] = NULL; + free_candev(netdev); return ret; + }
netdev_queue_set_dql_min_limit(netdev_get_tx_queue(netdev, 0), es58x_dev->param->dql_min_limit);
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 1eca1d4cc21b6d0fc5f9a390339804c0afce9439 ]
In m_can_pci_remove() and error handling path of m_can_pci_probe(), m_can_class_free_dev() should be called to free resource allocated by m_can_class_allocate_dev(), otherwise there will be memleak.
Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Reviewed-by: Jarkko Nikula jarkko.nikula@linux.intel.com Link: https://lore.kernel.org/all/1668168684-6390-1-git-send-email-zhangchangzhong... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can_pci.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/can/m_can/m_can_pci.c b/drivers/net/can/m_can/m_can_pci.c index 8f184a852a0a..f2219aa2824b 100644 --- a/drivers/net/can/m_can/m_can_pci.c +++ b/drivers/net/can/m_can/m_can_pci.c @@ -120,7 +120,7 @@ static int m_can_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
ret = pci_alloc_irq_vectors(pci, 1, 1, PCI_IRQ_ALL_TYPES); if (ret < 0) - return ret; + goto err_free_dev;
mcan_class->dev = &pci->dev; mcan_class->net->irq = pci_irq_vector(pci, 0); @@ -132,7 +132,7 @@ static int m_can_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
ret = m_can_class_register(mcan_class); if (ret) - goto err; + goto err_free_irq;
/* Enable interrupt control at CAN wrapper IP */ writel(0x1, base + CTL_CSR_INT_CTL_OFFSET); @@ -144,8 +144,10 @@ static int m_can_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
return 0;
-err: +err_free_irq: pci_free_irq_vectors(pci); +err_free_dev: + m_can_class_free_dev(mcan_class->net); return ret; }
@@ -161,6 +163,7 @@ static void m_can_pci_remove(struct pci_dev *pci) writel(0x0, priv->base + CTL_CSR_INT_CTL_OFFSET);
m_can_class_unregister(mcan_class); + m_can_class_free_dev(mcan_class->net); pci_free_irq_vectors(pci); }
From: Jiasheng Jiang jiasheng@iscas.ac.cn
[ Upstream commit 68b4f9e0bdd0f920d7303d07bfe226cd0976961d ]
Since the devm_clk_get may return error, it should be better to add check for the cdev->hclk, as same as cdev->cclk.
Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Link: https://lore.kernel.org/all/20221123063651.26199-1-jiasheng@iscas.ac.cn Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 4dc67fdfcdb9..153d8fd08bd8 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1910,7 +1910,7 @@ int m_can_class_get_clocks(struct m_can_classdev *cdev) cdev->hclk = devm_clk_get(cdev->dev, "hclk"); cdev->cclk = devm_clk_get(cdev->dev, "cclk");
- if (IS_ERR(cdev->cclk)) { + if (IS_ERR(cdev->hclk) || IS_ERR(cdev->cclk)) { dev_err(cdev->dev, "no clock found\n"); ret = -ENODEV; }
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 10bc8e4af65946b727728d7479c028742321b60a ]
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more.
Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already.
Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path.
This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions.
Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon linkinjeon@kernel.org Tested-by: Luis Henriques lhenriques@suse.de Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ksmbd/vfs.c | 6 +++--- fs/nfsd/vfs.c | 4 ++-- fs/read_write.c | 19 +++++++++++++++---- include/linux/fs.h | 8 ++++++++ 4 files changed, 28 insertions(+), 9 deletions(-)
diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c index 78d01033604c..c5c801e38b63 100644 --- a/fs/ksmbd/vfs.c +++ b/fs/ksmbd/vfs.c @@ -1784,9 +1784,9 @@ int ksmbd_vfs_copy_file_ranges(struct ksmbd_work *work, ret = vfs_copy_file_range(src_fp->filp, src_off, dst_fp->filp, dst_off, len, 0); if (ret == -EOPNOTSUPP || ret == -EXDEV) - ret = generic_copy_file_range(src_fp->filp, src_off, - dst_fp->filp, dst_off, - len, 0); + ret = vfs_copy_file_range(src_fp->filp, src_off, + dst_fp->filp, dst_off, len, + COPY_FILE_SPLICE); if (ret < 0) return ret;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index f3cd614e1f1e..dc24d67d0ca4 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -572,8 +572,8 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst, ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
if (ret == -EOPNOTSUPP || ret == -EXDEV) - ret = generic_copy_file_range(src, src_pos, dst, dst_pos, - count, 0); + ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, + COPY_FILE_SPLICE); return ret; }
diff --git a/fs/read_write.c b/fs/read_write.c index 328ce8cf9a85..24b9668d6377 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1388,6 +1388,8 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len, unsigned int flags) { + lockdep_assert(sb_write_started(file_inode(file_out)->i_sb)); + return do_splice_direct(file_in, &pos_in, file_out, &pos_out, len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0); } @@ -1424,7 +1426,9 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, * and several different sets of file_operations, but they all end up * using the same ->copy_file_range() function pointer. */ - if (file_out->f_op->copy_file_range) { + if (flags & COPY_FILE_SPLICE) { + /* cross sb splice is allowed */ + } else if (file_out->f_op->copy_file_range) { if (file_in->f_op->copy_file_range != file_out->f_op->copy_file_range) return -EXDEV; @@ -1474,8 +1478,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, size_t len, unsigned int flags) { ssize_t ret; + bool splice = flags & COPY_FILE_SPLICE;
- if (flags != 0) + if (flags & ~COPY_FILE_SPLICE) return -EINVAL;
ret = generic_copy_file_checks(file_in, pos_in, file_out, pos_out, &len, @@ -1501,14 +1506,14 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * same sb using clone, but for filesystems where both clone and copy * are supported (e.g. nfs,cifs), we only call the copy method. */ - if (file_out->f_op->copy_file_range) { + if (!splice && file_out->f_op->copy_file_range) { ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); goto done; }
- if (file_in->f_op->remap_file_range && + if (!splice && file_in->f_op->remap_file_range && file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) { ret = file_in->f_op->remap_file_range(file_in, pos_in, file_out, pos_out, @@ -1528,6 +1533,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * consistent story about which filesystems support copy_file_range() * and which filesystems do not, that will allow userspace tools to * make consistent desicions w.r.t using copy_file_range(). + * + * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE. */ ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); @@ -1582,6 +1589,10 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in, pos_out = f_out.file->f_pos; }
+ ret = -EINVAL; + if (flags != 0) + goto out; + ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len, flags); if (ret > 0) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 7203f5582fd4..be074b6895b9 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2087,6 +2087,14 @@ struct dir_context { */ #define REMAP_FILE_ADVISORY (REMAP_FILE_CAN_SHORTEN)
+/* + * These flags control the behavior of vfs_copy_file_range(). + * They are not available to the user via syscall. + * + * COPY_FILE_SPLICE: call splice direct instead of fs clone/copy ops + */ +#define COPY_FILE_SPLICE (1 << 0) + struct iov_iter; struct io_uring_cmd;
On Mon, Dec 5, 2022 at 9:24 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 10bc8e4af65946b727728d7479c028742321b60a ]
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
Hi Greg,
The regressing commit is in v5.15.53. Please apply this fix to 5.15.y.
I will test and post backports of cross-fs copy_file_range() fixes to pre 5.15 kernels. See: https://bugzilla.kernel.org/show_bug.cgi?id=216800
Thanks, Amir.
To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more.
Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already.
Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path.
This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions.
Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon linkinjeon@kernel.org Tested-by: Luis Henriques lhenriques@suse.de Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org
fs/ksmbd/vfs.c | 6 +++--- fs/nfsd/vfs.c | 4 ++-- fs/read_write.c | 19 +++++++++++++++---- include/linux/fs.h | 8 ++++++++ 4 files changed, 28 insertions(+), 9 deletions(-)
diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c index 78d01033604c..c5c801e38b63 100644 --- a/fs/ksmbd/vfs.c +++ b/fs/ksmbd/vfs.c @@ -1784,9 +1784,9 @@ int ksmbd_vfs_copy_file_ranges(struct ksmbd_work *work, ret = vfs_copy_file_range(src_fp->filp, src_off, dst_fp->filp, dst_off, len, 0); if (ret == -EOPNOTSUPP || ret == -EXDEV)
ret = generic_copy_file_range(src_fp->filp, src_off,
dst_fp->filp, dst_off,
len, 0);
ret = vfs_copy_file_range(src_fp->filp, src_off,
dst_fp->filp, dst_off, len,
COPY_FILE_SPLICE); if (ret < 0) return ret;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index f3cd614e1f1e..dc24d67d0ca4 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -572,8 +572,8 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst, ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
if (ret == -EOPNOTSUPP || ret == -EXDEV)
ret = generic_copy_file_range(src, src_pos, dst, dst_pos,
count, 0);
ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count,
COPY_FILE_SPLICE); return ret;
}
diff --git a/fs/read_write.c b/fs/read_write.c index 328ce8cf9a85..24b9668d6377 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1388,6 +1388,8 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len, unsigned int flags) {
lockdep_assert(sb_write_started(file_inode(file_out)->i_sb));
return do_splice_direct(file_in, &pos_in, file_out, &pos_out, len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
} @@ -1424,7 +1426,9 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, * and several different sets of file_operations, but they all end up * using the same ->copy_file_range() function pointer. */
if (file_out->f_op->copy_file_range) {
if (flags & COPY_FILE_SPLICE) {
/* cross sb splice is allowed */
} else if (file_out->f_op->copy_file_range) { if (file_in->f_op->copy_file_range != file_out->f_op->copy_file_range) return -EXDEV;
@@ -1474,8 +1478,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, size_t len, unsigned int flags) { ssize_t ret;
bool splice = flags & COPY_FILE_SPLICE;
if (flags != 0)
if (flags & ~COPY_FILE_SPLICE) return -EINVAL; ret = generic_copy_file_checks(file_in, pos_in, file_out, pos_out, &len,
@@ -1501,14 +1506,14 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * same sb using clone, but for filesystems where both clone and copy * are supported (e.g. nfs,cifs), we only call the copy method. */
if (file_out->f_op->copy_file_range) {
if (!splice && file_out->f_op->copy_file_range) { ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out, len, flags); goto done; }
if (file_in->f_op->remap_file_range &&
if (!splice && file_in->f_op->remap_file_range && file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) { ret = file_in->f_op->remap_file_range(file_in, pos_in, file_out, pos_out,
@@ -1528,6 +1533,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * consistent story about which filesystems support copy_file_range() * and which filesystems do not, that will allow userspace tools to * make consistent desicions w.r.t using copy_file_range().
*
* We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE. */ ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len, flags);
@@ -1582,6 +1589,10 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in, pos_out = f_out.file->f_pos; }
ret = -EINVAL;
if (flags != 0)
goto out;
ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len, flags); if (ret > 0) {
diff --git a/include/linux/fs.h b/include/linux/fs.h index 7203f5582fd4..be074b6895b9 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2087,6 +2087,14 @@ struct dir_context { */ #define REMAP_FILE_ADVISORY (REMAP_FILE_CAN_SHORTEN)
+/*
- These flags control the behavior of vfs_copy_file_range().
- They are not available to the user via syscall.
- COPY_FILE_SPLICE: call splice direct instead of fs clone/copy ops
- */
+#define COPY_FILE_SPLICE (1 << 0)
struct iov_iter; struct io_uring_cmd;
-- 2.35.1
On Tue, Dec 13, 2022 at 10:03:02AM +0200, Amir Goldstein wrote:
On Mon, Dec 5, 2022 at 9:24 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 10bc8e4af65946b727728d7479c028742321b60a ]
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
Hi Greg,
The regressing commit is in v5.15.53. Please apply this fix to 5.15.y.
This commit does not apply to 5.15.y as-is (breaks the build), can you provide a working backport?
thanks,
greg k-h
On Wed, Dec 14, 2022 at 5:58 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Dec 13, 2022 at 10:03:02AM +0200, Amir Goldstein wrote:
On Mon, Dec 5, 2022 at 9:24 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 10bc8e4af65946b727728d7479c028742321b60a ]
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
Hi Greg,
The regressing commit is in v5.15.53. Please apply this fix to 5.15.y.
This commit does not apply to 5.15.y as-is (breaks the build),
Sorry. compiled without lockdep.
can you provide a working backport?
Patch attached with lockdep assert removed.
Thanks, Amir.
On Wed, Dec 14, 2022 at 07:21:47PM +0200, Amir Goldstein wrote:
On Wed, Dec 14, 2022 at 5:58 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Tue, Dec 13, 2022 at 10:03:02AM +0200, Amir Goldstein wrote:
On Mon, Dec 5, 2022 at 9:24 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 10bc8e4af65946b727728d7479c028742321b60a ]
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range().
Hi Greg,
The regressing commit is in v5.15.53. Please apply this fix to 5.15.y.
This commit does not apply to 5.15.y as-is (breaks the build),
Sorry. compiled without lockdep.
can you provide a working backport?
Patch attached with lockdep assert removed.
thanks, that worked, now queued up.
greg k-h
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit 8dbd6e4ce1b9c527921643d9e34f188a10d4e893 ]
The watchdog timer is used to monitor whether the process of transmitting data is timeout. If we use qlcnic driver, the dev_watchdog() that is the timer handler of watchdog timer will call qlcnic_tx_timeout() to process the timeout. But the qlcnic_tx_timeout() calls msleep(), as a result, the sleep-in-atomic-context bugs will happen. The processes are shown below:
(atomic context) dev_watchdog qlcnic_tx_timeout qlcnic_83xx_idc_request_reset qlcnic_83xx_lock_driver msleep
---------------------------
(atomic context) dev_watchdog qlcnic_tx_timeout qlcnic_83xx_idc_request_reset qlcnic_83xx_lock_driver qlcnic_83xx_recover_driver_lock msleep
Fix by changing msleep() to mdelay(), the mdelay() is busy-waiting and the bugs could be mitigated.
Fixes: 629263acaea3 ("qlcnic: 83xx CNA inter driver communication mechanism") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index bd0607680329..2fd5c6fdb500 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -2991,7 +2991,7 @@ static void qlcnic_83xx_recover_driver_lock(struct qlcnic_adapter *adapter) QLCWRX(adapter->ahw, QLC_83XX_RECOVER_DRV_LOCK, val); dev_info(&adapter->pdev->dev, "%s: lock recovery initiated\n", __func__); - msleep(QLC_83XX_DRV_LOCK_RECOVERY_DELAY); + mdelay(QLC_83XX_DRV_LOCK_RECOVERY_DELAY); val = QLCRDX(adapter->ahw, QLC_83XX_RECOVER_DRV_LOCK); id = ((val >> 2) & 0xF); if (id == adapter->portnum) { @@ -3027,7 +3027,7 @@ int qlcnic_83xx_lock_driver(struct qlcnic_adapter *adapter) if (status) break;
- msleep(QLC_83XX_DRV_LOCK_WAIT_DELAY); + mdelay(QLC_83XX_DRV_LOCK_WAIT_DELAY); i++;
if (i == 1)
From: Izabela Bakollari ibakolla@redhat.com
[ Upstream commit 2a83891130512dafb321418a8e7c9c09268d8c59 ]
IPV6 addresses are purged when setting the number of rx/tx rings using ethtool -G. The function aq_set_ringparam calls dev_close, which removes the addresses. As a solution, call an internal function (aq_ndev_close).
Fixes: c1af5427954b ("net: aquantia: Ethtool based ring size configuration") Signed-off-by: Izabela Bakollari ibakolla@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c | 5 +++-- drivers/net/ethernet/aquantia/atlantic/aq_main.c | 4 ++-- drivers/net/ethernet/aquantia/atlantic/aq_main.h | 2 ++ 3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c b/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c index 1daecd483b8d..9c1378c22a8e 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c @@ -13,6 +13,7 @@ #include "aq_ptp.h" #include "aq_filters.h" #include "aq_macsec.h" +#include "aq_main.h"
#include <linux/ptp_clock_kernel.h>
@@ -858,7 +859,7 @@ static int aq_set_ringparam(struct net_device *ndev,
if (netif_running(ndev)) { ndev_running = true; - dev_close(ndev); + aq_ndev_close(ndev); }
cfg->rxds = max(ring->rx_pending, hw_caps->rxds_min); @@ -874,7 +875,7 @@ static int aq_set_ringparam(struct net_device *ndev, goto err_exit;
if (ndev_running) - err = dev_open(ndev, NULL); + err = aq_ndev_open(ndev);
err_exit: return err; diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c b/drivers/net/ethernet/aquantia/atlantic/aq_main.c index 8a0af371e7dc..77609dc0a08d 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c @@ -58,7 +58,7 @@ struct net_device *aq_ndev_alloc(void) return ndev; }
-static int aq_ndev_open(struct net_device *ndev) +int aq_ndev_open(struct net_device *ndev) { struct aq_nic_s *aq_nic = netdev_priv(ndev); int err = 0; @@ -88,7 +88,7 @@ static int aq_ndev_open(struct net_device *ndev) return err; }
-static int aq_ndev_close(struct net_device *ndev) +int aq_ndev_close(struct net_device *ndev) { struct aq_nic_s *aq_nic = netdev_priv(ndev); int err = 0; diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.h b/drivers/net/ethernet/aquantia/atlantic/aq_main.h index 99870865f66d..a78c1a168d8e 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_main.h +++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.h @@ -16,5 +16,7 @@ DECLARE_STATIC_KEY_FALSE(aq_xdp_locking_key);
void aq_ndev_schedule_work(struct work_struct *work); struct net_device *aq_ndev_alloc(void); +int aq_ndev_open(struct net_device *ndev); +int aq_ndev_close(struct net_device *ndev);
#endif /* AQ_MAIN_H */
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 9f16b5c82a025cd4c864737409234ddc44fb166a ]
For vendor elements, the code here assumes that 5 octets are present without checking. Since the element itself is already checked to fit, we only need to check the length.
Reported-and-tested-by: Sönke Huster shuster@seemoo.tu-darmstadt.de Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/scan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/wireless/scan.c b/net/wireless/scan.c index 9067e4b70855..56db0f12ca7c 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -330,7 +330,8 @@ static size_t cfg80211_gen_new_ie(const u8 *ie, size_t ielen, * determine if they are the same ie. */ if (tmp_old[0] == WLAN_EID_VENDOR_SPECIFIC) { - if (!memcmp(tmp_old + 2, tmp + 2, 5)) { + if (tmp_old[1] >= 5 && tmp[1] >= 5 && + !memcmp(tmp_old + 2, tmp + 2, 5)) { /* same vendor ie, copy from * subelement */
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit acd3c92acc7aaec50a94d0a7faf7ccd74e952493 ]
In S1G beacon frames there shouldn't be multi-BSSID elements since that's not supported, remove that to avoid a potential integer underflow and/or misparsing the frames due to the different length of the fixed part of the frame.
While at it, initialize non_tx_data so we don't send garbage values to the user (even if it doesn't seem to matter now.)
Reported-and-tested-by: Sönke Huster shuster@seemoo.tu-darmstadt.de Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/scan.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/wireless/scan.c b/net/wireless/scan.c index 56db0f12ca7c..b4d788572992 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -2527,10 +2527,15 @@ cfg80211_inform_bss_frame_data(struct wiphy *wiphy, const struct cfg80211_bss_ies *ies1, *ies2; size_t ielen = len - offsetof(struct ieee80211_mgmt, u.probe_resp.variable); - struct cfg80211_non_tx_bss non_tx_data; + struct cfg80211_non_tx_bss non_tx_data = {};
res = cfg80211_inform_single_bss_frame_data(wiphy, data, mgmt, len, gfp); + + /* don't do any further MBSSID handling for S1G */ + if (ieee80211_is_s1g_beacon(mgmt->frame_control)) + return res; + if (!res || !wiphy->support_mbssid || !cfg80211_find_elem(WLAN_EID_MULTIPLE_BSSID, ie, ielen)) return res;
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 3e8f7abcc3473bc9603323803aeaed4ffcc3a2ab ]
Fix possible out-of-bound access in ieee80211_get_rate_duration routine as reported by the following UBSAN report:
UBSAN: array-index-out-of-bounds in net/mac80211/airtime.c:455:47 index 15 is out of range for type 'u16 [12]' CPU: 2 PID: 217 Comm: kworker/u32:10 Not tainted 6.1.0-060100rc3-generic Hardware name: Acer Aspire TC-281/Aspire TC-281, BIOS R01-A2 07/18/2017 Workqueue: mt76 mt76u_tx_status_data [mt76_usb] Call Trace: <TASK> show_stack+0x4e/0x61 dump_stack_lvl+0x4a/0x6f dump_stack+0x10/0x18 ubsan_epilogue+0x9/0x43 __ubsan_handle_out_of_bounds.cold+0x42/0x47 ieee80211_get_rate_duration.constprop.0+0x22f/0x2a0 [mac80211] ? ieee80211_tx_status_ext+0x32e/0x640 [mac80211] ieee80211_calc_rx_airtime+0xda/0x120 [mac80211] ieee80211_calc_tx_airtime+0xb4/0x100 [mac80211] mt76x02_send_tx_status+0x266/0x480 [mt76x02_lib] mt76x02_tx_status_data+0x52/0x80 [mt76x02_lib] mt76u_tx_status_data+0x67/0xd0 [mt76_usb] process_one_work+0x225/0x400 worker_thread+0x50/0x3e0 ? process_one_work+0x400/0x400 kthread+0xe9/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30
Fixes: db3e1c40cf2f ("mac80211: Import airtime calculation code from mt76") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/airtime.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/mac80211/airtime.c b/net/mac80211/airtime.c index 2e66598fac79..e8ebd343e2bf 100644 --- a/net/mac80211/airtime.c +++ b/net/mac80211/airtime.c @@ -452,6 +452,9 @@ static u32 ieee80211_get_rate_duration(struct ieee80211_hw *hw, (status->encoding == RX_ENC_HE && streams > 8))) return 0;
+ if (idx >= MCS_GROUP_RATES) + return 0; + duration = airtime_mcs_groups[group].duration[idx]; duration <<= airtime_mcs_groups[group].shift; *overhead = 36 + (streams << 2);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 369eb2c9f1f72adbe91e0ea8efb130f0a2ba11a6 ]
I got a null-ptr-deref report as following when doing fault injection test:
BUG: kernel NULL pointer dereference, address: 0000000000000058 Oops: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 253 Comm: 507-spi-dm9051 Tainted: G B N 6.1.0-rc3+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:klist_put+0x2d/0xd0 Call Trace: <TASK> klist_remove+0xf1/0x1c0 device_release_driver_internal+0x23e/0x2d0 bus_remove_device+0x1bd/0x240 device_del+0x357/0x770 phy_device_remove+0x11/0x30 mdiobus_unregister+0xa5/0x140 release_nodes+0x6a/0xa0 devres_release_all+0xf8/0x150 device_unbind_cleanup+0x19/0xd0
//probe path: phy_device_register() device_add()
phy_connect phy_attach_direct() //set device driver probe() //it's failed, driver is not bound device_bind_driver() // probe failed, it's not called
//remove path: phy_device_remove() device_del() device_release_driver_internal() __device_release_driver() //dev->drv is not NULL klist_remove() <- knode_driver is not added yet, cause null-ptr-deref
In phy_attach_direct(), after setting the 'dev->driver', probe() fails, device_bind_driver() is not called, so the knode_driver->n_klist is not set, then it causes null-ptr-deref in __device_release_driver() while deleting device. Fix this by setting dev->driver to NULL in the error path in phy_attach_direct().
Fixes: e13934563db0 ("[PATCH] PHY Layer fixup") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 4df8c337221b..70c4d48f32c6 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1518,6 +1518,7 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
error_module_put: module_put(d->driver->owner); + d->driver = NULL; error_put_device: put_device(d); if (ndev_owner != bus->owner)
From: Zhang Changzhong zhangchangzhong@huawei.com
[ Upstream commit 46fb6512538d201d9a5b2bd7138b6751c37fdf0b ]
The am65_cpsw_nuss_cleanup_ndev() function calls unregister_netdev() even if register_netdev() fails, which triggers WARN_ON(1) in unregister_netdevice_many(). To fix it, make sure that unregister_netdev() is called only on registered netdev.
Compile tested only.
Fixes: 84b4aa493249 ("net: ethernet: ti: am65-cpsw: add multi port support in mac-only mode") Signed-off-by: Zhang Changzhong zhangchangzhong@huawei.com Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 348201e10d49..95baacd6c761 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -2061,7 +2061,7 @@ static void am65_cpsw_nuss_cleanup_ndev(struct am65_cpsw_common *common)
for (i = 0; i < common->port_num; i++) { port = &common->ports[i]; - if (port->ndev) + if (port->ndev && port->ndev->reg_state == NETREG_REGISTERED) unregister_netdev(port->ndev); } }
From: Yuan Can yuancan@huawei.com
[ Upstream commit b8f79dccd38edf7db4911c353d9cd792ab13a327 ]
The ntb_netdev_init_module() returns the ntb_transport_register_client() directly without checking its return value, if ntb_transport_register_client() failed, the NTB client device is not unregistered.
Fix by unregister NTB client device when ntb_transport_register_client() failed.
Fixes: 548c237c0a99 ("net: Add support for NTB virtual ethernet device") Signed-off-by: Yuan Can yuancan@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ntb_netdev.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ntb_netdev.c b/drivers/net/ntb_netdev.c index 80bdc07f2cd3..dd7e273c90cb 100644 --- a/drivers/net/ntb_netdev.c +++ b/drivers/net/ntb_netdev.c @@ -484,7 +484,14 @@ static int __init ntb_netdev_init_module(void) rc = ntb_transport_register_client_dev(KBUILD_MODNAME); if (rc) return rc; - return ntb_transport_register_client(&ntb_netdev_client); + + rc = ntb_transport_register_client(&ntb_netdev_client); + if (rc) { + ntb_transport_unregister_client_dev(KBUILD_MODNAME); + return rc; + } + + return 0; } module_init(ntb_netdev_init_module);
From: Wang Hai wanghai38@huawei.com
[ Upstream commit dcc14cfd7debe11b825cb077e75d91d2575b4cb8 ]
Both p9_fd_create_tcp() and p9_fd_create_unix() will call p9_socket_open(). If the creation of p9_trans_fd fails, p9_fd_create_tcp() and p9_fd_create_unix() will return an error directly instead of releasing the cscoket, which will result in a socket leak.
This patch adds sock_release() to fix the leak issue.
Fixes: 6b18662e239a ("9p connect fixes") Signed-off-by: Wang Hai wanghai38@huawei.com ACKed-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/9p/trans_fd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 8487321c1fc7..3e056fb043bb 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -862,8 +862,10 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket) struct file *file;
p = kzalloc(sizeof(struct p9_trans_fd), GFP_KERNEL); - if (!p) + if (!p) { + sock_release(csocket); return -ENOMEM; + }
csocket->sk->sk_allocation = GFP_NOIO; file = sock_alloc_file(csocket, 0, NULL);
From: Yuri Karpov YKarpov@ispras.ru
[ Upstream commit 9256db4e45e8b497b0e993cc3ed4ad08eb2389b6 ]
In function nixge_hw_dma_bd_release() dereference of NULL pointer priv->rx_bd_v is possible for the case of its allocation failure in nixge_hw_dma_bd_init().
Move for() loop with priv->rx_bd_v dereference under the check for its validity.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev") Signed-off-by: Yuri Karpov YKarpov@ispras.ru Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ni/nixge.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c index 4fc279a17562..bef3f0506487 100644 --- a/drivers/net/ethernet/ni/nixge.c +++ b/drivers/net/ethernet/ni/nixge.c @@ -249,25 +249,26 @@ static void nixge_hw_dma_bd_release(struct net_device *ndev) struct sk_buff *skb; int i;
- for (i = 0; i < RX_BD_NUM; i++) { - phys_addr = nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i], - phys); - - dma_unmap_single(ndev->dev.parent, phys_addr, - NIXGE_MAX_JUMBO_FRAME_SIZE, - DMA_FROM_DEVICE); - - skb = (struct sk_buff *)(uintptr_t) - nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i], - sw_id_offset); - dev_kfree_skb(skb); - } + if (priv->rx_bd_v) { + for (i = 0; i < RX_BD_NUM; i++) { + phys_addr = nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i], + phys); + + dma_unmap_single(ndev->dev.parent, phys_addr, + NIXGE_MAX_JUMBO_FRAME_SIZE, + DMA_FROM_DEVICE); + + skb = (struct sk_buff *)(uintptr_t) + nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i], + sw_id_offset); + dev_kfree_skb(skb); + }
- if (priv->rx_bd_v) dma_free_coherent(ndev->dev.parent, sizeof(*priv->rx_bd_v) * RX_BD_NUM, priv->rx_bd_v, priv->rx_bd_p); + }
if (priv->tx_skb) devm_kfree(ndev->dev.parent, priv->tx_skb);
From: M Chetan Kumar m.chetan.kumar@linux.intel.com
[ Upstream commit 985a02e75881b73a43c9433a718b49d272a9dd6b ]
sparse warnings - iosm_ipc_mux_codec.c:1474 using plain integer as NULL pointer.
Use skb_trim() to reset skb tail & len.
Fixes: 9413491e20e1 ("net: iosm: encode or decode datagram") Reported-by: kernel test robot lkp@intel.com Signed-off-by: M Chetan Kumar m.chetan.kumar@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_mux_codec.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c index d41e373f9c0a..c16365123660 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c +++ b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c @@ -1471,8 +1471,7 @@ void ipc_mux_ul_encoded_process(struct iosm_mux *ipc_mux, struct sk_buff *skb) ipc_mux->ul_data_pend_bytes);
/* Reset the skb settings. */ - skb->tail = 0; - skb->len = 0; + skb_trim(skb, 0);
/* Add the consumed ADB to the free list. */ skb_queue_tail((&ipc_mux->ul_adb.free_list), skb);
From: M Chetan Kumar m.chetan.kumar@linux.intel.com
[ Upstream commit 4a99e3c8ed888577b947cbed97d88c9706896105 ]
Fix build error reported on armhf while preparing 6.1-rc5 for Debian.
iosm_ipc_protocol.c:244:36: error: passing argument 3 of 'dma_alloc_coherent' from incompatible pointer type.
Change phy_ap_shm type from phys_addr_t to dma_addr_t.
Fixes: faed4c6f6f48 ("net: iosm: shared memory protocol") Reported-by: Bonaccorso Salvatore carnil@debian.org Signed-off-by: M Chetan Kumar m.chetan.kumar@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_protocol.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_protocol.h b/drivers/net/wwan/iosm/iosm_ipc_protocol.h index 9b3a6d86ece7..289397c4ea6c 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_protocol.h +++ b/drivers/net/wwan/iosm/iosm_ipc_protocol.h @@ -122,7 +122,7 @@ struct iosm_protocol { struct iosm_imem *imem; struct ipc_rsp *rsp_ring[IPC_MEM_MSG_ENTRIES]; struct device *dev; - phys_addr_t phy_ap_shm; + dma_addr_t phy_ap_shm; u32 old_msg_tail; };
From: M Chetan Kumar m.chetan.kumar@linux.intel.com
[ Upstream commit 2290a1d46bf30f9e0bcf49ad20d5c30d0e099989 ]
Peek throughput UL test is resulting in crash. If the UL transfer block free list is exhaust, the peeked skb is freed. In the next transfer freed skb is referred from UL list which results in crash.
Don't free the skb if UL transfer blocks are unavailable. The pending skb will be picked for transfer on UL transfer block available.
Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Signed-off-by: M Chetan Kumar m.chetan.kumar@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_mux_codec.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c index c16365123660..738420bd14af 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c +++ b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c @@ -1207,10 +1207,9 @@ static int mux_ul_dg_update_tbl_index(struct iosm_mux *ipc_mux, qlth_n_ql_size, ul_list); ipc_mux_ul_adb_finish(ipc_mux); if (ipc_mux_ul_adb_allocate(ipc_mux, adb, &ipc_mux->size_needed, - IOSM_AGGR_MUX_SIG_ADBH)) { - dev_kfree_skb(src_skb); + IOSM_AGGR_MUX_SIG_ADBH)) return -ENOMEM; - } + ipc_mux->size_needed = le32_to_cpu(adb->adbh->block_length);
ipc_mux->size_needed += offsetof(struct mux_adth, dg);
From: M Chetan Kumar m.chetan.kumar@linux.intel.com
[ Upstream commit c34ca4f32c24bf748493b49085e43cd714cf8357 ]
skb passed to network layer contains incorrect length.
In mux aggregation protocol, the datagram block received from device contains block signature, packet & datagram header. The right skb len to be calculated by subracting datagram pad len from datagram length.
Whereas in mux lite protocol, the skb contains single datagram so skb len is calculated by subtracting the packet offset from datagram header.
Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Signed-off-by: M Chetan Kumar m.chetan.kumar@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_mux_codec.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c index 738420bd14af..d6b166fc5c0e 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c +++ b/drivers/net/wwan/iosm/iosm_ipc_mux_codec.c @@ -365,7 +365,8 @@ static void ipc_mux_dl_cmd_decode(struct iosm_mux *ipc_mux, struct sk_buff *skb) /* Pass the DL packet to the netif layer. */ static int ipc_mux_net_receive(struct iosm_mux *ipc_mux, int if_id, struct iosm_wwan *wwan, u32 offset, - u8 service_class, struct sk_buff *skb) + u8 service_class, struct sk_buff *skb, + u32 pkt_len) { struct sk_buff *dest_skb = skb_clone(skb, GFP_ATOMIC);
@@ -373,7 +374,7 @@ static int ipc_mux_net_receive(struct iosm_mux *ipc_mux, int if_id, return -ENOMEM;
skb_pull(dest_skb, offset); - skb_set_tail_pointer(dest_skb, dest_skb->len); + skb_trim(dest_skb, pkt_len); /* Pass the packet to the netif layer. */ dest_skb->priority = service_class;
@@ -429,7 +430,7 @@ static void ipc_mux_dl_fcth_decode(struct iosm_mux *ipc_mux, static void ipc_mux_dl_adgh_decode(struct iosm_mux *ipc_mux, struct sk_buff *skb) { - u32 pad_len, packet_offset; + u32 pad_len, packet_offset, adgh_len; struct iosm_wwan *wwan; struct mux_adgh *adgh; u8 *block = skb->data; @@ -470,10 +471,12 @@ static void ipc_mux_dl_adgh_decode(struct iosm_mux *ipc_mux, packet_offset = sizeof(*adgh) + pad_len;
if_id += ipc_mux->wwan_q_offset; + adgh_len = le16_to_cpu(adgh->length);
/* Pass the packet to the netif layer */ rc = ipc_mux_net_receive(ipc_mux, if_id, wwan, packet_offset, - adgh->service_class, skb); + adgh->service_class, skb, + adgh_len - packet_offset); if (rc) { dev_err(ipc_mux->dev, "mux adgh decoding error"); return; @@ -547,7 +550,7 @@ static int mux_dl_process_dg(struct iosm_mux *ipc_mux, struct mux_adbh *adbh, int if_id, int nr_of_dg) { u32 dl_head_pad_len = ipc_mux->session[if_id].dl_head_pad_len; - u32 packet_offset, i, rc; + u32 packet_offset, i, rc, dg_len;
for (i = 0; i < nr_of_dg; i++, dg++) { if (le32_to_cpu(dg->datagram_index) @@ -562,11 +565,12 @@ static int mux_dl_process_dg(struct iosm_mux *ipc_mux, struct mux_adbh *adbh, packet_offset = le32_to_cpu(dg->datagram_index) + dl_head_pad_len; + dg_len = le16_to_cpu(dg->datagram_length); /* Pass the packet to the netif layer. */ rc = ipc_mux_net_receive(ipc_mux, if_id, ipc_mux->wwan, packet_offset, - dg->service_class, - skb); + dg->service_class, skb, + dg_len - dl_head_pad_len); if (rc) goto dg_error; }
From: Jerry Ray jerry.ray@microchip.com
[ Upstream commit 39f59bca275d2d819a8788c0f962e9e89843efc9 ]
This patch changes the reported ethtool statistics for the lan9303 family of parts covered by this driver.
The TxUnderRun statistic label is renamed to RxShort to accurately reflect what stat the device is reporting. I did not reorder the statistics as that might cause problems with existing user code that are expecting the stats at a certain offset.
Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Jerry Ray jerry.ray@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20221128193559.6572-1-jerry.ray@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/lan9303-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index e03ff1f267bb..1de62604434d 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -959,7 +959,7 @@ static const struct lan9303_mib_desc lan9303_mib[] = { { .offset = LAN9303_MAC_TX_BRDCST_CNT_0, .name = "TxBroad", }, { .offset = LAN9303_MAC_TX_PAUSE_CNT_0, .name = "TxPause", }, { .offset = LAN9303_MAC_TX_MULCST_CNT_0, .name = "TxMulti", }, - { .offset = LAN9303_MAC_RX_UNDSZE_CNT_0, .name = "TxUnderRun", }, + { .offset = LAN9303_MAC_RX_UNDSZE_CNT_0, .name = "RxShort", }, { .offset = LAN9303_MAC_TX_64_CNT_0, .name = "Tx64Byte", }, { .offset = LAN9303_MAC_TX_127_CNT_0, .name = "Tx128Byte", }, { .offset = LAN9303_MAC_TX_255_CNT_0, .name = "Tx256Byte", },
From: Menglong Dong imagedong@tencent.com
[ Upstream commit fe94800184f22d4778628f1321dce5acb7513d84 ]
All of the subflows of a msk will be orphaned in mptcp_close(), which means the subflows are in DEAD state. After then, DATA_FIN will be sent, and the other side will response with a DATA_ACK for this DATA_FIN.
However, if the other side still has pending data, the data that received on these subflows will not be passed to the msk, as they are DEAD and subflow_data_ready() will not be called in tcp_data_ready(). Therefore, these data can't be acked, and they will be retransmitted again and again, until timeout.
Fix this by setting ssk->sk_socket and ssk->sk_wq to 'NULL', instead of orphaning the subflows in __mptcp_close(), as Paolo suggested.
Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close") Reviewed-by: Biao Jiang benbjiang@tencent.com Reviewed-by: Mengen Sun mengensun@tencent.com Signed-off-by: Menglong Dong imagedong@tencent.com Reviewed-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index b568f55998f3..42d5e0a7952a 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2297,12 +2297,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, goto out; }
- /* if we are invoked by the msk cleanup code, the subflow is - * already orphaned - */ - if (ssk->sk_socket) - sock_orphan(ssk); - + sock_orphan(ssk); subflow->disposable = 1;
/* if ssk hit tcp_done(), tcp_cleanup_ulp() cleared the related ops @@ -2833,7 +2828,11 @@ bool __mptcp_close(struct sock *sk, long timeout) if (ssk == msk->first) subflow->fail_tout = 0;
- sock_orphan(ssk); + /* detach from the parent socket, but allow data_ready to + * push incoming data into the mptcp stack, to properly ack it + */ + ssk->sk_socket = NULL; + ssk->sk_wq = NULL; unlock_sock_fast(ssk, slow); } sock_orphan(sk);
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit b4f166651d03b5484fa179817ba8ad4899a5a6ac ]
Matt reported a splat at msk close time:
BUG: sleeping function called from invalid context at net/mptcp/protocol.c:2877 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 155, name: packetdrill preempt_count: 201, expected: 0 RCU nest depth: 0, expected: 0 4 locks held by packetdrill/155: #0: ffff888001536990 (&sb->s_type->i_mutex_key#6){+.+.}-{3:3}, at: __sock_release (net/socket.c:650) #1: ffff88800b498130 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973) #2: ffff88800b49a130 (sk_lock-AF_INET/1){+.+.}-{0:0}, at: __mptcp_close_ssk (net/mptcp/protocol.c:2363) #3: ffff88800b49a0b0 (slock-AF_INET){+...}-{2:2}, at: __lock_sock_fast (include/net/sock.h:1820) Preemption disabled at: 0x0 CPU: 1 PID: 155 Comm: packetdrill Not tainted 6.1.0-rc5 #365 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) __might_resched.cold (kernel/sched/core.c:9891) __mptcp_destroy_sock (include/linux/kernel.h:110) __mptcp_close (net/mptcp/protocol.c:2959) mptcp_subflow_queue_clean (include/net/sock.h:1777) __mptcp_close_ssk (net/mptcp/protocol.c:2363) mptcp_destroy_common (net/mptcp/protocol.c:3170) mptcp_destroy (include/net/sock.h:1495) __mptcp_destroy_sock (net/mptcp/protocol.c:2886) __mptcp_close (net/mptcp/protocol.c:2959) mptcp_close (net/mptcp/protocol.c:2974) inet_release (net/ipv4/af_inet.c:432) __sock_release (net/socket.c:651) sock_close (net/socket.c:1367) __fput (fs/file_table.c:320) task_work_run (kernel/task_work.c:181 (discriminator 1)) exit_to_user_mode_prepare (include/linux/resume_user_mode.h:49) syscall_exit_to_user_mode (kernel/entry/common.c:130) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
We can't call mptcp_close under the 'fast' socket lock variant, replace it with a sock_lock_nested() as the relevant code is already under the listening msk socket lock protection.
Reported-by: Matthieu Baerts matthieu.baerts@tessares.net Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/316 Fixes: 30e51b923e43 ("mptcp: fix unreleased socket in accept queue") Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/subflow.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 02a54d59697b..2159b5f9988f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1745,16 +1745,16 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
for (msk = head; msk; msk = next) { struct sock *sk = (struct sock *)msk; - bool slow, do_cancel_work; + bool do_cancel_work;
sock_hold(sk); - slow = lock_sock_fast_nested(sk); + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); next = msk->dl_next; msk->first = NULL; msk->dl_next = NULL;
do_cancel_work = __mptcp_close(sk, 0); - unlock_sock_fast(sk, slow); + release_sock(sk); if (do_cancel_work) mptcp_cancel_work(sk); sock_put(sk);
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 3067bc61fcfe3081bf4807ce65560f499e895e77 ]
As the call trace shows, the original skb was freed in tipc_msg_validate(), and dereferencing the old skb cb would cause an use-after-free crash.
BUG: KASAN: use-after-free in tipc_crypto_rcv_complete+0x1835/0x2240 [tipc] Call Trace: <IRQ> tipc_crypto_rcv_complete+0x1835/0x2240 [tipc] tipc_crypto_rcv+0xd32/0x1ec0 [tipc] tipc_rcv+0x744/0x1150 [tipc] ... Allocated by task 47078: kmem_cache_alloc_node+0x158/0x4d0 __alloc_skb+0x1c1/0x270 tipc_buf_acquire+0x1e/0xe0 [tipc] tipc_msg_create+0x33/0x1c0 [tipc] tipc_link_build_proto_msg+0x38a/0x2100 [tipc] tipc_link_timeout+0x8b8/0xef0 [tipc] tipc_node_timeout+0x2a1/0x960 [tipc] call_timer_fn+0x2d/0x1c0 ... Freed by task 47078: tipc_msg_validate+0x7b/0x440 [tipc] tipc_crypto_rcv_complete+0x4b5/0x2240 [tipc] tipc_crypto_rcv+0xd32/0x1ec0 [tipc] tipc_rcv+0x744/0x1150 [tipc]
This patch fixes it by re-fetching the skb cb from the new allocated skb after calling tipc_msg_validate().
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Reported-by: Shuang Li shuali@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/1b1cdba762915325bd8ef9a98d0276eb673df2a5.166939840... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/crypto.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index f09316a9035f..d67440de011e 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1971,6 +1971,9 @@ static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead, /* Ok, everything's fine, try to synch own keys according to peers' */ tipc_crypto_key_synch(rx, *skb);
+ /* Re-fetch skb cb as skb might be changed in tipc_msg_validate */ + skb_cb = TIPC_SKB_CB(*skb); + /* Mark skb decrypted */ skb_cb->decrypted = 1;
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit 7e177d32442b7ed08a9fa61b61724abc548cb248 ]
The skb is delivered to netif_rx() which may free it, after calling this, dereferencing skb may trigger use-after-free.
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: YueHaibing yuehaibing@huawei.com Link: https://lore.kernel.org/r/20221125075724.27912-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_forward.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c index a50429a62f74..56bb27d67a2e 100644 --- a/net/hsr/hsr_forward.c +++ b/net/hsr/hsr_forward.c @@ -351,17 +351,18 @@ static void hsr_deliver_master(struct sk_buff *skb, struct net_device *dev, struct hsr_node *node_src) { bool was_multicast_frame; - int res; + int res, recv_len;
was_multicast_frame = (skb->pkt_type == PACKET_MULTICAST); hsr_addr_subst_source(node_src, skb); skb_pull(skb, ETH_HLEN); + recv_len = skb->len; res = netif_rx(skb); if (res == NET_RX_DROP) { dev->stats.rx_dropped++; } else { dev->stats.rx_packets++; - dev->stats.rx_bytes += skb->len; + dev->stats.rx_bytes += recv_len; if (was_multicast_frame) dev->stats.multicast++; }
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit cdde1560118f82498fc9e9a7c1ef7f0ef7755891 ]
I got the following report while doing device(mscc-miim) load test with CONFIG_OF_UNITTEST and CONFIG_OF_DYNAMIC enabled:
OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /spi/soc@0/mdio@7107009c/ethernet-phy@0
If the 'fwnode' is not an acpi node, the refcount is get in fwnode_mdiobus_phy_device_register(), but it has never been put when the device is freed in the normal path. So call fwnode_handle_put() in phy_device_release() to avoid leak.
If it's an acpi node, it has never been get, but it's put in the error path, so call fwnode_handle_get() before phy_device_register() to keep get/put operation balanced.
Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20221124150130.609420-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/mdio/fwnode_mdio.c | 2 +- drivers/net/phy/phy_device.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c index 1c1584fca632..40e745a1d185 100644 --- a/drivers/net/mdio/fwnode_mdio.c +++ b/drivers/net/mdio/fwnode_mdio.c @@ -120,7 +120,7 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus, /* Associate the fwnode with the device structure so it * can be looked up later. */ - phy->mdio.dev.fwnode = child; + phy->mdio.dev.fwnode = fwnode_handle_get(child);
/* All data is now stored in the phy struct, so register it */ rc = phy_device_register(phy); diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 70c4d48f32c6..3607077cf86f 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -216,6 +216,7 @@ static void phy_mdio_device_free(struct mdio_device *mdiodev)
static void phy_device_release(struct device *dev) { + fwnode_handle_put(dev->fwnode); kfree(to_phy_device(dev)); }
From: David Howells dhowells@redhat.com
[ Upstream commit ca57f02295f188d6c65ec02202402979880fa6d8 ]
The fileserver probing code attempts to work out the best fileserver to use for a volume by retrieving the RTT calculated by AF_RXRPC for the probe call sent to each server and comparing them. Sometimes, however, no RTT estimate is available and rxrpc_kernel_get_srtt() returns false, leading good fileservers to be given an RTT of UINT_MAX and thus causing the rotation algorithm to ignore them.
Fix afs_select_fileserver() to ignore rxrpc_kernel_get_srtt()'s return value and just take the estimated RTT it provides - which will be capped at 1 second.
Fixes: 1d4adfaf6574 ("rxrpc: Make rxrpc_kernel_get_srtt() indicate validity") Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Marc Dionne marc.dionne@auristor.com Tested-by: Marc Dionne marc.dionne@auristor.com cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/166965503999.3392585.13954054113218099395.stgit@wa... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/fs_probe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c index c0031a3ab42f..3ac5fcf98d0d 100644 --- a/fs/afs/fs_probe.c +++ b/fs/afs/fs_probe.c @@ -167,8 +167,8 @@ void afs_fileserver_probe_result(struct afs_call *call) clear_bit(AFS_SERVER_FL_HAS_FS64, &server->flags); }
- if (rxrpc_kernel_get_srtt(call->net->socket, call->rxcall, &rtt_us) && - rtt_us < server->probe.rtt) { + rxrpc_kernel_get_srtt(call->net->socket, call->rxcall, &rtt_us); + if (rtt_us < server->probe.rtt) { server->probe.rtt = rtt_us; server->rtt = rtt_us; alist->preferred = index;
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit 5daadc86f27ea4d691e2131c04310d0418c6cd12 ]
syzbot reported use-after-free in tun_detach() [1]. This causes call trace like below:
================================================================== BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75 Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x15e/0x461 mm/kasan/report.c:395 kasan_report+0xbf/0x1f0 mm/kasan/report.c:495 notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75 call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942 call_netdevice_notifiers_extack net/core/dev.c:1983 [inline] call_netdevice_notifiers net/core/dev.c:1997 [inline] netdev_wait_allrefs_any net/core/dev.c:10237 [inline] netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351 tun_detach drivers/net/tun.c:704 [inline] tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xb3d/0x2a30 kernel/exit.c:820 do_group_exit+0xd4/0x2a0 kernel/exit.c:950 get_signal+0x21b1/0x2440 kernel/signal.c:2858 arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The cause of the issue is that sock_put() from __tun_detach() drops last reference count for struct net, and then notifier_call_chain() from netdev_state_change() accesses that struct net.
This patch fixes the issue by calling sock_put() from tun_detach() after all necessary accesses for the struct net has done.
Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified") Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edef... [1] Signed-off-by: Shigeru Yoshida syoshida@redhat.com Link: https://lore.kernel.org/r/20221124175134.1589053-1-syoshida@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/tun.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 3387074a2bdb..167e6a3784ca 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean) if (tun) xdp_rxq_info_unreg(&tfile->xdp_rxq); ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free); - sock_put(&tfile->sk); } }
@@ -702,6 +701,9 @@ static void tun_detach(struct tun_file *tfile, bool clean) if (dev) netdev_state_change(dev); rtnl_unlock(); + + if (clean) + sock_put(&tfile->sk); }
static void tun_detach_all(struct net_device *dev)
From: Chris Mi cmi@nvidia.com
[ Upstream commit 0e682f04b4b59eac0b0a030251513589c4607458 ]
The cited commit adds a for loop to check if each port supports lag or not. But dev is not initialized correctly. Fix it by initializing dev for each iteration.
Fixes: e87c6a832f88 ("net/mlx5: E-switch, Fix duplicate lag creation") Signed-off-by: Chris Mi cmi@nvidia.com Reported-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20221129093006.378840-2-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index a879e0b0f702..48f86e12f5c0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -648,11 +648,13 @@ static bool mlx5_lag_check_prereq(struct mlx5_lag *ldev) return false;
#ifdef CONFIG_MLX5_ESWITCH - dev = ldev->pf[MLX5_LAG_P1].dev; - for (i = 0; i < ldev->ports; i++) + for (i = 0; i < ldev->ports; i++) { + dev = ldev->pf[i].dev; if (mlx5_eswitch_num_vfs(dev->priv.eswitch) && !is_mdev_switchdev_mode(dev)) return false; + }
+ dev = ldev->pf[MLX5_LAG_P1].dev; mode = mlx5_eswitch_mode(dev); for (i = 0; i < ldev->ports; i++) if (mlx5_eswitch_mode(ldev->pf[i].dev) != mode)
From: Willem de Bruijn willemb@google.com
[ Upstream commit b85f628aa158a653c006e9c1405a117baef8c868 ]
CHECKSUM_COMPLETE signals that skb->csum stores the sum over the entire packet. It does not imply that an embedded l4 checksum field has been validated.
Fixes: 682f048bd494 ("af_packet: pass checksum validation status to the user") Signed-off-by: Willem de Bruijn willemb@google.com Link: https://lore.kernel.org/r/20221128161812.640098-1-willemdebruijn.kernel@gmai... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/packet/af_packet.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 5cbe07116e04..5727cb7ec174 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2293,8 +2293,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, if (skb->ip_summed == CHECKSUM_PARTIAL) status |= TP_STATUS_CSUMNOTREADY; else if (skb->pkt_type != PACKET_OUTGOING && - (skb->ip_summed == CHECKSUM_COMPLETE || - skb_csum_unnecessary(skb))) + skb_csum_unnecessary(skb)) status |= TP_STATUS_CSUM_VALID;
if (snaplen > res) @@ -3520,8 +3519,7 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, if (skb->ip_summed == CHECKSUM_PARTIAL) aux.tp_status |= TP_STATUS_CSUMNOTREADY; else if (skb->pkt_type != PACKET_OUTGOING && - (skb->ip_summed == CHECKSUM_COMPLETE || - skb_csum_unnecessary(skb))) + skb_csum_unnecessary(skb)) aux.tp_status |= TP_STATUS_CSUM_VALID;
aux.tp_len = origlen;
From: Zhengchao Shao shaozhengchao@huawei.com
[ Upstream commit 9ed7bfc79542119ac0a9e1ce8a2a5285e43433e9 ]
When sctp_stream_outq_migrate() is called to release stream out resources, the memory pointed to by prio_head in stream out is not released.
The memory leak information is as follows: unreferenced object 0xffff88801fe79f80 (size 64): comm "sctp_repo", pid 7957, jiffies 4294951704 (age 36.480s) hex dump (first 32 bytes): 80 9f e7 1f 80 88 ff ff 80 9f e7 1f 80 88 ff ff ................ 90 9f e7 1f 80 88 ff ff 90 9f e7 1f 80 88 ff ff ................ backtrace: [<ffffffff81b215c6>] kmalloc_trace+0x26/0x60 [<ffffffff88ae517c>] sctp_sched_prio_set+0x4cc/0x770 [<ffffffff88ad64f2>] sctp_stream_init_ext+0xd2/0x1b0 [<ffffffff88aa2604>] sctp_sendmsg_to_asoc+0x1614/0x1a30 [<ffffffff88ab7ff1>] sctp_sendmsg+0xda1/0x1ef0 [<ffffffff87f765ed>] inet_sendmsg+0x9d/0xe0 [<ffffffff8754b5b3>] sock_sendmsg+0xd3/0x120 [<ffffffff8755446a>] __sys_sendto+0x23a/0x340 [<ffffffff87554651>] __x64_sys_sendto+0xe1/0x1b0 [<ffffffff89978b49>] do_syscall_64+0x39/0xb0 [<ffffffff89a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Link: https://syzkaller.appspot.com/bug?exrid=29c402e56c4760763cc0 Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler") Reported-by: syzbot+29c402e56c4760763cc0@syzkaller.appspotmail.com Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/20221126031720.378562-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sctp/stream_sched.h | 2 ++ net/sctp/stream.c | 25 ++++++++++++++++++------- net/sctp/stream_sched.c | 5 +++++ net/sctp/stream_sched_prio.c | 19 +++++++++++++++++++ net/sctp/stream_sched_rr.c | 5 +++++ 5 files changed, 49 insertions(+), 7 deletions(-)
diff --git a/include/net/sctp/stream_sched.h b/include/net/sctp/stream_sched.h index 01a70b27e026..65058faea4db 100644 --- a/include/net/sctp/stream_sched.h +++ b/include/net/sctp/stream_sched.h @@ -26,6 +26,8 @@ struct sctp_sched_ops { int (*init)(struct sctp_stream *stream); /* Init a stream */ int (*init_sid)(struct sctp_stream *stream, __u16 sid, gfp_t gfp); + /* free a stream */ + void (*free_sid)(struct sctp_stream *stream, __u16 sid); /* Frees the entire thing */ void (*free)(struct sctp_stream *stream);
diff --git a/net/sctp/stream.c b/net/sctp/stream.c index ef9fceadef8d..ee6514af830f 100644 --- a/net/sctp/stream.c +++ b/net/sctp/stream.c @@ -52,6 +52,19 @@ static void sctp_stream_shrink_out(struct sctp_stream *stream, __u16 outcnt) } }
+static void sctp_stream_free_ext(struct sctp_stream *stream, __u16 sid) +{ + struct sctp_sched_ops *sched; + + if (!SCTP_SO(stream, sid)->ext) + return; + + sched = sctp_sched_ops_from_stream(stream); + sched->free_sid(stream, sid); + kfree(SCTP_SO(stream, sid)->ext); + SCTP_SO(stream, sid)->ext = NULL; +} + /* Migrates chunks from stream queues to new stream queues if needed, * but not across associations. Also, removes those chunks to streams * higher than the new max. @@ -70,16 +83,14 @@ static void sctp_stream_outq_migrate(struct sctp_stream *stream, * sctp_stream_update will swap ->out pointers. */ for (i = 0; i < outcnt; i++) { - kfree(SCTP_SO(new, i)->ext); + sctp_stream_free_ext(new, i); SCTP_SO(new, i)->ext = SCTP_SO(stream, i)->ext; SCTP_SO(stream, i)->ext = NULL; } }
- for (i = outcnt; i < stream->outcnt; i++) { - kfree(SCTP_SO(stream, i)->ext); - SCTP_SO(stream, i)->ext = NULL; - } + for (i = outcnt; i < stream->outcnt; i++) + sctp_stream_free_ext(stream, i); }
static int sctp_stream_alloc_out(struct sctp_stream *stream, __u16 outcnt, @@ -174,9 +185,9 @@ void sctp_stream_free(struct sctp_stream *stream) struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream); int i;
- sched->free(stream); + sched->unsched_all(stream); for (i = 0; i < stream->outcnt; i++) - kfree(SCTP_SO(stream, i)->ext); + sctp_stream_free_ext(stream, i); genradix_free(&stream->out); genradix_free(&stream->in); } diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c index 1ad565ed5627..7c8f9d89e16a 100644 --- a/net/sctp/stream_sched.c +++ b/net/sctp/stream_sched.c @@ -46,6 +46,10 @@ static int sctp_sched_fcfs_init_sid(struct sctp_stream *stream, __u16 sid, return 0; }
+static void sctp_sched_fcfs_free_sid(struct sctp_stream *stream, __u16 sid) +{ +} + static void sctp_sched_fcfs_free(struct sctp_stream *stream) { } @@ -96,6 +100,7 @@ static struct sctp_sched_ops sctp_sched_fcfs = { .get = sctp_sched_fcfs_get, .init = sctp_sched_fcfs_init, .init_sid = sctp_sched_fcfs_init_sid, + .free_sid = sctp_sched_fcfs_free_sid, .free = sctp_sched_fcfs_free, .enqueue = sctp_sched_fcfs_enqueue, .dequeue = sctp_sched_fcfs_dequeue, diff --git a/net/sctp/stream_sched_prio.c b/net/sctp/stream_sched_prio.c index 80b5a2c4cbc7..4fc9f2923ed1 100644 --- a/net/sctp/stream_sched_prio.c +++ b/net/sctp/stream_sched_prio.c @@ -204,6 +204,24 @@ static int sctp_sched_prio_init_sid(struct sctp_stream *stream, __u16 sid, return sctp_sched_prio_set(stream, sid, 0, gfp); }
+static void sctp_sched_prio_free_sid(struct sctp_stream *stream, __u16 sid) +{ + struct sctp_stream_priorities *prio = SCTP_SO(stream, sid)->ext->prio_head; + int i; + + if (!prio) + return; + + SCTP_SO(stream, sid)->ext->prio_head = NULL; + for (i = 0; i < stream->outcnt; i++) { + if (SCTP_SO(stream, i)->ext && + SCTP_SO(stream, i)->ext->prio_head == prio) + return; + } + + kfree(prio); +} + static void sctp_sched_prio_free(struct sctp_stream *stream) { struct sctp_stream_priorities *prio, *n; @@ -323,6 +341,7 @@ static struct sctp_sched_ops sctp_sched_prio = { .get = sctp_sched_prio_get, .init = sctp_sched_prio_init, .init_sid = sctp_sched_prio_init_sid, + .free_sid = sctp_sched_prio_free_sid, .free = sctp_sched_prio_free, .enqueue = sctp_sched_prio_enqueue, .dequeue = sctp_sched_prio_dequeue, diff --git a/net/sctp/stream_sched_rr.c b/net/sctp/stream_sched_rr.c index ff425aed62c7..cc444fe0d67c 100644 --- a/net/sctp/stream_sched_rr.c +++ b/net/sctp/stream_sched_rr.c @@ -90,6 +90,10 @@ static int sctp_sched_rr_init_sid(struct sctp_stream *stream, __u16 sid, return 0; }
+static void sctp_sched_rr_free_sid(struct sctp_stream *stream, __u16 sid) +{ +} + static void sctp_sched_rr_free(struct sctp_stream *stream) { sctp_sched_rr_unsched_all(stream); @@ -177,6 +181,7 @@ static struct sctp_sched_ops sctp_sched_rr = { .get = sctp_sched_rr_get, .init = sctp_sched_rr_init, .init_sid = sctp_sched_rr_init_sid, + .free_sid = sctp_sched_rr_free_sid, .free = sctp_sched_rr_free, .enqueue = sctp_sched_rr_enqueue, .dequeue = sctp_sched_rr_dequeue,
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit d66233a312ec9013af3e37e4030b479a20811ec3 ]
After system resumed on some environment board, the promiscuous mode is disabled because the SoC turned off. So, call ravb_set_rx_mode() in the ravb_resume() to fix the issue.
Reported-by: Tho Vu tho.vu.wh@renesas.com Fixes: 0184165b2f42 ("ravb: add sleep PM suspend/resume support") Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Link: https://lore.kernel.org/r/20221128065604.1864391-1-yoshihiro.shimoda.uh@rene... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/renesas/ravb_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 7e32b04eb0c7..44f9b31f8b99 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -3013,6 +3013,7 @@ static int __maybe_unused ravb_resume(struct device *dev) ret = ravb_open(ndev); if (ret < 0) return ret; + ravb_set_rx_mode(ndev); netif_device_attach(ndev); }
From: Marc Dionne marc.dionne@auristor.com
[ Upstream commit ef4d3ea40565a781c25847e9cb96c1bd9f462bc6 ]
The atomic_read was accidentally replaced with atomic_inc_return, which prevents the server from getting cleaned up and causes rmmod to hang with a warning:
Can't purge s=00000001
Fixes: 2757a4dc1849 ("afs: Fix access after dec in put functions") Signed-off-by: Marc Dionne marc.dionne@auristor.com Signed-off-by: David Howells dhowells@redhat.com Link: https://lore.kernel.org/r/20221130174053.2665818-1-marc.dionne@auristor.com/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/server.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/afs/server.c b/fs/afs/server.c index 4981baf97835..b5237206eac3 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -406,7 +406,7 @@ void afs_put_server(struct afs_net *net, struct afs_server *server, if (!server) return;
- a = atomic_inc_return(&server->active); + a = atomic_read(&server->active); zero = __refcount_dec_and_test(&server->ref, &r); trace_afs_server(debug_id, r - 1, a, reason); if (unlikely(zero))
From: Phil Auld pauld@redhat.com
[ Upstream commit a89ff5f5cc64b9fe7a992cf56988fd36f56ca82a ]
If coretemp_add_core() gets an error then pdata->core_data[indx] is already NULL and has been kfreed. Don't pass that to sysfs_remove_group() as that will crash in sysfs_remove_group().
[Shortened for readability] [91854.020159] sysfs: cannot create duplicate filename '/devices/platform/coretemp.0/hwmon/hwmon2/temp20_label' <cpu offline> [91855.126115] BUG: kernel NULL pointer dereference, address: 0000000000000188 [91855.165103] #PF: supervisor read access in kernel mode [91855.194506] #PF: error_code(0x0000) - not-present page [91855.224445] PGD 0 P4D 0 [91855.238508] Oops: 0000 [#1] PREEMPT SMP PTI ... [91855.342716] RIP: 0010:sysfs_remove_group+0xc/0x80 ... [91855.796571] Call Trace: [91855.810524] coretemp_cpu_offline+0x12b/0x1dd [coretemp] [91855.841738] ? coretemp_cpu_online+0x180/0x180 [coretemp] [91855.871107] cpuhp_invoke_callback+0x105/0x4b0 [91855.893432] cpuhp_thread_fun+0x8e/0x150 ...
Fix this by checking for NULL first.
Signed-off-by: Phil Auld pauld@redhat.com Cc: linux-hwmon@vger.kernel.org Cc: Fenghua Yu fenghua.yu@intel.com Cc: Jean Delvare jdelvare@suse.com Cc: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20221117162313.3164803-1-pauld@redhat.com Fixes: 199e0de7f5df3 ("hwmon: (coretemp) Merge pkgtemp with coretemp") Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/coretemp.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 8bf32c6c85d9..30a19d711f89 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -533,6 +533,10 @@ static void coretemp_remove_core(struct platform_data *pdata, int indx) { struct temp_data *tdata = pdata->core_data[indx];
+ /* if we errored on add then this is already gone */ + if (!tdata) + return; + /* Remove the sysfs attributes */ sysfs_remove_group(&pdata->hwmon_dev->kobj, &tdata->attr_group);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 7dec14537c5906b8bf40fd6fd6d9c3850f8df11d ]
As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). So call it after using to avoid refcount leak.
Fixes: 14513ee696a0 ("hwmon: (coretemp) Use PCI host bridge ID to identify CPU if necessary") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Link: https://lore.kernel.org/r/20221118093303.214163-1-yangyingliang@huawei.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/coretemp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 30a19d711f89..9bee4d33fbdf 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -242,10 +242,13 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev) */ if (host_bridge && host_bridge->vendor == PCI_VENDOR_ID_INTEL) { for (i = 0; i < ARRAY_SIZE(tjmax_pci_table); i++) { - if (host_bridge->device == tjmax_pci_table[i].device) + if (host_bridge->device == tjmax_pci_table[i].device) { + pci_dev_put(host_bridge); return tjmax_pci_table[i].tjmax; + } } } + pci_dev_put(host_bridge);
for (i = 0; i < ARRAY_SIZE(tjmax_table); i++) { if (strstr(c->x86_model_id, tjmax_table[i].id))
From: Yuan Can yuancan@huawei.com
[ Upstream commit 9bdc112be727cf1ba65be79541147f960c3349d8 ]
As the devm_kcalloc may return NULL, the return value needs to be checked to avoid NULL poineter dereference.
Fixes: d0ddfd241e57 ("hwmon: (asus-ec-sensors) add driver for ASUS EC") Signed-off-by: Yuan Can yuancan@huawei.com Link: https://lore.kernel.org/r/20221125014329.121560-1-yuancan@huawei.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/asus-ec-sensors.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/hwmon/asus-ec-sensors.c b/drivers/hwmon/asus-ec-sensors.c index 81e688975c6a..a901e4e33d81 100644 --- a/drivers/hwmon/asus-ec-sensors.c +++ b/drivers/hwmon/asus-ec-sensors.c @@ -938,6 +938,8 @@ static int asus_ec_probe(struct platform_device *pdev) ec_data->nr_sensors = hweight_long(ec_data->board_info->sensors); ec_data->sensors = devm_kcalloc(dev, ec_data->nr_sensors, sizeof(struct ec_sensor), GFP_KERNEL); + if (!ec_data->sensors) + return -ENOMEM;
status = setup_lock_data(dev); if (status) {
From: Jisheng Zhang jszhang@kernel.org
commit 74f6bb55c834da6d4bac24f44868202743189b2b upstream.
lkp reported a build error, I tried the config and can reproduce build error as below:
VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg ld.lld: error: section .note file range overlaps with .text
.note range is [0x7C8, 0x803] .text range is [0x800, 0x1993]
ld.lld: error: section .text file range overlaps with .dynamic
.text range is [0x800, 0x1993] .dynamic range is [0x808, 0x937]
ld.lld: error: section .note virtual address range overlaps with .text
.note range is [0x7C8, 0x803] .text range is [0x800, 0x1993]
Fix it by setting DISABLE_BRANCH_PROFILING which will disable branch tracing for vdso, thus avoid useless _ftrace_annotated_branch section and _ftrace_branch section. Although we can also fix it by removing the hardcoded .text begin address, but I think that's another story and should be put into another patch.
Link: https://lore.kernel.org/lkml/202210122123.Cc4FPShJ-lkp@intel.com/#r Reported-by: kernel test robot lkp@intel.com Signed-off-by: Jisheng Zhang jszhang@kernel.org Link: https://lore.kernel.org/r/20221102170254.1925-1-jszhang@kernel.org Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/vdso/Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -17,6 +17,7 @@ vdso-syms += flush_icache obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o
ccflags-y := -fno-stack-protector +ccflags-y += -DDISABLE_BRANCH_PROFILING
ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y)
From: Björn Töpel bjorn@rivosinc.com
commit 6fdd5d2f8c2f54b7fad4ff4df2a19542aeaf6102 upstream.
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1].
At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC.
When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission.
This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map.
In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception.
Restore the freed initmem properly, by setting both the kernel image map to the correct permissions.
[1] Documentation/riscv/vm-layout.rst
Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel bjorn@rivosinc.com Reviewed-by: Alexandre Ghiti alex@ghiti.fr Tested-by: Alexandre Ghiti alex@ghiti.fr Link: https://lore.kernel.org/r/20221115090641.258476-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/setup.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -322,10 +322,11 @@ subsys_initcall(topology_init);
void free_initmem(void) { - if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) - set_kernel_memory(lm_alias(__init_begin), lm_alias(__init_end), - IS_ENABLED(CONFIG_64BIT) ? - set_memory_rw : set_memory_rw_nx); + if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) { + set_kernel_memory(lm_alias(__init_begin), lm_alias(__init_end), set_memory_rw_nx); + if (IS_ENABLED(CONFIG_64BIT)) + set_kernel_memory(__init_begin, __init_end, set_memory_nx); + }
free_initmem_default(POISON_FREE_INITMEM); }
From: Takashi Sakamoto o-takashi@sakamocchi.jp
commit 9b84f0f74d0d716e3fd18dc428ac111266ef5844 upstream.
For Lexicon I-ONIX FW810S, the call of ioctl(2) with SNDRV_PCM_IOCTL_HW_PARAMS can returns -ETIMEDOUT. This is a regression due to the commit 41319eb56e19 ("ALSA: dice: wait just for NOTIFY_CLOCK_ACCEPTED after GLOBAL_CLOCK_SELECT operation"). The device does not emit NOTIFY_CLOCK_ACCEPTED notification when accepting GLOBAL_CLOCK_SELECT operation with the same parameters as current ones.
This commit fixes the regression. When receiving no notification, return -ETIMEDOUT as long as operating for any change.
Fixes: 41319eb56e19 ("ALSA: dice: wait just for NOTIFY_CLOCK_ACCEPTED after GLOBAL_CLOCK_SELECT operation") Cc: stable@vger.kernel.org Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp Link: https://lore.kernel.org/r/20221130130604.29774-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/firewire/dice/dice-stream.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/sound/firewire/dice/dice-stream.c +++ b/sound/firewire/dice/dice-stream.c @@ -59,7 +59,7 @@ int snd_dice_stream_get_rate_mode(struct
static int select_clock(struct snd_dice *dice, unsigned int rate) { - __be32 reg; + __be32 reg, new; u32 data; int i; int err; @@ -83,15 +83,17 @@ static int select_clock(struct snd_dice if (completion_done(&dice->clock_accepted)) reinit_completion(&dice->clock_accepted);
- reg = cpu_to_be32(data); + new = cpu_to_be32(data); err = snd_dice_transaction_write_global(dice, GLOBAL_CLOCK_SELECT, - ®, sizeof(reg)); + &new, sizeof(new)); if (err < 0) return err;
if (wait_for_completion_timeout(&dice->clock_accepted, - msecs_to_jiffies(NOTIFICATION_TIMEOUT_MS)) == 0) - return -ETIMEDOUT; + msecs_to_jiffies(NOTIFICATION_TIMEOUT_MS)) == 0) { + if (reg != new) + return -ETIMEDOUT; + }
return 0; }
From: Ziyang Xuan william.xuanziyang@huawei.com
commit 8fa452cfafed521aaf5a18c71003fe24b1ee6141 upstream.
In can327_feed_frame_to_netdev(), it did not free the skb when netdev is down, and all callers of can327_feed_frame_to_netdev() did not free allocated skb too. That would trigger skb leak.
Fix it by adding kfree_skb() in can327_feed_frame_to_netdev() when netdev is down. Not tested, just compiled.
Fixes: 43da2f07622f ("can: can327: CAN/ldisc driver for ELM327 based OBD-II adapters") Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Link: https://lore.kernel.org/all/20221110061437.411525-1-william.xuanziyang@huawe... Reviewed-by: Max Staudt max@enpas.org Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/can327.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/can327.c b/drivers/net/can/can327.c index 094197780776..ed3d0b8989a0 100644 --- a/drivers/net/can/can327.c +++ b/drivers/net/can/can327.c @@ -263,8 +263,10 @@ static void can327_feed_frame_to_netdev(struct can327 *elm, struct sk_buff *skb) { lockdep_assert_held(&elm->lock);
- if (!netif_running(elm->dev)) + if (!netif_running(elm->dev)) { + kfree_skb(skb); return; + }
/* Queue for NAPI pickup. * rx-offload will update stats and LEDs for us.
From: Steven Rostedt (Google) rostedt@goodmis.org
commit a4412fdd49dc011bcc2c0d81ac4cab7457092650 upstream.
The config to be able to inject error codes into any function annotated with ALLOW_ERROR_INJECTION() is enabled when FUNCTION_ERROR_INJECTION is enabled. But unfortunately, this is always enabled on x86 when KPROBES is enabled, and there's no way to turn it off.
As kprobes is useful for observability of the kernel, it is useful to have it enabled in production environments. But error injection should be avoided. Add a prompt to the config to allow it to be disabled even when kprobes is enabled, and get rid of the "def_bool y".
This is a kernel debug feature (it's in Kconfig.debug), and should have never been something enabled by default.
Cc: stable@vger.kernel.org Fixes: 540adea3809f6 ("error-injection: Separate error-injection from kprobe") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/Kconfig.debug | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1862,8 +1862,14 @@ config NETDEV_NOTIFIER_ERROR_INJECT If unsure, say N.
config FUNCTION_ERROR_INJECTION - def_bool y + bool "Fault-injections of functions" depends on HAVE_FUNCTION_ERROR_INJECTION && KPROBES + help + Add fault injections into various functions that are annotated with + ALLOW_ERROR_INJECTION() in the kernel. BPF may also modify the return + value of theses functions. This is useful to test error paths of code. + + If unsure, say N
config FAULT_INJECTION bool "Fault-injection framework"
From: Tiezhu Yang yangtiezhu@loongson.cn
commit a435874bf626f55d7147026b059008c8de89fbb8 upstream.
The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.
sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/vm`
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz tar xf grep-3.8.tar.gz cd grep-3.8 && ./configure && make sudo make install export PATH=/usr/local/bin:$PATH
Link: https://lkml.kernel.org/r/1668825419-30584-1-git-send-email-yangtiezhu@loong... Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/vm/slabinfo-gnuplot.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/vm/slabinfo-gnuplot.sh +++ b/tools/vm/slabinfo-gnuplot.sh @@ -150,7 +150,7 @@ do_preprocess() let lines=3 out=`basename "$in"`"-slabs-by-loss" `cat "$in" | grep -A "$lines" 'Slabs sorted by loss' |\ - egrep -iv '--|Name|Slabs'\ + grep -E -iv '--|Name|Slabs'\ | awk '{print $1" "$4+$2*$3" "$4}' > "$out"` if [ $? -eq 0 ]; then do_slabs_plotting "$out" @@ -159,7 +159,7 @@ do_preprocess() let lines=3 out=`basename "$in"`"-slabs-by-size" `cat "$in" | grep -A "$lines" 'Slabs sorted by size' |\ - egrep -iv '--|Name|Slabs'\ + grep -E -iv '--|Name|Slabs'\ | awk '{print $1" "$4" "$4-$2*$3}' > "$out"` if [ $? -eq 0 ]; then do_slabs_plotting "$out"
From: ZhangPeng zhangpeng362@huawei.com
commit f0a0ccda18d6fd826d7c7e7ad48a6ed61c20f8b4 upstream.
Syzbot reported a null-ptr-deref bug:
NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 PID: 3603 Comm: segctord Not tainted 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0 fs/nilfs2/alloc.c:608 Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7 RSP: 0018:ffffc90003dff830 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010 RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158 R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004 FS: 0000000000000000(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0 Call Trace: <TASK> nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline] nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193 nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236 nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940 nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline] nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline] nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088 nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337 nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568 nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018 nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline] nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline] nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045 nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline] nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570 kthread+0x2e4/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 </TASK> ...
If DAT metadata file is corrupted on disk, there is a case where req->pr_desc_bh is NULL and blocknr is 0 at nilfs_dat_commit_end() during a b-tree operation that cascadingly updates ancestor nodes of the b-tree, because nilfs_dat_commit_alloc() for a lower level block can initialize the blocknr on the same DAT entry between nilfs_dat_prepare_end() and nilfs_dat_commit_end().
If this happens, nilfs_dat_commit_end() calls nilfs_dat_commit_free() without valid buffer heads in req->pr_desc_bh and req->pr_bitmap_bh, and causes the NULL pointer dereference above in nilfs_palloc_commit_free_entry() function, which leads to a crash.
Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().
This also calls nilfs_error() in that case to notify that there is a fatal flaw in the filesystem metadata and prevent further operations.
Link: https://lkml.kernel.org/r/00000000000097c20205ebaea3d6@google.com Link: https://lkml.kernel.org/r/20221114040441.1649940-1-zhangpeng362@huawei.com Link: https://lkml.kernel.org/r/20221119120542.17204-1-konishi.ryusuke@gmail.com Signed-off-by: ZhangPeng zhangpeng362@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+ebe05ee8e98f755f61d0@syzkaller.appspotmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/dat.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/nilfs2/dat.c +++ b/fs/nilfs2/dat.c @@ -111,6 +111,13 @@ static void nilfs_dat_commit_free(struct kunmap_atomic(kaddr);
nilfs_dat_commit_entry(dat, req); + + if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)) { + nilfs_error(dat->i_sb, + "state inconsistency probably due to duplicate use of vblocknr = %llu", + (unsigned long long)req->pr_entry_nr); + return; + } nilfs_palloc_commit_free_entry(dat, req); }
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 66065157420c5b9b3f078f43d313c153e1ff7f83 upstream.
The "force" argument to write_spec_ctrl_current() is currently ambiguous as it does not guarantee the MSR write. This is due to the optimization that writes to the MSR happen only when the new value differs from the cached value.
This is fine in most cases, but breaks for S3 resume when the cached MSR value gets out of sync with the hardware MSR value due to S3 resetting it.
When x86_spec_ctrl_current is same as x86_spec_ctrl_base, the MSR write is skipped. Which results in SPEC_CTRL mitigations not getting restored.
Move the MSR write from write_spec_ctrl_current() to a new function that unconditionally writes to the MSR. Update the callers accordingly and rename functions.
[ bp: Rework a bit. ]
Fixes: caa0ff24d5d0 ("x86/bugs: Keep a per-CPU IA32_SPEC_CTRL value") Suggested-by: Borislav Petkov bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Thomas Gleixner tglx@linutronix.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/806d39b0bfec2fe8f50dc5446dff20f5bb24a959.166982157... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 2 +- arch/x86/kernel/cpu/bugs.c | 21 ++++++++++++++------- arch/x86/kernel/process.c | 2 +- 3 files changed, 16 insertions(+), 9 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -321,7 +321,7 @@ static inline void indirect_branch_predi /* The Intel SPEC CTRL MSR base value cache */ extern u64 x86_spec_ctrl_base; DECLARE_PER_CPU(u64, x86_spec_ctrl_current); -extern void write_spec_ctrl_current(u64 val, bool force); +extern void update_spec_ctrl_cond(u64 val); extern u64 spec_ctrl_current(void);
/* --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -60,11 +60,18 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_current)
static DEFINE_MUTEX(spec_ctrl_mutex);
+/* Update SPEC_CTRL MSR and its cached copy unconditionally */ +static void update_spec_ctrl(u64 val) +{ + this_cpu_write(x86_spec_ctrl_current, val); + wrmsrl(MSR_IA32_SPEC_CTRL, val); +} + /* * Keep track of the SPEC_CTRL MSR value for the current task, which may differ * from x86_spec_ctrl_base due to STIBP/SSB in __speculation_ctrl_update(). */ -void write_spec_ctrl_current(u64 val, bool force) +void update_spec_ctrl_cond(u64 val) { if (this_cpu_read(x86_spec_ctrl_current) == val) return; @@ -75,7 +82,7 @@ void write_spec_ctrl_current(u64 val, bo * When KERNEL_IBRS this MSR is written on return-to-user, unless * forced the update can be delayed until that time. */ - if (force || !cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) + if (!cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) wrmsrl(MSR_IA32_SPEC_CTRL, val); }
@@ -1328,7 +1335,7 @@ static void __init spec_ctrl_disable_ker
if (ia32_cap & ARCH_CAP_RRSBA) { x86_spec_ctrl_base |= SPEC_CTRL_RRSBA_DIS_S; - write_spec_ctrl_current(x86_spec_ctrl_base, true); + update_spec_ctrl(x86_spec_ctrl_base); } }
@@ -1450,7 +1457,7 @@ static void __init spectre_v2_select_mit
if (spectre_v2_in_ibrs_mode(mode)) { x86_spec_ctrl_base |= SPEC_CTRL_IBRS; - write_spec_ctrl_current(x86_spec_ctrl_base, true); + update_spec_ctrl(x86_spec_ctrl_base); }
switch (mode) { @@ -1564,7 +1571,7 @@ static void __init spectre_v2_select_mit static void update_stibp_msr(void * __unused) { u64 val = spec_ctrl_current() | (x86_spec_ctrl_base & SPEC_CTRL_STIBP); - write_spec_ctrl_current(val, true); + update_spec_ctrl(val); }
/* Update x86_spec_ctrl_base in case SMT state changed. */ @@ -1797,7 +1804,7 @@ static enum ssb_mitigation __init __ssb_ x86_amd_ssb_disable(); } else { x86_spec_ctrl_base |= SPEC_CTRL_SSBD; - write_spec_ctrl_current(x86_spec_ctrl_base, true); + update_spec_ctrl(x86_spec_ctrl_base); } }
@@ -2048,7 +2055,7 @@ int arch_prctl_spec_ctrl_get(struct task void x86_spec_ctrl_setup_ap(void) { if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) - write_spec_ctrl_current(x86_spec_ctrl_base, true); + update_spec_ctrl(x86_spec_ctrl_base);
if (ssb_mode == SPEC_STORE_BYPASS_DISABLE) x86_amd_ssb_disable(); --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -600,7 +600,7 @@ static __always_inline void __speculatio }
if (updmsr) - write_spec_ctrl_current(msr, false); + update_spec_ctrl_cond(msr); }
static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 6989ea4881c8944fbf04378418bb1af63d875ef8 upstream.
The firmware on some systems may configure GPIO pins to be an interrupt source in so called "direct IRQ" mode. In such cases the GPIO controller driver has no idea if those pins are being used or not. At the same time, there is a known bug in the firmwares that don't restore the pin settings correctly after suspend, i.e. by an unknown reason the Rx value becomes inverted.
Hence, let's save and restore the pins that are configured as GPIOs in the input mode with GPIROUTIOXAPIC bit set.
Cc: stable@vger.kernel.org Reported-and-tested-by: Dale Smith dalepsmith@gmail.com Reported-and-tested-by: John Harris jmharris@gmail.com BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214749 Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Link: https://lore.kernel.org/r/20221124222926.72326-1-andriy.shevchenko@linux.int... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/intel/pinctrl-intel.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
--- a/drivers/pinctrl/intel/pinctrl-intel.c +++ b/drivers/pinctrl/intel/pinctrl-intel.c @@ -436,9 +436,14 @@ static void __intel_gpio_set_direction(v writel(value, padcfg0); }
+static int __intel_gpio_get_gpio_mode(u32 value) +{ + return (value & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT; +} + static int intel_gpio_get_gpio_mode(void __iomem *padcfg0) { - return (readl(padcfg0) & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT; + return __intel_gpio_get_gpio_mode(readl(padcfg0)); }
static void intel_gpio_set_gpio_mode(void __iomem *padcfg0) @@ -1674,6 +1679,7 @@ EXPORT_SYMBOL_GPL(intel_pinctrl_get_soc_ static bool intel_pinctrl_should_save(struct intel_pinctrl *pctrl, unsigned int pin) { const struct pin_desc *pd = pin_desc_get(pctrl->pctldev, pin); + u32 value;
if (!pd || !intel_pad_usable(pctrl, pin)) return false; @@ -1688,6 +1694,25 @@ static bool intel_pinctrl_should_save(st gpiochip_line_is_irq(&pctrl->chip, intel_pin_to_gpio(pctrl, pin))) return true;
+ /* + * The firmware on some systems may configure GPIO pins to be + * an interrupt source in so called "direct IRQ" mode. In such + * cases the GPIO controller driver has no idea if those pins + * are being used or not. At the same time, there is a known bug + * in the firmwares that don't restore the pin settings correctly + * after suspend, i.e. by an unknown reason the Rx value becomes + * inverted. + * + * Hence, let's save and restore the pins that are configured + * as GPIOs in the input mode with GPIROUTIOXAPIC bit set. + * + * See https://bugzilla.kernel.org/show_bug.cgi?id=214749. + */ + value = readl(intel_get_padcfg(pctrl, pin, PADCFG0)); + if ((value & PADCFG0_GPIROUTIOXAPIC) && (value & PADCFG0_GPIOTXDIS) && + (__intel_gpio_get_gpio_mode(value) == PADCFG0_PMODE_GPIO)) + return true; + return false; }
From: Linus Torvalds torvalds@linux-foundation.org
commit 6647e76ab623b2b3fb2efe03a86e9c9046c52c33 upstream.
The V4L2_MEMORY_USERPTR interface is long deprecated and shouldn't be used (and is discouraged for any modern v4l drivers). And Seth Jenkins points out that the fallback to VM_PFNMAP/VM_IO is fundamentally racy and dangerous.
Note that it's not even a case that should trigger, since any normal user pointer logic ends up just using the pin_user_pages_fast() call that does the proper page reference counting. That's not the problem case, only if you try to use special device mappings do you have any issues.
Normally I'd just remove this during the merge window, but since Seth pointed out the problem cases, we really want to know as soon as possible if there are actually any users of this odd special case of a legacy interface. Neither Hans nor Mauro seem to think that such mis-uses of the old legacy interface should exist. As Mauro says:
"See, V4L2 has actually 4 streaming APIs: - Kernel-allocated mmap (usually referred simply as just mmap); - USERPTR mmap; - read(); - dmabuf;
The USERPTR is one of the oldest way to use it, coming from V4L version 1 times, and by far the least used one"
And Hans chimed in on the USERPTR interface:
"To be honest, I wouldn't mind if it goes away completely, but that's a bit of a pipe dream right now"
but while removing this legacy interface entirely may be a pipe dream we can at least try to remove the unlikely (and actively broken) case of using special device mappings for USERPTR accesses.
This replaces it with a WARN_ONCE() that we can remove once we've hopefully confirmed that no actual users exist.
NOTE! Longer term, this means that a 'struct frame_vector' only ever contains proper page pointers, and all the games we have with converting them to pages can go away (grep for 'frame_vector_to_pages()' and the uses of 'vec->is_pfns'). But this is just the first step, to verify that this code really is all dead, and do so as quickly as possible.
Reported-by: Seth Jenkins sethjenkins@google.com Acked-by: Hans Verkuil hverkuil@xs4all.nl Acked-by: Mauro Carvalho Chehab mchehab@kernel.org Cc: David Hildenbrand david@redhat.com Cc: Jan Kara jack@suse.cz Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/common/videobuf2/frame_vector.c | 68 ++++---------------------- 1 file changed, 12 insertions(+), 56 deletions(-)
--- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -35,11 +35,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, struct frame_vector *vec) { - struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; - int ret_pin_user_pages_fast = 0; - int ret = 0; - int err; + int ret;
if (nr_frames == 0) return 0; @@ -52,57 +48,17 @@ int get_vaddr_frames(unsigned long start ret = pin_user_pages_fast(start, nr_frames, FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, (struct page **)(vec->ptrs)); - if (ret > 0) { - vec->got_ref = true; - vec->is_pfns = false; - goto out_unlocked; - } - ret_pin_user_pages_fast = ret; - - mmap_read_lock(mm); - vec->got_ref = false; - vec->is_pfns = true; - ret = 0; - do { - unsigned long *nums = frame_vector_pfns(vec); - - vma = vma_lookup(mm, start); - if (!vma) - break; - - while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) { - err = follow_pfn(vma, start, &nums[ret]); - if (err) { - if (ret) - goto out; - // If follow_pfn() returns -EINVAL, then this - // is not an IO mapping or a raw PFN mapping. - // In that case, return the original error from - // pin_user_pages_fast(). Otherwise this - // function would return -EINVAL when - // pin_user_pages_fast() returned -ENOMEM, - // which makes debugging hard. - if (err == -EINVAL && ret_pin_user_pages_fast) - ret = ret_pin_user_pages_fast; - else - ret = err; - goto out; - } - start += PAGE_SIZE; - ret++; - } - /* Bail out if VMA doesn't completely cover the tail page. */ - if (start < vma->vm_end) - break; - } while (ret < nr_frames); -out: - mmap_read_unlock(mm); -out_unlocked: - if (!ret) - ret = -EFAULT; - if (ret > 0) - vec->nr_frames = ret; - return ret; + vec->got_ref = true; + vec->is_pfns = false; + vec->nr_frames = ret; + + if (likely(ret > 0)) + return ret; + + /* This used to (racily) return non-refcounted pfns. Let people know */ + WARN_ONCE(1, "get_vaddr_frames() cannot follow VM_IO mapping"); + vec->nr_frames = 0; + return ret ? ret : -EFAULT; } EXPORT_SYMBOL(get_vaddr_frames);
From: Gavin Shan gshan@redhat.com
commit 829ae0f81ce093d674ff2256f66a714753e9ce32 upstream.
The issue is reported when removing memory through virtio_mem device. The transparent huge page, experienced copy-on-write fault, is wrongly regarded as pinned. The transparent huge page is escaped from being isolated in isolate_migratepages_block(). The transparent huge page can't be migrated and the corresponding memory block can't be put into offline state.
Fix it by replacing page_mapcount() with total_mapcount(). With this, the transparent huge page can be isolated and migrated, and the memory block can be put into offline state. Besides, The page's refcount is increased a bit earlier to avoid the page is released when the check is executed.
Link: https://lkml.kernel.org/r/20221124095523.31061-1-gshan@redhat.com Fixes: 1da2f328fa64 ("mm,thp,compaction,cma: allow THP migration for CMA allocations") Signed-off-by: Gavin Shan gshan@redhat.com Reported-by: Zhenyu Zhang zhenyzha@redhat.com Tested-by: Zhenyu Zhang zhenyzha@redhat.com Suggested-by: David Hildenbrand david@redhat.com Acked-by: David Hildenbrand david@redhat.com Cc: Alistair Popple apopple@nvidia.com Cc: Hugh Dickins hughd@google.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Matthew Wilcox willy@infradead.org Cc: William Kucharski william.kucharski@oracle.com Cc: Zi Yan ziy@nvidia.com Cc: stable@vger.kernel.org [5.7+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/compaction.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/mm/compaction.c +++ b/mm/compaction.c @@ -987,28 +987,28 @@ isolate_migratepages_block(struct compac }
/* + * Be careful not to clear PageLRU until after we're + * sure the page is not being freed elsewhere -- the + * page release code relies on it. + */ + if (unlikely(!get_page_unless_zero(page))) + goto isolate_fail; + + /* * Migration will fail if an anonymous page is pinned in memory, * so avoid taking lru_lock and isolating it unnecessarily in an * admittedly racy check. */ mapping = page_mapping(page); - if (!mapping && page_count(page) > page_mapcount(page)) - goto isolate_fail; + if (!mapping && (page_count(page) - 1) > total_mapcount(page)) + goto isolate_fail_put;
/* * Only allow to migrate anonymous pages in GFP_NOFS context * because those do not depend on fs locks. */ if (!(cc->gfp_mask & __GFP_FS) && mapping) - goto isolate_fail; - - /* - * Be careful not to clear PageLRU until after we're - * sure the page is not being freed elsewhere -- the - * page release code relies on it. - */ - if (unlikely(!get_page_unless_zero(page))) - goto isolate_fail; + goto isolate_fail_put;
/* Only take pages on LRU: a check now makes later tests safe */ if (!PageLRU(page))
From: Goh, Wei Sheng wei.sheng.goh@intel.com
commit cc3d2b5fc0d6f8ad8a52da5ea679e5c2ec2adbd4 upstream.
Currently, pause frame register GMAC_RX_FLOW_CTRL_RFE is not updated correctly when 'ethtool -A <IFACE> autoneg off rx off tx off' command is issued. This fix ensures the flow control change is reflected directly in the GMAC_RX_FLOW_CTRL_RFE register.
Fixes: 46f69ded988d ("net: stmmac: Use resolved link config in mac_link_up()") Cc: stable@vger.kernel.org # 5.10.x Signed-off-by: Goh, Wei Sheng wei.sheng.goh@intel.com Signed-off-by: Noor Azura Ahmad Tarmizi noor.azura.ahmad.tarmizi@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 ++++++++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -749,6 +749,8 @@ static void dwmac4_flow_ctrl(struct mac_ if (fc & FLOW_RX) { pr_debug("\tReceive Flow-Control ON\n"); flow |= GMAC_RX_FLOW_CTRL_RFE; + } else { + pr_debug("\tReceive Flow-Control OFF\n"); } writel(flow, ioaddr + GMAC_RX_FLOW_CTRL);
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1061,8 +1061,16 @@ static void stmmac_mac_link_up(struct ph ctrl |= priv->hw->link.duplex;
/* Flow Control operation */ - if (tx_pause && rx_pause) - stmmac_mac_flow_ctrl(priv, duplex); + if (rx_pause && tx_pause) + priv->flow_ctrl = FLOW_AUTO; + else if (rx_pause && !tx_pause) + priv->flow_ctrl = FLOW_RX; + else if (!rx_pause && tx_pause) + priv->flow_ctrl = FLOW_TX; + else + priv->flow_ctrl = FLOW_OFF; + + stmmac_mac_flow_ctrl(priv, duplex);
if (ctrl != old_ctrl) writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
From: Ye Bin yebin10@huawei.com
commit f4307b4df1c28842bb1950ff0e1b97e17031b17f upstream.
In __mmc_test_register_dbgfs_file(), we need to assign 'file', as it's being used when removing the debugfs files when the mmc_test module is removed.
Fixes: a04c50aaa916 ("mmc: core: no need to check return value of debugfs_create functions") Signed-off-by: Ye Bin yebin10@huawei.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org [Ulf: Re-wrote the commit msg] Link: https://lore.kernel.org/r/20221123095506.1965691-1-yebin@huaweicloud.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/mmc_test.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/mmc/core/mmc_test.c +++ b/drivers/mmc/core/mmc_test.c @@ -3179,7 +3179,8 @@ static int __mmc_test_register_dbgfs_fil struct mmc_test_dbgfs_file *df;
if (card->debugfs_root) - debugfs_create_file(name, mode, card->debugfs_root, card, fops); + file = debugfs_create_file(name, mode, card->debugfs_root, + card, fops);
df = kmalloc(sizeof(*df), GFP_KERNEL); if (!df) {
From: Gaosheng Cui cuigaosheng1@huawei.com
commit c61bfb1cb63ddab52b31cf5f1924688917e61fad upstream.
The clk_disable_unprepare() should be called in the error handling of devm_clk_bulk_get_optional, fix it by replacing devm_clk_get_optional and clk_prepare_enable by devm_clk_get_optional_enabled.
Fixes: f5eccd94b63f ("mmc: mediatek: Add subsys clock control for MT8192 msdc") Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221125090141.3626747-1-cuigaosheng1@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mtk-sd.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/mmc/host/mtk-sd.c +++ b/drivers/mmc/host/mtk-sd.c @@ -2573,13 +2573,11 @@ static int msdc_of_clock_parse(struct pl return PTR_ERR(host->src_clk_cg); }
- host->sys_clk_cg = devm_clk_get_optional(&pdev->dev, "sys_cg"); + /* If present, always enable for this clock gate */ + host->sys_clk_cg = devm_clk_get_optional_enabled(&pdev->dev, "sys_cg"); if (IS_ERR(host->sys_clk_cg)) host->sys_clk_cg = NULL;
- /* If present, always enable for this clock gate */ - clk_prepare_enable(host->sys_clk_cg); - host->bulk_clks[0].id = "pclk_cg"; host->bulk_clks[1].id = "axi_cg"; host->bulk_clks[2].id = "ahb_cg";
From: Christian Löhle CLoehle@hyperstone.com
commit 489d144563f23911262a652234b80c70c89c978b upstream.
Clean up the MMC_TRIM_ARGS define that became ambiguous with DISCARD introduction. While at it, let's fix one usage where MMC_TRIM_ARGS falsely included DISCARD too.
Fixes: b3bf915308ca ("mmc: core: new discard feature support at eMMC v4.5") Signed-off-by: Christian Loehle cloehle@hyperstone.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/11376b5714964345908f3990f17e0701@hyperstone.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/core.c | 9 +++++++-- include/linux/mmc/mmc.h | 2 +- 2 files changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -1484,6 +1484,11 @@ void mmc_init_erase(struct mmc_card *car card->pref_erase = 0; }
+static bool is_trim_arg(unsigned int arg) +{ + return (arg & MMC_TRIM_OR_DISCARD_ARGS) && arg != MMC_DISCARD_ARG; +} + static unsigned int mmc_mmc_erase_timeout(struct mmc_card *card, unsigned int arg, unsigned int qty) { @@ -1766,7 +1771,7 @@ int mmc_erase(struct mmc_card *card, uns !(card->ext_csd.sec_feature_support & EXT_CSD_SEC_ER_EN)) return -EOPNOTSUPP;
- if (mmc_card_mmc(card) && (arg & MMC_TRIM_ARGS) && + if (mmc_card_mmc(card) && is_trim_arg(arg) && !(card->ext_csd.sec_feature_support & EXT_CSD_SEC_GB_CL_EN)) return -EOPNOTSUPP;
@@ -1796,7 +1801,7 @@ int mmc_erase(struct mmc_card *card, uns * identified by the card->eg_boundary flag. */ rem = card->erase_size - (from % card->erase_size); - if ((arg & MMC_TRIM_ARGS) && (card->eg_boundary) && (nr > rem)) { + if ((arg & MMC_TRIM_OR_DISCARD_ARGS) && card->eg_boundary && nr > rem) { err = mmc_do_erase(card, from, from + rem - 1, arg); from += rem; if ((err) || (to <= from)) --- a/include/linux/mmc/mmc.h +++ b/include/linux/mmc/mmc.h @@ -451,7 +451,7 @@ static inline bool mmc_ready_for_data(u3 #define MMC_SECURE_TRIM1_ARG 0x80000001 #define MMC_SECURE_TRIM2_ARG 0x80008000 #define MMC_SECURE_ARGS 0x80000000 -#define MMC_TRIM_ARGS 0x00008001 +#define MMC_TRIM_OR_DISCARD_ARGS 0x00008003
#define mmc_driver_type_mask(n) (1 << (n))
From: Sebastian Falbesoner sebastian.falbesoner@gmail.com
commit a3cab1d2132474969871b5d7f915c5c0167b48b0 upstream.
With the current logic the "failed to exit halt state" error would be shown even if any other bit than CQHCI_HALT was set in the CQHCI_CTL register, since the right hand side is always true. Fix this by using the correct operator (bit-wise instead of logical AND) to only check for the halt bit flag, which was obviously intended here.
Fixes: 85236d2be844 ("mmc: sdhci-esdhc-imx: clear the HALT bit when enable CQE") Signed-off-by: Sebastian Falbesoner sebastian.falbesoner@gmail.com Acked-by: Haibo Chen haibo.chen@nxp.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221121105721.1903878-1-sebastian.falbesoner@gmai... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-esdhc-imx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci-esdhc-imx.c +++ b/drivers/mmc/host/sdhci-esdhc-imx.c @@ -1512,7 +1512,7 @@ static void esdhc_cqe_enable(struct mmc_ * system resume back. */ cqhci_writel(cq_host, 0, CQHCI_CTL); - if (cqhci_readl(cq_host, CQHCI_CTL) && CQHCI_HALT) + if (cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT) dev_err(mmc_dev(host->mmc), "failed to exit halt state when enable CQE\n");
From: Wenchao Chen wenchao.chen@unisoc.com
commit dd30dcfa7a74a06f8dcdab260d8d5adf32f17333 upstream.
After switching the voltage, no reset data and command will cause CMD2 timeout.
Fixes: 29ca763fc26f ("mmc: sdhci-sprd: Add pin control support for voltage switch") Signed-off-by: Wenchao Chen wenchao.chen@unisoc.com Acked-by: Adrian Hunter adrian.hunter@intel.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221130121328.25553-1-wenchao.chen@unisoc.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-sprd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci-sprd.c +++ b/drivers/mmc/host/sdhci-sprd.c @@ -470,7 +470,7 @@ static int sdhci_sprd_voltage_switch(str }
if (IS_ERR(sprd_host->pinctrl)) - return 0; + goto reset;
switch (ios->signal_voltage) { case MMC_SIGNAL_VOLTAGE_180: @@ -498,6 +498,8 @@ static int sdhci_sprd_voltage_switch(str
/* Wait for 300 ~ 500 us for pin state stable */ usleep_range(300, 500); + +reset: sdhci_reset(host, SDHCI_RESET_CMD | SDHCI_RESET_DATA);
return 0;
From: Adrian Hunter adrian.hunter@intel.com
commit c981cdfb9925f64a364f13c2b4f98f877308a408 upstream.
Commit 20b92a30b561 ("mmc: sdhci: update signal voltage switch code") removed voltage switch delays from sdhci because mmc core had been enhanced to support them. However that assumed that sdhci_set_ios() did a single clock change, which it did not, and so the delays in mmc core, which should have come after the first clock change, were not effective.
Fix by avoiding re-configuring UHS and preset settings when the clock is turning on and the settings have not changed. That then also avoids the associated clock changes, so that then sdhci_set_ios() does a single clock change when voltage switching, and the mmc core delays become effective.
To do that has meant keeping track of driver strength (host->drv_type), and cases of reinitialization (host->reinit_uhs).
Note also, the 'turning_on_clk' restriction should not be necessary but is done to minimize the impact of the change on stable kernels.
Fixes: 20b92a30b561 ("mmc: sdhci: update signal voltage switch code") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20221128133259.38305-2-adrian.hunter@intel.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci.c | 61 +++++++++++++++++++++++++++++++++++++++++------ drivers/mmc/host/sdhci.h | 2 + 2 files changed, 56 insertions(+), 7 deletions(-)
--- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -339,6 +339,7 @@ static void sdhci_init(struct sdhci_host if (soft) { /* force clock reconfiguration */ host->clock = 0; + host->reinit_uhs = true; mmc->ops->set_ios(mmc, &mmc->ios); } } @@ -2258,11 +2259,46 @@ void sdhci_set_uhs_signaling(struct sdhc } EXPORT_SYMBOL_GPL(sdhci_set_uhs_signaling);
+static bool sdhci_timing_has_preset(unsigned char timing) +{ + switch (timing) { + case MMC_TIMING_UHS_SDR12: + case MMC_TIMING_UHS_SDR25: + case MMC_TIMING_UHS_SDR50: + case MMC_TIMING_UHS_SDR104: + case MMC_TIMING_UHS_DDR50: + case MMC_TIMING_MMC_DDR52: + return true; + }; + return false; +} + +static bool sdhci_preset_needed(struct sdhci_host *host, unsigned char timing) +{ + return !(host->quirks2 & SDHCI_QUIRK2_PRESET_VALUE_BROKEN) && + sdhci_timing_has_preset(timing); +} + +static bool sdhci_presetable_values_change(struct sdhci_host *host, struct mmc_ios *ios) +{ + /* + * Preset Values are: Driver Strength, Clock Generator and SDCLK/RCLK + * Frequency. Check if preset values need to be enabled, or the Driver + * Strength needs updating. Note, clock changes are handled separately. + */ + return !host->preset_enabled && + (sdhci_preset_needed(host, ios->timing) || host->drv_type != ios->drv_type); +} + void sdhci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios) { struct sdhci_host *host = mmc_priv(mmc); + bool reinit_uhs = host->reinit_uhs; + bool turning_on_clk = false; u8 ctrl;
+ host->reinit_uhs = false; + if (ios->power_mode == MMC_POWER_UNDEFINED) return;
@@ -2288,6 +2324,8 @@ void sdhci_set_ios(struct mmc_host *mmc, sdhci_enable_preset_value(host, false);
if (!ios->clock || ios->clock != host->clock) { + turning_on_clk = ios->clock && !host->clock; + host->ops->set_clock(host, ios->clock); host->clock = ios->clock;
@@ -2314,6 +2352,17 @@ void sdhci_set_ios(struct mmc_host *mmc,
host->ops->set_bus_width(host, ios->bus_width);
+ /* + * Special case to avoid multiple clock changes during voltage + * switching. + */ + if (!reinit_uhs && + turning_on_clk && + host->timing == ios->timing && + host->version >= SDHCI_SPEC_300 && + !sdhci_presetable_values_change(host, ios)) + return; + ctrl = sdhci_readb(host, SDHCI_HOST_CONTROL);
if (!(host->quirks & SDHCI_QUIRK_NO_HISPD_BIT)) { @@ -2357,6 +2406,7 @@ void sdhci_set_ios(struct mmc_host *mmc, }
sdhci_writew(host, ctrl_2, SDHCI_HOST_CONTROL2); + host->drv_type = ios->drv_type; } else { /* * According to SDHC Spec v3.00, if the Preset Value @@ -2384,19 +2434,14 @@ void sdhci_set_ios(struct mmc_host *mmc, host->ops->set_uhs_signaling(host, ios->timing); host->timing = ios->timing;
- if (!(host->quirks2 & SDHCI_QUIRK2_PRESET_VALUE_BROKEN) && - ((ios->timing == MMC_TIMING_UHS_SDR12) || - (ios->timing == MMC_TIMING_UHS_SDR25) || - (ios->timing == MMC_TIMING_UHS_SDR50) || - (ios->timing == MMC_TIMING_UHS_SDR104) || - (ios->timing == MMC_TIMING_UHS_DDR50) || - (ios->timing == MMC_TIMING_MMC_DDR52))) { + if (sdhci_preset_needed(host, ios->timing)) { u16 preset;
sdhci_enable_preset_value(host, true); preset = sdhci_get_preset_value(host); ios->drv_type = FIELD_GET(SDHCI_PRESET_DRV_MASK, preset); + host->drv_type = ios->drv_type; }
/* Re-enable SD Clock */ @@ -3748,6 +3793,7 @@ int sdhci_resume_host(struct sdhci_host sdhci_init(host, 0); host->pwr = 0; host->clock = 0; + host->reinit_uhs = true; mmc->ops->set_ios(mmc, &mmc->ios); } else { sdhci_init(host, (mmc->pm_flags & MMC_PM_KEEP_POWER)); @@ -3810,6 +3856,7 @@ int sdhci_runtime_resume_host(struct sdh /* Force clock and power re-program */ host->pwr = 0; host->clock = 0; + host->reinit_uhs = true; mmc->ops->start_signal_voltage_switch(mmc, &mmc->ios); mmc->ops->set_ios(mmc, &mmc->ios);
--- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -526,6 +526,8 @@ struct sdhci_host {
unsigned int clock; /* Current clock (MHz) */ u8 pwr; /* Current voltage */ + u8 drv_type; /* Current UHS-I driver type */ + bool reinit_uhs; /* Force UHS-related re-initialization */
bool runtime_suspended; /* Host is runtime suspended */ bool bus_on; /* Bus power prevents runtime suspend */
From: Lee Jones lee@kernel.org
commit 152fe65f300e1819d59b80477d3e0999b4d5d7d2 upstream.
When enabled, KASAN enlarges function's stack-frames. Pushing quite a few over the current threshold. This can mainly be seen on 32-bit architectures where the present limit (when !GCC) is a lowly 1024-Bytes.
Link: https://lkml.kernel.org/r/20221125120750.3537134-3-lee@kernel.org Signed-off-by: Lee Jones lee@kernel.org Acked-by: Arnd Bergmann arnd@arndb.de Cc: Alex Deucher alexander.deucher@amd.com Cc: "Christian König" christian.koenig@amd.com Cc: Daniel Vetter daniel@ffwll.ch Cc: David Airlie airlied@gmail.com Cc: Harry Wentland harry.wentland@amd.com Cc: Leo Li sunpeng.li@amd.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Maxime Ripard mripard@kernel.org Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: "Pan, Xinhui" Xinhui.Pan@amd.com Cc: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Tom Rix trix@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/Kconfig.debug | 1 + 1 file changed, 1 insertion(+)
--- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -398,6 +398,7 @@ config FRAME_WARN default 2048 if GCC_PLUGIN_LATENT_ENTROPY default 2048 if PARISC default 1536 if (!64BIT && XTENSA) + default 1280 if KASAN && !64BIT default 1024 if !64BIT default 2048 if 64BIT help
From: Lee Jones lee@kernel.org
commit 6f6cb1714365a07dbc66851879538df9f6969288 upstream.
Patch series "Fix a bunch of allmodconfig errors", v2.
Since b339ec9c229aa ("kbuild: Only default to -Werror if COMPILE_TEST") WERROR now defaults to COMPILE_TEST meaning that it's enabled for allmodconfig builds. This leads to some interesting build failures when using Clang, each resolved in this set.
With this set applied, I am able to obtain a successful allmodconfig Arm build.
This patch (of 2):
calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64) architectures built with Clang (all released versions), whereby the stack frame gets blown up to well over 5k. This would cause an immediate kernel panic on most architectures. We'll revert this when the following bug report has been resolved: https://github.com/llvm/llvm-project/issues/41896.
Link: https://lkml.kernel.org/r/20221125120750.3537134-1-lee@kernel.org Link: https://lkml.kernel.org/r/20221125120750.3537134-2-lee@kernel.org Signed-off-by: Lee Jones lee@kernel.org Suggested-by: Arnd Bergmann arnd@arndb.de Acked-by: Arnd Bergmann arnd@arndb.de Cc: Alex Deucher alexander.deucher@amd.com Cc: "Christian König" christian.koenig@amd.com Cc: Daniel Vetter daniel@ffwll.ch Cc: David Airlie airlied@gmail.com Cc: Harry Wentland harry.wentland@amd.com Cc: Lee Jones lee@kernel.org Cc: Leo Li sunpeng.li@amd.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Maxime Ripard mripard@kernel.org Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: "Pan, Xinhui" Xinhui.Pan@amd.com Cc: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Tom Rix trix@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/Kconfig | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/gpu/drm/amd/display/Kconfig +++ b/drivers/gpu/drm/amd/display/Kconfig @@ -5,6 +5,7 @@ menu "Display Engine Configuration" config DRM_AMD_DC bool "AMD DC - Enable new display engine" default y + depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64 select SND_HDA_COMPONENT if SND_HDA_CORE select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128) help @@ -12,6 +13,12 @@ config DRM_AMD_DC support for AMDGPU. This adds required support for Vega and Raven ASICs.
+ calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64) + architectures built with Clang (all released versions), whereby the stack + frame gets blown up to well over 5k. This would cause an immediate kernel + panic on most architectures. We'll revert this when the following bug report + has been resolved: https://github.com/llvm/llvm-project/issues/41896. + config DRM_AMD_DC_DCN def_bool n help
From: Leo Liu leo.liu@amd.com
commit 9a8cc8cabc1e351614fd7f9e774757a5143b6fe8 upstream.
So that uses PSP to initialize HW.
Fixes: 0c2c02b66c672e ("drm/amdgpu/vcn: add firmware support for dimgrey_cavefish") Signed-off-by: Leo Liu leo.liu@amd.com Reviewed-by: James Zhu James.Zhu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c @@ -156,6 +156,9 @@ int amdgpu_vcn_sw_init(struct amdgpu_dev break; case IP_VERSION(3, 0, 2): fw_name = FIRMWARE_VANGOGH; + if ((adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) && + (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG)) + adev->vcn.indirect_sram = true; break; case IP_VERSION(3, 0, 16): fw_name = FIRMWARE_DIMGREY_CAVEFISH;
From: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com
commit a8899b8728013c7b2456f0bfa20e5fea85ee0fd1 upstream.
Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") extended the API of intel_gt_retire_requests_timeout() with an extra argument 'remaining_timeout', intended for passing back unconsumed portion of requested timeout when 0 (success) is returned. However, when request retirement happens to succeed despite an error returned by a call to dma_fence_wait_timeout(), that error code (a negative value) is passed back instead of remaining time. If we then pass that negative value forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG will be triggered.
If request retirement succeeds but an error code is passed back via remaininig_timeout, we may have no clue on how much of the initial timeout might have been left for spending it on waiting for GuC to become idle. OTOH, since all pending requests have been successfully retired, that error code has been already ignored by intel_gt_retire_requests_timeout(), then we shouldn't fail.
Assume no more time has been left on error and pass 0 timeout value to intel_uc_wait_for_idle() to give it a chance to return success if GuC is already idle.
v3: Don't fail on any error passed back via remaining_timeout.
v2: Fix the issue on the caller side, not the provider.
Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Cc: stable@vger.kernel.org # v5.15+ Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.... (cherry picked from commit f235dbd5b768e238d365fd05d92de5a32abc1c1f) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -616,8 +616,13 @@ int intel_gt_wait_for_idle(struct intel_ return -EINTR; }
- return timeout ? timeout : intel_uc_wait_for_idle(>->uc, - remaining_timeout); + if (timeout) + return timeout; + + if (remaining_timeout < 0) + remaining_timeout = 0; + + return intel_uc_wait_for_idle(>->uc, remaining_timeout); }
int intel_gt_init(struct intel_gt *gt)
From: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com
commit 12b8b046e4c9de40fa59b6f067d6826f4e688f68 upstream.
Users of intel_gt_retire_requests_timeout() expect 0 return value on success. However, we have no protection from passing back 0 potentially returned by a call to dma_fence_wait_timeout() when it succedes right after its timeout has expired.
Replace 0 with -ETIME before potentially using the timeout value as return code, so -ETIME is returned if there are still some requests not retired after timeout, 0 otherwise.
v3: Use conditional expression, more compact but also better reflecting intention standing behind the change.
v2: Move the added lines down so flush_submission() is not affected.
Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-3-janusz.... (cherry picked from commit f301a29f143760ce8d3d6b6a8436d45d3448cde6) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -199,7 +199,7 @@ out_active: spin_lock(&timelines->lock); if (remaining_timeout) *remaining_timeout = timeout;
- return active_count ? timeout : 0; + return active_count ? timeout ?: -ETIME : 0; }
static void retire_work_handler(struct work_struct *work)
From: Daniel Bristot de Oliveira bristot@kernel.org
commit 022632f6c43a86f2135642dccd5686de318e861d upstream.
The duration type is a 64 long value, not an int. This was causing some long noise to report wrong values.
Change the duration to a 64 bits value.
Link: https://lkml.kernel.org/r/a93d8a8378c7973e9c609de05826533c9e977939.166869209...
Cc: stable@vger.kernel.org Cc: Daniel Bristot de Oliveira bristot@kernel.org Cc: Steven Rostedt rostedt@goodmis.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Jonathan Corbet corbet@lwn.net Fixes: bce29ac9ce0b ("trace: Add osnoise tracer") Signed-off-by: Daniel Bristot de Oliveira bristot@kernel.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_osnoise.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -917,7 +917,7 @@ void osnoise_trace_irq_entry(int id) void osnoise_trace_irq_exit(int id, const char *desc) { struct osnoise_variables *osn_var = this_cpu_osn_var(); - int duration; + s64 duration;
if (!osn_var->sampling) return; @@ -1048,7 +1048,7 @@ static void trace_softirq_entry_callback static void trace_softirq_exit_callback(void *data, unsigned int vec_nr) { struct osnoise_variables *osn_var = this_cpu_osn_var(); - int duration; + s64 duration;
if (!osn_var->sampling) return; @@ -1144,7 +1144,7 @@ thread_entry(struct osnoise_variables *o static void thread_exit(struct osnoise_variables *osn_var, struct task_struct *t) { - int duration; + s64 duration;
if (!osn_var->sampling) return;
From: Steven Rostedt (Google) rostedt@goodmis.org
commit ef38c79a522b660f7f71d45dad2d6244bc741841 upstream.
commit 94eedf3dded5 ("tracing: Fix race where eprobes can be called before the event") fixed an issue where if an event is soft disabled, and the trigger is being added, there's a small window where the event sees that there's a trigger but does not see that it requires reading the event yet, and then calls the trigger with the record == NULL.
This could be solved with adding memory barriers in the hot path, or to make sure that all the triggers requiring a record check for NULL. The latter was chosen.
Commit 94eedf3dded5 set the eprobe trigger handle to check for NULL, but the same needs to be done with histograms.
Link: https://lore.kernel.org/linux-trace-kernel/20221118211809.701d40c0f8a757b0df... Link: https://lore.kernel.org/linux-trace-kernel/20221123164323.03450c3a@gandalf.l...
Cc: Tom Zanussi zanussi@kernel.org Cc: stable@vger.kernel.org Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events") Reported-by: Masami Hiramatsu (Google) mhiramat@kernel.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_hist.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -5051,6 +5051,9 @@ static void event_hist_trigger(struct ev void *key = NULL; unsigned int i;
+ if (unlikely(!rbe)) + return; + memset(compound_key, 0, hist_data->key_size);
for_each_hist_key_field(i, hist_data) {
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 4313e5a613049dfc1819a6dfb5f94cf2caff9452 upstream.
After 65536 dynamic events have been added and removed, the "type" field of the event then uses the first type number that is available (not currently used by other events). A type number is the identifier of the binary blobs in the tracing ring buffer (known as events) to map them to logic that can parse the binary blob.
The issue is that if a dynamic event (like a kprobe event) is traced and is in the ring buffer, and then that event is removed (because it is dynamic, which means it can be created and destroyed), if another dynamic event is created that has the same number that new event's logic on parsing the binary blob will be used.
To show how this can be an issue, the following can crash the kernel:
# cd /sys/kernel/tracing # for i in `seq 65536`; do echo 'p:kprobes/foo do_sys_openat2 $arg1:u32' > kprobe_events # done
For every iteration of the above, the writing to the kprobe_events will remove the old event and create a new one (with the same format) and increase the type number to the next available on until the type number reaches over 65535 which is the max number for the 16 bit type. After it reaches that number, the logic to allocate a new number simply looks for the next available number. When an dynamic event is removed, that number is then available to be reused by the next dynamic event created. That is, once the above reaches the max number, the number assigned to the event in that loop will remain the same.
Now that means deleting one dynamic event and created another will reuse the previous events type number. This is where bad things can happen. After the above loop finishes, the kprobes/foo event which reads the do_sys_openat2 function call's first parameter as an integer.
# echo 1 > kprobes/foo/enable # cat /etc/passwd > /dev/null # cat trace cat-2211 [005] .... 2007.849603: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196 cat-2211 [005] .... 2007.849620: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196 cat-2211 [005] .... 2007.849838: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196 cat-2211 [005] .... 2007.849880: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196 # echo 0 > kprobes/foo/enable
Now if we delete the kprobe and create a new one that reads a string:
# echo 'p:kprobes/foo do_sys_openat2 +0($arg2):string' > kprobe_events
And now we can the trace:
# cat trace sendmail-1942 [002] ..... 530.136320: foo: (do_sys_openat2+0x0/0x240) arg1= cat-2046 [004] ..... 530.930817: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������" cat-2046 [004] ..... 530.930961: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������" cat-2046 [004] ..... 530.934278: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������" cat-2046 [004] ..... 530.934563: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������" bash-1515 [007] ..... 534.299093: foo: (do_sys_openat2+0x0/0x240) arg1="kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk���������@��4Z����;Y�����U
And dmesg has:
================================================================== BUG: KASAN: use-after-free in string+0xd4/0x1c0 Read of size 1 at addr ffff88805fdbbfa0 by task cat/2049
CPU: 0 PID: 2049 Comm: cat Not tainted 6.1.0-rc6-test+ #641 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 Call Trace: <TASK> dump_stack_lvl+0x5b/0x77 print_report+0x17f/0x47b kasan_report+0xad/0x130 string+0xd4/0x1c0 vsnprintf+0x500/0x840 seq_buf_vprintf+0x62/0xc0 trace_seq_printf+0x10e/0x1e0 print_type_string+0x90/0xa0 print_kprobe_event+0x16b/0x290 print_trace_line+0x451/0x8e0 s_show+0x72/0x1f0 seq_read_iter+0x58e/0x750 seq_read+0x115/0x160 vfs_read+0x11d/0x460 ksys_read+0xa9/0x130 do_syscall_64+0x3a/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fc2e972ade2 Code: c0 e9 b2 fe ff ff 50 48 8d 3d b2 3f 0a 00 e8 05 f0 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 RSP: 002b:00007ffc64e687c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fc2e972ade2 RDX: 0000000000020000 RSI: 00007fc2e980d000 RDI: 0000000000000003 RBP: 00007fc2e980d000 R08: 00007fc2e980c010 R09: 0000000000000000 R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020f00 R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 </TASK>
The buggy address belongs to the physical page: page:ffffea00017f6ec0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5fdbb flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) raw: 000fffffc0000000 0000000000000000 ffffea00017f6ec8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88805fdbbe80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88805fdbbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff88805fdbbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^ ffff88805fdbc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88805fdbc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ==================================================================
This was found when Zheng Yejian sent a patch to convert the event type number assignment to use IDA, which gives the next available number, and this bug showed up in the fuzz testing by Yujie Liu and the kernel test robot. But after further analysis, I found that this behavior is the same as when the event type numbers go past the 16bit max (and the above shows that).
As modules have a similar issue, but is dealt with by setting a "WAS_ENABLED" flag when a module event is enabled, and when the module is freed, if any of its events were enabled, the ring buffer that holds that event is also cleared, to prevent reading stale events. The same can be done for dynamic events.
If any dynamic event that is being removed was enabled, then make sure the buffers they were enabled in are now cleared.
Link: https://lkml.kernel.org/r/20221123171434.545706e3@gandalf.local.home Link: https://lore.kernel.org/all/20221110020319.1259291-1-zhengyejian1@huawei.com...
Cc: stable@vger.kernel.org Cc: Andrew Morton akpm@linux-foundation.org Depends-on: e18eb8783ec49 ("tracing: Add tracing_reset_all_online_cpus_unlocked() function") Depends-on: 5448d44c38557 ("tracing: Add unified dynamic event framework") Depends-on: 6212dd29683ee ("tracing/kprobes: Use dyn_event framework for kprobe events") Depends-on: 065e63f951432 ("tracing: Only have rmmod clear buffers that its events were active in") Depends-on: 575380da8b469 ("tracing: Only clear trace buffer on module unload if event was traced") Fixes: 77b44d1b7c283 ("tracing/kprobes: Rename Kprobe-tracer to kprobe-event") Reported-by: Zheng Yejian zhengyejian1@huawei.com Reported-by: Yujie Liu yujie.liu@intel.com Reported-by: kernel test robot yujie.liu@intel.com Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_dynevent.c | 2 ++ kernel/trace/trace_events.c | 11 ++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace_dynevent.c +++ b/kernel/trace/trace_dynevent.c @@ -118,6 +118,7 @@ int dyn_event_release(const char *raw_co if (ret) break; } + tracing_reset_all_online_cpus(); mutex_unlock(&event_mutex); out: argv_free(argv); @@ -214,6 +215,7 @@ int dyn_events_release_all(struct dyn_ev break; } out: + tracing_reset_all_online_cpus(); mutex_unlock(&event_mutex);
return ret; --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2880,7 +2880,10 @@ static int probe_remove_event_call(struc * TRACE_REG_UNREGISTER. */ if (file->flags & EVENT_FILE_FL_ENABLED) - return -EBUSY; + goto busy; + + if (file->flags & EVENT_FILE_FL_WAS_ENABLED) + tr->clear_trace = true; /* * The do_for_each_event_file_safe() is * a double loop. After finding the call for this @@ -2893,6 +2896,12 @@ static int probe_remove_event_call(struc __trace_remove_event_call(call);
return 0; + busy: + /* No need to clear the trace now */ + list_for_each_entry(tr, &ftrace_trace_arrays, list) { + tr->clear_trace = false; + } + return -EBUSY; }
/* Remove an event_call */
From: Mark Brown broonie@kernel.org
[ Upstream commit 698813ba8c580efb356ace8dbf55f61dac6063a8 ]
For _sx controls the semantics of the max field is not the usual one, max is the number of steps rather than the maximum value. This means that our check in snd_soc_put_volsw_sx() needs to just check against the maximum value.
Fixes: 4f1e50d6a9cf9c1b ("ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx()") Signed-off-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20220511134137.169575-1-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-ops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c index bd88de056358..47691119306f 100644 --- a/sound/soc/soc-ops.c +++ b/sound/soc/soc-ops.c @@ -452,7 +452,7 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol, val = ucontrol->value.integer.value[0]; if (mc->platform_max && val > mc->platform_max) return -EINVAL; - if (val > max - min) + if (val > max) return -EINVAL; val_mask = mask << shift; val = (val + min) & mask;
From: Hui Tang tanghui20@huawei.com
[ Upstream commit 19c5bda74dc45fee598a57600b550c9ea7662f10 ]
sound/soc/codecs/tlv320adc3xxx.c: In function ‘adc3xxx_i2c_probe’: sound/soc/codecs/tlv320adc3xxx.c:1359:21: error: implicit declaration of function ‘devm_gpiod_get’; did you mean ‘devm_gpio_free’? [-Werror=implicit-function-declaration] adc3xxx->rst_pin = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW); ^~~~~~~~~~~~~~ devm_gpio_free CC [M] drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgt215.o LD [M] sound/soc/codecs/snd-soc-ak4671.o LD [M] sound/soc/codecs/snd-soc-arizona.o LD [M] sound/soc/codecs/snd-soc-cros-ec-codec.o LD [M] sound/soc/codecs/snd-soc-ak4641.o LD [M] sound/soc/codecs/snd-soc-alc5632.o sound/soc/codecs/tlv320adc3xxx.c:1359:50: error: ‘GPIOD_OUT_LOW’ undeclared (first use in this function); did you mean ‘GPIOF_INIT_LOW’? adc3xxx->rst_pin = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW); ^~~~~~~~~~~~~ GPIOF_INIT_LOW sound/soc/codecs/tlv320adc3xxx.c:1359:50: note: each undeclared identifier is reported only once for each function it appears in LD [M] sound/soc/codecs/snd-soc-cs35l32.o sound/soc/codecs/tlv320adc3xxx.c:1408:2: error: implicit declaration of function ‘gpiod_set_value_cansleep’; did you mean ‘gpio_set_value_cansleep’? [-Werror=implicit-function-declaration] gpiod_set_value_cansleep(adc3xxx->rst_pin, 1); ^~~~~~~~~~~~~~~~~~~~~~~~ gpio_set_value_cansleep LD [M] sound/soc/codecs/snd-soc-cs35l41-lib.o LD [M] sound/soc/codecs/snd-soc-cs35l36.o LD [M] sound/soc/codecs/snd-soc-cs35l34.o LD [M] sound/soc/codecs/snd-soc-cs35l41.o CC [M] drivers/gpu/drm/nouveau/nvkm/engine/disp/sormcp89.o cc1: all warnings being treated as errors
Fixes: e9a3b57efd28 ("ASoC: codec: tlv320adc3xxx: New codec driver") Signed-off-by: Hui Tang tanghui20@huawei.com Link: https://lore.kernel.org/r/20220512074640.75550-3-tanghui20@huawei.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/tlv320adc3xxx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/codecs/tlv320adc3xxx.c b/sound/soc/codecs/tlv320adc3xxx.c index 8a0965cd3e66..297c458c4d8b 100644 --- a/sound/soc/codecs/tlv320adc3xxx.c +++ b/sound/soc/codecs/tlv320adc3xxx.c @@ -14,6 +14,7 @@
#include <dt-bindings/sound/tlv320adc3xxx.h> #include <linux/clk.h> +#include <linux/gpio/consumer.h> #include <linux/module.h> #include <linux/moduleparam.h> #include <linux/io.h> @@ -1025,7 +1026,9 @@ static const struct gpio_chip adc3xxx_gpio_chip = {
static void adc3xxx_free_gpio(struct adc3xxx *adc3xxx) { +#ifdef CONFIG_GPIOLIB gpiochip_remove(&adc3xxx->gpio_chip); +#endif }
static void adc3xxx_init_gpio(struct adc3xxx *adc3xxx)
From: Maxim Korotkov korotkov.maxim.s@gmail.com
[ Upstream commit 64c150339e7f6c5cbbe8c17a56ef2b3902612798 ]
There is a possibility of dividing by zero due to the pcs->bits_per_pin if pcs->fmask() also has a value of zero and called fls from asm-generic/bitops/builtin-fls.h or arch/x86/include/asm/bitops.h. The function pcs_probe() has the branch that assigned to fmask 0 before pcs_allocate_pin_table() was called
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4e7e8017a80e ("pinctrl: pinctrl-single: enhance to configure multiple pins of different modules") Signed-off-by: Maxim Korotkov korotkov.maxim.s@gmail.com Reviewed-by: Tony Lindgren tony@atomide.com Link: https://lore.kernel.org/r/20221117123034.27383-1-korotkov.maxim.s@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/pinctrl-single.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/pinctrl-single.c b/drivers/pinctrl/pinctrl-single.c index 67bec7ea0f8b..414ee6bb8ac9 100644 --- a/drivers/pinctrl/pinctrl-single.c +++ b/drivers/pinctrl/pinctrl-single.c @@ -727,7 +727,7 @@ static int pcs_allocate_pin_table(struct pcs_device *pcs)
mux_bytes = pcs->width / BITS_PER_BYTE;
- if (pcs->bits_per_mux) { + if (pcs->bits_per_mux && pcs->fmask) { pcs->bits_per_pin = fls(pcs->fmask); nr_pins = (pcs->size * BITS_PER_BYTE) / pcs->bits_per_pin; } else {
From: Alexandre Ghiti alexghiti@rivosinc.com
[ Upstream commit 3f105a742725a1b78766a55169f1d827732e62b8 ]
The EFI page table is initially created as a copy of the kernel page table. With VMAP_STACK enabled, kernel stacks are allocated in the vmalloc area: if the stack is allocated in a new PGD (one that was not present at the moment of the efi page table creation or not synced in a previous vmalloc fault), the kernel will take a trap when switching to the efi page table when the vmalloc kernel stack is accessed, resulting in a kernel panic.
Fix that by updating the efi kernel mappings before switching to the efi page table.
Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Fixes: b91540d52a08 ("RISC-V: Add EFI runtime services") Tested-by: Emil Renner Berthing emil.renner.berthing@canonical.com Reviewed-by: Atish Patra atishp@rivosinc.com Link: https://lore.kernel.org/r/20221121133303.1782246-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/efi.h | 6 +++++- arch/riscv/include/asm/pgalloc.h | 11 ++++++++--- 2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h index f74879a8f1ea..e229d7be4b66 100644 --- a/arch/riscv/include/asm/efi.h +++ b/arch/riscv/include/asm/efi.h @@ -10,6 +10,7 @@ #include <asm/mmu_context.h> #include <asm/ptrace.h> #include <asm/tlbflush.h> +#include <asm/pgalloc.h>
#ifdef CONFIG_EFI extern void efi_init(void); @@ -20,7 +21,10 @@ extern void efi_init(void); int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md); int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
-#define arch_efi_call_virt_setup() efi_virtmap_load() +#define arch_efi_call_virt_setup() ({ \ + sync_kernel_mappings(efi_mm.pgd); \ + efi_virtmap_load(); \ + }) #define arch_efi_call_virt_teardown() efi_virtmap_unload()
#define ARCH_EFI_IRQ_FLAGS_MASK (SR_IE | SR_SPIE) diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h index 947f23d7b6af..59dc12b5b7e8 100644 --- a/arch/riscv/include/asm/pgalloc.h +++ b/arch/riscv/include/asm/pgalloc.h @@ -127,6 +127,13 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) #define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d) #endif /* __PAGETABLE_PMD_FOLDED */
+static inline void sync_kernel_mappings(pgd_t *pgd) +{ + memcpy(pgd + USER_PTRS_PER_PGD, + init_mm.pgd + USER_PTRS_PER_PGD, + (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t)); +} + static inline pgd_t *pgd_alloc(struct mm_struct *mm) { pgd_t *pgd; @@ -135,9 +142,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm) if (likely(pgd != NULL)) { memset(pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t)); /* Copy kernel mappings */ - memcpy(pgd + USER_PTRS_PER_PGD, - init_mm.pgd + USER_PTRS_PER_PGD, - (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t)); + sync_kernel_mappings(pgd); } return pgd; }
From: Jisheng Zhang jszhang@kernel.org
[ Upstream commit 7e1864332fbc1b993659eab7974da9fe8bf8c128 ]
Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time.
To solve this race, we introduce spin_shadow_stack atomic var, which will be swap between its own address and 0 in atomic way, when the var is set, it means the shadow_stack is being used; when the var is cleared, it means the shadow_stack isn't being used.
Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Signed-off-by: Jisheng Zhang jszhang@kernel.org Suggested-by: Guo Ren guoren@kernel.org Reviewed-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org [Palmer: Add AQ to the swap, and also some comments.] Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/asm.h | 1 + arch/riscv/kernel/entry.S | 13 +++++++++++++ arch/riscv/kernel/traps.c | 18 ++++++++++++++++++ 3 files changed, 32 insertions(+)
diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h index 1b471ff73178..816e753de636 100644 --- a/arch/riscv/include/asm/asm.h +++ b/arch/riscv/include/asm/asm.h @@ -23,6 +23,7 @@ #define REG_L __REG_SEL(ld, lw) #define REG_S __REG_SEL(sd, sw) #define REG_SC __REG_SEL(sc.d, sc.w) +#define REG_AMOSWAP_AQ __REG_SEL(amoswap.d.aq, amoswap.w.aq) #define REG_ASM __REG_SEL(.dword, .word) #define SZREG __REG_SEL(8, 4) #define LGREG __REG_SEL(3, 2) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index b9eda3fcbd6d..186abd146eaf 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -404,6 +404,19 @@ handle_syscall_trace_exit:
#ifdef CONFIG_VMAP_STACK handle_kernel_stack_overflow: + /* + * Takes the psuedo-spinlock for the shadow stack, in case multiple + * harts are concurrently overflowing their kernel stacks. We could + * store any value here, but since we're overflowing the kernel stack + * already we only have SP to use as a scratch register. So we just + * swap in the address of the spinlock, as that's definately non-zero. + * + * Pairs with a store_release in handle_bad_stack(). + */ +1: la sp, spin_shadow_stack + REG_AMOSWAP_AQ sp, sp, (sp) + bnez sp, 1b + la sp, shadow_stack addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 635e6ec26938..6e8822446069 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -218,11 +218,29 @@ asmlinkage unsigned long get_overflow_stack(void) OVERFLOW_STACK_SIZE; }
+/* + * A pseudo spinlock to protect the shadow stack from being used by multiple + * harts concurrently. This isn't a real spinlock because the lock side must + * be taken without a valid stack and only a single register, it's only taken + * while in the process of panicing anyway so the performance and error + * checking a proper spinlock gives us doesn't matter. + */ +unsigned long spin_shadow_stack; + asmlinkage void handle_bad_stack(struct pt_regs *regs) { unsigned long tsk_stk = (unsigned long)current->stack; unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack);
+ /* + * We're done with the shadow stack by this point, as we're on the + * overflow stack. Tell any other concurrent overflowing harts that + * they can proceed with panicing by releasing the pseudo-spinlock. + * + * This pairs with an amoswap.aq in handle_kernel_stack_overflow. + */ + smp_store_release(&spin_shadow_stack, 0); + console_verbose();
pr_emerg("Insufficient stack space to handle exception!\n");
From: Guo Ren guoren@linux.alibaba.com
[ Upstream commit b17d19a5314a37f7197afd1a0200affd21a7227d ]
If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism.
Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren guoren@linux.alibaba.com Signed-off-by: Guo Ren guoren@kernel.org Reviewed-by: Xianting Tian xianting.tian@linux.alibaba.com Cc: Nick Kossifidis mick@ics.forth.gr Cc: Palmer Dabbelt palmer@rivosinc.com Link: https://lore.kernel.org/r/20221020141603.2856206-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/machine_kexec.c | 35 +++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+)
diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index ee79e6839b86..db41c676e5a2 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -15,6 +15,8 @@ #include <linux/compiler.h> /* For unreachable() */ #include <linux/cpu.h> /* For cpu_down() */ #include <linux/reboot.h> +#include <linux/interrupt.h> +#include <linux/irq.h>
/* * kexec_image_info - Print received image details @@ -154,6 +156,37 @@ void crash_smp_send_stop(void) cpus_stopped = 1; }
+static void machine_kexec_mask_interrupts(void) +{ + unsigned int i; + struct irq_desc *desc; + + for_each_irq_desc(i, desc) { + struct irq_chip *chip; + int ret; + + chip = irq_desc_get_chip(desc); + if (!chip) + continue; + + /* + * First try to remove the active state. If this + * fails, try to EOI the interrupt. + */ + ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false); + + if (ret && irqd_irq_inprogress(&desc->irq_data) && + chip->irq_eoi) + chip->irq_eoi(&desc->irq_data); + + if (chip->irq_mask) + chip->irq_mask(&desc->irq_data); + + if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) + chip->irq_disable(&desc->irq_data); + } +} + /* * machine_crash_shutdown - Prepare to kexec after a kernel crash * @@ -169,6 +202,8 @@ machine_crash_shutdown(struct pt_regs *regs) crash_smp_send_stop();
crash_save_cpu(regs, smp_processor_id()); + machine_kexec_mask_interrupts(); + pr_info("Starting crashdump kernel...\n"); }
From: Guo Ren guoren@linux.alibaba.com
[ Upstream commit 9b932aadfc47de5d70b53ea04b0d1b5f6c82945b ]
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv.
Before this patch, test result: crash> help -r CPU 0: [OFFLINE]
CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8
CPU 2: [OFFLINE]
CPU 3: [OFFLINE]
After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000
CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000
CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000
CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8
Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian xianting.tian@linux.alibaba.com Signed-off-by: Guo Ren guoren@linux.alibaba.com Signed-off-by: Guo Ren guoren@kernel.org Cc: Nick Kossifidis mick@ics.forth.gr Link: https://lore.kernel.org/r/20221020141603.2856206-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/smp.h | 3 + arch/riscv/kernel/machine_kexec.c | 21 ++----- arch/riscv/kernel/smp.c | 97 ++++++++++++++++++++++++++++++- 3 files changed, 103 insertions(+), 18 deletions(-)
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h index d3443be7eedc..3831b638ecab 100644 --- a/arch/riscv/include/asm/smp.h +++ b/arch/riscv/include/asm/smp.h @@ -50,6 +50,9 @@ void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops); /* Clear IPI for current CPU */ void riscv_clear_ipi(void);
+/* Check other CPUs stop or not */ +bool smp_crash_stop_failed(void); + /* Secondary hart entry */ asmlinkage void smp_callin(void);
diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c index db41c676e5a2..2d139b724bc8 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -140,22 +140,6 @@ void machine_shutdown(void) #endif }
-/* Override the weak function in kernel/panic.c */ -void crash_smp_send_stop(void) -{ - static int cpus_stopped; - - /* - * This function can be called twice in panic path, but obviously - * we execute this only once. - */ - if (cpus_stopped) - return; - - smp_send_stop(); - cpus_stopped = 1; -} - static void machine_kexec_mask_interrupts(void) { unsigned int i; @@ -230,6 +214,11 @@ machine_kexec(struct kimage *image) void *control_code_buffer = page_address(image->control_code_page); riscv_kexec_method kexec_method = NULL;
+#ifdef CONFIG_SMP + WARN(smp_crash_stop_failed(), + "Some CPUs may be stale, kdump will be unreliable.\n"); +#endif + if (image->type != KEXEC_TYPE_CRASH) kexec_method = control_code_buffer; else diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c index 760a64518c58..8c3b59f1f9b8 100644 --- a/arch/riscv/kernel/smp.c +++ b/arch/riscv/kernel/smp.c @@ -12,6 +12,7 @@ #include <linux/clockchips.h> #include <linux/interrupt.h> #include <linux/module.h> +#include <linux/kexec.h> #include <linux/profile.h> #include <linux/smp.h> #include <linux/sched.h> @@ -22,11 +23,13 @@ #include <asm/sbi.h> #include <asm/tlbflush.h> #include <asm/cacheflush.h> +#include <asm/cpu_ops.h>
enum ipi_message_type { IPI_RESCHEDULE, IPI_CALL_FUNC, IPI_CPU_STOP, + IPI_CPU_CRASH_STOP, IPI_IRQ_WORK, IPI_TIMER, IPI_MAX @@ -71,6 +74,32 @@ static void ipi_stop(void) wait_for_interrupt(); }
+#ifdef CONFIG_KEXEC_CORE +static atomic_t waiting_for_crash_ipi = ATOMIC_INIT(0); + +static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) +{ + crash_save_cpu(regs, cpu); + + atomic_dec(&waiting_for_crash_ipi); + + local_irq_disable(); + +#ifdef CONFIG_HOTPLUG_CPU + if (cpu_has_hotplug(cpu)) + cpu_ops[cpu]->cpu_stop(); +#endif + + for(;;) + wait_for_interrupt(); +} +#else +static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) +{ + unreachable(); +} +#endif + static const struct riscv_ipi_ops *ipi_ops __ro_after_init;
void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops) @@ -124,8 +153,9 @@ void arch_irq_work_raise(void)
void handle_IPI(struct pt_regs *regs) { - unsigned long *pending_ipis = &ipi_data[smp_processor_id()].bits; - unsigned long *stats = ipi_data[smp_processor_id()].stats; + unsigned int cpu = smp_processor_id(); + unsigned long *pending_ipis = &ipi_data[cpu].bits; + unsigned long *stats = ipi_data[cpu].stats;
riscv_clear_ipi();
@@ -154,6 +184,10 @@ void handle_IPI(struct pt_regs *regs) ipi_stop(); }
+ if (ops & (1 << IPI_CPU_CRASH_STOP)) { + ipi_cpu_crash_stop(cpu, get_irq_regs()); + } + if (ops & (1 << IPI_IRQ_WORK)) { stats[IPI_IRQ_WORK]++; irq_work_run(); @@ -176,6 +210,7 @@ static const char * const ipi_names[] = { [IPI_RESCHEDULE] = "Rescheduling interrupts", [IPI_CALL_FUNC] = "Function call interrupts", [IPI_CPU_STOP] = "CPU stop interrupts", + [IPI_CPU_CRASH_STOP] = "CPU stop (for crash dump) interrupts", [IPI_IRQ_WORK] = "IRQ work interrupts", [IPI_TIMER] = "Timer broadcast interrupts", }; @@ -235,6 +270,64 @@ void smp_send_stop(void) cpumask_pr_args(cpu_online_mask)); }
+#ifdef CONFIG_KEXEC_CORE +/* + * The number of CPUs online, not counting this CPU (which may not be + * fully online and so not counted in num_online_cpus()). + */ +static inline unsigned int num_other_online_cpus(void) +{ + unsigned int this_cpu_online = cpu_online(smp_processor_id()); + + return num_online_cpus() - this_cpu_online; +} + +void crash_smp_send_stop(void) +{ + static int cpus_stopped; + cpumask_t mask; + unsigned long timeout; + + /* + * This function can be called twice in panic path, but obviously + * we execute this only once. + */ + if (cpus_stopped) + return; + + cpus_stopped = 1; + + /* + * If this cpu is the only one alive at this point in time, online or + * not, there are no stop messages to be sent around, so just back out. + */ + if (num_other_online_cpus() == 0) + return; + + cpumask_copy(&mask, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &mask); + + atomic_set(&waiting_for_crash_ipi, num_other_online_cpus()); + + pr_crit("SMP: stopping secondary CPUs\n"); + send_ipi_mask(&mask, IPI_CPU_CRASH_STOP); + + /* Wait up to one second for other CPUs to stop */ + timeout = USEC_PER_SEC; + while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--) + udelay(1); + + if (atomic_read(&waiting_for_crash_ipi) > 0) + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", + cpumask_pr_args(&mask)); +} + +bool smp_crash_stop_failed(void) +{ + return (atomic_read(&waiting_for_crash_ipi) > 0); +} +#endif + void smp_send_reschedule(int cpu) { send_ipi_single(cpu, IPI_RESCHEDULE);
From: Caleb Sander csander@purestorage.com
[ Upstream commit 899d2a05dc14733cfba6224083c6b0dd5a738590 ]
Walking the nvme_ns_head siblings list is protected by the head's srcu in nvme_ns_head_submit_bio() but not nvme_mpath_revalidate_paths(). Removing namespaces from the list also fails to synchronize the srcu. Concurrent scan work can therefore cause use-after-frees.
Hold the head's srcu lock in nvme_mpath_revalidate_paths() and synchronize with the srcu, not the global RCU, in nvme_ns_remove().
Observed the following panic when making NVMe/RDMA connections with native multipath on the Rocky Linux 8.6 kernel (it seems the upstream kernel has the same race condition). Disassembly shows the faulting instruction is cmp 0x50(%rdx),%rcx; computing capacity != get_capacity(ns->disk). Address 0x50 is dereferenced because ns->disk is NULL. The NULL disk appears to be the result of concurrent scan work freeing the namespace (note the log line in the middle of the panic).
[37314.206036] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [37314.206036] nvme0n3: detected capacity change from 0 to 11811160064 [37314.299753] PGD 0 P4D 0 [37314.299756] Oops: 0000 [#1] SMP PTI [37314.299759] CPU: 29 PID: 322046 Comm: kworker/u98:3 Kdump: loaded Tainted: G W X --------- - - 4.18.0-372.32.1.el8test86.x86_64 #1 [37314.299762] Hardware name: Dell Inc. PowerEdge R720/0JP31P, BIOS 2.7.0 05/23/2018 [37314.299763] Workqueue: nvme-wq nvme_scan_work [nvme_core] [37314.299783] RIP: 0010:nvme_mpath_revalidate_paths+0x26/0xb0 [nvme_core] [37314.299790] Code: 1f 44 00 00 66 66 66 66 90 55 53 48 8b 5f 50 48 8b 83 c8 c9 00 00 48 8b 13 48 8b 48 50 48 39 d3 74 20 48 8d 42 d0 48 8b 50 20 <48> 3b 4a 50 74 05 f0 80 60 70 ef 48 8b 50 30 48 8d 42 d0 48 39 d3 [37315.058803] RSP: 0018:ffffabe28f913d10 EFLAGS: 00010202 [37315.121316] RAX: ffff927a077da800 RBX: ffff92991dd70000 RCX: 0000000001600000 [37315.206704] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff92991b719800 [37315.292106] RBP: ffff929a6b70c000 R08: 000000010234cd4a R09: c0000000ffff7fff [37315.377501] R10: 0000000000000001 R11: ffffabe28f913a30 R12: 0000000000000000 [37315.462889] R13: ffff92992716600c R14: ffff929964e6e030 R15: ffff92991dd70000 [37315.548286] FS: 0000000000000000(0000) GS:ffff92b87fb80000(0000) knlGS:0000000000000000 [37315.645111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [37315.713871] CR2: 0000000000000050 CR3: 0000002208810006 CR4: 00000000000606e0 [37315.799267] Call Trace: [37315.828515] nvme_update_ns_info+0x1ac/0x250 [nvme_core] [37315.892075] nvme_validate_or_alloc_ns+0x2ff/0xa00 [nvme_core] [37315.961871] ? __blk_mq_free_request+0x6b/0x90 [37316.015021] nvme_scan_work+0x151/0x240 [nvme_core] [37316.073371] process_one_work+0x1a7/0x360 [37316.121318] ? create_worker+0x1a0/0x1a0 [37316.168227] worker_thread+0x30/0x390 [37316.212024] ? create_worker+0x1a0/0x1a0 [37316.258939] kthread+0x10a/0x120 [37316.297557] ? set_kthread_struct+0x50/0x50 [37316.347590] ret_from_fork+0x35/0x40 [37316.390360] Modules linked in: nvme_rdma nvme_tcp(X) nvme_fabrics nvme_core netconsole iscsi_tcp libiscsi_tcp dm_queue_length dm_service_time nf_conntrack_netlink br_netfilter bridge stp llc overlay nft_chain_nat ipt_MASQUERADE nf_nat xt_addrtype xt_CT nft_counter xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment xt_multiport nft_compat nf_tables libcrc32c nfnetlink dm_multipath tg3 rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm intel_rapl_msr iTCO_wdt iTCO_vendor_support dcdbas intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm irqbypass crct10dif_pclmul crc32_pclmul mlx5_ib ghash_clmulni_intel ib_uverbs rapl intel_cstate intel_uncore ib_core ipmi_si joydev mei_me pcspkr ipmi_devintf mei lpc_ich wmi ipmi_msghandler acpi_power_meter ext4 mbcache jbd2 sd_mod t10_pi sg mgag200 mlx5_core drm_kms_helper syscopyarea [37316.390419] sysfillrect ahci sysimgblt fb_sys_fops libahci drm crc32c_intel libata mlxfw pci_hyperv_intf tls i2c_algo_bit psample dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: nvme_core] [37317.645908] CR2: 0000000000000050
Fixes: e7d65803e2bb ("nvme-multipath: revalidate paths during rescan") Signed-off-by: Caleb Sander csander@purestorage.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 2 +- drivers/nvme/host/multipath.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 01c36284e542..f612a0ba64d0 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4297,7 +4297,7 @@ static void nvme_ns_remove(struct nvme_ns *ns) mutex_unlock(&ns->ctrl->subsys->lock);
/* guarantee not available in head->list */ - synchronize_rcu(); + synchronize_srcu(&ns->head->srcu);
if (!nvme_ns_head_multipath(ns->head)) nvme_cdev_del(&ns->cdev, &ns->cdev_device); diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index b9cf17cbbbd5..114e2b9359f8 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -174,11 +174,14 @@ void nvme_mpath_revalidate_paths(struct nvme_ns *ns) struct nvme_ns_head *head = ns->head; sector_t capacity = get_capacity(head->disk); int node; + int srcu_idx;
+ srcu_idx = srcu_read_lock(&head->srcu); list_for_each_entry_rcu(ns, &head->list, siblings) { if (capacity != get_capacity(ns->disk)) clear_bit(NVME_NS_READY, &ns->flags); } + srcu_read_unlock(&head->srcu, srcu_idx);
for_each_node(node) rcu_assign_pointer(head->current_path[node], NULL);
From: Xiongfeng Wang wangxiongfeng2@huawei.com
[ Upstream commit afca9e19cc720bfafc75dc5ce429c185ca93f31d ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() before 'return true' to avoid reference count leak.
Fixes: 89a6079df791 ("iommu/vt-d: Force IOMMU on for platform opt in hint") Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Link: https://lore.kernel.org/r/20221121113649.190393-2-wangxiongfeng2@huawei.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/iommu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index e47700674978..412b106d2a39 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3844,8 +3844,10 @@ static inline bool has_external_pci(void) struct pci_dev *pdev = NULL;
for_each_pci_dev(pdev) - if (pdev->external_facing) + if (pdev->external_facing) { + pci_dev_put(pdev); return true; + }
return false; }
From: Xiongfeng Wang wangxiongfeng2@huawei.com
[ Upstream commit 4bedbbd782ebbe7287231fea862c158d4f08a9e3 ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() for the error path to avoid reference count leak.
Fixes: 2e4552893038 ("iommu/vt-d: Unify the way to process DMAR device scope array") Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Link: https://lore.kernel.org/r/20221121113649.190393-3-wangxiongfeng2@huawei.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/dmar.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 5a8f780e7ffd..bc94059a5b87 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -820,6 +820,7 @@ int __init dmar_dev_scope_init(void) info = dmar_alloc_pci_notify_info(dev, BUS_NOTIFY_ADD_DEVICE); if (!info) { + pci_dev_put(dev); return dmar_dev_scope_status; } else { dmar_pci_bus_add_dev(info);
From: David Ahern dsahern@kernel.org
[ Upstream commit 61b91eb33a69c3be11b259c5ea484505cd79f883 ]
Gwangun Jung reported a slab-out-of-bounds access in fib_nh_match: fib_nh_match+0xf98/0x1130 linux-6.0-rc7/net/ipv4/fib_semantics.c:961 fib_table_delete+0x5f3/0xa40 linux-6.0-rc7/net/ipv4/fib_trie.c:1753 inet_rtm_delroute+0x2b3/0x380 linux-6.0-rc7/net/ipv4/fib_frontend.c:874
Separate nexthop objects are mutually exclusive with the legacy multipath spec. Fix fib_nh_match to return if the config for the to be deleted route contains a multipath spec while the fib_info is using a nexthop object.
Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Fixes: 6bf92d70e690 ("net: ipv4: fix route with nexthop object delete warning") Reported-by: Gwangun Jung exsociety@gmail.com Signed-off-by: David Ahern dsahern@kernel.org Reviewed-by: Ido Schimmel idosch@nvidia.com Tested-by: Ido Schimmel idosch@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: d5082d386eee ("ipv4: Fix route deletion when nexthop info is not specified") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/fib_semantics.c | 8 ++++---- tools/testing/selftests/net/fib_nexthops.sh | 5 +++++ 2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 2dc97583d279..e9a7f70a54df 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -888,13 +888,13 @@ int fib_nh_match(struct net *net, struct fib_config *cfg, struct fib_info *fi, return 1; }
+ /* cannot match on nexthop object attributes */ + if (fi->nh) + return 1; + if (cfg->fc_oif || cfg->fc_gw_family) { struct fib_nh *nh;
- /* cannot match on nexthop object attributes */ - if (fi->nh) - return 1; - nh = fib_info_nh(fi, 0); if (cfg->fc_encap) { if (fib_encap_match(net, cfg->fc_encap_type, diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index d5a0dd548989..ee5e98204d3d 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -1223,6 +1223,11 @@ ipv4_fcnal() log_test $rc 0 "Delete nexthop route warning" run_cmd "$IP route delete 172.16.101.1/32 nhid 12" run_cmd "$IP nexthop del id 12" + + run_cmd "$IP nexthop add id 21 via 172.16.1.6 dev veth1" + run_cmd "$IP ro add 172.16.101.0/24 nhid 21" + run_cmd "$IP ro del 172.16.101.0/24 nexthop via 172.16.1.7 dev veth1 nexthop via 172.16.1.8 dev veth1" + log_test $? 2 "Delete multipath route with only nh id based entry" }
ipv4_grp_fcnal()
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit d5082d386eee7e8ec46fa8581932c81a4961dcef ]
When the kernel receives a route deletion request from user space it tries to delete a route that matches the route attributes specified in the request.
If only prefix information is specified in the request, the kernel should delete the first matching FIB alias regardless of its associated FIB info. However, an error is currently returned when the FIB info is backed by a nexthop object:
# ip nexthop add id 1 via 192.0.2.2 dev dummy10 # ip route add 198.51.100.0/24 nhid 1 # ip route del 198.51.100.0/24 RTNETLINK answers: No such process
Fix by matching on such a FIB info when legacy nexthop attributes are not specified in the request. An earlier check already covers the case where a nexthop ID is specified in the request.
Add tests that cover these flows. Before the fix:
# ./fib_nexthops.sh -t ipv4_fcnal ... TEST: Delete route when not specifying nexthop attributes [FAIL]
Tests passed: 11 Tests failed: 1
After the fix:
# ./fib_nexthops.sh -t ipv4_fcnal ... TEST: Delete route when not specifying nexthop attributes [ OK ]
Tests passed: 12 Tests failed: 0
No regressions in other tests:
# ./fib_nexthops.sh ... Tests passed: 228 Tests failed: 0
# ./fib_tests.sh ... Tests passed: 186 Tests failed: 0
Cc: stable@vger.kernel.org Reported-by: Jonas Gorski jonas.gorski@gmail.com Tested-by: Jonas Gorski jonas.gorski@gmail.com Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Fixes: 6bf92d70e690 ("net: ipv4: fix route with nexthop object delete warning") Fixes: 61b91eb33a69 ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov razor@blackwall.org Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20221124210932.2470010-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/fib_semantics.c | 8 +++++--- tools/testing/selftests/net/fib_nexthops.sh | 11 +++++++++++ 2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index e9a7f70a54df..cb24260692e1 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -888,9 +888,11 @@ int fib_nh_match(struct net *net, struct fib_config *cfg, struct fib_info *fi, return 1; }
- /* cannot match on nexthop object attributes */ - if (fi->nh) - return 1; + if (fi->nh) { + if (cfg->fc_oif || cfg->fc_gw_family || cfg->fc_mp) + return 1; + return 0; + }
if (cfg->fc_oif || cfg->fc_gw_family) { struct fib_nh *nh; diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index ee5e98204d3d..a47b26ab48f2 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -1228,6 +1228,17 @@ ipv4_fcnal() run_cmd "$IP ro add 172.16.101.0/24 nhid 21" run_cmd "$IP ro del 172.16.101.0/24 nexthop via 172.16.1.7 dev veth1 nexthop via 172.16.1.8 dev veth1" log_test $? 2 "Delete multipath route with only nh id based entry" + + run_cmd "$IP nexthop add id 22 via 172.16.1.6 dev veth1" + run_cmd "$IP ro add 172.16.102.0/24 nhid 22" + run_cmd "$IP ro del 172.16.102.0/24 dev veth1" + log_test $? 2 "Delete route when specifying only nexthop device" + + run_cmd "$IP ro del 172.16.102.0/24 via 172.16.1.6" + log_test $? 2 "Delete route when specifying only gateway" + + run_cmd "$IP ro del 172.16.102.0/24" + log_test $? 0 "Delete route when not specifying nexthop attributes" }
ipv4_grp_fcnal()
From: Yajun Deng yajun.deng@linux.dev
[ Upstream commit f5a79d7c0c87c8d88bb5e3f3c898258fdf1b3b05 ]
damon_new_scheme() has too many parameters, so introduce struct damos_access_pattern to simplify it.
In additon, we can't use a bpf trace kprobe that has more than 5 parameters.
Link: https://lkml.kernel.org/r/20220908191443.129534-1-sj@kernel.org Signed-off-by: Yajun Deng yajun.deng@linux.dev Signed-off-by: SeongJae Park sj@kernel.org Reviewed-by: SeongJae Park sj@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 95bc35f9bee5 ("mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/damon.h | 37 ++++++++++++++++++---------------- mm/damon/core.c | 31 ++++++++++++++--------------- mm/damon/dbgfs.c | 27 +++++++++++++++---------- mm/damon/lru_sort.c | 46 ++++++++++++++++++++++++++----------------- mm/damon/reclaim.c | 23 +++++++++++++--------- mm/damon/sysfs.c | 17 +++++++++++----- 6 files changed, 106 insertions(+), 75 deletions(-)
diff --git a/include/linux/damon.h b/include/linux/damon.h index 7b1f4a488230..98e622c34d44 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -216,13 +216,26 @@ struct damos_stat { };
/** - * struct damos - Represents a Data Access Monitoring-based Operation Scheme. + * struct damos_access_pattern - Target access pattern of the given scheme. * @min_sz_region: Minimum size of target regions. * @max_sz_region: Maximum size of target regions. * @min_nr_accesses: Minimum ``->nr_accesses`` of target regions. * @max_nr_accesses: Maximum ``->nr_accesses`` of target regions. * @min_age_region: Minimum age of target regions. * @max_age_region: Maximum age of target regions. + */ +struct damos_access_pattern { + unsigned long min_sz_region; + unsigned long max_sz_region; + unsigned int min_nr_accesses; + unsigned int max_nr_accesses; + unsigned int min_age_region; + unsigned int max_age_region; +}; + +/** + * struct damos - Represents a Data Access Monitoring-based Operation Scheme. + * @pattern: Access pattern of target regions. * @action: &damo_action to be applied to the target regions. * @quota: Control the aggressiveness of this scheme. * @wmarks: Watermarks for automated (in)activation of this scheme. @@ -230,10 +243,8 @@ struct damos_stat { * @list: List head for siblings. * * For each aggregation interval, DAMON finds regions which fit in the - * condition (&min_sz_region, &max_sz_region, &min_nr_accesses, - * &max_nr_accesses, &min_age_region, &max_age_region) and applies &action to - * those. To avoid consuming too much CPU time or IO resources for the - * &action, "a is used. + * &pattern and applies &action to those. To avoid consuming too much + * CPU time or IO resources for the &action, "a is used. * * To do the work only when needed, schemes can be activated for specific * system situations using &wmarks. If all schemes that registered to the @@ -248,12 +259,7 @@ struct damos_stat { * &action is applied. */ struct damos { - unsigned long min_sz_region; - unsigned long max_sz_region; - unsigned int min_nr_accesses; - unsigned int max_nr_accesses; - unsigned int min_age_region; - unsigned int max_age_region; + struct damos_access_pattern pattern; enum damos_action action; struct damos_quota quota; struct damos_watermarks wmarks; @@ -501,12 +507,9 @@ void damon_destroy_region(struct damon_region *r, struct damon_target *t); int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges, unsigned int nr_ranges);
-struct damos *damon_new_scheme( - unsigned long min_sz_region, unsigned long max_sz_region, - unsigned int min_nr_accesses, unsigned int max_nr_accesses, - unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action, struct damos_quota *quota, - struct damos_watermarks *wmarks); +struct damos *damon_new_scheme(struct damos_access_pattern *pattern, + enum damos_action action, struct damos_quota *quota, + struct damos_watermarks *wmarks); void damon_add_scheme(struct damon_ctx *ctx, struct damos *s); void damon_destroy_scheme(struct damos *s);
diff --git a/mm/damon/core.c b/mm/damon/core.c index 7d25dc582fe3..7d5a9ae6f4ac 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -230,24 +230,21 @@ int damon_set_regions(struct damon_target *t, struct damon_addr_range *ranges, return 0; }
-struct damos *damon_new_scheme( - unsigned long min_sz_region, unsigned long max_sz_region, - unsigned int min_nr_accesses, unsigned int max_nr_accesses, - unsigned int min_age_region, unsigned int max_age_region, - enum damos_action action, struct damos_quota *quota, - struct damos_watermarks *wmarks) +struct damos *damon_new_scheme(struct damos_access_pattern *pattern, + enum damos_action action, struct damos_quota *quota, + struct damos_watermarks *wmarks) { struct damos *scheme;
scheme = kmalloc(sizeof(*scheme), GFP_KERNEL); if (!scheme) return NULL; - scheme->min_sz_region = min_sz_region; - scheme->max_sz_region = max_sz_region; - scheme->min_nr_accesses = min_nr_accesses; - scheme->max_nr_accesses = max_nr_accesses; - scheme->min_age_region = min_age_region; - scheme->max_age_region = max_age_region; + scheme->pattern.min_sz_region = pattern->min_sz_region; + scheme->pattern.max_sz_region = pattern->max_sz_region; + scheme->pattern.min_nr_accesses = pattern->min_nr_accesses; + scheme->pattern.max_nr_accesses = pattern->max_nr_accesses; + scheme->pattern.min_age_region = pattern->min_age_region; + scheme->pattern.max_age_region = pattern->max_age_region; scheme->action = action; scheme->stat = (struct damos_stat){}; INIT_LIST_HEAD(&scheme->list); @@ -667,10 +664,12 @@ static bool __damos_valid_target(struct damon_region *r, struct damos *s) unsigned long sz;
sz = r->ar.end - r->ar.start; - return s->min_sz_region <= sz && sz <= s->max_sz_region && - s->min_nr_accesses <= r->nr_accesses && - r->nr_accesses <= s->max_nr_accesses && - s->min_age_region <= r->age && r->age <= s->max_age_region; + return s->pattern.min_sz_region <= sz && + sz <= s->pattern.max_sz_region && + s->pattern.min_nr_accesses <= r->nr_accesses && + r->nr_accesses <= s->pattern.max_nr_accesses && + s->pattern.min_age_region <= r->age && + r->age <= s->pattern.max_age_region; }
static bool damos_valid_target(struct damon_ctx *c, struct damon_target *t, diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c index dafe7e71329b..61214cb9a5d3 100644 --- a/mm/damon/dbgfs.c +++ b/mm/damon/dbgfs.c @@ -131,9 +131,12 @@ static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len) damon_for_each_scheme(s, c) { rc = scnprintf(&buf[written], len - written, "%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu %lu %lu %lu\n", - s->min_sz_region, s->max_sz_region, - s->min_nr_accesses, s->max_nr_accesses, - s->min_age_region, s->max_age_region, + s->pattern.min_sz_region, + s->pattern.max_sz_region, + s->pattern.min_nr_accesses, + s->pattern.max_nr_accesses, + s->pattern.min_age_region, + s->pattern.max_age_region, damos_action_to_dbgfs_scheme_action(s->action), s->quota.ms, s->quota.sz, s->quota.reset_interval, @@ -221,8 +224,6 @@ static struct damos **str_to_schemes(const char *str, ssize_t len, struct damos *scheme, **schemes; const int max_nr_schemes = 256; int pos = 0, parsed, ret; - unsigned long min_sz, max_sz; - unsigned int min_nr_a, max_nr_a, min_age, max_age; unsigned int action_input; enum damos_action action;
@@ -233,13 +234,18 @@ static struct damos **str_to_schemes(const char *str, ssize_t len,
*nr_schemes = 0; while (pos < len && *nr_schemes < max_nr_schemes) { + struct damos_access_pattern pattern = {}; struct damos_quota quota = {}; struct damos_watermarks wmarks;
ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u %u %lu %lu %lu %lu%n", - &min_sz, &max_sz, &min_nr_a, &max_nr_a, - &min_age, &max_age, &action_input, "a.ms, + &pattern.min_sz_region, &pattern.max_sz_region, + &pattern.min_nr_accesses, + &pattern.max_nr_accesses, + &pattern.min_age_region, + &pattern.max_age_region, + &action_input, "a.ms, "a.sz, "a.reset_interval, "a.weight_sz, "a.weight_nr_accesses, "a.weight_age, &wmarks.metric, @@ -251,7 +257,9 @@ static struct damos **str_to_schemes(const char *str, ssize_t len, if ((int)action < 0) goto fail;
- if (min_sz > max_sz || min_nr_a > max_nr_a || min_age > max_age) + if (pattern.min_sz_region > pattern.max_sz_region || + pattern.min_nr_accesses > pattern.max_nr_accesses || + pattern.min_age_region > pattern.max_age_region) goto fail;
if (wmarks.high < wmarks.mid || wmarks.high < wmarks.low || @@ -259,8 +267,7 @@ static struct damos **str_to_schemes(const char *str, ssize_t len, goto fail;
pos += parsed; - scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a, - min_age, max_age, action, "a, &wmarks); + scheme = damon_new_scheme(&pattern, action, "a, &wmarks); if (!scheme) goto fail;
diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c index 9de6f00a71c5..0184ed4828b7 100644 --- a/mm/damon/lru_sort.c +++ b/mm/damon/lru_sort.c @@ -293,6 +293,17 @@ static bool get_monitoring_region(unsigned long *start, unsigned long *end) /* Create a DAMON-based operation scheme for hot memory regions */ static struct damos *damon_lru_sort_new_hot_scheme(unsigned int hot_thres) { + struct damos_access_pattern pattern = { + /* Find regions having PAGE_SIZE or larger size */ + .min_sz_region = PAGE_SIZE, + .max_sz_region = ULONG_MAX, + /* and accessed for more than the threshold */ + .min_nr_accesses = hot_thres, + .max_nr_accesses = UINT_MAX, + /* no matter its age */ + .min_age_region = 0, + .max_age_region = UINT_MAX, + }; struct damos_watermarks wmarks = { .metric = DAMOS_WMARK_FREE_MEM_RATE, .interval = wmarks_interval, @@ -313,26 +324,31 @@ static struct damos *damon_lru_sort_new_hot_scheme(unsigned int hot_thres) .weight_nr_accesses = 1, .weight_age = 0, }; - struct damos *scheme = damon_new_scheme( - /* Find regions having PAGE_SIZE or larger size */ - PAGE_SIZE, ULONG_MAX, - /* and accessed for more than the threshold */ - hot_thres, UINT_MAX, - /* no matter its age */ - 0, UINT_MAX, + + return damon_new_scheme( + &pattern, /* prioritize those on LRU lists, as soon as found */ DAMOS_LRU_PRIO, /* under the quota. */ "a, /* (De)activate this according to the watermarks. */ &wmarks); - - return scheme; }
/* Create a DAMON-based operation scheme for cold memory regions */ static struct damos *damon_lru_sort_new_cold_scheme(unsigned int cold_thres) { + struct damos_access_pattern pattern = { + /* Find regions having PAGE_SIZE or larger size */ + .min_sz_region = PAGE_SIZE, + .max_sz_region = ULONG_MAX, + /* and not accessed at all */ + .min_nr_accesses = 0, + .max_nr_accesses = 0, + /* for min_age or more micro-seconds */ + .min_age_region = cold_thres, + .max_age_region = UINT_MAX, + }; struct damos_watermarks wmarks = { .metric = DAMOS_WMARK_FREE_MEM_RATE, .interval = wmarks_interval, @@ -354,21 +370,15 @@ static struct damos *damon_lru_sort_new_cold_scheme(unsigned int cold_thres) .weight_nr_accesses = 0, .weight_age = 1, }; - struct damos *scheme = damon_new_scheme( - /* Find regions having PAGE_SIZE or larger size */ - PAGE_SIZE, ULONG_MAX, - /* and not accessed at all */ - 0, 0, - /* for cold_thres or more micro-seconds, and */ - cold_thres, UINT_MAX, + + return damon_new_scheme( + &pattern, /* mark those as not accessed, as soon as found */ DAMOS_LRU_DEPRIO, /* under the quota. */ "a, /* (De)activate this according to the watermarks. */ &wmarks); - - return scheme; }
static int damon_lru_sort_apply_parameters(void) diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c index a7faf51b4bd4..5aeca0b9e88e 100644 --- a/mm/damon/reclaim.c +++ b/mm/damon/reclaim.c @@ -264,6 +264,17 @@ static bool get_monitoring_region(unsigned long *start, unsigned long *end)
static struct damos *damon_reclaim_new_scheme(void) { + struct damos_access_pattern pattern = { + /* Find regions having PAGE_SIZE or larger size */ + .min_sz_region = PAGE_SIZE, + .max_sz_region = ULONG_MAX, + /* and not accessed at all */ + .min_nr_accesses = 0, + .max_nr_accesses = 0, + /* for min_age or more micro-seconds */ + .min_age_region = min_age / aggr_interval, + .max_age_region = UINT_MAX, + }; struct damos_watermarks wmarks = { .metric = DAMOS_WMARK_FREE_MEM_RATE, .interval = wmarks_interval, @@ -284,21 +295,15 @@ static struct damos *damon_reclaim_new_scheme(void) .weight_nr_accesses = 0, .weight_age = 1 }; - struct damos *scheme = damon_new_scheme( - /* Find regions having PAGE_SIZE or larger size */ - PAGE_SIZE, ULONG_MAX, - /* and not accessed at all */ - 0, 0, - /* for min_age or more micro-seconds, and */ - min_age / aggr_interval, UINT_MAX, + + return damon_new_scheme( + &pattern, /* page out those, as soon as found */ DAMOS_PAGEOUT, /* under the quota. */ "a, /* (De)activate this according to the watermarks. */ &wmarks); - - return scheme; }
static int damon_reclaim_apply_parameters(void) diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index b4b9614eecbe..ec88644c51df 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -2259,11 +2259,20 @@ static int damon_sysfs_set_targets(struct damon_ctx *ctx, static struct damos *damon_sysfs_mk_scheme( struct damon_sysfs_scheme *sysfs_scheme) { - struct damon_sysfs_access_pattern *pattern = + struct damon_sysfs_access_pattern *access_pattern = sysfs_scheme->access_pattern; struct damon_sysfs_quotas *sysfs_quotas = sysfs_scheme->quotas; struct damon_sysfs_weights *sysfs_weights = sysfs_quotas->weights; struct damon_sysfs_watermarks *sysfs_wmarks = sysfs_scheme->watermarks; + + struct damos_access_pattern pattern = { + .min_sz_region = access_pattern->sz->min, + .max_sz_region = access_pattern->sz->max, + .min_nr_accesses = access_pattern->nr_accesses->min, + .max_nr_accesses = access_pattern->nr_accesses->max, + .min_age_region = access_pattern->age->min, + .max_age_region = access_pattern->age->max, + }; struct damos_quota quota = { .ms = sysfs_quotas->ms, .sz = sysfs_quotas->sz, @@ -2280,10 +2289,8 @@ static struct damos *damon_sysfs_mk_scheme( .low = sysfs_wmarks->low, };
- return damon_new_scheme(pattern->sz->min, pattern->sz->max, - pattern->nr_accesses->min, pattern->nr_accesses->max, - pattern->age->min, pattern->age->max, - sysfs_scheme->action, "a, &wmarks); + return damon_new_scheme(&pattern, sysfs_scheme->action, "a, + &wmarks); }
static int damon_sysfs_set_schemes(struct damon_ctx *ctx,
From: SeongJae Park sj@kernel.org
[ Upstream commit 95bc35f9bee5220dad4e8567654ab3288a181639 ]
Commit da87878010e5 ("mm/damon/sysfs: support online inputs update") made 'damon_sysfs_set_schemes()' to be called for running DAMON context, which could have schemes. In the case, DAMON sysfs interface is supposed to update, remove, or add schemes to reflect the sysfs files. However, the code is assuming the DAMON context wouldn't have schemes at all, and therefore creates and adds new schemes. As a result, the code doesn't work as intended for online schemes tuning and could have more than expected memory footprint. The schemes are all in the DAMON context, so it doesn't leak the memory, though.
Remove the wrong asssumption (the DAMON context wouldn't have schemes) in 'damon_sysfs_set_schemes()' to fix the bug.
Link: https://lkml.kernel.org/r/20221122194831.3472-1-sj@kernel.org Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [5.19+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/damon/sysfs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-)
diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index ec88644c51df..1b782ca41396 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -2293,12 +2293,54 @@ static struct damos *damon_sysfs_mk_scheme( &wmarks); }
+static void damon_sysfs_update_scheme(struct damos *scheme, + struct damon_sysfs_scheme *sysfs_scheme) +{ + struct damon_sysfs_access_pattern *access_pattern = + sysfs_scheme->access_pattern; + struct damon_sysfs_quotas *sysfs_quotas = sysfs_scheme->quotas; + struct damon_sysfs_weights *sysfs_weights = sysfs_quotas->weights; + struct damon_sysfs_watermarks *sysfs_wmarks = sysfs_scheme->watermarks; + + scheme->pattern.min_sz_region = access_pattern->sz->min; + scheme->pattern.max_sz_region = access_pattern->sz->max; + scheme->pattern.min_nr_accesses = access_pattern->nr_accesses->min; + scheme->pattern.max_nr_accesses = access_pattern->nr_accesses->max; + scheme->pattern.min_age_region = access_pattern->age->min; + scheme->pattern.max_age_region = access_pattern->age->max; + + scheme->action = sysfs_scheme->action; + + scheme->quota.ms = sysfs_quotas->ms; + scheme->quota.sz = sysfs_quotas->sz; + scheme->quota.reset_interval = sysfs_quotas->reset_interval_ms; + scheme->quota.weight_sz = sysfs_weights->sz; + scheme->quota.weight_nr_accesses = sysfs_weights->nr_accesses; + scheme->quota.weight_age = sysfs_weights->age; + + scheme->wmarks.metric = sysfs_wmarks->metric; + scheme->wmarks.interval = sysfs_wmarks->interval_us; + scheme->wmarks.high = sysfs_wmarks->high; + scheme->wmarks.mid = sysfs_wmarks->mid; + scheme->wmarks.low = sysfs_wmarks->low; +} + static int damon_sysfs_set_schemes(struct damon_ctx *ctx, struct damon_sysfs_schemes *sysfs_schemes) { - int i; + struct damos *scheme, *next; + int i = 0; + + damon_for_each_scheme_safe(scheme, next, ctx) { + if (i < sysfs_schemes->nr) + damon_sysfs_update_scheme(scheme, + sysfs_schemes->schemes_arr[i]); + else + damon_destroy_scheme(scheme); + i++; + }
- for (i = 0; i < sysfs_schemes->nr; i++) { + for (; i < sysfs_schemes->nr; i++) { struct damos *scheme, *next;
scheme = damon_sysfs_mk_scheme(sysfs_schemes->schemes_arr[i]);
From: Ricardo Ribalda ribalda@chromium.org
commit 79ece9b292af6b0edcfb4d67a00711d25507640b upstream.
A driver that supports I2C_DRV_ACPI_WAIVE_D0_PROBE is not expected to power off a device that it has not powered on previously.
For devices operating in "full_power" mode, the first call to `i2c_acpi_waive_d0_probe` will return 0, which means that the device will be turned on with `dev_pm_domain_attach`.
If probe fails the second call to `i2c_acpi_waive_d0_probe` will return 1, which means that the device will not be turned off. This is, it will be left in a different power state. Lets fix it.
Reviewed-by: Hidenori Kobayashi hidenorik@chromium.org Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Cc: stable@vger.kernel.org Fixes: b18c1ad685d9 ("i2c: Allow an ACPI driver to manage the device's power state during probe") Signed-off-by: Ricardo Ribalda ribalda@chromium.org Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-core-base.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/drivers/i2c/i2c-core-base.c +++ b/drivers/i2c/i2c-core-base.c @@ -467,6 +467,7 @@ static int i2c_device_probe(struct devic { struct i2c_client *client = i2c_verify_client(dev); struct i2c_driver *driver; + bool do_power_on; int status;
if (!client) @@ -541,8 +542,8 @@ static int i2c_device_probe(struct devic if (status < 0) goto err_clear_wakeup_irq;
- status = dev_pm_domain_attach(&client->dev, - !i2c_acpi_waive_d0_probe(dev)); + do_power_on = !i2c_acpi_waive_d0_probe(dev); + status = dev_pm_domain_attach(&client->dev, do_power_on); if (status) goto err_clear_wakeup_irq;
@@ -581,7 +582,7 @@ static int i2c_device_probe(struct devic err_release_driver_resources: devres_release_group(&client->dev, client->devres_group_id); err_detach_pm_domain: - dev_pm_domain_detach(&client->dev, !i2c_acpi_waive_d0_probe(dev)); + dev_pm_domain_detach(&client->dev, do_power_on); err_clear_wakeup_irq: dev_pm_clear_wake_irq(&client->dev); device_init_wakeup(&client->dev, false); @@ -610,7 +611,7 @@ static void i2c_device_remove(struct dev
devres_release_group(&client->dev, client->devres_group_id);
- dev_pm_domain_detach(&client->dev, !i2c_acpi_waive_d0_probe(dev)); + dev_pm_domain_detach(&client->dev, true);
dev_pm_clear_wake_irq(&client->dev); device_init_wakeup(&client->dev, false);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 145900cf91c4b32ac05dbc8675a0c7f4a278749d ]
A problem about i2c-npcm7xx create debugfs failed is triggered with the following log given:
[ 173.827310] debugfs: Directory 'npcm_i2c' with parent '/' already present!
The reason is that npcm_i2c_init() returns platform_driver_register() directly without checking its return value, if platform_driver_register() failed, it returns without destroy the newly created debugfs, resulting the debugfs of npcm_i2c can never be created later.
npcm_i2c_init() debugfs_create_dir() # create debugfs directory platform_driver_register() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without destroy debugfs directory
Fix by removing debugfs when platform_driver_register() returns error.
Fixes: 56a1485b102e ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver") Signed-off-by: Yuan Can yuancan@huawei.com Reviewed-by: Tali Perry tali.perry@nuvoton.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-npcm7xx.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-npcm7xx.c b/drivers/i2c/busses/i2c-npcm7xx.c index 0c365b57d957..83457359ec45 100644 --- a/drivers/i2c/busses/i2c-npcm7xx.c +++ b/drivers/i2c/busses/i2c-npcm7xx.c @@ -2393,8 +2393,17 @@ static struct platform_driver npcm_i2c_bus_driver = {
static int __init npcm_i2c_init(void) { + int ret; + npcm_i2c_debugfs_dir = debugfs_create_dir("npcm_i2c", NULL); - return platform_driver_register(&npcm_i2c_bus_driver); + + ret = platform_driver_register(&npcm_i2c_bus_driver); + if (ret) { + debugfs_remove_recursive(npcm_i2c_debugfs_dir); + return ret; + } + + return 0; } module_init(npcm_i2c_init);
From: Wang Yufen wangyufen@huawei.com
[ Upstream commit 7d8ccf4f117d082156e842d959f634efcf203cef ]
Fix to return a negative error code from the gi2c->err instead of 0.
Fixes: d8703554f4de ("i2c: qcom-geni: Add support for GPI DMA") Signed-off-by: Wang Yufen wangyufen@huawei.com Reviewed-by: Tommaso Merciai tommaso.merciai@amarulasoluitons.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-qcom-geni.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 84a77512614d..8fce98bb77ff 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -626,7 +626,6 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i dev_err(gi2c->se.dev, "I2C timeout gpi flags:%d addr:0x%x\n", gi2c->cur->flags, gi2c->cur->addr); gi2c->err = -ETIMEDOUT; - goto err; }
if (gi2c->err) {
From: Andrew Lunn andrew@lunn.ch
[ Upstream commit d36678f7905cbd1dc55a8a96e066dafd749d4600 ]
Recent changes to the DMA code has resulting in the IMX driver failing I2C transfers when the buffer has been vmalloc. Only perform DMA transfers if the message has the I2C_M_DMA_SAFE flag set, indicating the client is providing a buffer which is DMA safe.
This is a minimal fix for stable. The I2C core provides helpers to allocate a bounce buffer. For a fuller fix the master should make use of these helpers.
Fixes: 4544b9f25e70 ("dma-mapping: Add vmap checks to dma_map_single()") Signed-off-by: Andrew Lunn andrew@lunn.ch Acked-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-imx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 3082183bd66a..fc70920c4dda 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -1132,7 +1132,8 @@ static int i2c_imx_read(struct imx_i2c_struct *i2c_imx, struct i2c_msg *msgs, int i, result; unsigned int temp; int block_data = msgs->flags & I2C_M_RECV_LEN; - int use_dma = i2c_imx->dma && msgs->len >= DMA_THRESHOLD && !block_data; + int use_dma = i2c_imx->dma && msgs->flags & I2C_M_DMA_SAFE && + msgs->len >= DMA_THRESHOLD && !block_data;
dev_dbg(&i2c_imx->adapter.dev, "<%s> write slave address: addr=0x%x\n", @@ -1298,7 +1299,8 @@ static int i2c_imx_xfer_common(struct i2c_adapter *adapter, result = i2c_imx_read(i2c_imx, &msgs[i], is_lastmsg, atomic); } else { if (!atomic && - i2c_imx->dma && msgs[i].len >= DMA_THRESHOLD) + i2c_imx->dma && msgs[i].len >= DMA_THRESHOLD && + msgs[i].flags & I2C_M_DMA_SAFE) result = i2c_imx_dma_write(i2c_imx, &msgs[i]); else result = i2c_imx_write(i2c_imx, &msgs[i], atomic);
From: Vishal Verma vishal.l.verma@intel.com
[ Upstream commit 14f16d47561ba9249efc6c2db9d47ed56841f070 ]
In hmat_register_target_initiators(), the variable 'best' gets initialized in the outer per-locality-type for loop. The initialization just before setting up 'Access 1' targets was unnecessary. Remove it.
Cc: Rafael J. Wysocki rafael@kernel.org Cc: Liu Shixin liushixin2@huawei.com Cc: Dan Williams dan.j.williams@intel.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-1-3712569be691@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Stable-dep-of: 48d4180939e1 ("ACPI: HMAT: Fix initiator registration for single-initiator systems") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/numa/hmat.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index c3d783aca196..fca69a726360 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -645,7 +645,6 @@ static void hmat_register_target_initiators(struct memory_target *target) /* Access 1 ignores Generic Initiators */ bitmap_zero(p_nodes, MAX_NUMNODES); list_sort(p_nodes, &initiators, initiator_cmp); - best = 0; for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) { loc = localities_types[i]; if (!loc)
From: Vishal Verma vishal.l.verma@intel.com
[ Upstream commit 48d4180939e12c4bd2846f984436d895bb9699ed ]
In a system with a single initiator node, and one or more memory-only 'target' nodes, the memory-only node(s) would fail to register their initiator node correctly. i.e. in sysfs:
# ls /sys/devices/system/node/node0/access0/targets/ node0
Where as the correct behavior should be:
# ls /sys/devices/system/node/node0/access0/targets/ node0 node1
This happened because hmat_register_target_initiators() uses list_sort() to sort the initiator list, but the sort comparision function (initiator_cmp()) is overloaded to also set the node mask's bits.
In a system with a single initiator, the list is singular, and list_sort elides the comparision helper call. Thus the node mask never gets set, and the subsequent search for the best initiator comes up empty.
Add a new helper to consume the sorted initiator list, and generate the nodemask, decoupling it from the overloaded initiator_cmp() comparision callback. This prevents the singular list corner case naturally, and makes the code easier to follow as well.
Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki rafael@kernel.org Cc: Liu Shixin liushixin2@huawei.com Cc: Dan Williams dan.j.williams@intel.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Reported-by: Chris Piper chris.d.piper@intel.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Link: https://lore.kernel.org/r/20221116-acpi_hmat_fix-v2-2-3712569be691@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/numa/hmat.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index fca69a726360..b42653707fdc 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -563,17 +563,26 @@ static int initiator_cmp(void *priv, const struct list_head *a, { struct memory_initiator *ia; struct memory_initiator *ib; - unsigned long *p_nodes = priv;
ia = list_entry(a, struct memory_initiator, node); ib = list_entry(b, struct memory_initiator, node);
- set_bit(ia->processor_pxm, p_nodes); - set_bit(ib->processor_pxm, p_nodes); - return ia->processor_pxm - ib->processor_pxm; }
+static int initiators_to_nodemask(unsigned long *p_nodes) +{ + struct memory_initiator *initiator; + + if (list_empty(&initiators)) + return -ENXIO; + + list_for_each_entry(initiator, &initiators, node) + set_bit(initiator->processor_pxm, p_nodes); + + return 0; +} + static void hmat_register_target_initiators(struct memory_target *target) { static DECLARE_BITMAP(p_nodes, MAX_NUMNODES); @@ -610,7 +619,10 @@ static void hmat_register_target_initiators(struct memory_target *target) * initiators. */ bitmap_zero(p_nodes, MAX_NUMNODES); - list_sort(p_nodes, &initiators, initiator_cmp); + list_sort(NULL, &initiators, initiator_cmp); + if (initiators_to_nodemask(p_nodes) < 0) + return; + if (!access0done) { for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) { loc = localities_types[i]; @@ -644,7 +656,9 @@ static void hmat_register_target_initiators(struct memory_target *target)
/* Access 1 ignores Generic Initiators */ bitmap_zero(p_nodes, MAX_NUMNODES); - list_sort(p_nodes, &initiators, initiator_cmp); + if (initiators_to_nodemask(p_nodes) < 0) + return; + for (i = WRITE_LATENCY; i <= READ_BANDWIDTH; i++) { loc = localities_types[i]; if (!loc)
From: Conor Dooley conor.dooley@microchip.com
[ Upstream commit d9f15a9de44affe733e34f93bc184945ba277e6d ]
This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d.
On the subject of suspend, the RISC-V SBI spec states:
This does not cover whether any given events actually reach the hart or not, just what the hart will do if it receives an event. On PolarFire SoC, and potentially other SiFive based implementations, events from the RISC-V timer do reach a hart during suspend. This is not the case for the implementation on the Allwinner D1 - there timer events are not received during suspend.
To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the timer driver - but this has broken both RCU stall detection and timers generally on PolarFire SoC and potentially other SiFive based implementations.
If an AXI read to the PCIe controller on PolarFire SoC times out, the system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system just locks up without RCU stalling:
io scheduler mq-deadline registered io scheduler kyber registered microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: axi read request error microchip-pcie 2000000000.pcie: axi read timeout microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer Freeing initrd memory: 7332K
Similarly issues were reported with clock_nanosleep() - with a test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed commit in place, the sleep times are rounded up to the next jiffy:
== CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521
Fortunately, the D1 has a second timer, which is "currently used in preference to the RISC-V/SBI timer driver" so a revert here does not hurt operation of D1 in its current form.
Ultimately, a DeviceTree property (or node) will be added to encode the behaviour of the timers, but until then revert the addition of CLOCK_EVT_FEAT_C3STOP.
Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Signed-off-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Palmer Dabbelt palmer@rivosinc.com Acked-by: Palmer Dabbelt palmer@rivosinc.com Acked-by: Samuel Holland samuel@sholland.org Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/ Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@sho... Link: https://lore.kernel.org/r/20221122121620.3522431-1-conor.dooley@microchip.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-riscv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c index 969a552da8d2..a0d66fabf073 100644 --- a/drivers/clocksource/timer-riscv.c +++ b/drivers/clocksource/timer-riscv.c @@ -51,7 +51,7 @@ static int riscv_clock_next_event(unsigned long delta, static unsigned int riscv_clock_event_irq; static DEFINE_PER_CPU(struct clock_event_device, riscv_clock_event) = { .name = "riscv_timer_clockevent", - .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP, + .features = CLOCK_EVT_FEAT_ONESHOT, .rating = 100, .set_next_event = riscv_clock_next_event, };
From: Jan Dabros jsd@semihalf.com
commit 23393c6461422df5bf8084a086ada9a7e17dc2ba upstream.
Currently tpm transactions are executed unconditionally in tpm_pm_suspend() function, which may lead to races with other tpm accessors in the system.
Specifically, the hw_random tpm driver makes use of tpm_get_random(), and this function is called in a loop from a kthread, which means it's not frozen alongside userspace, and so can race with the work done during system suspend:
tpm tpm0: tpm_transmit: tpm_recv: error -52 tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics CPU: 0 PID: 1 Comm: init Not tainted 6.1.0-rc5+ #135 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20220807_005459-localhost 04/01/2014 Call Trace: tpm_tis_status.cold+0x19/0x20 tpm_transmit+0x13b/0x390 tpm_transmit_cmd+0x20/0x80 tpm1_pm_suspend+0xa6/0x110 tpm_pm_suspend+0x53/0x80 __pnp_bus_suspend+0x35/0xe0 __device_suspend+0x10f/0x350
Fix this by calling tpm_try_get_ops(), which itself is a wrapper around tpm_chip_start(), but takes the appropriate mutex.
Signed-off-by: Jan Dabros jsd@semihalf.com Reported-by: Vlastimil Babka vbabka@suse.cz Tested-by: Jason A. Donenfeld Jason@zx2c4.com Tested-by: Vlastimil Babka vbabka@suse.cz Link: https://lore.kernel.org/all/c5ba47ef-393f-1fba-30bd-1230d1b4b592@suse.cz/ Cc: stable@vger.kernel.org Fixes: e891db1a18bf ("tpm: turn on TPM on suspend for TPM 1.x") [Jason: reworked commit message, added metadata] Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm-interface.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/char/tpm/tpm-interface.c +++ b/drivers/char/tpm/tpm-interface.c @@ -401,13 +401,14 @@ int tpm_pm_suspend(struct device *dev) !pm_suspend_via_firmware()) goto suspended;
- if (!tpm_chip_start(chip)) { + rc = tpm_try_get_ops(chip); + if (!rc) { if (chip->flags & TPM_CHIP_FLAG_TPM2) tpm2_shutdown(chip, TPM2_SU_STATE); else rc = tpm1_pm_suspend(chip, tpm_suspend_pcr);
- tpm_chip_stop(chip); + tpm_put_ops(chip); }
suspended:
From: Zhang Xiaoxu zhangxiaoxu5@huawei.com
commit 8c9a59939deb4bfafdc451100c03d1e848b4169b upstream.
There is a kmemleak when test the raydium_i2c_ts with bpf mock device:
unreferenced object 0xffff88812d3675a0 (size 8): comm "python3", pid 349, jiffies 4294741067 (age 95.695s) hex dump (first 8 bytes): 11 0e 10 c0 01 00 04 00 ........ backtrace: [<0000000068427125>] __kmalloc+0x46/0x1b0 [<0000000090180f91>] raydium_i2c_send+0xd4/0x2bf [raydium_i2c_ts] [<000000006e631aee>] raydium_i2c_initialize.cold+0xbc/0x3e4 [raydium_i2c_ts] [<00000000dc6fcf38>] raydium_i2c_probe+0x3cd/0x6bc [raydium_i2c_ts] [<00000000a310de16>] i2c_device_probe+0x651/0x680 [<00000000f5a96bf3>] really_probe+0x17c/0x3f0 [<00000000096ba499>] __driver_probe_device+0xe3/0x170 [<00000000c5acb4d9>] driver_probe_device+0x49/0x120 [<00000000264fe082>] __device_attach_driver+0xf7/0x150 [<00000000f919423c>] bus_for_each_drv+0x114/0x180 [<00000000e067feca>] __device_attach+0x1e5/0x2d0 [<0000000054301fc2>] bus_probe_device+0x126/0x140 [<00000000aad93b22>] device_add+0x810/0x1130 [<00000000c086a53f>] i2c_new_client_device+0x352/0x4e0 [<000000003c2c248c>] of_i2c_register_device+0xf1/0x110 [<00000000ffec4177>] of_i2c_notify+0x100/0x160 unreferenced object 0xffff88812d3675c8 (size 8): comm "python3", pid 349, jiffies 4294741070 (age 95.692s) hex dump (first 8 bytes): 22 00 36 2d 81 88 ff ff ".6-.... backtrace: [<0000000068427125>] __kmalloc+0x46/0x1b0 [<0000000090180f91>] raydium_i2c_send+0xd4/0x2bf [raydium_i2c_ts] [<000000001d5c9620>] raydium_i2c_initialize.cold+0x223/0x3e4 [raydium_i2c_ts] [<00000000dc6fcf38>] raydium_i2c_probe+0x3cd/0x6bc [raydium_i2c_ts] [<00000000a310de16>] i2c_device_probe+0x651/0x680 [<00000000f5a96bf3>] really_probe+0x17c/0x3f0 [<00000000096ba499>] __driver_probe_device+0xe3/0x170 [<00000000c5acb4d9>] driver_probe_device+0x49/0x120 [<00000000264fe082>] __device_attach_driver+0xf7/0x150 [<00000000f919423c>] bus_for_each_drv+0x114/0x180 [<00000000e067feca>] __device_attach+0x1e5/0x2d0 [<0000000054301fc2>] bus_probe_device+0x126/0x140 [<00000000aad93b22>] device_add+0x810/0x1130 [<00000000c086a53f>] i2c_new_client_device+0x352/0x4e0 [<000000003c2c248c>] of_i2c_register_device+0xf1/0x110 [<00000000ffec4177>] of_i2c_notify+0x100/0x160
After BANK_SWITCH command from i2c BUS, no matter success or error happened, the tx_buf should be freed.
Fixes: 3b384bd6c3f2 ("Input: raydium_ts_i2c - do not split tx transactions") Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com Link: https://lore.kernel.org/r/20221202103412.2120169-1-zhangxiaoxu5@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/raydium_i2c_ts.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/input/touchscreen/raydium_i2c_ts.c +++ b/drivers/input/touchscreen/raydium_i2c_ts.c @@ -211,12 +211,14 @@ static int raydium_i2c_send(struct i2c_c
error = raydium_i2c_xfer(client, addr, xfer, ARRAY_SIZE(xfer)); if (likely(!error)) - return 0; + goto out;
msleep(RM_RETRY_DELAY_MS); } while (++tries < RM_MAX_RETRIES);
dev_err(&client->dev, "%s failed: %d\n", __func__, error); +out: + kfree(tx_buf); return error; }
From: Christophe Leroy christophe.leroy@csgroup.eu
commit 89d21e259a94f7d5582ec675aa445f5a79f347e4 upstream.
test_bpf tail call tests end up as:
test_bpf: #0 Tail call leaf jited:1 85 PASS test_bpf: #1 Tail call 2 jited:1 111 PASS test_bpf: #2 Tail call 3 jited:1 145 PASS test_bpf: #3 Tail call 4 jited:1 170 PASS test_bpf: #4 Tail call load/store leaf jited:1 190 PASS test_bpf: #5 Tail call load/store jited:1 BUG: Unable to handle kernel data access on write at 0xf1b4e000 Faulting instruction address: 0xbe86b710 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash PowerMac Modules linked in: test_bpf(+) CPU: 0 PID: 97 Comm: insmod Not tainted 6.1.0-rc4+ #195 Hardware name: PowerMac3,1 750CL 0x87210 PowerMac NIP: be86b710 LR: be857e88 CTR: be86b704 REGS: f1b4df20 TRAP: 0300 Not tainted (6.1.0-rc4+) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28008242 XER: 00000000 DAR: f1b4e000 DSISR: 42000000 GPR00: 00000001 f1b4dfe0 c11d2280 00000000 00000000 00000000 00000002 00000000 GPR08: f1b4e000 be86b704 f1b4e000 00000000 00000000 100d816a f2440000 fe73baa8 GPR16: f2458000 00000000 c1941ae4 f1fe2248 00000045 c0de0000 f2458030 00000000 GPR24: 000003e8 0000000f f2458000 f1b4dc90 3e584b46 00000000 f24466a0 c1941a00 NIP [be86b710] 0xbe86b710 LR [be857e88] __run_one+0xec/0x264 [test_bpf] Call Trace: [f1b4dfe0] [00000002] 0x2 (unreliable) Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 0000000000000000 ]---
This is a tentative to write above the stack. The problem is encoutered with tests added by commit 38608ee7b690 ("bpf, tests: Add load store test case for tail call")
This happens because tail call is done to a BPF prog with a different stack_depth. At the time being, the stack is kept as is when the caller tail calls its callee. But at exit, the callee restores the stack based on its own properties. Therefore here, at each run, r1 is erroneously increased by 32 - 16 = 16 bytes.
This was done that way in order to pass the tail call count from caller to callee through the stack. As powerpc32 doesn't have a red zone in the stack, it was necessary the maintain the stack as is for the tail call. But it was not anticipated that the BPF frame size could be different.
Let's take a new approach. Use register r4 to carry the tail call count during the tail call, and save it into the stack at function entry if required. This means the input parameter must be in r3, which is more correct as it is a 32 bits parameter, then tail call better match with normal BPF function entry, the down side being that we move that input parameter back and forth between r3 and r4. That can be optimised later.
Doing that also has the advantage of maximising the common parts between tail calls and a normal function exit.
With the fix, tail call tests are now successfull:
test_bpf: #0 Tail call leaf jited:1 53 PASS test_bpf: #1 Tail call 2 jited:1 115 PASS test_bpf: #2 Tail call 3 jited:1 154 PASS test_bpf: #3 Tail call 4 jited:1 165 PASS test_bpf: #4 Tail call load/store leaf jited:1 101 PASS test_bpf: #5 Tail call load/store jited:1 141 PASS test_bpf: #6 Tail call error path, max count reached jited:1 994 PASS test_bpf: #7 Tail call count preserved across function calls jited:1 140975 PASS test_bpf: #8 Tail call error path, NULL target jited:1 110 PASS test_bpf: #9 Tail call error path, index out of range jited:1 69 PASS test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
Suggested-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Fixes: 51c66ad849a7 ("powerpc/bpf: Implement extended BPF on PPC32") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/757acccb7fbfc78efa42dcf3c974b46678198905.166927888... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/net/bpf_jit_comp32.c | 52 +++++++++++++++----------------------- 1 file changed, 21 insertions(+), 31 deletions(-)
--- a/arch/powerpc/net/bpf_jit_comp32.c +++ b/arch/powerpc/net/bpf_jit_comp32.c @@ -113,23 +113,19 @@ void bpf_jit_build_prologue(u32 *image, { int i;
- /* First arg comes in as a 32 bits pointer. */ - EMIT(PPC_RAW_MR(bpf_to_ppc(BPF_REG_1), _R3)); - EMIT(PPC_RAW_LI(bpf_to_ppc(BPF_REG_1) - 1, 0)); + /* Initialize tail_call_cnt, to be skipped if we do tail calls. */ + EMIT(PPC_RAW_LI(_R4, 0)); + +#define BPF_TAILCALL_PROLOGUE_SIZE 4 + EMIT(PPC_RAW_STWU(_R1, _R1, -BPF_PPC_STACKFRAME(ctx)));
- /* - * Initialize tail_call_cnt in stack frame if we do tail calls. - * Otherwise, put in NOPs so that it can be skipped when we are - * invoked through a tail call. - */ if (ctx->seen & SEEN_TAILCALL) - EMIT(PPC_RAW_STW(bpf_to_ppc(BPF_REG_1) - 1, _R1, - bpf_jit_stack_offsetof(ctx, BPF_PPC_TC))); - else - EMIT(PPC_RAW_NOP()); + EMIT(PPC_RAW_STW(_R4, _R1, bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
-#define BPF_TAILCALL_PROLOGUE_SIZE 16 + /* First arg comes in as a 32 bits pointer. */ + EMIT(PPC_RAW_MR(bpf_to_ppc(BPF_REG_1), _R3)); + EMIT(PPC_RAW_LI(bpf_to_ppc(BPF_REG_1) - 1, 0));
/* * We need a stack frame, but we don't necessarily need to @@ -170,24 +166,24 @@ static void bpf_jit_emit_common_epilogue for (i = BPF_PPC_NVR_MIN; i <= 31; i++) if (bpf_is_seen_register(ctx, i)) EMIT(PPC_RAW_LWZ(i, _R1, bpf_jit_stack_offsetof(ctx, i))); -} - -void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) -{ - EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0))); - - bpf_jit_emit_common_epilogue(image, ctx); - - /* Tear down our stack frame */
if (ctx->seen & SEEN_FUNC) EMIT(PPC_RAW_LWZ(_R0, _R1, BPF_PPC_STACKFRAME(ctx) + PPC_LR_STKOFF));
+ /* Tear down our stack frame */ EMIT(PPC_RAW_ADDI(_R1, _R1, BPF_PPC_STACKFRAME(ctx)));
if (ctx->seen & SEEN_FUNC) EMIT(PPC_RAW_MTLR(_R0));
+} + +void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) +{ + EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_0))); + + bpf_jit_emit_common_epilogue(image, ctx); + EMIT(PPC_RAW_BLR()); }
@@ -244,7 +240,6 @@ static int bpf_jit_emit_tail_call(u32 *i EMIT(PPC_RAW_RLWINM(_R3, b2p_index, 2, 0, 29)); EMIT(PPC_RAW_ADD(_R3, _R3, b2p_bpf_array)); EMIT(PPC_RAW_LWZ(_R3, _R3, offsetof(struct bpf_array, ptrs))); - EMIT(PPC_RAW_STW(_R0, _R1, bpf_jit_stack_offsetof(ctx, BPF_PPC_TC)));
/* * if (prog == NULL) @@ -255,19 +250,14 @@ static int bpf_jit_emit_tail_call(u32 *i
/* goto *(prog->bpf_func + prologue_size); */ EMIT(PPC_RAW_LWZ(_R3, _R3, offsetof(struct bpf_prog, bpf_func))); - - if (ctx->seen & SEEN_FUNC) - EMIT(PPC_RAW_LWZ(_R0, _R1, BPF_PPC_STACKFRAME(ctx) + PPC_LR_STKOFF)); - EMIT(PPC_RAW_ADDIC(_R3, _R3, BPF_TAILCALL_PROLOGUE_SIZE)); - - if (ctx->seen & SEEN_FUNC) - EMIT(PPC_RAW_MTLR(_R0)); - EMIT(PPC_RAW_MTCTR(_R3));
EMIT(PPC_RAW_MR(_R3, bpf_to_ppc(BPF_REG_1)));
+ /* Put tail_call_cnt in r4 */ + EMIT(PPC_RAW_MR(_R4, _R0)); + /* tear restore NVRs, ... */ bpf_jit_emit_common_epilogue(image, ctx);
On 12/5/22 11:08, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 12/5/22 12:08, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 12/5/22 11:08 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Hey Greg,
Ran tests and boot tested on my system, no regressions found
Tested-by: Fenil Jain fkjainco@gmail.com
On Mon, Dec 05, 2022 at 08:08:26PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On Tue, 6 Dec 2022 at 00:54, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.0.12-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.0.y * git commit: cdf2cb62aec478b66cd531a340a5e3b58a782252 * git describe: v6.0.11-125-gcdf2cb62aec4 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.0.y/build/v6.0.11...
## Test Regressions (compared to v6.0.11)
## Metric Regressions (compared to v6.0.11)
## Test Fixes (compared to v6.0.11)
## Metric Fixes (compared to v6.0.11)
## Test result summary total: 137036, pass: 120940, fail: 3015, skip: 12818, xfail: 263
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 147 total, 144 passed, 3 failed * arm64: 45 total, 45 passed, 0 failed * i386: 35 total, 34 passed, 1 failed * mips: 26 total, 26 passed, 0 failed * parisc: 6 total, 6 passed, 0 failed * powerpc: 34 total, 30 passed, 4 failed * riscv: 12 total, 12 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 12 total, 12 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 38 total, 38 passed, 0 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-breakpoints * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-gpio * kselftest-intel_pstate * kselftest-kvm * kselftest-lib * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-openat2 * kselftest-seccomp * kselftest-timens * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ip * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * perf/Zstd-perf.data-compression * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
This is the start of the stable review cycle for the 6.0.12 release. There are 124 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 07 Dec 2022 19:07:46 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.0.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.0.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my x86_64 and ARM64 test systems. No errors or regressions.
Tested-by: Allen Pais apais@linux.microsoft.com
Thanks.
linux-stable-mirror@lists.linaro.org