This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.176-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.10.176-rc1
Lee Jones lee@kernel.org HID: uhid: Over-ride the default maximum data buffer value with our own
Lee Jones lee@kernel.org HID: core: Provide new max_buffer_size attribute to over-ride the default
Gaosheng Cui cuigaosheng1@huawei.com xfs: remove xfs_setattr_time() declaration
Christian Brauner brauner@kernel.org fs: use consistent setgid checks in is_sxid()
Amir Goldstein amir73il@gmail.com attr: use consistent sgid stripping checks
Amir Goldstein amir73il@gmail.com attr: add setattr_should_drop_sgid()
Amir Goldstein amir73il@gmail.com fs: move should_remove_suid()
Amir Goldstein amir73il@gmail.com attr: add in_group_or_capable()
Yang Xu xuyang2018.jy@fujitsu.com fs: move S_ISGID stripping into the vfs_*() helpers
Yang Xu xuyang2018.jy@fujitsu.com fs: add mode_strip_sgid() helper
Darrick J. Wong djwong@kernel.org xfs: use setattr_copy to set vfs inode attributes
Dave Chinner dchinner@redhat.com xfs: set prealloc flag in xfs_alloc_file_space()
Dave Chinner dchinner@redhat.com xfs: fallocate() should call file_modified()
Dave Chinner dchinner@redhat.com xfs: remove XFS_PREALLOC_SYNC
Darrick J. Wong djwong@kernel.org xfs: don't leak btree cursor when insrec fails after a split
Darrick J. Wong djwong@kernel.org xfs: purge dquots after inode walk fails during quotacheck
Dave Chinner dchinner@redhat.com xfs: don't assert fail on perag references on teardown
Lukas Wunner lukas@wunner.de PCI/DPC: Await readiness of secondary bus after reset
Lukas Wunner lukas@wunner.de PCI: Unify delay handling for reset and resume
Sven Schnelle svens@linux.ibm.com s390/ipl: add missing intersection check to ipl_report handling
Fedor Pchelkin pchelkin@ispras.ru io_uring: avoid null-ptr-deref in io_arm_poll_handler
Janusz Krzysztofik janusz.krzysztofik@linux.intel.com drm/i915/active: Fix misuse of non-idle barriers as fence trackers
John Harrison John.C.Harrison@Intel.com drm/i915: Don't use stolen memory for ring buffers with LLC
Nikita Zhandarovich n.zhandarovich@fintech.ru x86/mm: Fix use of uninitialized buffer in sme_enable()
Yazen Ghannam yazen.ghannam@amd.com x86/mce: Make sure logged MCEs are processed after sysfs update
Shawn Guo shawn.guo@linaro.org cpuidle: psci: Iterate backwards over list in psci_pd_remove()
Helge Deller deller@gmx.de fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
Francesco Dolcini francesco.dolcini@toradex.com mmc: sdhci_am654: lower power-on failed message severity
David Hildenbrand david@redhat.com mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
Chen Zhongjin chenzhongjin@huawei.com ftrace: Fix invalid address access in lookup_rec() when index is 0
Matthieu Baerts matthieu.baerts@tessares.net mptcp: avoid setting TCP_CLOSE state twice
Dmitry Osipenko dmitry.osipenko@collabora.com drm/shmem-helper: Remove another errant put in error path
Hamidreza H. Fard nitocris@posteo.net ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
Bard Liao yung-chuan.liao@linux.intel.com ALSA: hda: intel-dsp-config: add MTL PCI id
Paolo Bonzini pbonzini@redhat.com KVM: nVMX: add missing consistency checks for CR0 and CR4
Volker Lendecke vl@samba.org cifs: Fix smb2_set_path_size()
Steven Rostedt (Google) rostedt@goodmis.org tracing: Make tracepoint lockdep check actually test something
Steven Rostedt (Google) rostedt@goodmis.org tracing: Check field value in hist_field_name()
Sung-hun Kim sfoon.kim@samsung.com tracing: Make splice_read available again
Johan Hovold johan+linaro@kernel.org interconnect: fix mem leak when freeing nodes
Roman Gushchin roman.gushchin@linux.dev firmware: xilinx: don't make a sleepable memory allocation from an atomic context
Biju Das biju.das.jz@bp.renesas.com serial: 8250_em: Fix UART port type
Sherry Sun sherry.sun@nxp.com tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
Theodore Ts'o tytso@mit.edu ext4: fix possible double unlock when moving a directory
Alex Hung alex.hung@amd.com drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
Michael Karcher kernel@mkarcher.dialup.fu-berlin.de sh: intc: Avoid spurious sizeof-pointer-div warning
Qu Huang qu.huang@linux.dev drm/amdkfd: Fix an illegal memory access
Baokun Li libaokun1@huawei.com ext4: fix task hung in ext4_xattr_delete_inode
Baokun Li libaokun1@huawei.com ext4: fail ext4_iget if special inode unallocated
David Gow davidgow@google.com rust: arch/um: Disable FP/SIMD instruction to match x86
Yifei Liu yifeliu@cs.stonybrook.edu jffs2: correct logic when creating a hole in jffs2_write_begin
Tobias Schramm t.schramm@manjaro.org mmc: atmel-mci: fix race between stop command and start of next command
Linus Torvalds torvalds@linux-foundation.org media: m5mols: fix off-by-one loop termination error
Lars-Peter Clausen lars@metafoo.de hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org hwmon: tmp512: drop of_match_ptr for ID table
Lars-Peter Clausen lars@metafoo.de hwmon: (ucd90320) Add minimum delay between bus accesses
Marcus Folkesson marcus.folkesson@gmail.com hwmon: (ina3221) return prober error code
Zheng Wang zyytlz.wz@163.com hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
Tony O'Brien tony.obrien@alliedtelesis.co.nz hwmon: (adt7475) Fix masking of hysteresis registers
Tony O'Brien tony.obrien@alliedtelesis.co.nz hwmon: (adt7475) Display smoothing attributes in correct order
Liang He windhl@126.com ethernet: sun: add check for the mdesc_grab()
Daniil Tatianin d-tatianin@yandex-team.ru qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
Po-Hsu Lin po-hsu.lin@canonical.com selftests: net: devlink_port_split.py: skip test if no suitable device available
Alexandra Winter wintera@linux.ibm.com net/iucv: Fix size of interrupt data
Szymon Heidrich szymon.heidrich@gmail.com net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull
Ido Schimmel idosch@nvidia.com ipv4: Fix incorrect table ID in IOCTL path
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290
Maciej Fijalkowski maciej.fijalkowski@intel.com ice: xsk: disable txq irq before flushing hw
Liang He windhl@126.com block: sunvdc: add check for mdesc_grab() returning NULL
Damien Le Moal damien.lemoal@opensource.wdc.com nvmet: avoid potential UAF in nvmet_req_complete()
Ming Lei ming.lei@redhat.com nvme: fix handling single range discard request
Damien Le Moal damien.lemoal@opensource.wdc.com block: null_blk: Fix handling of fake timeout request
Damien Le Moal damien.lemoal@wdc.com null_blk: Move driver into its own directory
Liu Ying victor.liu@nxp.com drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
Szymon Heidrich szymon.heidrich@gmail.com net: usb: smsc75xx: Limit packet length to skb->len
Wenjia Zhang wenjia@linux.ibm.com net/smc: fix deadlock triggered by cancel_delayed_work_syn()
Zheng Wang zyytlz.wz@163.com nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
Heiner Kallweit hkallweit1@gmail.com net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails
Eric Dumazet edumazet@google.com net: tunnels: annotate lockless accesses to dev->needed_headroom
Daniil Tatianin d-tatianin@yandex-team.ru qed/qed_dev: guard against a possible division by zero
D. Wythe alibuda@linux.alibaba.com net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
Ivan Vecera ivecera@redhat.com i40e: Fix kernel crash during reboot when adapter is in recovery mode
Jianguo Wu wujianguo@chinatelecom.cn ipvlan: Make skb->skb_iif track skb->dev for l3s mode
Fedor Pchelkin pchelkin@ispras.ru nfc: pn533: initialize struct pn533_out_arg properly
Breno Leitao leitao@debian.org tcp: tcp_make_synack() can be called from process context
Bart Van Assche bvanassche@acm.org scsi: core: Fix a procfs host directory removal regression
Xiang Chen chenxiang66@hisilicon.com scsi: core: Fix a comment in function scsi_host_dev_release()
Jeremy Sowden jeremy@azazel.net netfilter: nft_redir: correct value of inet type `.maxattrs`
Jeremy Sowden jeremy@azazel.net netfilter: nft_redir: correct length for loading protocol registers
Jeremy Sowden jeremy@azazel.net netfilter: nft_masq: correct length for loading protocol registers
Jeremy Sowden jeremy@azazel.net netfilter: nft_nat: correct length for loading protocol registers
Bjorn Helgaas bhelgaas@google.com ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
Wenchao Hao haowenchao2@huawei.com scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
Glenn Washburn development@efficientek.com docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
Randy Dunlap rdunlap@infradead.org clk: HI655X: select REGMAP instead of depending on it
Christian Hewitt christianshewitt@gmail.com drm/meson: fix 1px pink line on GXM when scaling video overlay
Zhang Xiaoxu zhangxiaoxu5@huawei.com cifs: Move the in_send statistic to __smb_send_rqst()
Dmitry Osipenko dmitry.osipenko@collabora.com drm/panfrost: Don't sync rpm suspension after mmu flushing
Herbert Xu herbert@gondor.apana.org.au xfrm: Allow transport-mode states with AF_UNSPEC selector
-------------
Diffstat:
Documentation/filesystems/vfs.rst | 2 +- Documentation/trace/ftrace.rst | 2 +- Makefile | 4 +- arch/s390/boot/ipl_report.c | 8 +++ arch/x86/Makefile.um | 6 ++ arch/x86/kernel/cpu/mce/core.c | 1 + arch/x86/kvm/vmx/nested.c | 10 ++- arch/x86/mm/mem_encrypt_identity.c | 3 +- drivers/block/Kconfig | 8 +-- drivers/block/Makefile | 7 +- drivers/block/null_blk/Kconfig | 12 ++++ drivers/block/null_blk/Makefile | 11 +++ drivers/block/{null_blk_main.c => null_blk/main.c} | 6 +- drivers/block/{ => null_blk}/null_blk.h | 0 .../block/{null_blk_trace.c => null_blk/trace.c} | 2 +- .../block/{null_blk_trace.h => null_blk/trace.h} | 2 +- .../block/{null_blk_zoned.c => null_blk/zoned.c} | 2 +- drivers/block/sunvdc.c | 2 + drivers/clk/Kconfig | 2 +- drivers/cpuidle/cpuidle-psci-domain.c | 3 +- drivers/firmware/xilinx/zynqmp.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 9 +-- .../amd/display/dc/dml/dcn30/display_mode_vba_30.c | 5 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 9 ++- drivers/gpu/drm/i915/gt/intel_ring.c | 2 +- drivers/gpu/drm/i915/i915_active.c | 24 ++++--- drivers/gpu/drm/meson/meson_vpp.c | 2 + drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- drivers/hid/hid-core.c | 18 +++-- drivers/hid/uhid.c | 1 + drivers/hwmon/adt7475.c | 8 +-- drivers/hwmon/ina3221.c | 2 +- drivers/hwmon/pmbus/adm1266.c | 1 + drivers/hwmon/pmbus/ucd9000.c | 75 ++++++++++++++++++++ drivers/hwmon/tmp513.c | 2 +- drivers/hwmon/xgene-hwmon.c | 1 + drivers/interconnect/core.c | 4 ++ drivers/media/i2c/m5mols/m5mols_core.c | 2 +- drivers/mmc/host/atmel-mci.c | 3 - drivers/mmc/host/sdhci_am654.c | 2 +- drivers/net/dsa/mv88e6xxx/chip.c | 16 +++-- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 5 ++ drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c | 2 +- drivers/net/ethernet/sun/ldmvsw.c | 3 + drivers/net/ethernet/sun/sunvnet.c | 3 + drivers/net/ipvlan/ipvlan_l3s.c | 1 + drivers/net/phy/smsc.c | 5 +- drivers/net/usb/smsc75xx.c | 7 ++ drivers/nfc/pn533/usb.c | 1 + drivers/nfc/st-nci/ndlc.c | 6 +- drivers/nvme/host/core.c | 28 +++++--- drivers/nvme/target/core.c | 4 +- drivers/pci/pci-driver.c | 4 +- drivers/pci/pci.c | 57 +++++++-------- drivers/pci/pci.h | 16 ++++- drivers/pci/pcie/dpc.c | 4 +- drivers/scsi/hosts.c | 5 +- drivers/scsi/mpt3sas/mpt3sas_transport.c | 14 +++- drivers/tty/serial/8250/8250_em.c | 4 +- drivers/tty/serial/fsl_lpuart.c | 12 +++- drivers/video/fbdev/stifb.c | 27 ++++++++ fs/attr.c | 70 +++++++++++++++++-- fs/cifs/smb2inode.c | 31 +++++++-- fs/cifs/transport.c | 21 +++--- fs/ext4/inode.c | 18 +++-- fs/ext4/namei.c | 4 +- fs/ext4/xattr.c | 11 +++ fs/inode.c | 80 +++++++++++++--------- fs/internal.h | 6 ++ fs/jffs2/file.c | 15 ++-- fs/namei.c | 80 ++++++++++++++++++---- fs/ocfs2/file.c | 4 +- fs/ocfs2/namei.c | 1 + fs/open.c | 6 +- fs/xfs/libxfs/xfs_btree.c | 8 ++- fs/xfs/xfs_bmap_util.c | 9 +-- fs/xfs/xfs_file.c | 24 +++---- fs/xfs/xfs_iops.c | 56 +-------------- fs/xfs/xfs_iops.h | 1 - fs/xfs/xfs_mount.c | 3 +- fs/xfs/xfs_pnfs.c | 9 ++- fs/xfs/xfs_qm.c | 9 ++- include/drm/drm_bridge.h | 4 +- include/linux/fs.h | 5 +- include/linux/hid.h | 3 + include/linux/netdevice.h | 6 +- include/linux/sh_intc.h | 5 +- include/linux/tracepoint.h | 15 ++-- io_uring/io_uring.c | 4 +- kernel/trace/ftrace.c | 3 +- kernel/trace/trace.c | 2 + kernel/trace/trace_events_hist.c | 3 + mm/huge_memory.c | 6 +- net/ipv4/fib_frontend.c | 3 + net/ipv4/ip_tunnel.c | 12 ++-- net/ipv4/tcp_output.c | 2 +- net/ipv6/ip6_tunnel.c | 4 +- net/iucv/iucv.c | 2 +- net/mptcp/subflow.c | 1 - net/netfilter/nft_masq.c | 2 +- net/netfilter/nft_nat.c | 2 +- net/netfilter/nft_redir.c | 4 +- net/smc/smc_cdc.c | 3 + net/smc/smc_core.c | 2 +- net/xfrm/xfrm_state.c | 3 - sound/hda/intel-dsp-config.c | 9 +++ sound/pci/hda/hda_intel.c | 5 +- sound/pci/hda/patch_realtek.c | 1 + tools/testing/selftests/net/devlink_port_split.py | 30 ++++++++ 111 files changed, 743 insertions(+), 360 deletions(-)
From: Herbert Xu herbert@gondor.apana.org.au
[ Upstream commit c276a706ea1f51cf9723ed8484feceaf961b8f89 ]
xfrm state selectors are matched against the inner-most flow which can be of any address family. Therefore middle states in nested configurations need to carry a wildcard selector in order to work at all.
However, this is currently forbidden for transport-mode states.
Fix this by removing the unnecessary check.
Fixes: 13996378e658 ("[IPSEC]: Rename mode to outer_mode and add inner_mode") Reported-by: David George David.George@sophos.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_state.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index fdbd56ed4bd52..ba73014805a4f 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2611,9 +2611,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) if (inner_mode == NULL) goto error;
- if (!(inner_mode->flags & XFRM_MODE_FLAG_TUNNEL)) - goto error; - x->inner_mode = *inner_mode;
if (x->props.family == AF_INET)
From: Dmitry Osipenko dmitry.osipenko@collabora.com
[ Upstream commit ba3be66f11c3c49afaa9f49b99e21d88756229ef ]
Lockdep warns about potential circular locking dependency of devfreq with the fs_reclaim caused by immediate device suspension when mapping is released by shrinker. Fix it by doing the suspension asynchronously.
Reviewed-by: Steven Price steven.price@arm.com Fixes: ec7eba47da86 ("drm/panfrost: Rework page table flushing and runtime PM interaction") Signed-off-by: Dmitry Osipenko dmitry.osipenko@collabora.com Link: https://lore.kernel.org/all/20230108210445.3948344-3-dmitry.osipenko@collabo... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 13596961ae17f..5ff856ef7d88c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -236,7 +236,7 @@ static void panfrost_mmu_flush_range(struct panfrost_device *pfdev, if (pm_runtime_active(pfdev->dev)) mmu_hw_do_operation(pfdev, mmu, iova, size, AS_COMMAND_FLUSH_PT);
- pm_runtime_put_sync_autosuspend(pfdev->dev); + pm_runtime_put_autosuspend(pfdev->dev); }
static int mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu,
From: Zhang Xiaoxu zhangxiaoxu5@huawei.com
[ Upstream commit d0dc41119905f740e8d5594adce277f7c0de8c92 ]
When send SMB_COM_NT_CANCEL and RFC1002_SESSION_REQUEST, the in_send statistic was lost.
Let's move the in_send statistic to the send function to avoid this scenario.
Fixes: 7ee1af765dfa ("[CIFS]") Signed-off-by: Zhang Xiaoxu zhangxiaoxu5@huawei.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/transport.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index b137006f0fd25..4409f56fc37e6 100644 --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -312,7 +312,7 @@ static int __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, struct smb_rqst *rqst) { - int rc = 0; + int rc; struct kvec *iov; int n_vec; unsigned int send_length = 0; @@ -323,6 +323,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, struct msghdr smb_msg = {}; __be32 rfc1002_marker;
+ cifs_in_send_inc(server); if (cifs_rdma_enabled(server)) { /* return -EAGAIN when connecting or reconnecting */ rc = -EAGAIN; @@ -331,14 +332,17 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, goto smbd_done; }
+ rc = -EAGAIN; if (ssocket == NULL) - return -EAGAIN; + goto out;
+ rc = -ERESTARTSYS; if (fatal_signal_pending(current)) { cifs_dbg(FYI, "signal pending before send request\n"); - return -ERESTARTSYS; + goto out; }
+ rc = 0; /* cork the socket */ tcp_sock_set_cork(ssocket->sk, true);
@@ -449,7 +453,8 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, rc); else if (rc > 0) rc = 0; - +out: + cifs_in_send_dec(server); return rc; }
@@ -826,9 +831,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_rqst *rqst, * I/O response may come back and free the mid entry on another thread. */ cifs_save_when_sent(mid); - cifs_in_send_inc(server); rc = smb_send_rqst(server, 1, rqst, flags); - cifs_in_send_dec(server);
if (rc < 0) { revert_current_mid(server, mid->credits); @@ -1117,9 +1120,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses, else midQ[i]->callback = cifs_compound_last_callback; } - cifs_in_send_inc(server); rc = smb_send_rqst(server, num_rqst, rqst, flags); - cifs_in_send_dec(server);
for (i = 0; i < num_rqst; i++) cifs_save_when_sent(midQ[i]); @@ -1356,9 +1357,7 @@ SendReceive(const unsigned int xid, struct cifs_ses *ses,
midQ->mid_state = MID_REQUEST_SUBMITTED;
- cifs_in_send_inc(server); rc = smb_send(server, in_buf, len); - cifs_in_send_dec(server); cifs_save_when_sent(midQ);
if (rc < 0) @@ -1495,9 +1494,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifs_tcon *tcon, }
midQ->mid_state = MID_REQUEST_SUBMITTED; - cifs_in_send_inc(server); rc = smb_send(server, in_buf, len); - cifs_in_send_dec(server); cifs_save_when_sent(midQ);
if (rc < 0)
From: Christian Hewitt christianshewitt@gmail.com
[ Upstream commit 5c8cf1664f288098a971a1d1e65716a2b6a279e1 ]
Playing media with a resolution smaller than the crtc size requires the video overlay to be scaled for output and GXM boards display a 1px pink line on the bottom of the scaled overlay. Comparing with the downstream vendor driver revealed VPP_DUMMY_DATA not being set [0].
Setting VPP_DUMMY_DATA prevents the 1px pink line from being seen.
[0] https://github.com/endlessm/linux-s905x/blob/master/drivers/amlogic/amports/...
Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller") Suggested-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Christian Hewitt christianshewitt@gmail.com Acked-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20230303123312.155164-1-christ... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/meson/meson_vpp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/meson/meson_vpp.c b/drivers/gpu/drm/meson/meson_vpp.c index 154837688ab0d..5df1957c8e41f 100644 --- a/drivers/gpu/drm/meson/meson_vpp.c +++ b/drivers/gpu/drm/meson/meson_vpp.c @@ -100,6 +100,8 @@ void meson_vpp_init(struct meson_drm *priv) priv->io_base + _REG(VPP_DOLBY_CTRL)); writel_relaxed(0x1020080, priv->io_base + _REG(VPP_DUMMY_DATA1)); + writel_relaxed(0x42020, + priv->io_base + _REG(VPP_DUMMY_DATA)); } else if (meson_vpu_is_compatible(priv, VPU_COMPATIBLE_G12A)) writel_relaxed(0xf, priv->io_base + _REG(DOLBY_PATH_CTRL));
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 0ffad67784a097beccf34d297ddd1b0773b3b8a3 ]
REGMAP is a hidden (not user visible) symbol. Users cannot set it directly thru "make *config", so drivers should select it instead of depending on it if they need it.
Consistently using "select" or "depends on" can also help reduce Kconfig circular dependency issues.
Therefore, change the use of "depends on REGMAP" to "select REGMAP".
Fixes: 3a49afb84ca0 ("clk: enable hi655x common clk automatically") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Riku Voipio riku.voipio@linaro.org Cc: Stephen Boyd sboyd@kernel.org Cc: Michael Turquette mturquette@baylibre.com Cc: linux-clk@vger.kernel.org Link: https://lore.kernel.org/r/20230226053953.4681-3-rdunlap@infradead.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig index c715d4681a0b8..4ae49eae45869 100644 --- a/drivers/clk/Kconfig +++ b/drivers/clk/Kconfig @@ -79,7 +79,7 @@ config COMMON_CLK_RK808 config COMMON_CLK_HI655X tristate "Clock driver for Hi655x" if EXPERT depends on (MFD_HI655X_PMIC || COMPILE_TEST) - depends on REGMAP + select REGMAP default MFD_HI655X_PMIC help This driver supports the hi655x PMIC clock. This
From: Glenn Washburn development@efficientek.com
[ Upstream commit 74596085796fae0cfce3e42ee46bf4f8acbdac55 ]
The details for struct dentry_operations member d_weak_revalidate is missing a "d_" prefix.
Fixes: af96c1e304f7 ("docs: filesystems: vfs: Convert vfs.txt to RST") Signed-off-by: Glenn Washburn development@efficientek.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Link: https://lore.kernel.org/r/20230227184042.2375235-1-development@efficientek.c... Signed-off-by: Jonathan Corbet corbet@lwn.net Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/filesystems/vfs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index ca52c82e5bb54..f7b69a0e71e1c 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -1188,7 +1188,7 @@ defined: return -ECHILD and it will be called again in ref-walk mode.
-``_weak_revalidate`` +``d_weak_revalidate`` called when the VFS needs to revalidate a "jumped" dentry. This is called when a path-walk ends at dentry that was not acquired by doing a lookup in the parent directory. This includes "/",
From: Wenchao Hao haowenchao2@huawei.com
[ Upstream commit d3c57724f1569311e4b81e98fad0931028b9bdcd ]
Port is allocated by sas_port_alloc_num() and rphy is allocated by either sas_end_device_alloc() or sas_expander_alloc(), all of which may return NULL. So we need to check the rphy to avoid possible NULL pointer access.
If sas_rphy_add() returned with failure, rphy is set to NULL. We would access the rphy in the following lines which would also result NULL pointer access.
Fixes: 78316e9dfc24 ("scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add()") Signed-off-by: Wenchao Hao haowenchao2@huawei.com Link: https://lore.kernel.org/r/20230225100135.2109330-1-haowenchao2@huawei.com Acked-by: Sathya Prakash Veerichetty sathya.prakash@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_transport.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_transport.c b/drivers/scsi/mpt3sas/mpt3sas_transport.c index b58f4d9c296a3..326265fd7f91a 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_transport.c +++ b/drivers/scsi/mpt3sas/mpt3sas_transport.c @@ -670,7 +670,7 @@ mpt3sas_transport_port_add(struct MPT3SAS_ADAPTER *ioc, u16 handle, goto out_fail; } port = sas_port_alloc_num(sas_node->parent_dev); - if ((sas_port_add(port))) { + if (!port || (sas_port_add(port))) { ioc_err(ioc, "failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); goto out_fail; @@ -695,6 +695,12 @@ mpt3sas_transport_port_add(struct MPT3SAS_ADAPTER *ioc, u16 handle, rphy = sas_expander_alloc(port, mpt3sas_port->remote_identify.device_type);
+ if (!rphy) { + ioc_err(ioc, "failure at %s:%d/%s()!\n", + __FILE__, __LINE__, __func__); + goto out_delete_port; + } + rphy->identify = mpt3sas_port->remote_identify;
if (mpt3sas_port->remote_identify.device_type == SAS_END_DEVICE) { @@ -714,6 +720,7 @@ mpt3sas_transport_port_add(struct MPT3SAS_ADAPTER *ioc, u16 handle, __FILE__, __LINE__, __func__); sas_rphy_free(rphy); rphy = NULL; + goto out_delete_port; }
if (mpt3sas_port->remote_identify.device_type == SAS_END_DEVICE) { @@ -740,7 +747,10 @@ mpt3sas_transport_port_add(struct MPT3SAS_ADAPTER *ioc, u16 handle, rphy_to_expander_device(rphy)); return mpt3sas_port;
- out_fail: +out_delete_port: + sas_port_delete(port); + +out_fail: list_for_each_entry_safe(mpt3sas_phy, next, &mpt3sas_port->phy_list, port_siblings) list_del(&mpt3sas_phy->port_siblings);
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit ff447886e675979d66b2bc01810035d3baea1b3a ]
CONTROLLER_IN_GPU() is clearly intended to match only Intel devices, but previously it checked only the PCI Device ID, not the Vendor ID, so it could match devices from other vendors that happened to use the same Device ID.
Update CONTROLLER_IN_GPU() so it matches only Intel devices.
Fixes: 535115b5ff51 ("ALSA: hda - Abort the probe without i915 binding for HSW/B") Signed-off-by: Bjorn Helgaas bhelgaas@google.com Link: https://lore.kernel.org/r/20230307214054.886721-1-helgaas@kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_intel.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 494bfd2135a9e..de1fe604905f3 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -365,14 +365,15 @@ enum { #define needs_eld_notify_link(chip) false #endif
-#define CONTROLLER_IN_GPU(pci) (((pci)->device == 0x0a0c) || \ +#define CONTROLLER_IN_GPU(pci) (((pci)->vendor == 0x8086) && \ + (((pci)->device == 0x0a0c) || \ ((pci)->device == 0x0c0c) || \ ((pci)->device == 0x0d0c) || \ ((pci)->device == 0x160c) || \ ((pci)->device == 0x490d) || \ ((pci)->device == 0x4f90) || \ ((pci)->device == 0x4f91) || \ - ((pci)->device == 0x4f92)) + ((pci)->device == 0x4f92)))
#define IS_BXT(pci) ((pci)->vendor == 0x8086 && (pci)->device == 0x5a98)
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 068d82e75d537b444303b8c449a11e51ea659565 ]
The values in the protocol registers are two bytes wide. However, when parsing the register loads, the code currently uses the larger 16-byte size of a `union nf_inet_addr`. Change it to use the (correct) size of a `union nf_conntrack_man_proto` instead.
Fixes: d07db9884a5f ("netfilter: nf_tables: introduce nft_validate_register_load()") Signed-off-by: Jeremy Sowden jeremy@azazel.net Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_nat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c index db8f9116eeb43..cd4eb4996aff3 100644 --- a/net/netfilter/nft_nat.c +++ b/net/netfilter/nft_nat.c @@ -226,7 +226,7 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr, priv->flags |= NF_NAT_RANGE_MAP_IPS; }
- plen = sizeof_field(struct nf_nat_range, min_addr.all); + plen = sizeof_field(struct nf_nat_range, min_proto.all); if (tb[NFTA_NAT_REG_PROTO_MIN]) { err = nft_parse_register_load(tb[NFTA_NAT_REG_PROTO_MIN], &priv->sreg_proto_min, plen);
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit ec2c5917eb858428b2083d1c74f445aabbe8316b ]
The values in the protocol registers are two bytes wide. However, when parsing the register loads, the code currently uses the larger 16-byte size of a `union nf_inet_addr`. Change it to use the (correct) size of a `union nf_conntrack_man_proto` instead.
Fixes: 8a6bf5da1aef ("netfilter: nft_masq: support port range") Signed-off-by: Jeremy Sowden jeremy@azazel.net Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_masq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nft_masq.c b/net/netfilter/nft_masq.c index 9953e80537536..1818dbf089cad 100644 --- a/net/netfilter/nft_masq.c +++ b/net/netfilter/nft_masq.c @@ -43,7 +43,7 @@ static int nft_masq_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { - u32 plen = sizeof_field(struct nf_nat_range, min_addr.all); + u32 plen = sizeof_field(struct nf_nat_range, min_proto.all); struct nft_masq *priv = nft_expr_priv(expr); int err;
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 1f617b6b4c7a3d5ea7a56abb83a4c27733b60c2f ]
The values in the protocol registers are two bytes wide. However, when parsing the register loads, the code currently uses the larger 16-byte size of a `union nf_inet_addr`. Change it to use the (correct) size of a `union nf_conntrack_man_proto` instead.
Fixes: d07db9884a5f ("netfilter: nf_tables: introduce nft_validate_register_load()") Signed-off-by: Jeremy Sowden jeremy@azazel.net Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_redir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nft_redir.c b/net/netfilter/nft_redir.c index ba09890dddb50..deb7e65c8d82b 100644 --- a/net/netfilter/nft_redir.c +++ b/net/netfilter/nft_redir.c @@ -48,7 +48,7 @@ static int nft_redir_init(const struct nft_ctx *ctx, unsigned int plen; int err;
- plen = sizeof_field(struct nf_nat_range, min_addr.all); + plen = sizeof_field(struct nf_nat_range, min_proto.all); if (tb[NFTA_REDIR_REG_PROTO_MIN]) { err = nft_parse_register_load(tb[NFTA_REDIR_REG_PROTO_MIN], &priv->sreg_proto_min, plen);
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 493924519b1fe3faab13ee621a43b0d0939abab1 ]
`nft_redir_inet_type.maxattrs` was being set, presumably because of a cut-and-paste error, to `NFTA_MASQ_MAX`, instead of `NFTA_REDIR_MAX`.
Fixes: 63ce3940f3ab ("netfilter: nft_redir: add inet support") Signed-off-by: Jeremy Sowden jeremy@azazel.net Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_redir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nft_redir.c b/net/netfilter/nft_redir.c index deb7e65c8d82b..e64f531d66cfc 100644 --- a/net/netfilter/nft_redir.c +++ b/net/netfilter/nft_redir.c @@ -232,7 +232,7 @@ static struct nft_expr_type nft_redir_inet_type __read_mostly = { .name = "redir", .ops = &nft_redir_inet_ops, .policy = nft_redir_policy, - .maxattr = NFTA_MASQ_MAX, + .maxattr = NFTA_REDIR_MAX, .owner = THIS_MODULE, };
From: Xiang Chen chenxiang66@hisilicon.com
[ Upstream commit 2dde5c8d912efea43be94d6a83ac9cb74879fa12 ]
Commit 3be8828fc507 ("scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops") moved rcu to scsi_cmnd instead of shost. Modify "shost->rcu" to "scmd->rcu" in a comment.
Link: https://lore.kernel.org/r/1620646526-193154-1-git-send-email-chenxiang66@his... Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hosts.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index fae0323242103..0fd2487203ff5 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -325,7 +325,7 @@ static void scsi_host_dev_release(struct device *dev) /* In case scsi_remove_host() has not been called. */ scsi_proc_hostdir_rm(shost->hostt);
- /* Wait for functions invoked through call_rcu(&shost->rcu, ...) */ + /* Wait for functions invoked through call_rcu(&scmd->rcu, ...) */ rcu_barrier();
if (shost->tmf_work_q)
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit be03df3d4bfe7e8866d4aa43d62e648ffe884f5f ]
scsi_proc_hostdir_rm() decreases a reference counter and hence must only be called once per host that is removed. This change does not require a scsi_add_host_with_dma() change since scsi_add_host_with_dma() will return 0 (success) if scsi_proc_host_add() is called.
Fixes: fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name} directory earlier") Cc: John Garry john.g.garry@oracle.com Reported-by: John Garry john.g.garry@oracle.com Link: https://lore.kernel.org/all/ed6b8027-a9d9-1b45-be8e-df4e8c6c4605@oracle.com/ Reported-by: syzbot+645a4616b87a2f10e398@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-scsi/000000000000890fab05f65342b6@google.com/ Signed-off-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20230307214428.3703498-1-bvanassche@acm.org Tested-by: John Garry john.g.garry@oracle.com Tested-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hosts.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 0fd2487203ff5..18321cf9db5d6 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -322,9 +322,6 @@ static void scsi_host_dev_release(struct device *dev) struct Scsi_Host *shost = dev_to_shost(dev); struct device *parent = dev->parent;
- /* In case scsi_remove_host() has not been called. */ - scsi_proc_hostdir_rm(shost->hostt); - /* Wait for functions invoked through call_rcu(&scmd->rcu, ...) */ rcu_barrier();
From: Breno Leitao leitao@debian.org
[ Upstream commit bced3f7db95ff2e6ca29dc4d1c9751ab5e736a09 ]
tcp_rtx_synack() now could be called in process context as explained in 0a375c822497 ("tcp: tcp_rtx_synack() can be called from process context").
tcp_rtx_synack() might call tcp_make_synack(), which will touch per-CPU variables with preemption enabled. This causes the following BUG:
BUG: using __this_cpu_add() in preemptible [00000000] code: ThriftIO1/5464 caller is tcp_make_synack+0x841/0xac0 Call Trace: <TASK> dump_stack_lvl+0x10d/0x1a0 check_preemption_disabled+0x104/0x110 tcp_make_synack+0x841/0xac0 tcp_v6_send_synack+0x5c/0x450 tcp_rtx_synack+0xeb/0x1f0 inet_rtx_syn_ack+0x34/0x60 tcp_check_req+0x3af/0x9e0 tcp_rcv_state_process+0x59b/0x2030 tcp_v6_do_rcv+0x5f5/0x700 release_sock+0x3a/0xf0 tcp_sendmsg+0x33/0x40 ____sys_sendmsg+0x2f2/0x490 __sys_sendmsg+0x184/0x230 do_syscall_64+0x3d/0x90
Avoid calling __TCP_INC_STATS() with will touch per-cpu variables. Use TCP_INC_STATS() which is safe to be called from context switch.
Fixes: 8336886f786f ("tcp: TCP Fast Open Server - support TFO listeners") Signed-off-by: Breno Leitao leitao@debian.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230308190745.780221-1-leitao@debian.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index eefd032bc6dbd..e4ad274ec7a30 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3609,7 +3609,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, th->window = htons(min(req->rsk_rcv_wnd, 65535U)); tcp_options_write((__be32 *)(th + 1), NULL, &opts); th->doff = (tcp_header_size >> 2); - __TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS); + TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS);
#ifdef CONFIG_TCP_MD5SIG /* Okay, we have all we need - do the md5 hash if needed */
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit 484b7059796e3bc1cb527caa61dfc60da649b4f6 ]
struct pn533_out_arg used as a temporary context for out_urb is not initialized properly. Its uninitialized 'phy' field can be dereferenced in error cases inside pn533_out_complete() callback function. It causes the following failure:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.2.0-rc3-next-20230110-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:pn533_out_complete.cold+0x15/0x44 drivers/nfc/pn533/usb.c:441 Call Trace: <IRQ> __usb_hcd_giveback_urb+0x2b6/0x5c0 drivers/usb/core/hcd.c:1671 usb_hcd_giveback_urb+0x384/0x430 drivers/usb/core/hcd.c:1754 dummy_timer+0x1203/0x32d0 drivers/usb/gadget/udc/dummy_hcd.c:1988 call_timer_fn+0x1da/0x800 kernel/time/timer.c:1700 expire_timers+0x234/0x330 kernel/time/timer.c:1751 __run_timers kernel/time/timer.c:2022 [inline] __run_timers kernel/time/timer.c:1995 [inline] run_timer_softirq+0x326/0x910 kernel/time/timer.c:2035 __do_softirq+0x1fb/0xaf6 kernel/softirq.c:571 invoke_softirq kernel/softirq.c:445 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 irq_exit_rcu+0x9/0x20 kernel/softirq.c:662 sysvec_apic_timer_interrupt+0x97/0xc0 arch/x86/kernel/apic/apic.c:1107
Initialize the field with the pn533_usb_phy currently used.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 9dab880d675b ("nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame()") Reported-by: syzbot+1e608ba4217c96d1952f@syzkaller.appspotmail.com Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230309165050.207390-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nfc/pn533/usb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nfc/pn533/usb.c b/drivers/nfc/pn533/usb.c index 57b07446bb768..68eb1253f888f 100644 --- a/drivers/nfc/pn533/usb.c +++ b/drivers/nfc/pn533/usb.c @@ -175,6 +175,7 @@ static int pn533_usb_send_frame(struct pn533 *dev, print_hex_dump_debug("PN533 TX: ", DUMP_PREFIX_NONE, 16, 1, out->data, out->len, false);
+ arg.phy = phy; init_completion(&arg.done); cntx = phy->out_urb->context; phy->out_urb->context = &arg;
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit 59a0b022aa249e3f5735d93de0849341722c4754 ]
For l3s mode, skb->dev is set to ipvlan interface in ipvlan_nf_input(): skb->dev = addr->master->dev but, skb->skb_iif remain unchanged, this will cause socket lookup failed if a target socket is bound to a interface, like the following example:
ip link add ipvlan0 link eth0 type ipvlan mode l3s ip addr add dev ipvlan0 192.168.124.111/24 ip link set ipvlan0 up
ping -c 1 -I ipvlan0 8.8.8.8 100% packet loss
This is because there is no match sk in __raw_v4_lookup() as sk->sk_bound_dev_if != dif(skb->skb_iif). Fix this by make skb->skb_iif track skb->dev in ipvlan_nf_input().
Fixes: c675e06a98a4 ("ipvlan: decouple l3s mode dependencies from other modes") Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/29865b1f-6db7-c07a-de89-949d3721ea30@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipvlan/ipvlan_l3s.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c index 943d26cbf39f5..71712ea25403d 100644 --- a/drivers/net/ipvlan/ipvlan_l3s.c +++ b/drivers/net/ipvlan/ipvlan_l3s.c @@ -101,6 +101,7 @@ static unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb, goto out;
skb->dev = addr->master->dev; + skb->skb_iif = skb->dev->ifindex; len = skb->len + ETH_HLEN; ipvlan_count_rx(addr->master, len, true, false); out:
From: Ivan Vecera ivecera@redhat.com
[ Upstream commit 7e4f8a0c495413a50413e8c9f1032ce1bc633bae ]
If the driver detects during probe that firmware is in recovery mode then i40e_init_recovery_mode() is called and the rest of probe function is skipped including pci_set_drvdata(). Subsequent i40e_shutdown() called during shutdown/reboot dereferences NULL pointer as pci_get_drvdata() returns NULL.
To fix call pci_set_drvdata() also during entering to recovery mode.
Reproducer: 1) Lets have i40e NIC with firmware in recovery mode 2) Run reboot
Result: [ 139.084698] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 139.090959] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 139.108438] i40e 0000:02:00.0: Firmware recovery mode detected. Limiting functionality. [ 139.116439] i40e 0000:02:00.0: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode. [ 139.129499] i40e 0000:02:00.0: fw 8.3.64775 api 1.13 nvm 8.30 0x8000b78d 1.3106.0 [8086:1583] [15d9:084a] [ 139.215932] i40e 0000:02:00.0 enp2s0f0: renamed from eth0 [ 139.223292] i40e 0000:02:00.1: Firmware recovery mode detected. Limiting functionality. [ 139.231292] i40e 0000:02:00.1: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode. [ 139.244406] i40e 0000:02:00.1: fw 8.3.64775 api 1.13 nvm 8.30 0x8000b78d 1.3106.0 [8086:1583] [15d9:084a] [ 139.329209] i40e 0000:02:00.1 enp2s0f1: renamed from eth0 ... [ 156.311376] BUG: kernel NULL pointer dereference, address: 00000000000006c2 [ 156.318330] #PF: supervisor write access in kernel mode [ 156.323546] #PF: error_code(0x0002) - not-present page [ 156.328679] PGD 0 P4D 0 [ 156.331210] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 156.335567] CPU: 26 PID: 15119 Comm: reboot Tainted: G E 6.2.0+ #1 [ 156.343126] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.4 04/13/2022 [ 156.353369] RIP: 0010:i40e_shutdown+0x15/0x130 [i40e] [ 156.358430] Code: c1 fc ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 89 fd 53 48 8b 9f 48 01 00 00 <f0> 80 8b c2 06 00 00 04 f0 80 8b c0 06 00 00 08 48 8d bb 08 08 00 [ 156.377168] RSP: 0018:ffffb223c8447d90 EFLAGS: 00010282 [ 156.382384] RAX: ffffffffc073ee70 RBX: 0000000000000000 RCX: 0000000000000001 [ 156.389510] RDX: 0000000080000001 RSI: 0000000000000246 RDI: ffff95db49988000 [ 156.396634] RBP: ffff95db49988000 R08: ffffffffffffffff R09: ffffffff8bd17d40 [ 156.403759] R10: 0000000000000001 R11: ffffffff8a5e3d28 R12: ffff95db49988000 [ 156.410882] R13: ffffffff89a6fe17 R14: ffff95db49988150 R15: 0000000000000000 [ 156.418007] FS: 00007fe7c0cc3980(0000) GS:ffff95ea8ee80000(0000) knlGS:0000000000000000 [ 156.426083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.431819] CR2: 00000000000006c2 CR3: 00000003092fc005 CR4: 0000000000770ee0 [ 156.438944] PKRU: 55555554 [ 156.441647] Call Trace: [ 156.444096] <TASK> [ 156.446199] pci_device_shutdown+0x38/0x60 [ 156.450297] device_shutdown+0x163/0x210 [ 156.454215] kernel_restart+0x12/0x70 [ 156.457872] __do_sys_reboot+0x1ab/0x230 [ 156.461789] ? vfs_writev+0xa6/0x1a0 [ 156.465362] ? __pfx_file_free_rcu+0x10/0x10 [ 156.469635] ? __call_rcu_common.constprop.85+0x109/0x5a0 [ 156.475034] do_syscall_64+0x3e/0x90 [ 156.478611] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 156.483658] RIP: 0033:0x7fe7bff37ab7
Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support") Signed-off-by: Ivan Vecera ivecera@redhat.com Tested-by: Arpana Arland arpanax.arland@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230309184509.984639-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 9e8a20a94862f..76481ff7074ba 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -14851,6 +14851,7 @@ static int i40e_init_recovery_mode(struct i40e_pf *pf, struct i40e_hw *hw) int err; int v_idx;
+ pci_set_drvdata(pf->pdev, pf); pci_save_state(pf->pdev);
/* set up periodic task facility */
From: D. Wythe alibuda@linux.alibaba.com
[ Upstream commit 22a825c541d775c1dbe7b2402786025acad6727b ]
When performing a stress test on SMC-R by rmmod mlx5_ib driver during the wrk/nginx test, we found that there is a probability of triggering a panic while terminating all link groups.
This issue dues to the race between smc_smcr_terminate_all() and smc_buf_create().
smc_smcr_terminate_all
smc_buf_create /* init */ conn->sndbuf_desc = NULL; ...
__smc_lgr_terminate smc_conn_kill smc_close_abort smc_cdc_get_slot_and_msg_send
__softirqentry_text_start smc_wr_tx_process_cqe smc_cdc_tx_handler READ(conn->sndbuf_desc->len); /* panic dues to NULL sndbuf_desc */
conn->sndbuf_desc = xxx;
This patch tries to fix the issue by always to check the sndbuf_desc before send any cdc msg, to make sure that no null pointer is seen during cqe processing.
Fixes: 0b29ec643613 ("net/smc: immediate termination for SMCR link groups") Signed-off-by: D. Wythe alibuda@linux.alibaba.com Reviewed-by: Tony Lu tonylu@linux.alibaba.com Reviewed-by: Wenjia Zhang wenjia@linux.ibm.com Link: https://lore.kernel.org/r/1678263432-17329-1-git-send-email-alibuda@linux.al... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_cdc.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c index 94503f36b9a61..9125d28d9ff5d 100644 --- a/net/smc/smc_cdc.c +++ b/net/smc/smc_cdc.c @@ -104,6 +104,9 @@ int smc_cdc_msg_send(struct smc_connection *conn, union smc_host_cursor cfed; int rc;
+ if (unlikely(!READ_ONCE(conn->sndbuf_desc))) + return -ENOBUFS; + smc_cdc_add_pending_send(conn, pend);
conn->tx_cdc_seq++;
From: Daniil Tatianin d-tatianin@yandex-team.ru
[ Upstream commit 1a9dc5610ef89d807acdcfbff93a558f341a44da ]
Previously we would divide total_left_rate by zero if num_vports happened to be 1 because non_requested_count is calculated as num_vports - req_count. Guard against this by validating num_vports at the beginning and returning an error otherwise.
Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool.
Fixes: bcd197c81f63 ("qed: Add vport WFQ configuration APIs") Signed-off-by: Daniil Tatianin d-tatianin@yandex-team.ru Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230309201556.191392-1-d-tatianin@yandex-team.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_dev.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c index d2f5855b2ea79..895b6f0a39841 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_dev.c +++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c @@ -4986,6 +4986,11 @@ static int qed_init_wfq_param(struct qed_hwfn *p_hwfn,
num_vports = p_hwfn->qm_info.num_vports;
+ if (num_vports < 2) { + DP_NOTICE(p_hwfn, "Unexpected num_vports: %d\n", num_vports); + return -EINVAL; + } + /* Accounting for the vports which are configured for WFQ explicitly */ for (i = 0; i < num_vports; i++) { u32 tmp_speed;
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4b397c06cb987935b1b097336532aa6b4210e091 ]
IP tunnels can apparently update dev->needed_headroom in their xmit path.
This patch takes care of three tunnels xmit, and also the core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA() helpers.
More changes might be needed for completeness.
BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit
read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1: ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228 ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:444 [inline] ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126 iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0: ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804 __gre_xmit net/ipv4/ip_gre.c:469 [inline] ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661 __netdev_start_xmit include/linux/netdevice.h:4881 [inline] netdev_start_xmit include/linux/netdevice.h:4895 [inline] xmit_one net/core/dev.c:3580 [inline] dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596 __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246 dev_queue_xmit include/linux/netdevice.h:3051 [inline] neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623 neigh_output include/net/neighbour.h:546 [inline] ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134 __ip6_finish_output net/ipv6/ip6_output.c:195 [inline] ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206 NF_HOOK_COND include/linux/netfilter.h:291 [inline] ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227 dst_output include/net/dst.h:444 [inline] NF_HOOK include/linux/netfilter.h:302 [inline] mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820 mld_send_cr net/ipv6/mcast.c:2121 [inline] mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653 process_one_work+0x3e6/0x750 kernel/workqueue.c:2390 worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537 kthread+0x1ac/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
value changed: 0x0dd4 -> 0x0e14
Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5fa354-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 Workqueue: mld mld_ifc_work
Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/netdevice.h | 6 ++++-- net/ipv4/ip_tunnel.c | 12 ++++++------ net/ipv6/ip6_tunnel.c | 4 ++-- 3 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index b478a16ef284d..9ef63bc14b002 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -270,9 +270,11 @@ struct hh_cache { * relationship HH alignment <= LL alignment. */ #define LL_RESERVED_SPACE(dev) \ - ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD) + ((((dev)->hard_header_len + READ_ONCE((dev)->needed_headroom)) \ + & ~(HH_DATA_MOD - 1)) + HH_DATA_MOD) #define LL_RESERVED_SPACE_EXTRA(dev,extra) \ - ((((dev)->hard_header_len+(dev)->needed_headroom+(extra))&~(HH_DATA_MOD - 1)) + HH_DATA_MOD) + ((((dev)->hard_header_len + READ_ONCE((dev)->needed_headroom) + (extra)) \ + & ~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
struct header_ops { int (*create) (struct sk_buff *skb, struct net_device *dev, diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index be75b409445c2..99f70b990eb13 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -613,10 +613,10 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, }
headroom += LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len; - if (headroom > dev->needed_headroom) - dev->needed_headroom = headroom; + if (headroom > READ_ONCE(dev->needed_headroom)) + WRITE_ONCE(dev->needed_headroom, headroom);
- if (skb_cow_head(skb, dev->needed_headroom)) { + if (skb_cow_head(skb, READ_ONCE(dev->needed_headroom))) { ip_rt_put(rt); goto tx_dropped; } @@ -797,10 +797,10 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
max_headroom = LL_RESERVED_SPACE(rt->dst.dev) + sizeof(struct iphdr) + rt->dst.header_len + ip_encap_hlen(&tunnel->encap); - if (max_headroom > dev->needed_headroom) - dev->needed_headroom = max_headroom; + if (max_headroom > READ_ONCE(dev->needed_headroom)) + WRITE_ONCE(dev->needed_headroom, max_headroom);
- if (skb_cow_head(skb, dev->needed_headroom)) { + if (skb_cow_head(skb, READ_ONCE(dev->needed_headroom))) { ip_rt_put(rt); dev->stats.tx_dropped++; kfree_skb(skb); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 0d4cab94c5dd2..a03a322e0cc1c 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1267,8 +1267,8 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, */ max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr) + dst->header_len + t->hlen; - if (max_headroom > dev->needed_headroom) - dev->needed_headroom = max_headroom; + if (max_headroom > READ_ONCE(dev->needed_headroom)) + WRITE_ONCE(dev->needed_headroom, max_headroom);
err = ip6_tnl_encap(skb, t, &proto, fl6); if (err)
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit c22c3bbf351e4ce905f082649cffa1ff893ea8c1 ]
If genphy_read_status fails then further access to the PHY may result in unpredictable behavior. To prevent this bail out immediately if genphy_read_status fails.
Fixes: 4223dbffed9f ("net: phy: smsc: Re-enable EDPD mode for LAN87xx") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/026aa4f2-36f5-1c10-ab9f-cdb17dda6ac4@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/smsc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c index caf7291ffaf83..b67de3f9ef186 100644 --- a/drivers/net/phy/smsc.c +++ b/drivers/net/phy/smsc.c @@ -181,8 +181,11 @@ static int lan95xx_config_aneg_ext(struct phy_device *phydev) static int lan87xx_read_status(struct phy_device *phydev) { struct smsc_phy_priv *priv = phydev->priv; + int err;
- int err = genphy_read_status(phydev); + err = genphy_read_status(phydev); + if (err) + return err;
if (!phydev->link && priv->energy_enable) { /* Disable EDPD to wake up PHY */
From: Zheng Wang zyytlz.wz@163.com
[ Upstream commit 5000fe6c27827a61d8250a7e4a1d26c3298ef4f6 ]
This bug influences both st_nci_i2c_remove and st_nci_spi_remove. Take st_nci_i2c_remove as an example.
In st_nci_i2c_probe, it called ndlc_probe and bound &ndlc->sm_work with llt_ndlc_sm_work.
When it calls ndlc_recv or timeout handler, it will finally call schedule_work to start the work.
When we call st_nci_i2c_remove to remove the driver, there may be a sequence as follows:
Fix it by finishing the work before cleanup in ndlc_remove
CPU0 CPU1
|llt_ndlc_sm_work st_nci_i2c_remove | ndlc_remove | st_nci_remove | nci_free_device| kfree(ndev) | //free ndlc->ndev | |llt_ndlc_rcv_queue |nci_recv_frame |//use ndlc->ndev
Fixes: 35630df68d60 ("NFC: st21nfcb: Add driver for STMicroelectronics ST21NFCB NFC chip") Signed-off-by: Zheng Wang zyytlz.wz@163.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20230312160837.2040857-1-zyytlz.wz@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nfc/st-nci/ndlc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nfc/st-nci/ndlc.c b/drivers/nfc/st-nci/ndlc.c index 5d74c674368a5..8ccf5a86ad1bb 100644 --- a/drivers/nfc/st-nci/ndlc.c +++ b/drivers/nfc/st-nci/ndlc.c @@ -286,13 +286,15 @@ EXPORT_SYMBOL(ndlc_probe);
void ndlc_remove(struct llt_ndlc *ndlc) { - st_nci_remove(ndlc->ndev); - /* cancel timers */ del_timer_sync(&ndlc->t1_timer); del_timer_sync(&ndlc->t2_timer); ndlc->t2_active = false; ndlc->t1_active = false; + /* cancel work */ + cancel_work_sync(&ndlc->sm_work); + + st_nci_remove(ndlc->ndev);
skb_queue_purge(&ndlc->rcv_q); skb_queue_purge(&ndlc->send_q);
From: Wenjia Zhang wenjia@linux.ibm.com
[ Upstream commit 13085e1b5cab8ad802904d72e6a6dae85ae0cd20 ]
The following LOCKDEP was detected: Workqueue: events smc_lgr_free_work [smc] WARNING: possible circular locking dependency detected 6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted ------------------------------------------------------ kworker/3:0/176251 is trying to acquire lock: 00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}, at: __flush_workqueue+0x7a/0x4f0 but task is already holding lock: 0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_work+0x76/0xf0 __cancel_work_timer+0x170/0x220 __smc_lgr_terminate.part.0+0x34/0x1c0 [smc] smc_connect_rdma+0x15e/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> #3 (smc_client_lgr_pending){+.+.}-{3:3}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __mutex_lock+0x96/0x8e8 mutex_lock_nested+0x32/0x40 smc_connect_rdma+0xa4/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> #2 (sk_lock-AF_SMC){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 lock_sock_nested+0x46/0xa8 smc_tx_work+0x34/0x50 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 process_one_work+0x2bc/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}: check_prev_add+0xd8/0xe88 validate_chain+0x70c/0xb20 __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_workqueue+0xaa/0x4f0 drain_workqueue+0xaa/0x158 destroy_workqueue+0x44/0x2d8 smc_lgr_free+0x9e/0xf8 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 other info that might help us debug this: Chain exists of: (wq_completion)smc_tx_wq-00000000#2 --> smc_client_lgr_pending --> (work_completion)(&(&lgr->free_work)->work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&(&lgr->free_work)->work)); lock(smc_client_lgr_pending); lock((work_completion) (&(&lgr->free_work)->work)); lock((wq_completion)smc_tx_wq-00000000#2); *** DEADLOCK *** 2 locks held by kworker/3:0/176251: #0: 0000000080183548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x232/0x730 #1: 0000037fffe97dc8 ((work_completion) (&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 stack backtrace: CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted Hardware name: IBM 8561 T01 701 (z/VM 7.2.0) Call Trace: [<000000002983c3e4>] dump_stack_lvl+0xac/0x100 [<0000000028b477ae>] check_noncircular+0x13e/0x160 [<0000000028b48808>] check_prev_add+0xd8/0xe88 [<0000000028b49cc4>] validate_chain+0x70c/0xb20 [<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8 [<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248 [<0000000028b4d17c>] lock_acquire+0xac/0x1c8 [<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0 [<0000000028addf9a>] drain_workqueue+0xaa/0x158 [<0000000028ae303c>] destroy_workqueue+0x44/0x2d8 [<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc] [<0000000028adf3d4>] process_one_work+0x30c/0x730 [<0000000028adf85a>] worker_thread+0x62/0x420 [<0000000028aeac50>] kthread+0x138/0x150 [<0000000028a63914>] __ret_from_fork+0x3c/0x58 [<00000000298503da>] ret_from_fork+0xa/0x40 INFO: lockdep is turned off. ===================================================================
This deadlock occurs because cancel_delayed_work_sync() waits for the work(&lgr->free_work) to finish, while the &lgr->free_work waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that is already used under the mutex_lock.
The solution is to use cancel_delayed_work() instead, which kills off a pending work.
Fixes: a52bcc919b14 ("net/smc: improve termination processing") Signed-off-by: Wenjia Zhang wenjia@linux.ibm.com Reviewed-by: Jan Karcher jaka@linux.ibm.com Reviewed-by: Karsten Graul kgraul@linux.ibm.com Reviewed-by: Tony Lu tonylu@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index bf485a2017a4e..e84241ff4ac4f 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -912,7 +912,7 @@ static void __smc_lgr_terminate(struct smc_link_group *lgr, bool soft) if (lgr->terminating) return; /* lgr already terminating */ /* cancel free_work sync, will terminate when lgr->freeing is set */ - cancel_delayed_work_sync(&lgr->free_work); + cancel_delayed_work(&lgr->free_work); lgr->terminating = 1;
/* kill remaining link group connections */
From: Szymon Heidrich szymon.heidrich@gmail.com
[ Upstream commit d8b228318935044dafe3a5bc07ee71a1f1424b8d ]
Packet length retrieved from skb data may be larger than the actual socket buffer length (up to 9026 bytes). In such case the cloned skb passed up the network stack will leak kernel memory contents.
Fixes: d0cad871703b ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver") Signed-off-by: Szymon Heidrich szymon.heidrich@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc75xx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index 378a12ae2d957..0b3d11e28faa7 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -2211,7 +2211,8 @@ static int smsc75xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb) dev->net->stats.rx_frame_errors++; } else { /* MAX_SINGLE_PACKET_SIZE + 4(CRC) + 2(COE) + 4(Vlan) */ - if (unlikely(size > (MAX_SINGLE_PACKET_SIZE + ETH_HLEN + 12))) { + if (unlikely(size > (MAX_SINGLE_PACKET_SIZE + ETH_HLEN + 12) || + size > skb->len)) { netif_dbg(dev, rx_err, dev->net, "size err rx_cmd_a=0x%08x\n", rx_cmd_a);
From: Liu Ying victor.liu@nxp.com
[ Upstream commit 0d3c9333d976af41d7dbc6bf4d9d2e95fbdf9c89 ]
The returned array size for input formats is set through atomic_get_input_bus_fmts()'s 'num_input_fmts' argument, so use 'num_input_fmts' to represent the array size in the function's kdoc, not 'num_output_fmts'.
Fixes: 91ea83306bfa ("drm/bridge: Fix the bridge kernel doc") Fixes: f32df58acc68 ("drm/bridge: Add the necessary bits to support bus format negotiation") Signed-off-by: Liu Ying victor.liu@nxp.com Reviewed-by: Robert Foss rfoss@kernel.org Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20230314055035.3731179-1-victo... Signed-off-by: Sasha Levin sashal@kernel.org --- include/drm/drm_bridge.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h index 2195daa289d27..055486e35e68f 100644 --- a/include/drm/drm_bridge.h +++ b/include/drm/drm_bridge.h @@ -427,11 +427,11 @@ struct drm_bridge_funcs { * * The returned array must be allocated with kmalloc() and will be * freed by the caller. If the allocation fails, NULL should be - * returned. num_output_fmts must be set to the returned array size. + * returned. num_input_fmts must be set to the returned array size. * Formats listed in the returned array should be listed in decreasing * preference order (the core will try all formats until it finds one * that works). When the format is not supported NULL should be - * returned and num_output_fmts should be set to 0. + * returned and num_input_fmts should be set to 0. * * This method is called on all elements of the bridge chain as part of * the bus format negotiation process that happens in
From: Damien Le Moal damien.lemoal@wdc.com
[ Upstream commit eebf34a85c8c724676eba502d15202854f199b05 ]
Move null_blk driver code into the new sub-directory drivers/block/null_blk.
Suggested-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Damien Le Moal damien.lemoal@wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 63f886597085 ("block: null_blk: Fix handling of fake timeout request") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/Kconfig | 8 +------- drivers/block/Makefile | 7 +------ drivers/block/null_blk/Kconfig | 12 ++++++++++++ drivers/block/null_blk/Makefile | 11 +++++++++++ drivers/block/{null_blk_main.c => null_blk/main.c} | 0 drivers/block/{ => null_blk}/null_blk.h | 0 drivers/block/{null_blk_trace.c => null_blk/trace.c} | 2 +- drivers/block/{null_blk_trace.h => null_blk/trace.h} | 2 +- drivers/block/{null_blk_zoned.c => null_blk/zoned.c} | 2 +- 9 files changed, 28 insertions(+), 16 deletions(-) create mode 100644 drivers/block/null_blk/Kconfig create mode 100644 drivers/block/null_blk/Makefile rename drivers/block/{null_blk_main.c => null_blk/main.c} (100%) rename drivers/block/{ => null_blk}/null_blk.h (100%) rename drivers/block/{null_blk_trace.c => null_blk/trace.c} (93%) rename drivers/block/{null_blk_trace.h => null_blk/trace.h} (97%) rename drivers/block/{null_blk_zoned.c => null_blk/zoned.c} (99%)
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index 40c53632512b7..9617688b58b32 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -16,13 +16,7 @@ menuconfig BLK_DEV
if BLK_DEV
-config BLK_DEV_NULL_BLK - tristate "Null test block driver" - select CONFIGFS_FS - -config BLK_DEV_NULL_BLK_FAULT_INJECTION - bool "Support fault injection for Null test block driver" - depends on BLK_DEV_NULL_BLK && FAULT_INJECTION +source "drivers/block/null_blk/Kconfig"
config BLK_DEV_FD tristate "Normal floppy disk support" diff --git a/drivers/block/Makefile b/drivers/block/Makefile index e1f63117ee94f..a3170859e01d4 100644 --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -41,12 +41,7 @@ obj-$(CONFIG_BLK_DEV_RSXX) += rsxx/ obj-$(CONFIG_ZRAM) += zram/ obj-$(CONFIG_BLK_DEV_RNBD) += rnbd/
-obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk.o -null_blk-objs := null_blk_main.o -ifeq ($(CONFIG_BLK_DEV_ZONED), y) -null_blk-$(CONFIG_TRACING) += null_blk_trace.o -endif -null_blk-$(CONFIG_BLK_DEV_ZONED) += null_blk_zoned.o +obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk/
skd-y := skd_main.o swim_mod-y := swim.o swim_asm.o diff --git a/drivers/block/null_blk/Kconfig b/drivers/block/null_blk/Kconfig new file mode 100644 index 0000000000000..6bf1f8ca20a24 --- /dev/null +++ b/drivers/block/null_blk/Kconfig @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Null block device driver configuration +# + +config BLK_DEV_NULL_BLK + tristate "Null test block driver" + select CONFIGFS_FS + +config BLK_DEV_NULL_BLK_FAULT_INJECTION + bool "Support fault injection for Null test block driver" + depends on BLK_DEV_NULL_BLK && FAULT_INJECTION diff --git a/drivers/block/null_blk/Makefile b/drivers/block/null_blk/Makefile new file mode 100644 index 0000000000000..84c36e512ab89 --- /dev/null +++ b/drivers/block/null_blk/Makefile @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0 + +# needed for trace events +ccflags-y += -I$(src) + +obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk.o +null_blk-objs := main.o +ifeq ($(CONFIG_BLK_DEV_ZONED), y) +null_blk-$(CONFIG_TRACING) += trace.o +endif +null_blk-$(CONFIG_BLK_DEV_ZONED) += zoned.o diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk/main.c similarity index 100% rename from drivers/block/null_blk_main.c rename to drivers/block/null_blk/main.c diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk/null_blk.h similarity index 100% rename from drivers/block/null_blk.h rename to drivers/block/null_blk/null_blk.h diff --git a/drivers/block/null_blk_trace.c b/drivers/block/null_blk/trace.c similarity index 93% rename from drivers/block/null_blk_trace.c rename to drivers/block/null_blk/trace.c index f246e7bff6982..3711cba160715 100644 --- a/drivers/block/null_blk_trace.c +++ b/drivers/block/null_blk/trace.c @@ -4,7 +4,7 @@ * * Copyright (C) 2020 Western Digital Corporation or its affiliates. */ -#include "null_blk_trace.h" +#include "trace.h"
/* * Helper to use for all null_blk traces to extract disk name. diff --git a/drivers/block/null_blk_trace.h b/drivers/block/null_blk/trace.h similarity index 97% rename from drivers/block/null_blk_trace.h rename to drivers/block/null_blk/trace.h index 4f83032eb5441..ce3b430e88c57 100644 --- a/drivers/block/null_blk_trace.h +++ b/drivers/block/null_blk/trace.h @@ -73,7 +73,7 @@ TRACE_EVENT(nullb_report_zones, #undef TRACE_INCLUDE_PATH #define TRACE_INCLUDE_PATH . #undef TRACE_INCLUDE_FILE -#define TRACE_INCLUDE_FILE null_blk_trace +#define TRACE_INCLUDE_FILE trace
/* This part must be outside protection */ #include <trace/define_trace.h> diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk/zoned.c similarity index 99% rename from drivers/block/null_blk_zoned.c rename to drivers/block/null_blk/zoned.c index f5df82c26c16f..41220ce59659b 100644 --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk/zoned.c @@ -4,7 +4,7 @@ #include "null_blk.h"
#define CREATE_TRACE_POINTS -#include "null_blk_trace.h" +#include "trace.h"
#define MB_TO_SECTS(mb) (((sector_t)mb * SZ_1M) >> SECTOR_SHIFT)
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 63f886597085f346276e3b3c8974de0100d65f32 ]
When injecting a fake timeout into the null_blk driver using fail_io_timeout, the request timeout handler does not execute blk_mq_complete_request(), so the complete callback is never executed for a timedout request.
The null_blk driver also has a driver-specific fake timeout mechanism which does not have this problem. Fix the problem with fail_io_timeout by using the same meachanism as null_blk internal timeout feature, using the fake_timeout field of null_blk commands.
Reported-by: Akinobu Mita akinobu.mita@gmail.com Fixes: de3510e52b0a ("null_blk: fix command timeout completion handling") Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20230314041106.19173-2-damien.lemoal@opensource.wd... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/null_blk/main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index c6ba8f9f3f311..25db095e943b7 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1309,8 +1309,7 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd) case NULL_IRQ_SOFTIRQ: switch (cmd->nq->dev->queue_mode) { case NULL_Q_MQ: - if (likely(!blk_should_fake_timeout(cmd->rq->q))) - blk_mq_complete_request(cmd->rq); + blk_mq_complete_request(cmd->rq); break; case NULL_Q_BIO: /* @@ -1486,7 +1485,8 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, cmd->rq = bd->rq; cmd->error = BLK_STS_OK; cmd->nq = nq; - cmd->fake_timeout = should_timeout_request(bd->rq); + cmd->fake_timeout = should_timeout_request(bd->rq) || + blk_should_fake_timeout(bd->rq->q);
blk_mq_start_request(bd->rq);
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 37f0dc2ec78af0c3f35dd05578763de059f6fe77 ]
When investigating one customer report on warning in nvme_setup_discard, we observed the controller(nvme/tcp) actually exposes queue_max_discard_segments(req->q) == 1.
Obviously the current code can't handle this situation, since contiguity merge like normal RW request is taken.
Fix the issue by building range from request sector/nr_sectors directly.
Fixes: b35ba01ea697 ("nvme: support ranged discard requests") Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e162f1dfbafe9..a4b6aa932a8fe 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -723,16 +723,26 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, range = page_address(ns->ctrl->discard_page); }
- __rq_for_each_bio(bio, req) { - u64 slba = nvme_sect_to_lba(ns, bio->bi_iter.bi_sector); - u32 nlb = bio->bi_iter.bi_size >> ns->lba_shift; - - if (n < segments) { - range[n].cattr = cpu_to_le32(0); - range[n].nlb = cpu_to_le32(nlb); - range[n].slba = cpu_to_le64(slba); + if (queue_max_discard_segments(req->q) == 1) { + u64 slba = nvme_sect_to_lba(ns, blk_rq_pos(req)); + u32 nlb = blk_rq_sectors(req) >> (ns->lba_shift - 9); + + range[0].cattr = cpu_to_le32(0); + range[0].nlb = cpu_to_le32(nlb); + range[0].slba = cpu_to_le64(slba); + n = 1; + } else { + __rq_for_each_bio(bio, req) { + u64 slba = nvme_sect_to_lba(ns, bio->bi_iter.bi_sector); + u32 nlb = bio->bi_iter.bi_size >> ns->lba_shift; + + if (n < segments) { + range[n].cattr = cpu_to_le32(0); + range[n].nlb = cpu_to_le32(nlb); + range[n].slba = cpu_to_le64(slba); + } + n++; } - n++; }
if (WARN_ON_ONCE(n != segments)) {
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 6173a77b7e9d3e202bdb9897b23f2a8afe7bf286 ]
An nvme target ->queue_response() operation implementation may free the request passed as argument. Such implementation potentially could result in a use after free of the request pointer when percpu_ref_put() is called in nvmet_req_complete().
Avoid such problem by using a local variable to save the sq pointer before calling __nvmet_req_complete(), thus avoiding dereferencing the req pointer after that function call.
Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index bc88ff2912f56..a82a0796a6148 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -749,8 +749,10 @@ static void __nvmet_req_complete(struct nvmet_req *req, u16 status)
void nvmet_req_complete(struct nvmet_req *req, u16 status) { + struct nvmet_sq *sq = req->sq; + __nvmet_req_complete(req, status); - percpu_ref_put(&req->sq->ref); + percpu_ref_put(&sq->ref); } EXPORT_SYMBOL_GPL(nvmet_req_complete);
From: Liang He windhl@126.com
[ Upstream commit 6030363199e3a6341afb467ddddbed56640cbf6a ]
In vdc_port_probe(), we should check the return value of mdesc_grab() as it may return NULL, which can cause potential NPD bug.
Fixes: 43fdf27470b2 ("[SPARC64]: Abstract out mdesc accesses for better MD update handling.") Signed-off-by: Liang He windhl@126.com Link: https://lore.kernel.org/r/20230315062032.1741692-1-windhl@126.com [axboe: style cleanup] Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/sunvdc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c index 39aeebc6837da..d9e41d3bbe717 100644 --- a/drivers/block/sunvdc.c +++ b/drivers/block/sunvdc.c @@ -984,6 +984,8 @@ static int vdc_port_probe(struct vio_dev *vdev, const struct vio_device_id *id) print_version();
hp = mdesc_grab(); + if (!hp) + return -ENODEV;
err = -ENODEV; if ((vdev->dev_no << PARTITION_SHIFT) & ~(u64)MINORMASK) {
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit b830c9642386867863ac64295185f896ff2928ac ]
ice_qp_dis() intends to stop a given queue pair that is a target of xsk pool attach/detach. One of the steps is to disable interrupts on these queues. It currently is broken in a way that txq irq is turned off *after* HW flush which in turn takes no effect.
ice_qp_dis(): -> ice_qvec_dis_irq() --> disable rxq irq --> flush hw -> ice_vsi_stop_tx_ring() -->disable txq irq
Below splat can be triggered by following steps: - start xdpsock WITHOUT loading xdp prog - run xdp_rxq_info with XDP_TX action on this interface - start traffic - terminate xdpsock
[ 256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 256.319560] #PF: supervisor read access in kernel mode [ 256.324775] #PF: error_code(0x0000) - not-present page [ 256.329994] PGD 0 P4D 0 [ 256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G OE 6.2.0-rc5+ #51 [ 256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [ 256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice] [ 256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44 [ 256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206 [ 256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f [ 256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80 [ 256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000 [ 256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000 [ 256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600 [ 256.421990] FS: 0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000 [ 256.430207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0 [ 256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 256.457770] PKRU: 55555554 [ 256.460529] Call Trace: [ 256.463015] <TASK> [ 256.465157] ? ice_xmit_zc+0x6e/0x150 [ice] [ 256.469437] ice_napi_poll+0x46d/0x680 [ice] [ 256.473815] ? _raw_spin_unlock_irqrestore+0x1b/0x40 [ 256.478863] __napi_poll+0x29/0x160 [ 256.482409] net_rx_action+0x136/0x260 [ 256.486222] __do_softirq+0xe8/0x2e5 [ 256.489853] ? smpboot_thread_fn+0x2c/0x270 [ 256.494108] run_ksoftirqd+0x2a/0x50 [ 256.497747] smpboot_thread_fn+0x1c1/0x270 [ 256.501907] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 256.506594] kthread+0xea/0x120 [ 256.509785] ? __pfx_kthread+0x10/0x10 [ 256.513597] ret_from_fork+0x29/0x50 [ 256.517238] </TASK>
In fact, irqs were not disabled and napi managed to be scheduled and run while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers was already freed.
To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also while at it, remove redundant ice_clean_rx_ring() call - this is handled in ice_qp_clean_rings().
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Reviewed-by: Larysa Zaremba larysa.zaremba@intel.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Acked-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_xsk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 59963b901be0f..e0790df700e2c 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -169,8 +169,6 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx) } netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
- ice_qvec_dis_irq(vsi, rx_ring, q_vector); - ice_fill_txq_meta(vsi, tx_ring, &txq_meta); err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta); if (err) @@ -185,6 +183,8 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx) if (err) return err; } + ice_qvec_dis_irq(vsi, rx_ring, q_vector); + err = ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, true); if (err) return err;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 7e9517375a14f44ee830ca1c3278076dd65fcc8f ]
There are 3 classes of switch families that the driver is aware of, as far as mv88e6xxx_change_mtu() is concerned:
- MTU configuration is available per port. Here, the chip->info->ops->port_set_jumbo_size() method will be present.
- MTU configuration is global to the switch. Here, the chip->info->ops->set_max_frame_size() method will be present.
- We don't know how to change the MTU. Here, none of the above methods will be present.
Switch families MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290 fall in category 3.
The blamed commit has adjusted the MTU for all 3 categories by EDSA_HLEN (8 bytes), resulting in a new maximum MTU of 1492 being reported by the driver for these switches.
I don't have the hardware to test, but I do have a MV88E6390 switch on which I can simulate this by commenting out its .port_set_jumbo_size definition from mv88e6390_ops. The result is this set of messages at probe time:
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 1 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 2 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 3 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 4 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 5 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 6 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 7 mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 8
It is highly implausible that there exist Ethernet switches which don't support the standard MTU of 1500 octets, and this is what the DSA framework says as well - the error comes from dsa_slave_create() -> dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN).
But the error messages are alarming, and it would be good to suppress them.
As a consequence of this unlikeliness, we reimplement mv88e6xxx_get_max_mtu() and mv88e6xxx_change_mtu() on switches from the 3rd category as follows: the maximum supported MTU is 1500, and any request to set the MTU to a value larger than that fails in dev_validate_mtu().
Fixes: b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 371b345635e62..a253476a52b01 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2734,7 +2734,7 @@ static int mv88e6xxx_get_max_mtu(struct dsa_switch *ds, int port) return 10240 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; else if (chip->info->ops->set_max_frame_size) return 1632 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; - return 1522 - VLAN_ETH_HLEN - EDSA_HLEN - ETH_FCS_LEN; + return ETH_DATA_LEN; }
static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) @@ -2742,6 +2742,17 @@ static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) struct mv88e6xxx_chip *chip = ds->priv; int ret = 0;
+ /* For families where we don't know how to alter the MTU, + * just accept any value up to ETH_DATA_LEN + */ + if (!chip->info->ops->port_set_jumbo_size && + !chip->info->ops->set_max_frame_size) { + if (new_mtu > ETH_DATA_LEN) + return -EINVAL; + + return 0; + } + if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port)) new_mtu += EDSA_HLEN;
@@ -2750,9 +2761,6 @@ static int mv88e6xxx_change_mtu(struct dsa_switch *ds, int port, int new_mtu) ret = chip->info->ops->port_set_jumbo_size(chip, port, new_mtu); else if (chip->info->ops->set_max_frame_size) ret = chip->info->ops->set_max_frame_size(chip, new_mtu); - else - if (new_mtu > 1522) - ret = -EINVAL; mv88e6xxx_reg_unlock(chip);
return ret;
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 8a2618e14f81604a9b6ad305d57e0c8da939cd65 ]
Commit f96a3d74554d ("ipv4: Fix incorrect route flushing when source address is deleted") started to take the table ID field in the FIB info structure into account when determining if two structures are identical or not. This field is initialized using the 'fc_table' field in the route configuration structure, which is not set when adding a route via IOCTL.
The above can result in user space being able to install two identical routes that only differ in the table ID field of their associated FIB info.
Fix by initializing the table ID field in the route configuration structure in the IOCTL path.
Before the fix:
# ip route add default via 192.0.2.2 # route add default gw 192.0.2.2 # ip -4 r show default # default via 192.0.2.2 dev dummy10 # default via 192.0.2.2 dev dummy10
After the fix:
# ip route add default via 192.0.2.2 # route add default gw 192.0.2.2 SIOCADDRT: File exists # ip -4 r show default default via 192.0.2.2 dev dummy10
Audited the code paths to ensure there are no other paths that do not properly initialize the route configuration structure when installing a route.
Fixes: 5a56a0b3a45d ("net: Don't delete routes in different VRFs") Fixes: f96a3d74554d ("ipv4: Fix incorrect route flushing when source address is deleted") Reported-by: gaoxingwang gaoxingwang1@huawei.com Link: https://lore.kernel.org/netdev/20230314144159.2354729-1-gaoxingwang1@huawei.... Tested-by: gaoxingwang gaoxingwang1@huawei.com Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20230315124009.4015212-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/fib_frontend.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 5f786ef662ead..41f890bf9d4c4 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -573,6 +573,9 @@ static int rtentry_to_fib_config(struct net *net, int cmd, struct rtentry *rt, cfg->fc_scope = RT_SCOPE_UNIVERSE; }
+ if (!cfg->fc_table) + cfg->fc_table = RT_TABLE_MAIN; + if (cmd == SIOCDELRT) return 0;
From: Szymon Heidrich szymon.heidrich@gmail.com
[ Upstream commit 43ffe6caccc7a1bb9d7442fbab521efbf6c1378c ]
Packet length check needs to be located after size and align_count calculation to prevent kernel panic in skb_pull() in case rx_cmd_a & RX_CMD_A_RED evaluates to true.
Fixes: d8b228318935 ("net: usb: smsc75xx: Limit packet length to skb->len") Signed-off-by: Szymon Heidrich szymon.heidrich@gmail.com Link: https://lore.kernel.org/r/20230316110540.77531-1-szymon.heidrich@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/smsc75xx.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c index 0b3d11e28faa7..fb1389bd09392 100644 --- a/drivers/net/usb/smsc75xx.c +++ b/drivers/net/usb/smsc75xx.c @@ -2199,6 +2199,13 @@ static int smsc75xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb) size = (rx_cmd_a & RX_CMD_A_LEN) - RXW_PADDING; align_count = (4 - ((size + RXW_PADDING) % 4)) % 4;
+ if (unlikely(size > skb->len)) { + netif_dbg(dev, rx_err, dev->net, + "size err rx_cmd_a=0x%08x\n", + rx_cmd_a); + return 0; + } + if (unlikely(rx_cmd_a & RX_CMD_A_RED)) { netif_dbg(dev, rx_err, dev->net, "Error rx_cmd_a=0x%08x\n", rx_cmd_a); @@ -2211,8 +2218,7 @@ static int smsc75xx_rx_fixup(struct usbnet *dev, struct sk_buff *skb) dev->net->stats.rx_frame_errors++; } else { /* MAX_SINGLE_PACKET_SIZE + 4(CRC) + 2(COE) + 4(Vlan) */ - if (unlikely(size > (MAX_SINGLE_PACKET_SIZE + ETH_HLEN + 12) || - size > skb->len)) { + if (unlikely(size > (MAX_SINGLE_PACKET_SIZE + ETH_HLEN + 12))) { netif_dbg(dev, rx_err, dev->net, "size err rx_cmd_a=0x%08x\n", rx_cmd_a);
From: Alexandra Winter wintera@linux.ibm.com
[ Upstream commit 3d87debb8ed2649608ff432699e7c961c0c6f03b ]
iucv_irq_data needs to be 4 bytes larger. These bytes are not used by the iucv module, but written by the z/VM hypervisor in case a CPU is deconfigured.
Reported as: BUG dma-kmalloc-64 (Not tainted): kmalloc Redzone overwritten ----------------------------------------------------------------------------- 0x0000000000400564-0x0000000000400567 @offset=1380. First byte 0x80 instead of 0xcc Allocated in iucv_cpu_prepare+0x44/0xd0 age=167839 cpu=2 pid=1 __kmem_cache_alloc_node+0x166/0x450 kmalloc_node_trace+0x3a/0x70 iucv_cpu_prepare+0x44/0xd0 cpuhp_invoke_callback+0x156/0x2f0 cpuhp_issue_call+0xf0/0x298 __cpuhp_setup_state_cpuslocked+0x136/0x338 __cpuhp_setup_state+0xf4/0x288 iucv_init+0xf4/0x280 do_one_initcall+0x78/0x390 do_initcalls+0x11a/0x140 kernel_init_freeable+0x25e/0x2a0 kernel_init+0x2e/0x170 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 Freed in iucv_init+0x92/0x280 age=167839 cpu=2 pid=1 __kmem_cache_free+0x308/0x358 iucv_init+0x92/0x280 do_one_initcall+0x78/0x390 do_initcalls+0x11a/0x140 kernel_init_freeable+0x25e/0x2a0 kernel_init+0x2e/0x170 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 Slab 0x0000037200010000 objects=32 used=30 fp=0x0000000000400640 flags=0x1ffff00000010200(slab|head|node=0|zone=0| Object 0x0000000000400540 @offset=1344 fp=0x0000000000000000 Redzone 0000000000400500: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Redzone 0000000000400510: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Redzone 0000000000400520: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Redzone 0000000000400530: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Object 0000000000400540: 00 01 00 03 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object 0000000000400550: f3 86 81 f2 f4 82 f8 82 f0 f0 f0 f0 f0 f0 f0 f2 ................ Object 0000000000400560: 00 00 00 00 80 00 00 00 cc cc cc cc cc cc cc cc ................ Object 0000000000400570: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................ Redzone 0000000000400580: cc cc cc cc cc cc cc cc ........ Padding 00000000004005d4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding 00000000004005e4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ Padding 00000000004005f4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ CPU: 6 PID: 121030 Comm: 116-pai-crypto. Not tainted 6.3.0-20230221.rc0.git4.99b8246b2d71.300.fc37.s390x+debug #1 Hardware name: IBM 3931 A01 704 (z/VM 7.3.0) Call Trace: [<000000032aa034ec>] dump_stack_lvl+0xac/0x100 [<0000000329f5a6cc>] check_bytes_and_report+0x104/0x140 [<0000000329f5aa78>] check_object+0x370/0x3c0 [<0000000329f5ede6>] free_debug_processing+0x15e/0x348 [<0000000329f5f06a>] free_to_partial_list+0x9a/0x2f0 [<0000000329f5f4a4>] __slab_free+0x1e4/0x3a8 [<0000000329f61768>] __kmem_cache_free+0x308/0x358 [<000000032a91465c>] iucv_cpu_dead+0x6c/0x88 [<0000000329c2fc66>] cpuhp_invoke_callback+0x156/0x2f0 [<000000032aa062da>] _cpu_down.constprop.0+0x22a/0x5e0 [<0000000329c3243e>] cpu_device_down+0x4e/0x78 [<000000032a61dee0>] device_offline+0xc8/0x118 [<000000032a61e048>] online_store+0x60/0xe0 [<000000032a08b6b0>] kernfs_fop_write_iter+0x150/0x1e8 [<0000000329fab65c>] vfs_write+0x174/0x360 [<0000000329fab9fc>] ksys_write+0x74/0x100 [<000000032aa03a5a>] __do_syscall+0x1da/0x208 [<000000032aa177b2>] system_call+0x82/0xb0 INFO: lockdep is turned off. FIX dma-kmalloc-64: Restoring kmalloc Redzone 0x0000000000400564-0x0000000000400567=0xcc FIX dma-kmalloc-64: Object at 0x0000000000400540 not freed
Fixes: 2356f4cb1911 ("[S390]: Rewrite of the IUCV base code, part 2") Signed-off-by: Alexandra Winter wintera@linux.ibm.com Link: https://lore.kernel.org/r/20230315131435.4113889-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/iucv/iucv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/iucv/iucv.c b/net/iucv/iucv.c index 349c6ac3313f7..6f84978a77265 100644 --- a/net/iucv/iucv.c +++ b/net/iucv/iucv.c @@ -83,7 +83,7 @@ struct iucv_irq_data { u16 ippathid; u8 ipflags1; u8 iptype; - u32 res2[8]; + u32 res2[9]; };
struct iucv_irq_list {
From: Po-Hsu Lin po-hsu.lin@canonical.com
[ Upstream commit 24994513ad13ff2c47ba91d2b5df82c3d496c370 ]
The `devlink -j port show` command output may not contain the "flavour" key, an example from Ubuntu 22.10 s390x LPAR(5.19.0-37-generic), with mlx4 driver and iproute2-5.15.0: {"port":{"pci/0001:00:00.0/1":{"type":"eth","netdev":"ens301"}, "pci/0001:00:00.0/2":{"type":"eth","netdev":"ens301d1"}, "pci/0002:00:00.0/1":{"type":"eth","netdev":"ens317"}, "pci/0002:00:00.0/2":{"type":"eth","netdev":"ens317d1"}}}
This will cause a KeyError exception.
Create a validate_devlink_output() to check for this "flavour" from devlink command output to avoid this KeyError exception. Also let it handle the check for `devlink -j dev show` output in main().
Apart from this, if the test was not started because the max lanes of the designated device is 0. The script will still return 0 and thus causing a false-negative test result.
Use a found_max_lanes flag to determine if these tests were skipped due to this reason and return KSFT_SKIP to make it more clear.
Link: https://bugs.launchpad.net/bugs/1937133 Fixes: f3348a82e727 ("selftests: net: Add port split test") Signed-off-by: Po-Hsu Lin po-hsu.lin@canonical.com Link: https://lore.kernel.org/r/20230315165353.229590-1-po-hsu.lin@canonical.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/net/devlink_port_split.py | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/tools/testing/selftests/net/devlink_port_split.py b/tools/testing/selftests/net/devlink_port_split.py index 834066d465fc1..f0fbd7367f4f6 100755 --- a/tools/testing/selftests/net/devlink_port_split.py +++ b/tools/testing/selftests/net/devlink_port_split.py @@ -57,6 +57,8 @@ class devlink_ports(object): assert stderr == "" ports = json.loads(stdout)['port']
+ validate_devlink_output(ports, 'flavour') + for port in ports: if dev in port: if ports[port]['flavour'] == 'physical': @@ -218,6 +220,27 @@ def split_splittable_port(port, k, lanes, dev): unsplit(port.bus_info)
+def validate_devlink_output(devlink_data, target_property=None): + """ + Determine if test should be skipped by checking: + 1. devlink_data contains values + 2. The target_property exist in devlink_data + """ + skip_reason = None + if any(devlink_data.values()): + if target_property: + skip_reason = "{} not found in devlink output, test skipped".format(target_property) + for key in devlink_data: + if target_property in devlink_data[key]: + skip_reason = None + else: + skip_reason = 'devlink output is empty, test skipped' + + if skip_reason: + print(skip_reason) + sys.exit(KSFT_SKIP) + + def make_parser(): parser = argparse.ArgumentParser(description='A test for port splitting.') parser.add_argument('--dev', @@ -238,6 +261,7 @@ def main(cmdline=None): stdout, stderr = run_command(cmd) assert stderr == ""
+ validate_devlink_output(json.loads(stdout)) devs = json.loads(stdout)['dev'] dev = list(devs.keys())[0]
@@ -249,6 +273,7 @@ def main(cmdline=None):
ports = devlink_ports(dev)
+ found_max_lanes = False for port in ports.if_names: max_lanes = get_max_lanes(port.name)
@@ -271,6 +296,11 @@ def main(cmdline=None): split_splittable_port(port, lane, max_lanes, dev)
lane //= 2 + found_max_lanes = True + + if not found_max_lanes: + print(f"Test not started, no port of device {dev} reports max_lanes") + sys.exit(KSFT_SKIP)
if __name__ == "__main__":
From: Daniil Tatianin d-tatianin@yandex-team.ru
[ Upstream commit 470efd68a4653d9819d391489886432cd31bcd0b ]
This fixes an issue where ->hour would erroneously get zeroed out instead of ->min because of a bad copy paste.
Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool.
Fixes: f240b6882211 ("qed: Add support for processing fcoe tlv request.") Signed-off-by: Daniil Tatianin d-tatianin@yandex-team.ru Link: https://lore.kernel.org/r/20230315194618.579286-1-d-tatianin@yandex-team.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c b/drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c index 3e3192a3ad9b7..fdbd5f07a1857 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c +++ b/drivers/net/ethernet/qlogic/qed/qed_mng_tlv.c @@ -422,7 +422,7 @@ qed_mfw_get_tlv_time_value(struct qed_mfw_tlv_time *p_time, if (p_time->hour > 23) p_time->hour = 0; if (p_time->min > 59) - p_time->hour = 0; + p_time->min = 0; if (p_time->msec > 999) p_time->msec = 0; if (p_time->usec > 999)
From: Liang He windhl@126.com
[ Upstream commit 90de546d9a0b3c771667af18bb3f80567eabb89b ]
In vnet_port_probe() and vsw_port_probe(), we should check the return value of mdesc_grab() as it may return NULL which can caused NPD bugs.
Fixes: 5d01fa0c6bd8 ("ldmvsw: Add ldmvsw.c driver code") Fixes: 43fdf27470b2 ("[SPARC64]: Abstract out mdesc accesses for better MD update handling.") Signed-off-by: Liang He windhl@126.com Reviewed-by: Piotr Raczynski piotr.raczynski@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sun/ldmvsw.c | 3 +++ drivers/net/ethernet/sun/sunvnet.c | 3 +++ 2 files changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/sun/ldmvsw.c b/drivers/net/ethernet/sun/ldmvsw.c index 01ea0d6f88193..934a4b54784b8 100644 --- a/drivers/net/ethernet/sun/ldmvsw.c +++ b/drivers/net/ethernet/sun/ldmvsw.c @@ -290,6 +290,9 @@ static int vsw_port_probe(struct vio_dev *vdev, const struct vio_device_id *id)
hp = mdesc_grab();
+ if (!hp) + return -ENODEV; + rmac = mdesc_get_property(hp, vdev->mp, remote_macaddr_prop, &len); err = -ENODEV; if (!rmac) { diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c index 96b883f965f63..b6c03adf1e762 100644 --- a/drivers/net/ethernet/sun/sunvnet.c +++ b/drivers/net/ethernet/sun/sunvnet.c @@ -431,6 +431,9 @@ static int vnet_port_probe(struct vio_dev *vdev, const struct vio_device_id *id)
hp = mdesc_grab();
+ if (!hp) + return -ENODEV; + vp = vnet_find_parent(hp, vdev->mp, vdev); if (IS_ERR(vp)) { pr_err("Cannot find port parent vnet\n");
From: Tony O'Brien tony.obrien@alliedtelesis.co.nz
[ Upstream commit 5f8d1e3b6f9b5971f9c06d5846ce00c49e3a8d94 ]
Throughout the ADT7475 driver, attributes relating to the temperature sensors are displayed in the order Remote 1, Local, Remote 2. Make temp_st_show() conform to this expectation so that values set by temp_st_store() can be displayed using the correct attribute.
Fixes: 8f05bcc33e74 ("hwmon: (adt7475) temperature smoothing") Signed-off-by: Tony O'Brien tony.obrien@alliedtelesis.co.nz Link: https://lore.kernel.org/r/20230222005228.158661-2-tony.obrien@alliedtelesis.... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/adt7475.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index 9d5b019651f2d..c7f19a1ca0a59 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -554,11 +554,11 @@ static ssize_t temp_st_show(struct device *dev, struct device_attribute *attr, val = data->enh_acoustics[0] & 0xf; break; case 1: - val = (data->enh_acoustics[1] >> 4) & 0xf; + val = data->enh_acoustics[1] & 0xf; break; case 2: default: - val = data->enh_acoustics[1] & 0xf; + val = (data->enh_acoustics[1] >> 4) & 0xf; break; }
From: Tony O'Brien tony.obrien@alliedtelesis.co.nz
[ Upstream commit 48e8186870d9d0902e712d601ccb7098cb220688 ]
The wrong bits are masked in the hysteresis register; indices 0 and 2 should zero bits [7:4] and preserve bits [3:0], and index 1 should zero bits [3:0] and preserve bits [7:4].
Fixes: 1c301fc5394f ("hwmon: Add a driver for the ADT7475 hardware monitoring chip") Signed-off-by: Tony O'Brien tony.obrien@alliedtelesis.co.nz Link: https://lore.kernel.org/r/20230222005228.158661-3-tony.obrien@alliedtelesis.... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/adt7475.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index c7f19a1ca0a59..6b84822e7d93b 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -486,10 +486,10 @@ static ssize_t temp_store(struct device *dev, struct device_attribute *attr, val = (temp - val) / 1000;
if (sattr->index != 1) { - data->temp[HYSTERSIS][sattr->index] &= 0xF0; + data->temp[HYSTERSIS][sattr->index] &= 0x0F; data->temp[HYSTERSIS][sattr->index] |= (val & 0xF) << 4; } else { - data->temp[HYSTERSIS][sattr->index] &= 0x0F; + data->temp[HYSTERSIS][sattr->index] &= 0xF0; data->temp[HYSTERSIS][sattr->index] |= (val & 0xF); }
From: Zheng Wang zyytlz.wz@163.com
[ Upstream commit cb090e64cf25602b9adaf32d5dfc9c8bec493cd1 ]
In xgene_hwmon_probe, &ctx->workq is bound with xgene_hwmon_evt_work. Then it will be started.
If we remove the driver which will call xgene_hwmon_remove to clean up, there may be unfinished work.
The possible sequence is as follows:
Fix it by finishing the work before cleanup in xgene_hwmon_remove.
CPU0 CPU1
|xgene_hwmon_evt_work xgene_hwmon_remove | kfifo_free(&ctx->async_msg_fifo);| | |kfifo_out_spinlocked |//use &ctx->async_msg_fifo Fixes: 2ca492e22cb7 ("hwmon: (xgene) Fix crash when alarm occurs before driver probe") Signed-off-by: Zheng Wang zyytlz.wz@163.com Link: https://lore.kernel.org/r/20230310084007.1403388-1-zyytlz.wz@163.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/xgene-hwmon.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/xgene-hwmon.c b/drivers/hwmon/xgene-hwmon.c index f2a5af239c956..f5d3cf86753f7 100644 --- a/drivers/hwmon/xgene-hwmon.c +++ b/drivers/hwmon/xgene-hwmon.c @@ -768,6 +768,7 @@ static int xgene_hwmon_remove(struct platform_device *pdev) { struct xgene_hwmon_dev *ctx = platform_get_drvdata(pdev);
+ cancel_work_sync(&ctx->workq); hwmon_device_unregister(ctx->hwmon_dev); kfifo_free(&ctx->async_msg_fifo); if (acpi_disabled)
From: Marcus Folkesson marcus.folkesson@gmail.com
[ Upstream commit c93f5e2ab53243b17febabb9422a697017d3d49a ]
ret is set to 0 which do not indicate an error. Return -EINVAL instead.
Fixes: a9e9dd9c6de5 ("hwmon: (ina3221) Read channel input source info from DT") Signed-off-by: Marcus Folkesson marcus.folkesson@gmail.com Link: https://lore.kernel.org/r/20230310075035.246083-1-marcus.folkesson@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/ina3221.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/ina3221.c b/drivers/hwmon/ina3221.c index d3c98115042b5..836e7579e166a 100644 --- a/drivers/hwmon/ina3221.c +++ b/drivers/hwmon/ina3221.c @@ -772,7 +772,7 @@ static int ina3221_probe_child_from_dt(struct device *dev, return ret; } else if (val > INA3221_CHANNEL3) { dev_err(dev, "invalid reg %d of %pOFn\n", val, child); - return ret; + return -EINVAL; }
input = &ina->inputs[val];
From: Lars-Peter Clausen lars@metafoo.de
[ Upstream commit 8d655e65237643c48ada2c131b83679bf1105373 ]
When probing the ucd90320 access to some of the registers randomly fails. Sometimes it NACKs a transfer, sometimes it returns just random data and the PEC check fails.
Experimentation shows that this seems to be triggered by a register access directly back to back with a previous register write. Experimentation also shows that inserting a small delay after register writes makes the issue go away.
Use a similar solution to what the max15301 driver does to solve the same problem. Create a custom set of bus read and write functions that make sure that the delay is added.
Fixes: a470f11c5ba2 ("hwmon: (pmbus/ucd9000) Add support for UCD90320 Power Sequencer") Signed-off-by: Lars-Peter Clausen lars@metafoo.de Link: https://lore.kernel.org/r/20230312160312.2227405-1-lars@metafoo.de Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/pmbus/ucd9000.c | 75 +++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+)
diff --git a/drivers/hwmon/pmbus/ucd9000.c b/drivers/hwmon/pmbus/ucd9000.c index f8017993e2b4d..9e26cc084a176 100644 --- a/drivers/hwmon/pmbus/ucd9000.c +++ b/drivers/hwmon/pmbus/ucd9000.c @@ -7,6 +7,7 @@ */
#include <linux/debugfs.h> +#include <linux/delay.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/of_device.h> @@ -16,6 +17,7 @@ #include <linux/i2c.h> #include <linux/pmbus.h> #include <linux/gpio/driver.h> +#include <linux/timekeeping.h> #include "pmbus.h"
enum chips { ucd9000, ucd90120, ucd90124, ucd90160, ucd90320, ucd9090, @@ -65,6 +67,7 @@ struct ucd9000_data { struct gpio_chip gpio; #endif struct dentry *debugfs; + ktime_t write_time; }; #define to_ucd9000_data(_info) container_of(_info, struct ucd9000_data, info)
@@ -73,6 +76,73 @@ struct ucd9000_debugfs_entry { u8 index; };
+/* + * It has been observed that the UCD90320 randomly fails register access when + * doing another access right on the back of a register write. To mitigate this + * make sure that there is a minimum delay between a write access and the + * following access. The 250us is based on experimental data. At a delay of + * 200us the issue seems to go away. Add a bit of extra margin to allow for + * system to system differences. + */ +#define UCD90320_WAIT_DELAY_US 250 + +static inline void ucd90320_wait(const struct ucd9000_data *data) +{ + s64 delta = ktime_us_delta(ktime_get(), data->write_time); + + if (delta < UCD90320_WAIT_DELAY_US) + udelay(UCD90320_WAIT_DELAY_US - delta); +} + +static int ucd90320_read_word_data(struct i2c_client *client, int page, + int phase, int reg) +{ + const struct pmbus_driver_info *info = pmbus_get_driver_info(client); + struct ucd9000_data *data = to_ucd9000_data(info); + + if (reg >= PMBUS_VIRT_BASE) + return -ENXIO; + + ucd90320_wait(data); + return pmbus_read_word_data(client, page, phase, reg); +} + +static int ucd90320_read_byte_data(struct i2c_client *client, int page, int reg) +{ + const struct pmbus_driver_info *info = pmbus_get_driver_info(client); + struct ucd9000_data *data = to_ucd9000_data(info); + + ucd90320_wait(data); + return pmbus_read_byte_data(client, page, reg); +} + +static int ucd90320_write_word_data(struct i2c_client *client, int page, + int reg, u16 word) +{ + const struct pmbus_driver_info *info = pmbus_get_driver_info(client); + struct ucd9000_data *data = to_ucd9000_data(info); + int ret; + + ucd90320_wait(data); + ret = pmbus_write_word_data(client, page, reg, word); + data->write_time = ktime_get(); + + return ret; +} + +static int ucd90320_write_byte(struct i2c_client *client, int page, u8 value) +{ + const struct pmbus_driver_info *info = pmbus_get_driver_info(client); + struct ucd9000_data *data = to_ucd9000_data(info); + int ret; + + ucd90320_wait(data); + ret = pmbus_write_byte(client, page, value); + data->write_time = ktime_get(); + + return ret; +} + static int ucd9000_get_fan_config(struct i2c_client *client, int fan) { int fan_config = 0; @@ -598,6 +668,11 @@ static int ucd9000_probe(struct i2c_client *client) info->read_byte_data = ucd9000_read_byte_data; info->func[0] |= PMBUS_HAVE_FAN12 | PMBUS_HAVE_STATUS_FAN12 | PMBUS_HAVE_FAN34 | PMBUS_HAVE_STATUS_FAN34; + } else if (mid->driver_data == ucd90320) { + info->read_byte_data = ucd90320_read_byte_data; + info->read_word_data = ucd90320_read_word_data; + info->write_byte = ucd90320_write_byte; + info->write_word_data = ucd90320_write_word_data; }
ucd9000_probe_gpio(client, mid, data);
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 00d85e81796b17a29a0e096c5a4735daa47adef8 ]
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). This also fixes !CONFIG_OF error:
drivers/hwmon/tmp513.c:610:34: error: ‘tmp51x_of_match’ defined but not used [-Werror=unused-const-variable=]
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20230312193723.478032-2-krzysztof.kozlowski@linaro... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/tmp513.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c index 47bbe47e062fd..7d5f7441aceb1 100644 --- a/drivers/hwmon/tmp513.c +++ b/drivers/hwmon/tmp513.c @@ -758,7 +758,7 @@ static int tmp51x_probe(struct i2c_client *client) static struct i2c_driver tmp51x_driver = { .driver = { .name = "tmp51x", - .of_match_table = of_match_ptr(tmp51x_of_match), + .of_match_table = tmp51x_of_match, }, .probe_new = tmp51x_probe, .id_table = tmp51x_id,
From: Lars-Peter Clausen lars@metafoo.de
[ Upstream commit a5bb73b3f5db1a4e91402ad132b59b13d2651ed9 ]
The adm1266 driver uses I2C bus access in its GPIO chip `set` and `get` implementation. This means these functions can sleep and the GPIO chip should set the `can_sleep` property to true.
This will ensure that a warning is printed when trying to set or get the GPIO value from a context that potentially can't sleep.
Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs") Signed-off-by: Lars-Peter Clausen lars@metafoo.de Link: https://lore.kernel.org/r/20230314093146.2443845-1-lars@metafoo.de Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/pmbus/adm1266.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/pmbus/adm1266.c b/drivers/hwmon/pmbus/adm1266.c index c7b373ba92f21..d1b2e936546fd 100644 --- a/drivers/hwmon/pmbus/adm1266.c +++ b/drivers/hwmon/pmbus/adm1266.c @@ -301,6 +301,7 @@ static int adm1266_config_gpio(struct adm1266_data *data) data->gc.label = name; data->gc.parent = &data->client->dev; data->gc.owner = THIS_MODULE; + data->gc.can_sleep = true; data->gc.base = -1; data->gc.names = data->gpio_names; data->gc.ngpio = ARRAY_SIZE(data->gpio_names);
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit efbcbb12ee99f750c9f25c873b55ad774871de2a ]
The __find_restype() function loops over the m5mols_default_ffmt[] array, and the termination condition ends up being wrong: instead of stopping when the iterator becomes the size of the array it traverses, it stops after it has already overshot the array.
Now, in practice this doesn't likely matter, because the code will always find the entry it looks for, and will thus return early and never hit that last extra iteration.
But it turns out that clang will unroll the loop fully, because it has only two iterations (well, three due to the off-by-one bug), and then clang will end up just giving up in the middle of the loop unrolling when it notices that the code walks past the end of the array.
And that made 'objtool' very unhappy indeed, because the generated code just falls off the edge of the universe, and ends up falling through to the next function, causing this warning:
drivers/media/i2c/m5mols/m5mols.o: warning: objtool: m5mols_set_fmt() falls through to next function m5mols_get_frame_desc()
Fix the loop ending condition.
Reported-by: Jens Axboe axboe@kernel.dk Analyzed-by: Miguel Ojeda miguel.ojeda.sandonis@gmail.com Analyzed-by: Nick Desaulniers ndesaulniers@google.com Link: https://lore.kernel.org/linux-block/CAHk-=wgTSdKYbmB1JYM5vmHMcD9J9UZr0mn7BOY... Fixes: bc125106f8af ("[media] Add support for M-5MOLS 8 Mega Pixel camera ISP") Cc: HeungJun, Kim riverful.kim@samsung.com Cc: Sylwester Nawrocki s.nawrocki@samsung.com Cc: Kyungmin Park kyungmin.park@samsung.com Cc: Mauro Carvalho Chehab mchehab@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/m5mols/m5mols_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/i2c/m5mols/m5mols_core.c b/drivers/media/i2c/m5mols/m5mols_core.c index 21666d705e372..dcf9e4d4ee6b8 100644 --- a/drivers/media/i2c/m5mols/m5mols_core.c +++ b/drivers/media/i2c/m5mols/m5mols_core.c @@ -488,7 +488,7 @@ static enum m5mols_restype __find_restype(u32 code) do { if (code == m5mols_default_ffmt[type].code) return type; - } while (type++ != SIZE_DEFAULT_FFMT); + } while (++type != SIZE_DEFAULT_FFMT);
return 0; }
From: Tobias Schramm t.schramm@manjaro.org
[ Upstream commit eca5bd666b0aa7dc0bca63292e4778968241134e ]
This commit fixes a race between completion of stop command and start of a new command. Previously the command ready interrupt was enabled before stop command was written to the command register. This caused the command ready interrupt to fire immediately since the CMDRDY flag is asserted constantly while there is no command in progress. Consequently the command state machine will immediately advance to the next state when the tasklet function is executed again, no matter actual completion state of the stop command. Thus a new command can then be dispatched immediately, interrupting and corrupting the stop command on the CMD line. Fix that by dropping the command ready interrupt enable before calling atmci_send_stop_cmd. atmci_send_stop_cmd does already enable the command ready interrupt, no further writes to ATMCI_IER are necessary.
Signed-off-by: Tobias Schramm t.schramm@manjaro.org Acked-by: Ludovic Desroches ludovic.desroches@microchip.com Link: https://lore.kernel.org/r/20221230194315.809903-2-t.schramm@manjaro.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/atmel-mci.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/mmc/host/atmel-mci.c b/drivers/mmc/host/atmel-mci.c index af85b32c6c1c8..c468f9a02ef6b 100644 --- a/drivers/mmc/host/atmel-mci.c +++ b/drivers/mmc/host/atmel-mci.c @@ -1818,7 +1818,6 @@ static void atmci_tasklet_func(unsigned long priv) atmci_writel(host, ATMCI_IER, ATMCI_NOTBUSY); state = STATE_WAITING_NOTBUSY; } else if (host->mrq->stop) { - atmci_writel(host, ATMCI_IER, ATMCI_CMDRDY); atmci_send_stop_cmd(host, data); state = STATE_SENDING_STOP; } else { @@ -1851,8 +1850,6 @@ static void atmci_tasklet_func(unsigned long priv) * command to send. */ if (host->mrq->stop) { - atmci_writel(host, ATMCI_IER, - ATMCI_CMDRDY); atmci_send_stop_cmd(host, data); state = STATE_SENDING_STOP; } else {
From: Yifei Liu yifeliu@cs.stonybrook.edu
[ Upstream commit 23892d383bee15b64f5463bd7195615734bb2415 ]
Bug description and fix:
1. Write data to a file, say all 1s from offset 0 to 16.
2. Truncate the file to a smaller size, say 8 bytes.
3. Write new bytes (say 2s) from an offset past the original size of the file, say at offset 20, for 4 bytes. This is supposed to create a "hole" in the file, meaning that the bytes from offset 8 (where it was truncated above) up to the new write at offset 20, should all be 0s (zeros).
4. Flush all caches using "echo 3 > /proc/sys/vm/drop_caches" (or unmount and remount) the f/s.
5. Check the content of the file. It is wrong. The 1s that used to be between bytes 9 and 16, before the truncation, have REAPPEARED (they should be 0s).
We wrote a script and helper C program to reproduce the bug (reproduce_jffs2_write_begin_issue.sh, write_file.c, and Makefile). We can make them available to anyone.
The above example is shown when writing a small file within the same first page. But the bug happens for larger files, as long as steps 1, 2, and 3 above all happen within the same page.
The problem was traced to the jffs2_write_begin code, where it goes into an 'if' statement intended to handle writes past the current EOF (i.e., writes that may create a hole). The code computes a 'pageofs' that is the floor of the write position (pos), aligned to the page size boundary. In other words, 'pageofs' will never be larger than 'pos'. The code then sets the internal jffs2_raw_inode->isize to the size of max(current inode size, pageofs) but that is wrong: the new file size should be the 'pos', which is larger than both the current inode size and pageofs.
Similarly, the code incorrectly sets the internal jffs2_raw_inode->dsize to the difference between the pageofs minus current inode size; instead it should be the current pos minus the current inode size. Finally, inode->i_size was also set incorrectly.
The patch below fixes this bug. The bug was discovered using a new tool for finding f/s bugs using model checking, called MCFS (Model Checking File Systems).
Signed-off-by: Yifei Liu yifeliu@cs.stonybrook.edu Signed-off-by: Erez Zadok ezk@cs.stonybrook.edu Signed-off-by: Manish Adkar madkar@cs.stonybrook.edu Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jffs2/file.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c index bd7d58d27bfc6..97a3c09fd96b6 100644 --- a/fs/jffs2/file.c +++ b/fs/jffs2/file.c @@ -138,19 +138,18 @@ static int jffs2_write_begin(struct file *filp, struct address_space *mapping, struct jffs2_inode_info *f = JFFS2_INODE_INFO(inode); struct jffs2_sb_info *c = JFFS2_SB_INFO(inode->i_sb); pgoff_t index = pos >> PAGE_SHIFT; - uint32_t pageofs = index << PAGE_SHIFT; int ret = 0;
jffs2_dbg(1, "%s()\n", __func__);
- if (pageofs > inode->i_size) { - /* Make new hole frag from old EOF to new page */ + if (pos > inode->i_size) { + /* Make new hole frag from old EOF to new position */ struct jffs2_raw_inode ri; struct jffs2_full_dnode *fn; uint32_t alloc_len;
- jffs2_dbg(1, "Writing new hole frag 0x%x-0x%x between current EOF and new page\n", - (unsigned int)inode->i_size, pageofs); + jffs2_dbg(1, "Writing new hole frag 0x%x-0x%x between current EOF and new position\n", + (unsigned int)inode->i_size, (uint32_t)pos);
ret = jffs2_reserve_space(c, sizeof(ri), &alloc_len, ALLOC_NORMAL, JFFS2_SUMMARY_INODE_SIZE); @@ -170,10 +169,10 @@ static int jffs2_write_begin(struct file *filp, struct address_space *mapping, ri.mode = cpu_to_jemode(inode->i_mode); ri.uid = cpu_to_je16(i_uid_read(inode)); ri.gid = cpu_to_je16(i_gid_read(inode)); - ri.isize = cpu_to_je32(max((uint32_t)inode->i_size, pageofs)); + ri.isize = cpu_to_je32((uint32_t)pos); ri.atime = ri.ctime = ri.mtime = cpu_to_je32(JFFS2_NOW()); ri.offset = cpu_to_je32(inode->i_size); - ri.dsize = cpu_to_je32(pageofs - inode->i_size); + ri.dsize = cpu_to_je32((uint32_t)pos - inode->i_size); ri.csize = cpu_to_je32(0); ri.compr = JFFS2_COMPR_ZERO; ri.node_crc = cpu_to_je32(crc32(0, &ri, sizeof(ri)-8)); @@ -203,7 +202,7 @@ static int jffs2_write_begin(struct file *filp, struct address_space *mapping, goto out_err; } jffs2_complete_reservation(c); - inode->i_size = pageofs; + inode->i_size = pos; mutex_unlock(&f->sem); }
From: David Gow davidgow@google.com
[ Upstream commit 8849818679478933dd1d9718741f4daa3f4e8b86 ]
The kernel disables all SSE and similar FP/SIMD instructions on x86-based architectures (partly because we shouldn't be using floats in the kernel, and partly to avoid the need for stack alignment, see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383 )
UML does not do the same thing, which isn't in itself a problem, but does add to the list of differences between UML and "normal" x86 builds.
In addition, there was a crash bug with LLVM < 15 / rustc < 1.65 when building with SSE, so disabling it fixes rust builds with earlier compiler versions, see: https://github.com/Rust-for-Linux/linux/pull/881
Signed-off-by: David Gow davidgow@google.com Reviewed-by: Sergio González Collado sergio.collado@gmail.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/Makefile.um | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/Makefile.um b/arch/x86/Makefile.um index 1db7913795f51..e51c53d10efe7 100644 --- a/arch/x86/Makefile.um +++ b/arch/x86/Makefile.um @@ -1,6 +1,12 @@ # SPDX-License-Identifier: GPL-2.0 core-y += arch/x86/crypto/
+# +# Disable SSE and other FP/SIMD instructions to match normal x86 +# +KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx +KBUILD_RUSTFLAGS += -Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2 + ifeq ($(CONFIG_X86_32),y) START := 0x8048000
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 5cd740287ae5e3f9d1c46f5bfe8778972fd6d3fe ]
In ext4_fill_super(), EXT4_ORPHAN_FS flag is cleared after ext4_orphan_cleanup() is executed. Therefore, when __ext4_iget() is called to get an inode whose i_nlink is 0 when the flag exists, no error is returned. If the inode is a special inode, a null pointer dereference may occur. If the value of i_nlink is 0 for any inodes (except boot loader inodes) got by using the EXT4_IGET_SPECIAL flag, the current file system is corrupted. Therefore, make the ext4_iget() function return an error if it gets such an abnormal special inode.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199179 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216541 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216539 Reported-by: Luís Henriques lhenriques@suse.de Suggested-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230107032126.4165860-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/inode.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 1a654a1f3f46b..6ba185b46ba39 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4721,13 +4721,6 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, goto bad_inode; raw_inode = ext4_raw_inode(&iloc);
- if ((ino == EXT4_ROOT_INO) && (raw_inode->i_links_count == 0)) { - ext4_error_inode(inode, function, line, 0, - "iget: root inode unallocated"); - ret = -EFSCORRUPTED; - goto bad_inode; - } - if ((flags & EXT4_IGET_HANDLE) && (raw_inode->i_links_count == 0) && (raw_inode->i_mode == 0)) { ret = -ESTALE; @@ -4800,11 +4793,16 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, * NeilBrown 1999oct15 */ if (inode->i_nlink == 0) { - if ((inode->i_mode == 0 || + if ((inode->i_mode == 0 || flags & EXT4_IGET_SPECIAL || !(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ORPHAN_FS)) && ino != EXT4_BOOT_LOADER_INO) { - /* this inode is deleted */ - ret = -ESTALE; + /* this inode is deleted or unallocated */ + if (flags & EXT4_IGET_SPECIAL) { + ext4_error_inode(inode, function, line, 0, + "iget: special inode unallocated"); + ret = -EFSCORRUPTED; + } else + ret = -ESTALE; goto bad_inode; } /* The only unlinked inodes we let through here have
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 0f7bfd6f8164be32dbbdf36aa1e5d00485c53cd7 ]
Syzbot reported a hung task problem: ================================================================== INFO: task syz-executor232:5073 blocked for more than 143 seconds. Not tainted 6.2.0-rc2-syzkaller-00024-g512dee0c00ad #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-exec232 state:D stack:21024 pid:5073 ppid:5072 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5244 [inline] __schedule+0x995/0xe20 kernel/sched/core.c:6555 schedule+0xcb/0x190 kernel/sched/core.c:6631 __wait_on_freeing_inode fs/inode.c:2196 [inline] find_inode_fast+0x35a/0x4c0 fs/inode.c:950 iget_locked+0xb1/0x830 fs/inode.c:1273 __ext4_iget+0x22e/0x3ed0 fs/ext4/inode.c:4861 ext4_xattr_inode_iget+0x68/0x4e0 fs/ext4/xattr.c:389 ext4_xattr_inode_dec_ref_all+0x1a7/0xe50 fs/ext4/xattr.c:1148 ext4_xattr_delete_inode+0xb04/0xcd0 fs/ext4/xattr.c:2880 ext4_evict_inode+0xd7c/0x10b0 fs/ext4/inode.c:296 evict+0x2a4/0x620 fs/inode.c:664 ext4_orphan_cleanup+0xb60/0x1340 fs/ext4/orphan.c:474 __ext4_fill_super fs/ext4/super.c:5516 [inline] ext4_fill_super+0x81cd/0x8700 fs/ext4/super.c:5644 get_tree_bdev+0x400/0x620 fs/super.c:1282 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fa5406fd5ea RSP: 002b:00007ffc7232f968 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa5406fd5ea RDX: 0000000020000440 RSI: 0000000020000000 RDI: 00007ffc7232f970 RBP: 00007ffc7232f970 R08: 00007ffc7232f9b0 R09: 0000000000000432 R10: 0000000000804a03 R11: 0000000000000202 R12: 0000000000000004 R13: 0000555556a7a2c0 R14: 00007ffc7232f9b0 R15: 0000000000000000 </TASK> ==================================================================
The problem is that the inode contains an xattr entry with ea_inum of 15 when cleaning up an orphan inode <15>. When evict inode <15>, the reference counting of the corresponding EA inode is decreased. When EA inode <15> is found by find_inode_fast() in __ext4_iget(), it is found that the EA inode holds the I_FREEING flag and waits for the EA inode to complete deletion. As a result, when inode <15> is being deleted, we wait for inode <15> to complete the deletion, resulting in an infinite loop and triggering Hung Task. To solve this problem, we only need to check whether the ino of EA inode and parent is the same before getting EA inode.
Link: https://syzkaller.appspot.com/bug?extid=77d6fcc37bbb92f26048 Reported-by: syzbot+77d6fcc37bbb92f26048@syzkaller.appspotmail.com Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230110133436.996350-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/xattr.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 60e122761352c..f3da1f2d4cb93 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -386,6 +386,17 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, struct inode *inode; int err;
+ /* + * We have to check for this corruption early as otherwise + * iget_locked() could wait indefinitely for the state of our + * parent inode. + */ + if (parent->i_ino == ea_ino) { + ext4_error(parent->i_sb, + "Parent and EA inode have the same ino %lu", ea_ino); + return -EFSCORRUPTED; + } + inode = ext4_iget(parent->i_sb, ea_ino, EXT4_IGET_NORMAL); if (IS_ERR(inode)) { err = PTR_ERR(inode);
From: Qu Huang qu.huang@linux.dev
[ Upstream commit 4fc8fff378b2f2039f2a666d9f8c570f4e58352c ]
In the kfd_wait_on_events() function, the kfd_event_waiter structure is allocated by alloc_event_waiters(), but the event field of the waiter structure is not initialized; When copy_from_user() fails in the kfd_wait_on_events() function, it will enter exception handling to release the previously allocated memory of the waiter structure; Due to the event field of the waiters structure being accessed in the free_waiters() function, this results in illegal memory access and system crash, here is the crash log:
localhost kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x185/0x1e0 localhost kernel: RSP: 0018:ffffaa53c362bd60 EFLAGS: 00010082 localhost kernel: RAX: ff3d3d6bff4007cb RBX: 0000000000000282 RCX: 00000000002c0000 localhost kernel: RDX: ffff9e855eeacb80 RSI: 000000000000279c RDI: ffffe7088f6a21d0 localhost kernel: RBP: ffffe7088f6a21d0 R08: 00000000002c0000 R09: ffffaa53c362be64 localhost kernel: R10: ffffaa53c362bbd8 R11: 0000000000000001 R12: 0000000000000002 localhost kernel: R13: ffff9e7ead15d600 R14: 0000000000000000 R15: ffff9e7ead15d698 localhost kernel: FS: 0000152a3d111700(0000) GS:ffff9e855ee80000(0000) knlGS:0000000000000000 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 localhost kernel: CR2: 0000152938000010 CR3: 000000044d7a4000 CR4: 00000000003506e0 localhost kernel: Call Trace: localhost kernel: _raw_spin_lock_irqsave+0x30/0x40 localhost kernel: remove_wait_queue+0x12/0x50 localhost kernel: kfd_wait_on_events+0x1b6/0x490 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: kfd_ioctl+0x38c/0x4a0 [hydcu] localhost kernel: ? kfd_ioctl_set_trap_handler+0x70/0x70 [hydcu] localhost kernel: ? kfd_ioctl_create_queue+0x5a0/0x5a0 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: __x64_sys_ioctl+0x8e/0xd0 localhost kernel: ? syscall_trace_enter.isra.18+0x143/0x1b0 localhost kernel: do_syscall_64+0x33/0x80 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 localhost kernel: RIP: 0033:0x152a4dff68d7
Allocate the structure with kcalloc, and remove redundant 0-initialization and a redundant loop condition check.
Signed-off-by: Qu Huang qu.huang@linux.dev Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c index 159be13ef20bb..2c19b3775179b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -528,16 +528,13 @@ static struct kfd_event_waiter *alloc_event_waiters(uint32_t num_events) struct kfd_event_waiter *event_waiters; uint32_t i;
- event_waiters = kmalloc_array(num_events, - sizeof(struct kfd_event_waiter), - GFP_KERNEL); + event_waiters = kcalloc(num_events, sizeof(struct kfd_event_waiter), + GFP_KERNEL); if (!event_waiters) return NULL;
- for (i = 0; (event_waiters) && (i < num_events) ; i++) { + for (i = 0; i < num_events; i++) init_wait(&event_waiters[i].wait); - event_waiters[i].activated = false; - }
return event_waiters; }
From: Michael Karcher kernel@mkarcher.dialup.fu-berlin.de
[ Upstream commit 250870824c1cf199b032b1ef889c8e8d69d9123a ]
GCC warns about the pattern sizeof(void*)/sizeof(void), as it looks like the abuse of a pattern to calculate the array size. This pattern appears in the unevaluated part of the ternary operator in _INTC_ARRAY if the parameter is NULL.
The replacement uses an alternate approach to return 0 in case of NULL which does not generate the pattern sizeof(void*)/sizeof(void), but still emits the warning if _INTC_ARRAY is called with a nonarray parameter.
This patch is required for successful compilation with -Werror enabled.
The idea to use _Generic for type distinction is taken from Comment #7 in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483 by Jakub Jelinek
Signed-off-by: Michael Karcher kernel@mkarcher.dialup.fu-berlin.de Acked-by: Randy Dunlap rdunlap@infradead.org # build-tested Link: https://lore.kernel.org/r/619fa552-c988-35e5-b1d7-fe256c46a272@mkarcher.dial... Signed-off-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sh_intc.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/sh_intc.h b/include/linux/sh_intc.h index c255273b02810..37ad81058d6ae 100644 --- a/include/linux/sh_intc.h +++ b/include/linux/sh_intc.h @@ -97,7 +97,10 @@ struct intc_hw_desc { unsigned int nr_subgroups; };
-#define _INTC_ARRAY(a) a, __same_type(a, NULL) ? 0 : sizeof(a)/sizeof(*a) +#define _INTC_SIZEOF_OR_ZERO(a) (_Generic(a, \ + typeof(NULL): 0, \ + default: sizeof(a))) +#define _INTC_ARRAY(a) a, _INTC_SIZEOF_OR_ZERO(a)/sizeof(*a)
#define INTC_HW_DESC(vectors, groups, mask_regs, \ prio_regs, sense_regs, ack_regs) \
From: Alex Hung alex.hung@amd.com
[ Upstream commit 031f196d1b1b6d5dfcb0533b431e3ab1750e6189 ]
[WHY] When PTEBufferSizeInRequests is zero, UBSAN reports the following warning because dml_log2 returns an unexpected negative value:
shift exponent 4294966273 is too large for 32-bit type 'int'
[HOW]
In the case PTEBufferSizeInRequests is zero, skip the dml_log2() and assign the result directly.
Reviewed-by: Jun Lei Jun.Lei@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Alex Hung alex.hung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c index e427f4ffa0807..e5b1002d7f3f0 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c @@ -1868,7 +1868,10 @@ static unsigned int CalculateVMAndRowBytes( }
if (SurfaceTiling == dm_sw_linear) { - *dpte_row_height = dml_min(128, 1 << (unsigned int) dml_floor(dml_log2(PTEBufferSizeInRequests * *PixelPTEReqWidth / Pitch), 1)); + if (PTEBufferSizeInRequests == 0) + *dpte_row_height = 1; + else + *dpte_row_height = dml_min(128, 1 << (unsigned int) dml_floor(dml_log2(PTEBufferSizeInRequests * *PixelPTEReqWidth / Pitch), 1)); *dpte_row_width_ub = (dml_ceil(((double) SwathWidth - 1) / *PixelPTEReqWidth, 1) + 1) * *PixelPTEReqWidth; *PixelPTEBytesPerRow = *dpte_row_width_ub / *PixelPTEReqWidth * *PTERequestSize; } else if (ScanDirection != dm_vert) {
From: Theodore Ts'o tytso@mit.edu
commit 70e42feab2e20618ddd0cbfc4ab4b08628236ecd upstream.
Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") Link: https://lore.kernel.org/r/5efbe1b9-ad8b-4a4f-b422-24824d2b775c@kili.mountain Reported-by: Dan Carpenter error27@gmail.com Reported-by: syzbot+0c73d1d8b952c5f3d714@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/namei.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3934,10 +3934,8 @@ static int ext4_rename(struct inode *old goto end_rename; } retval = ext4_rename_dir_prepare(handle, &old); - if (retval) { - inode_unlock(old.inode); + if (retval) goto end_rename; - } } /* * If we're renaming a file within an inline_data dir and adding or
From: Sherry Sun sherry.sun@nxp.com
commit 2411fd94ceaa6e11326e95d6ebf876cbfed28d23 upstream.
According to LPUART RM, Transmission Complete Flag becomes 0 if queuing a break character by writing 1 to CTRL[SBK], so here need to skip waiting for transmission complete when UARTCTRL_SBK is asserted, otherwise the kernel may stuck here. And actually set_termios() adds transmission completion waiting to avoid data loss or data breakage when changing the baud rate, but we don't need to worry about this when queuing break characters.
Signed-off-by: Sherry Sun sherry.sun@nxp.com Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20230223093941.31790-1-sherry.sun@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/fsl_lpuart.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/drivers/tty/serial/fsl_lpuart.c +++ b/drivers/tty/serial/fsl_lpuart.c @@ -2159,9 +2159,15 @@ lpuart32_set_termios(struct uart_port *p /* update the per-port timeout */ uart_update_timeout(port, termios->c_cflag, baud);
- /* wait transmit engin complete */ - lpuart32_write(&sport->port, 0, UARTMODIR); - lpuart32_wait_bit_set(&sport->port, UARTSTAT, UARTSTAT_TC); + /* + * LPUART Transmission Complete Flag may never be set while queuing a break + * character, so skip waiting for transmission complete when UARTCTRL_SBK is + * asserted. + */ + if (!(old_ctrl & UARTCTRL_SBK)) { + lpuart32_write(&sport->port, 0, UARTMODIR); + lpuart32_wait_bit_set(&sport->port, UARTSTAT, UARTSTAT_TC); + }
/* disable transmit and receive */ lpuart32_write(&sport->port, old_ctrl & ~(UARTCTRL_TE | UARTCTRL_RE),
From: Biju Das biju.das.jz@bp.renesas.com
commit 32e293be736b853f168cd065d9cbc1b0c69f545d upstream.
As per HW manual for EMEV2 "R19UH0040EJ0400 Rev.4.00", the UART IP found on EMMA mobile SoC is Register-compatible with the general-purpose 16750 UART chip. Fix UART port type as 16750 and enable 64-bytes fifo support.
Fixes: 22886ee96895 ("serial8250-em: Emma Mobile UART driver V2") Cc: stable@vger.kernel.org Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Link: https://lore.kernel.org/r/20230227114152.22265-2-biju.das.jz@bp.renesas.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_em.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/tty/serial/8250/8250_em.c +++ b/drivers/tty/serial/8250/8250_em.c @@ -106,8 +106,8 @@ static int serial8250_em_probe(struct pl memset(&up, 0, sizeof(up)); up.port.mapbase = regs->start; up.port.irq = irq; - up.port.type = PORT_UNKNOWN; - up.port.flags = UPF_BOOT_AUTOCONF | UPF_FIXED_PORT | UPF_IOREMAP; + up.port.type = PORT_16750; + up.port.flags = UPF_FIXED_PORT | UPF_IOREMAP | UPF_FIXED_TYPE; up.port.dev = &pdev->dev; up.port.private_data = priv;
From: Roman Gushchin roman.gushchin@linux.dev
commit 38ed310c22e7a0fc978b1f8292136a4a4a8b3051 upstream.
The following issue was discovered using lockdep: [ 6.691371] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:209 [ 6.694602] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 [ 6.702431] 2 locks held by swapper/0/1: [ 6.706300] #0: ffffff8800f6f188 (&dev->mutex){....}-{3:3}, at: __device_driver_lock+0x4c/0x90 [ 6.714900] #1: ffffffc009a2abb8 (enable_lock){....}-{2:2}, at: clk_enable_lock+0x4c/0x140 [ 6.723156] irq event stamp: 304030 [ 6.726596] hardirqs last enabled at (304029): [<ffffffc008d17ee0>] _raw_spin_unlock_irqrestore+0xc0/0xd0 [ 6.736142] hardirqs last disabled at (304030): [<ffffffc00876bc5c>] clk_enable_lock+0xfc/0x140 [ 6.744742] softirqs last enabled at (303958): [<ffffffc0080904f0>] _stext+0x4f0/0x894 [ 6.752655] softirqs last disabled at (303951): [<ffffffc0080e53b8>] irq_exit+0x238/0x280 [ 6.760744] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G U 5.15.36 #2 [ 6.768048] Hardware name: xlnx,zynqmp (DT) [ 6.772179] Call trace: [ 6.774584] dump_backtrace+0x0/0x300 [ 6.778197] show_stack+0x18/0x30 [ 6.781465] dump_stack_lvl+0xb8/0xec [ 6.785077] dump_stack+0x1c/0x38 [ 6.788345] ___might_sleep+0x1a8/0x2a0 [ 6.792129] __might_sleep+0x6c/0xd0 [ 6.795655] kmem_cache_alloc_trace+0x270/0x3d0 [ 6.800127] do_feature_check_call+0x100/0x220 [ 6.804513] zynqmp_pm_invoke_fn+0x8c/0xb0 [ 6.808555] zynqmp_pm_clock_getstate+0x90/0xe0 [ 6.813027] zynqmp_pll_is_enabled+0x8c/0x120 [ 6.817327] zynqmp_pll_enable+0x38/0xc0 [ 6.821197] clk_core_enable+0x144/0x400 [ 6.825067] clk_core_enable+0xd4/0x400 [ 6.828851] clk_core_enable+0xd4/0x400 [ 6.832635] clk_core_enable+0xd4/0x400 [ 6.836419] clk_core_enable+0xd4/0x400 [ 6.840203] clk_core_enable+0xd4/0x400 [ 6.843987] clk_core_enable+0xd4/0x400 [ 6.847771] clk_core_enable+0xd4/0x400 [ 6.851555] clk_core_enable_lock+0x24/0x50 [ 6.855683] clk_enable+0x24/0x40 [ 6.858952] fclk_probe+0x84/0xf0 [ 6.862220] platform_probe+0x8c/0x110 [ 6.865918] really_probe+0x110/0x5f0 [ 6.869530] __driver_probe_device+0xcc/0x210 [ 6.873830] driver_probe_device+0x64/0x140 [ 6.877958] __driver_attach+0x114/0x1f0 [ 6.881828] bus_for_each_dev+0xe8/0x160 [ 6.885698] driver_attach+0x34/0x50 [ 6.889224] bus_add_driver+0x228/0x300 [ 6.893008] driver_register+0xc0/0x1e0 [ 6.896792] __platform_driver_register+0x44/0x60 [ 6.901436] fclk_driver_init+0x1c/0x28 [ 6.905220] do_one_initcall+0x104/0x590 [ 6.909091] kernel_init_freeable+0x254/0x2bc [ 6.913390] kernel_init+0x24/0x130 [ 6.916831] ret_from_fork+0x10/0x20
Fix it by passing the GFP_ATOMIC gfp flag for the corresponding memory allocation.
Fixes: acfdd18591ea ("firmware: xilinx: Use hash-table for api feature check") Cc: stable stable@kernel.org Signed-off-by: Roman Gushchin roman.gushchin@linux.dev Cc: Amit Sunil Dhamne amit.sunil.dhamne@xilinx.com Cc: Michal Simek michal.simek@xilinx.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/r/20230308222602.123866-1-roman.gushchin@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/xilinx/zynqmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/firmware/xilinx/zynqmp.c +++ b/drivers/firmware/xilinx/zynqmp.c @@ -171,7 +171,7 @@ static int zynqmp_pm_feature(u32 api_id) }
/* Add new entry if not present */ - feature_data = kmalloc(sizeof(*feature_data), GFP_KERNEL); + feature_data = kmalloc(sizeof(*feature_data), GFP_ATOMIC); if (!feature_data) return -ENOMEM;
From: Johan Hovold johan+linaro@kernel.org
commit a5904f415e1af72fa8fe6665aa4f554dc2099a95 upstream.
The node link array is allocated when adding links to a node but is not deallocated when nodes are destroyed.
Fixes: 11f1ceca7031 ("interconnect: Add generic on-chip interconnect API") Cc: stable@vger.kernel.org # 5.1 Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Signed-off-by: Johan Hovold johan+linaro@kernel.org Tested-by: Luca Ceresoli luca.ceresoli@bootlin.com # i.MX8MP MSC SM2-MB-EP1 Board Link: https://lore.kernel.org/r/20230306075651.2449-2-johan+linaro@kernel.org Signed-off-by: Georgi Djakov djakov@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/interconnect/core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/interconnect/core.c +++ b/drivers/interconnect/core.c @@ -850,6 +850,10 @@ void icc_node_destroy(int id)
mutex_unlock(&icc_lock);
+ if (!node) + return; + + kfree(node->links); kfree(node); } EXPORT_SYMBOL_GPL(icc_node_destroy);
From: Sung-hun Kim sfoon.kim@samsung.com
commit e400be674a1a40e9dcb2e95f84d6c1fd2d88f31d upstream.
Since the commit 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") is applied to the kernel, splice() and sendfile() calls on the trace file (/sys/kernel/debug/tracing /trace) return EINVAL.
This patch restores these system calls by initializing splice_read in file_operations of the trace file. This patch only enables such functionalities for the read case.
Link: https://lore.kernel.org/linux-trace-kernel/20230314013707.28814-1-sfoon.kim@...
Cc: stable@vger.kernel.org Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Sung-hun Kim sfoon.kim@samsung.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4705,6 +4705,8 @@ loff_t tracing_lseek(struct file *file, static const struct file_operations tracing_fops = { .open = tracing_open, .read = seq_read, + .read_iter = seq_read_iter, + .splice_read = generic_file_splice_read, .write = tracing_write_stub, .llseek = tracing_lseek, .release = tracing_release,
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 9f116f76fa8c04c81aef33ad870dbf9a158e5b70 upstream.
The function hist_field_name() cannot handle being passed a NULL field parameter. It should never be NULL, but due to a previous bug, NULL was passed to the function and the kernel crashed due to a NULL dereference. Mark Rutland reported this to me on IRC.
The bug was fixed, but to prevent future bugs from crashing the kernel, check the field and add a WARN_ON() if it is NULL.
Link: https://lkml.kernel.org/r/20230302020810.762384440@goodmis.org
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Reported-by: Mark Rutland mark.rutland@arm.com Fixes: c6afad49d127f ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers") Tested-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_hist.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1087,6 +1087,9 @@ static const char *hist_field_name(struc { const char *field_name = "";
+ if (WARN_ON_ONCE(!field)) + return field_name; + if (level > 1) return field_name;
From: Steven Rostedt (Google) rostedt@goodmis.org
commit c2679254b9c9980d9045f0f722cf093a2b1f7590 upstream.
A while ago where the trace events had the following:
rcu_read_lock_sched_notrace(); rcu_dereference_sched(...); rcu_read_unlock_sched_notrace();
If the tracepoint is enabled, it could trigger RCU issues if called in the wrong place. And this warning was only triggered if lockdep was enabled. If the tracepoint was never enabled with lockdep, the bug would not be caught. To handle this, the above sequence was done when lockdep was enabled regardless if the tracepoint was enabled or not (although the always enabled code really didn't do anything, it would still trigger a warning).
But a lot has changed since that lockdep code was added. One is, that sequence no longer triggers any warning. Another is, the tracepoint when enabled doesn't even do that sequence anymore.
The main check we care about today is whether RCU is "watching" or not. So if lockdep is enabled, always check if rcu_is_watching() which will trigger a warning if it is not (tracepoints require RCU to be watching).
Note, that old sequence did add a bit of overhead when lockdep was enabled, and with the latest kernel updates, would cause the system to slow down enough to trigger kernel "stalled" warnings.
Link: http://lore.kernel.org/lkml/20140806181801.GA4605@redhat.com Link: http://lore.kernel.org/lkml/20140807175204.C257CAC5@viggo.jf.intel.com Link: https://lore.kernel.org/lkml/20230307184645.521db5c9@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20230310172856.77406446@gandalf.l...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: "Paul E. McKenney" paulmck@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Joel Fernandes joel@joelfernandes.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Paul E. McKenney paulmck@kernel.org Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/tracepoint.h | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
--- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -234,12 +234,11 @@ static inline struct tracepoint *tracepo * not add unwanted padding between the beginning of the section and the * structure. Force alignment to the same alignment as the section start. * - * When lockdep is enabled, we make sure to always do the RCU portions of - * the tracepoint code, regardless of whether tracing is on. However, - * don't check if the condition is false, due to interaction with idle - * instrumentation. This lets us find RCU issues triggered with tracepoints - * even when this tracepoint is off. This code has no purpose other than - * poking RCU a bit. + * When lockdep is enabled, we make sure to always test if RCU is + * "watching" regardless if the tracepoint is enabled or not. Tracepoints + * require RCU to be active, and it should always warn at the tracepoint + * site if it is not watching, as it will need to be active when the + * tracepoint is enabled. */ #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \ extern int __traceiter_##name(data_proto); \ @@ -253,9 +252,7 @@ static inline struct tracepoint *tracepo TP_ARGS(data_args), \ TP_CONDITION(cond), 0); \ if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ - rcu_read_lock_sched_notrace(); \ - rcu_dereference_sched(__tracepoint_##name.funcs);\ - rcu_read_unlock_sched_notrace(); \ + WARN_ON_ONCE(!rcu_is_watching()); \ } \ } \ __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \
From: Volker Lendecke vl@samba.org
commit 211baef0eabf4169ce4f73ebd917749d1a7edd74 upstream.
If cifs_get_writable_path() finds a writable file, smb2_compound_op() must use that file's FID and not the COMPOUND_FID.
Cc: stable@vger.kernel.org Signed-off-by: Volker Lendecke vl@samba.org Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2inode.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-)
--- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -236,15 +236,32 @@ smb2_compound_op(const unsigned int xid, size[0] = 8; /* sizeof __le64 */ data[0] = ptr;
- rc = SMB2_set_info_init(tcon, server, - &rqst[num_rqst], COMPOUND_FID, - COMPOUND_FID, current->tgid, - FILE_END_OF_FILE_INFORMATION, - SMB2_O_INFO_FILE, 0, data, size); + if (cfile) { + rc = SMB2_set_info_init(tcon, server, + &rqst[num_rqst], + cfile->fid.persistent_fid, + cfile->fid.volatile_fid, + current->tgid, + FILE_END_OF_FILE_INFORMATION, + SMB2_O_INFO_FILE, 0, + data, size); + } else { + rc = SMB2_set_info_init(tcon, server, + &rqst[num_rqst], + COMPOUND_FID, + COMPOUND_FID, + current->tgid, + FILE_END_OF_FILE_INFORMATION, + SMB2_O_INFO_FILE, 0, + data, size); + if (!rc) { + smb2_set_next_command(tcon, &rqst[num_rqst]); + smb2_set_related(&rqst[num_rqst]); + } + } if (rc) goto finished; - smb2_set_next_command(tcon, &rqst[num_rqst]); - smb2_set_related(&rqst[num_rqst++]); + num_rqst++; trace_smb3_set_eof_enter(xid, ses->Suid, tcon->tid, full_path); break; case SMB2_OP_SET_INFO:
From: Paolo Bonzini pbonzini@redhat.com
commit 112e66017bff7f2837030f34c2bc19501e9212d5 upstream.
The effective values of the guest CR0 and CR4 registers may differ from those included in the VMCS12. In particular, disabling EPT forces CR4.PAE=1 and disabling unrestricted guest mode forces CR0.PG=CR0.PE=1.
Therefore, checks on these bits cannot be delegated to the processor and must be performed by KVM.
Reported-by: Reima ISHII ishiir@g.ecc.u-tokyo.ac.jp Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/nested.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2998,7 +2998,7 @@ static int nested_vmx_check_guest_state( struct vmcs12 *vmcs12, enum vm_entry_failure_code *entry_failure_code) { - bool ia32e; + bool ia32e = !!(vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE);
*entry_failure_code = ENTRY_FAIL_DEFAULT;
@@ -3024,6 +3024,13 @@ static int nested_vmx_check_guest_state( vmcs12->guest_ia32_perf_global_ctrl))) return -EINVAL;
+ if (CC((vmcs12->guest_cr0 & (X86_CR0_PG | X86_CR0_PE)) == X86_CR0_PG)) + return -EINVAL; + + if (CC(ia32e && !(vmcs12->guest_cr4 & X86_CR4_PAE)) || + CC(ia32e && !(vmcs12->guest_cr0 & X86_CR0_PG))) + return -EINVAL; + /* * If the load IA32_EFER VM-entry control is 1, the following checks * are performed on the field for the IA32_EFER MSR: @@ -3035,7 +3042,6 @@ static int nested_vmx_check_guest_state( */ if (to_vmx(vcpu)->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER)) { - ia32e = (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) != 0; if (CC(!kvm_valid_efer(vcpu, vmcs12->guest_ia32_efer)) || CC(ia32e != !!(vmcs12->guest_ia32_efer & EFER_LMA)) || CC(((vmcs12->guest_cr0 & X86_CR0_PG) &&
From: Bard Liao yung-chuan.liao@linux.intel.com
commit bbdf904b13a62bb8b1272d92a7dde082dff86fbb upstream.
Use SOF as default audio driver.
Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Gongjun Song gongjun.song@intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230306074101.3906707-1-yung-chuan.liao@linux.int... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/hda/intel-dsp-config.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/sound/hda/intel-dsp-config.c +++ b/sound/hda/intel-dsp-config.c @@ -359,6 +359,15 @@ static const struct config_entry config_ }, #endif
+/* Meteor Lake */ +#if IS_ENABLED(CONFIG_SND_SOC_SOF_METEORLAKE) + /* Meteorlake-P */ + { + .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE, + .device = 0x7e28, + }, +#endif + };
static const struct config_entry *snd_intel_dsp_find_config
From: Hamidreza H. Fard nitocris@posteo.net
commit a86e79e3015f5dd8e1b01ccfa49bd5c6e41047a1 upstream.
Samsung Galaxy Book2 Pro (13" 2022 NP930XED-KA1DE) with codec SSID 144d:c868 requires the same workaround for enabling the speaker amp like other Samsung models with ALC298 code.
Signed-off-by: Hamidreza H. Fard nitocris@posteo.net Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230307163741.3878-1-nitocris@posteo.net Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9091,6 +9091,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x144d, 0xc830, "Samsung Galaxy Book Ion (NT950XCJ-X716A)", ALC298_FIXUP_SAMSUNG_AMP), SND_PCI_QUIRK(0x144d, 0xc832, "Samsung Galaxy Book Flex Alpha (NP730QCJ)", ALC256_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET), SND_PCI_QUIRK(0x144d, 0xca03, "Samsung Galaxy Book2 Pro 360 (NP930QED)", ALC298_FIXUP_SAMSUNG_AMP), + SND_PCI_QUIRK(0x144d, 0xc868, "Samsung Galaxy Book2 Pro (NP930XED)", ALC298_FIXUP_SAMSUNG_AMP), SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x1462, 0xb171, "Cubi N 8GL (MS-B171)", ALC283_FIXUP_HEADSET_MIC),
From: Dmitry Osipenko dmitry.osipenko@collabora.com
commit ee9adb7a45516cfa536ca92253d7ae59d56db9e4 upstream.
drm_gem_shmem_mmap() doesn't own reference in error code path, resulting in the dma-buf shmem GEM object getting prematurely freed leading to a later use-after-free.
Fixes: f49a51bfdc8e ("drm/shme-helpers: Fix dma_buf_mmap forwarding bug") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Osipenko dmitry.osipenko@collabora.com Reviewed-by: Rob Clark robdclark@gmail.com Link: https://patchwork.freedesktop.org/patch/msgid/20230108211311.3950107-1-dmitr... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_gem_shmem_helper.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -614,11 +614,14 @@ int drm_gem_shmem_mmap(struct drm_gem_ob int ret;
if (obj->import_attach) { - /* Drop the reference drm_gem_mmap_obj() acquired.*/ - drm_gem_object_put(obj); vma->vm_private_data = NULL; + ret = dma_buf_mmap(obj->dma_buf, vma, 0); + + /* Drop the reference drm_gem_mmap_obj() acquired.*/ + if (!ret) + drm_gem_object_put(obj);
- return dma_buf_mmap(obj->dma_buf, vma, 0); + return ret; }
shmem = to_drm_gem_shmem_obj(obj);
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 3ba14528684f528566fb7d956bfbfb958b591d86 upstream.
tcp_set_state() is called from tcp_done() already.
There is then no need to first set the state to TCP_CLOSE, then call tcp_done().
Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/362 Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/subflow.c | 1 - 1 file changed, 1 deletion(-)
--- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -275,7 +275,6 @@ void mptcp_subflow_reset(struct sock *ss struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct sock *sk = subflow->conn;
- tcp_set_state(ssk, TCP_CLOSE); tcp_send_active_reset(ssk, GFP_ATOMIC); tcp_done(ssk); if (!test_and_set_bit(MPTCP_WORK_CLOSE_SUBFLOW, &mptcp_sk(sk)->flags) &&
From: Chen Zhongjin chenzhongjin@huawei.com
commit ee92fa443358f4fc0017c1d0d325c27b37802504 upstream.
KASAN reported follow problem:
BUG: KASAN: use-after-free in lookup_rec Read of size 8 at addr ffff000199270ff0 by task modprobe CPU: 2 Comm: modprobe Call trace: kasan_report __asan_load8 lookup_rec ftrace_location arch_check_ftrace_location check_kprobe_address_safe register_kprobe
When checking pg->records[pg->index - 1].ip in lookup_rec(), it can get a pg which is newly added to ftrace_pages_start in ftrace_process_locs(). Before the first pg->index++, index is 0 and accessing pg->records[-1].ip will cause this problem.
Don't check the ip when pg->index is 0.
Link: https://lore.kernel.org/linux-trace-kernel/20230309080230.36064-1-chenzhongj...
Cc: stable@vger.kernel.org Fixes: 9644302e3315 ("ftrace: Speed up search by skipping pages by address") Suggested-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Chen Zhongjin chenzhongjin@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ftrace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1538,7 +1538,8 @@ static struct dyn_ftrace *lookup_rec(uns key.flags = end; /* overload flags, as it is unsigned long */
for (pg = ftrace_pages_start; pg; pg = pg->next) { - if (end < pg->records[0].ip || + if (pg->index == 0 || + end < pg->records[0].ip || start >= (pg->records[pg->index - 1].ip + MCOUNT_INSN_SIZE)) continue; rec = bsearch(&key, pg->records, pg->index,
From: David Hildenbrand david@redhat.com
commit 42b2af2c9b7eede8ef21d0943f84d135e21a32a3 upstream.
Currently, we'd lose the userfaultfd-wp marker when PTE-mapping a huge zeropage, resulting in the next write faults in the PMD range not triggering uffd-wp events.
Various actions (partial MADV_DONTNEED, partial mremap, partial munmap, partial mprotect) could trigger this. However, most importantly, un-protecting a single sub-page from the userfaultfd-wp handler when processing a uffd-wp event will PTE-map the shared huge zeropage and lose the uffd-wp bit for the remainder of the PMD.
Let's properly propagate the uffd-wp bit to the PMDs.
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <stdbool.h> #include <inttypes.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <poll.h> #include <pthread.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <linux/userfaultfd.h>
static size_t pagesize; static int uffd; static volatile bool uffd_triggered;
#define barrier() __asm__ __volatile__("": : :"memory")
static void uffd_wp_range(char *start, size_t size, bool wp) { struct uffdio_writeprotect uffd_writeprotect;
uffd_writeprotect.range.start = (unsigned long) start; uffd_writeprotect.range.len = size; if (wp) { uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP; } else { uffd_writeprotect.mode = 0; } if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) { fprintf(stderr, "UFFDIO_WRITEPROTECT failed: %d\n", errno); exit(1); } }
static void *uffd_thread_fn(void *arg) { static struct uffd_msg msg; ssize_t nread;
while (1) { struct pollfd pollfd; int nready;
pollfd.fd = uffd; pollfd.events = POLLIN; nready = poll(&pollfd, 1, -1); if (nready == -1) { fprintf(stderr, "poll() failed: %d\n", errno); exit(1); }
nread = read(uffd, &msg, sizeof(msg)); if (nread <= 0) continue;
if (msg.event != UFFD_EVENT_PAGEFAULT || !(msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP)) { printf("FAIL: wrong uffd-wp event fired\n"); exit(1); }
/* un-protect the single page. */ uffd_triggered = true; uffd_wp_range((char *)(uintptr_t)msg.arg.pagefault.address, pagesize, false); } return arg; }
static int setup_uffd(char *map, size_t size) { struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; pthread_t thread;
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY); if (uffd < 0) { fprintf(stderr, "syscall() failed: %d\n", errno); return -errno; }
uffdio_api.api = UFFD_API; uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP; if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { fprintf(stderr, "UFFDIO_API failed: %d\n", errno); return -errno; }
if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) { fprintf(stderr, "UFFD_FEATURE_WRITEPROTECT missing\n"); return -ENOSYS; }
uffdio_register.range.start = (unsigned long) map; uffdio_register.range.len = size; uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) { fprintf(stderr, "UFFDIO_REGISTER failed: %d\n", errno); return -errno; }
pthread_create(&thread, NULL, uffd_thread_fn, NULL);
return 0; }
int main(void) { const size_t size = 4 * 1024 * 1024ull; char *map, *cur;
pagesize = getpagesize();
map = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; }
if (madvise(map, size, MADV_HUGEPAGE)) { fprintf(stderr, "MADV_HUGEPAGE failed\n"); return -errno; }
if (setup_uffd(map, size)) return 1;
/* Read the whole range, populating zeropages. */ madvise(map, size, MADV_POPULATE_READ);
/* Write-protect the whole range. */ uffd_wp_range(map, size, true);
/* Make sure uffd-wp triggers on each page. */ for (cur = map; cur < map + size; cur += pagesize) { uffd_triggered = false;
barrier(); /* Trigger a write fault. */ *cur = 1; barrier();
if (!uffd_triggered) { printf("FAIL: uffd-wp did not trigger\n"); return 1; } }
printf("PASS: uffd-wp triggered\n"); return 0; }
Link: https://lkml.kernel.org/r/20230302175423.589164-1-david@redhat.com Fixes: e06f1e1dd499 ("userfaultfd: wp: enabled write protection in userfaultfd API") Signed-off-by: David Hildenbrand david@redhat.com Acked-by: Peter Xu peterx@redhat.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Jerome Glisse jglisse@redhat.com Cc: Shaohua Li shli@fb.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1994,7 +1994,7 @@ static void __split_huge_zero_page_pmd(s { struct mm_struct *mm = vma->vm_mm; pgtable_t pgtable; - pmd_t _pmd; + pmd_t _pmd, old_pmd; int i;
/* @@ -2005,7 +2005,7 @@ static void __split_huge_zero_page_pmd(s * * See Documentation/vm/mmu_notifier.rst */ - pmdp_huge_clear_flush(vma, haddr, pmd); + old_pmd = pmdp_huge_clear_flush(vma, haddr, pmd);
pgtable = pgtable_trans_huge_withdraw(mm, pmd); pmd_populate(mm, &_pmd, pgtable); @@ -2014,6 +2014,8 @@ static void __split_huge_zero_page_pmd(s pte_t *pte, entry; entry = pfn_pte(my_zero_pfn(haddr), vma->vm_page_prot); entry = pte_mkspecial(entry); + if (pmd_uffd_wp(old_pmd)) + entry = pte_mkuffd_wp(entry); pte = pte_offset_map(&_pmd, haddr); VM_BUG_ON(!pte_none(*pte)); set_pte_at(mm, haddr, pte, entry);
From: Francesco Dolcini francesco.dolcini@toradex.com
commit 11440da77d6020831ee6f9ce4551b545dea789ee upstream.
Lower the power-on failed message severity from warn to info when the controller does not power-up. It's normal to have this situation when the SD card slot is empty, therefore we should not warn the user about it.
Fixes: 7ca0f166f5b2 ("mmc: sdhci_am654: Add workaround for card detect debounce timer") Signed-off-by: Francesco Dolcini francesco.dolcini@toradex.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230306162751.163369-1-francesco@dolcini.it Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci_am654.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -369,7 +369,7 @@ static void sdhci_am654_write_b(struct s MAX_POWER_ON_TIMEOUT, false, host, val, reg); if (ret) - dev_warn(mmc_dev(host->mmc), "Power on failed\n"); + dev_info(mmc_dev(host->mmc), "Power on failed\n"); } }
From: Helge Deller deller@gmx.de
commit 203873a535d627c668f293be0cb73e26c30f9cc7 upstream.
Find a valid modeline depending on the machine graphic card configuration and add the fb_check_var() function to validate Xorg provided graphics settings.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/video/fbdev/stifb.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
--- a/drivers/video/fbdev/stifb.c +++ b/drivers/video/fbdev/stifb.c @@ -922,6 +922,28 @@ SETUP_HCRX(struct stifb_info *fb) /* ------------------- driver specific functions --------------------------- */
static int +stifb_check_var(struct fb_var_screeninfo *var, struct fb_info *info) +{ + struct stifb_info *fb = container_of(info, struct stifb_info, info); + + if (var->xres != fb->info.var.xres || + var->yres != fb->info.var.yres || + var->bits_per_pixel != fb->info.var.bits_per_pixel) + return -EINVAL; + + var->xres_virtual = var->xres; + var->yres_virtual = var->yres; + var->xoffset = 0; + var->yoffset = 0; + var->grayscale = fb->info.var.grayscale; + var->red.length = fb->info.var.red.length; + var->green.length = fb->info.var.green.length; + var->blue.length = fb->info.var.blue.length; + + return 0; +} + +static int stifb_setcolreg(u_int regno, u_int red, u_int green, u_int blue, u_int transp, struct fb_info *info) { @@ -1145,6 +1167,7 @@ stifb_init_display(struct stifb_info *fb
static const struct fb_ops stifb_ops = { .owner = THIS_MODULE, + .fb_check_var = stifb_check_var, .fb_setcolreg = stifb_setcolreg, .fb_blank = stifb_blank, .fb_fillrect = stifb_fillrect, @@ -1164,6 +1187,7 @@ static int __init stifb_init_fb(struct s struct stifb_info *fb; struct fb_info *info; unsigned long sti_rom_address; + char modestr[32]; char *dev_name; int bpp, xres, yres;
@@ -1342,6 +1366,9 @@ static int __init stifb_init_fb(struct s info->flags = FBINFO_HWACCEL_COPYAREA | FBINFO_HWACCEL_FILLRECT; info->pseudo_palette = &fb->pseudo_palette;
+ scnprintf(modestr, sizeof(modestr), "%dx%d-%d", xres, yres, bpp); + fb_find_mode(&info->var, info, modestr, NULL, 0, NULL, bpp); + /* This has to be done !!! */ if (fb_alloc_cmap(&info->cmap, NR_PALETTE, 0)) goto out_err1;
From: Shawn Guo shawn.guo@linaro.org
commit 6b0313c2fa3d2cf991c9ffef6fae6e7ef592ce6d upstream.
In case that psci_pd_init_topology() fails for some reason, psci_pd_remove() will be responsible for deleting provider and removing genpd from psci_pd_providers list. There will be a failure when removing the cluster PD, because the cpu (child) PDs haven't been removed.
[ 0.050232] CPUidle PSCI: init PM domain cpu0 [ 0.050278] CPUidle PSCI: init PM domain cpu1 [ 0.050329] CPUidle PSCI: init PM domain cpu2 [ 0.050370] CPUidle PSCI: init PM domain cpu3 [ 0.050422] CPUidle PSCI: init PM domain cpu-cluster0 [ 0.050475] PM: genpd_remove: unable to remove cpu-cluster0 [ 0.051412] PM: genpd_remove: removed cpu3 [ 0.051449] PM: genpd_remove: removed cpu2 [ 0.051499] PM: genpd_remove: removed cpu1 [ 0.051546] PM: genpd_remove: removed cpu0
Fix the problem by iterating the provider list reversely, so that parent PD gets removed after child's PDs like below.
[ 0.029052] CPUidle PSCI: init PM domain cpu0 [ 0.029076] CPUidle PSCI: init PM domain cpu1 [ 0.029103] CPUidle PSCI: init PM domain cpu2 [ 0.029124] CPUidle PSCI: init PM domain cpu3 [ 0.029151] CPUidle PSCI: init PM domain cpu-cluster0 [ 0.029647] PM: genpd_remove: removed cpu0 [ 0.029666] PM: genpd_remove: removed cpu1 [ 0.029690] PM: genpd_remove: removed cpu2 [ 0.029714] PM: genpd_remove: removed cpu3 [ 0.029738] PM: genpd_remove: removed cpu-cluster0
Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd") Reviewed-by: Sudeep Holla sudeep.holla@arm.com Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Shawn Guo shawn.guo@linaro.org Cc: 5.10+ stable@vger.kernel.org # 5.10+ Signed-off-by: Rafael J. Wysocki rjw@rjwysocki.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpuidle/cpuidle-psci-domain.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/cpuidle/cpuidle-psci-domain.c +++ b/drivers/cpuidle/cpuidle-psci-domain.c @@ -182,7 +182,8 @@ static void psci_pd_remove(void) struct psci_pd_provider *pd_provider, *it; struct generic_pm_domain *genpd;
- list_for_each_entry_safe(pd_provider, it, &psci_pd_providers, link) { + list_for_each_entry_safe_reverse(pd_provider, it, + &psci_pd_providers, link) { of_genpd_del_provider(pd_provider->node);
genpd = of_genpd_remove_last(pd_provider->node);
From: Yazen Ghannam yazen.ghannam@amd.com
commit 4783b9cb374af02d49740e00e2da19fd4ed6dec4 upstream.
A recent change introduced a flag to queue up errors found during boot-time polling. These errors will be processed during late init once the MCE subsystem is fully set up.
A number of sysfs updates call mce_restart() which goes through a subset of the CPU init flow. This includes polling MCA banks and logging any errors found. Since the same function is used as boot-time polling, errors will be queued. However, the system is now past late init, so the errors will remain queued until another error is found and the workqueue is triggered.
Call mce_schedule_work() at the end of mce_restart() so that queued errors are processed.
Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors") Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Tony Luck tony.luck@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/mce/core.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -2309,6 +2309,7 @@ static void mce_restart(void) { mce_timer_delete_all(); on_each_cpu(mce_cpu_restart, NULL, 1); + mce_schedule_work(); }
/* Toggle features for corrected errors */
From: Nikita Zhandarovich n.zhandarovich@fintech.ru
commit cbebd68f59f03633469f3ecf9bea99cd6cce3854 upstream.
cmdline_find_option() may fail before doing any initialization of the buffer array. This may lead to unpredictable results when the same buffer is used later in calls to strncmp() function. Fix the issue by returning early if cmdline_find_option() returns an error.
Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE.
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption") Signed-off-by: Nikita Zhandarovich n.zhandarovich@fintech.ru Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Tom Lendacky thomas.lendacky@amd.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/mem_encrypt_identity.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -586,7 +586,8 @@ void __init sme_enable(struct boot_param cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr | ((u64)bp->ext_cmd_line_ptr << 32));
- cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)); + if (cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)) < 0) + return;
if (!strncmp(buffer, cmdline_on, sizeof(buffer))) sme_me_mask = me_mask;
From: John Harrison John.C.Harrison@Intel.com
commit 690e0ec8e63da9a29b39fedc6ed5da09c7c82651 upstream.
Direction from hardware is that stolen memory should never be used for ring buffer allocations on platforms with LLC. There are too many caching pitfalls due to the way stolen memory accesses are routed. So it is safest to just not use it.
Signed-off-by: John Harrison John.C.Harrison@Intel.com Fixes: c58b735fc762 ("drm/i915: Allocate rings from stolen") Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org # v4.9+ Tested-by: Jouni Högander jouni.hogander@intel.com Reviewed-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230216011101.1909009-2-John.... (cherry picked from commit f54c1f6c697c4297f7ed94283c184acc338a5cf8) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gt/intel_ring.c +++ b/drivers/gpu/drm/i915/gt/intel_ring.c @@ -108,7 +108,7 @@ static struct i915_vma *create_ring_vma( struct i915_vma *vma;
obj = ERR_PTR(-ENODEV); - if (i915_ggtt_has_aperture(ggtt)) + if (i915_ggtt_has_aperture(ggtt) && !HAS_LLC(i915)) obj = i915_gem_object_create_stolen(i915, size); if (IS_ERR(obj)) obj = i915_gem_object_create_internal(i915, size);
From: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com
commit e0e6b416b25ee14716f3549e0cbec1011b193809 upstream.
Users reported oopses on list corruptions when using i915 perf with a number of concurrently running graphics applications. Root cause analysis pointed at an issue in barrier processing code -- a race among perf open / close replacing active barriers with perf requests on kernel context and concurrent barrier preallocate / acquire operations performed during user context first pin / last unpin.
When adding a request to a composite tracker, we try to reuse an existing fence tracker, already allocated and registered with that composite. The tracker we obtain may already track another fence, may be an idle barrier, or an active barrier.
If the tracker we get occurs a non-idle barrier then we try to delete that barrier from a list of barrier tasks it belongs to. However, while doing that we don't respect return value from a function that performs the barrier deletion. Should the deletion ever fail, we would end up reusing the tracker still registered as a barrier task. Since the same structure field is reused with both fence callback lists and barrier tasks list, list corruptions would likely occur.
Barriers are now deleted from a barrier tasks list by temporarily removing the list content, traversing that content with skip over the node to be deleted, then populating the list back with the modified content. Should that intentionally racy concurrent deletion attempts be not serialized, one or more of those may fail because of the list being temporary empty.
Related code that ignores the results of barrier deletion was initially introduced in v5.4 by commit d8af05ff38ae ("drm/i915: Allow sharing the idle-barrier from other kernel requests"). However, all users of the barrier deletion routine were apparently serialized at that time, then the issue didn't exhibit itself. Results of git bisect with help of a newly developed igt@gem_barrier_race@remote-request IGT test indicate that list corruptions might start to appear after commit 311770173fac ("drm/i915/gt: Schedule request retirement when timeline idles"), introduced in v5.5.
Respect results of barrier deletion attempts -- mark the barrier as idle only if successfully deleted from the list. Then, before proceeding with setting our fence as the one currently tracked, make sure that the tracker we've got is not a non-idle barrier. If that check fails then don't use that tracker but go back and try to acquire a new, usable one.
v3: use unlikely() to document what outcome we expect (Andi), - fix bad grammar in commit description. v2: no code changes, - blame commit 311770173fac ("drm/i915/gt: Schedule request retirement when timeline idles"), v5.5, not commit d8af05ff38ae ("drm/i915: Allow sharing the idle-barrier from other kernel requests"), v5.4, - reword commit description.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6333 Fixes: 311770173fac ("drm/i915/gt: Schedule request retirement when timeline idles") Cc: Chris Wilson chris@chris-wilson.co.uk Cc: stable@vger.kernel.org # v5.5 Cc: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230302120820.48740-1-janusz.... (cherry picked from commit 506006055769b10d1b2b4e22f636f3b45e0e9fc7) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/i915_active.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
--- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -432,8 +432,7 @@ replace_barrier(struct i915_active *ref, * we can use it to substitute for the pending idle-barrer * request that we want to emit on the kernel_context. */ - __active_del_barrier(ref, node_from_active(active)); - return true; + return __active_del_barrier(ref, node_from_active(active)); }
int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) @@ -446,16 +445,19 @@ int i915_active_ref(struct i915_active * if (err) return err;
- active = active_instance(ref, idx); - if (!active) { - err = -ENOMEM; - goto out; - } - - if (replace_barrier(ref, active)) { - RCU_INIT_POINTER(active->fence, NULL); - atomic_dec(&ref->count); - } + do { + active = active_instance(ref, idx); + if (!active) { + err = -ENOMEM; + goto out; + } + + if (replace_barrier(ref, active)) { + RCU_INIT_POINTER(active->fence, NULL); + atomic_dec(&ref->count); + } + } while (unlikely(is_barrier(active))); + if (!__i915_active_fence_set(active, fence)) __i915_active_acquire(ref);
From: Fedor Pchelkin pchelkin@ispras.ru
No upstream commit exists for this commit.
The issue was introduced with backporting upstream commit c16bda37594f ("io_uring/poll: allow some retries for poll triggering spuriously").
Memory allocation can possibly fail causing invalid pointer be dereferenced just before comparing it to NULL value.
Move the pointer check in proper place (upstream has the similar location of the check). In case the request has REQ_F_POLLED flag up, apoll can't be NULL so no need to check there.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -5792,10 +5792,10 @@ static int io_arm_poll_handler(struct io } } else { apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC); + if (unlikely(!apoll)) + return IO_APOLL_ABORTED; apoll->poll.retries = APOLL_MAX_RETRY; } - if (unlikely(!apoll)) - return IO_APOLL_ABORTED; apoll->double_poll = NULL; req->apoll = apoll; req->flags |= REQ_F_POLLED;
From: Sven Schnelle svens@linux.ibm.com
commit a52e5cdbe8016d4e3e6322fd93d71afddb9a5af9 upstream.
The code which handles the ipl report is searching for a free location in memory where it could copy the component and certificate entries to. It checks for intersection between the sections required for the kernel and the component/certificate data area, but fails to check whether the data structures linking these data areas together intersect.
This might cause the iplreport copy code to overwrite the iplreport itself. Fix this by adding two addtional intersection checks.
Cc: stable@vger.kernel.org Fixes: 9641b8cc733f ("s390/ipl: read IPL report at early boot") Signed-off-by: Sven Schnelle svens@linux.ibm.com Reviewed-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/boot/ipl_report.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/s390/boot/ipl_report.c +++ b/arch/s390/boot/ipl_report.c @@ -57,11 +57,19 @@ repeat: if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && INITRD_START && INITRD_SIZE && intersects(INITRD_START, INITRD_SIZE, safe_addr, size)) safe_addr = INITRD_START + INITRD_SIZE; + if (intersects(safe_addr, size, (unsigned long)comps, comps->len)) { + safe_addr = (unsigned long)comps + comps->len; + goto repeat; + } for_each_rb_entry(comp, comps) if (intersects(safe_addr, size, comp->addr, comp->len)) { safe_addr = comp->addr + comp->len; goto repeat; } + if (intersects(safe_addr, size, (unsigned long)certs, certs->len)) { + safe_addr = (unsigned long)certs + certs->len; + goto repeat; + } for_each_rb_entry(cert, certs) if (intersects(safe_addr, size, cert->addr, cert->len)) { safe_addr = cert->addr + cert->len;
From: Lukas Wunner lukas@wunner.de
commit ac91e6980563ed53afadd925fa6585ffd2bc4a2c upstream.
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait for devices on the secondary bus to become accessible after reset:
Although it does call pci_dev_wait(), it erroneously passes the bridge's pci_dev rather than that of a child. The bridge of course is always accessible while its secondary bus is reset, so pci_dev_wait() returns immediately.
Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait() function which is called from pci_bridge_secondary_bus_reset():
https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gm...
However we already have pci_bridge_wait_for_secondary_bus() which does almost exactly what we need. So far it's only called on resume from D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8). Re-using it for Secondary Bus Resets is a leaner and more rational approach than introducing a new function.
That only requires a few minor tweaks:
- Amend pci_bridge_wait_for_secondary_bus() to await accessibility of the first device on the secondary bus by calling pci_dev_wait() after performing the prescribed delays. pci_dev_wait() needs two parameters, a reset reason and a timeout, which callers must now pass to pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI: Wait up to 60 seconds for device to become ready after FLR")). Introduce a PCI_RESET_WAIT macro for the 1 sec timeout.
- Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().
- Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which is now performed by pci_bridge_wait_for_secondary_bus(). A static delay this long is only necessary for Conventional PCI, so modern PCIe systems benefit from shorter reset times as a side effect.
Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset") Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.167376951... Reported-by: Sheng Bi windy.bi.enflame@gmail.com Tested-by: Ravi Kishore Koppuravuri ravi.kishore.koppuravuri@intel.com Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-driver.c | 4 +-- drivers/pci/pci.c | 54 ++++++++++++++++++++--------------------------- drivers/pci/pci.h | 10 +++++++- 3 files changed, 35 insertions(+), 33 deletions(-)
--- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -911,7 +911,7 @@ static int pci_pm_resume_noirq(struct de pcie_pme_root_status_cleanup(pci_dev);
if (!skip_bus_pm && prev_state == PCI_D3cold) - pci_bridge_wait_for_secondary_bus(pci_dev); + pci_bridge_wait_for_secondary_bus(pci_dev, "resume", PCI_RESET_WAIT);
if (pci_has_legacy_pm_support(pci_dev)) return 0; @@ -1298,7 +1298,7 @@ static int pci_pm_runtime_resume(struct pci_pm_default_resume(pci_dev);
if (prev_state == PCI_D3cold) - pci_bridge_wait_for_secondary_bus(pci_dev); + pci_bridge_wait_for_secondary_bus(pci_dev, "resume", PCI_RESET_WAIT);
if (pm && pm->runtime_resume) error = pm->runtime_resume(dev); --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1221,7 +1221,7 @@ static int pci_dev_wait(struct pci_dev * return -ENOTTY; }
- if (delay > 1000) + if (delay > PCI_RESET_WAIT) pci_info(dev, "not ready %dms after %s; waiting\n", delay - 1, reset_type);
@@ -1230,7 +1230,7 @@ static int pci_dev_wait(struct pci_dev * pci_read_config_dword(dev, PCI_COMMAND, &id); }
- if (delay > 1000) + if (delay > PCI_RESET_WAIT) pci_info(dev, "ready %dms after %s\n", delay - 1, reset_type);
@@ -4792,24 +4792,31 @@ static int pci_bus_max_d3cold_delay(cons /** * pci_bridge_wait_for_secondary_bus - Wait for secondary bus to be accessible * @dev: PCI bridge + * @reset_type: reset type in human-readable form + * @timeout: maximum time to wait for devices on secondary bus (milliseconds) * * Handle necessary delays before access to the devices on the secondary - * side of the bridge are permitted after D3cold to D0 transition. + * side of the bridge are permitted after D3cold to D0 transition + * or Conventional Reset. * * For PCIe this means the delays in PCIe 5.0 section 6.6.1. For * conventional PCI it means Tpvrh + Trhfa specified in PCI 3.0 section * 4.3.2. + * + * Return 0 on success or -ENOTTY if the first device on the secondary bus + * failed to become accessible. */ -void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev) +int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type, + int timeout) { struct pci_dev *child; int delay;
if (pci_dev_is_disconnected(dev)) - return; + return 0;
if (!pci_is_bridge(dev)) - return; + return 0;
down_read(&pci_bus_sem);
@@ -4821,14 +4828,14 @@ void pci_bridge_wait_for_secondary_bus(s */ if (!dev->subordinate || list_empty(&dev->subordinate->devices)) { up_read(&pci_bus_sem); - return; + return 0; }
/* Take d3cold_delay requirements into account */ delay = pci_bus_max_d3cold_delay(dev->subordinate); if (!delay) { up_read(&pci_bus_sem); - return; + return 0; }
child = list_first_entry(&dev->subordinate->devices, struct pci_dev, @@ -4837,14 +4844,12 @@ void pci_bridge_wait_for_secondary_bus(s
/* * Conventional PCI and PCI-X we need to wait Tpvrh + Trhfa before - * accessing the device after reset (that is 1000 ms + 100 ms). In - * practice this should not be needed because we don't do power - * management for them (see pci_bridge_d3_possible()). + * accessing the device after reset (that is 1000 ms + 100 ms). */ if (!pci_is_pcie(dev)) { pci_dbg(dev, "waiting %d ms for secondary bus\n", 1000 + delay); msleep(1000 + delay); - return; + return 0; }
/* @@ -4861,11 +4866,11 @@ void pci_bridge_wait_for_secondary_bus(s * configuration requests if we only wait for 100 ms (see * https://bugzilla.kernel.org/show_bug.cgi?id=203885). * - * Therefore we wait for 100 ms and check for the device presence. - * If it is still not present give it an additional 100 ms. + * Therefore we wait for 100 ms and check for the device presence + * until the timeout expires. */ if (!pcie_downstream_port(dev)) - return; + return 0;
if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) { pci_dbg(dev, "waiting %d ms for downstream link\n", delay); @@ -4876,14 +4881,11 @@ void pci_bridge_wait_for_secondary_bus(s if (!pcie_wait_for_link_delay(dev, true, delay)) { /* Did not train, no need to wait any further */ pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n"); - return; + return -ENOTTY; } }
- if (!pci_device_is_present(child)) { - pci_dbg(child, "waiting additional %d ms to become accessible\n", delay); - msleep(delay); - } + return pci_dev_wait(child, reset_type, timeout - delay); }
void pci_reset_secondary_bus(struct pci_dev *dev) @@ -4902,15 +4904,6 @@ void pci_reset_secondary_bus(struct pci_
ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); - - /* - * Trhfa for conventional PCI is 2^25 clock cycles. - * Assuming a minimum 33MHz clock this results in a 1s - * delay before we can consider subordinate devices to - * be re-initialized. PCIe has some ways to shorten this, - * but we don't make use of them yet. - */ - ssleep(1); }
void __weak pcibios_reset_secondary_bus(struct pci_dev *dev) @@ -4929,7 +4922,8 @@ int pci_bridge_secondary_bus_reset(struc { pcibios_reset_secondary_bus(dev);
- return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS); + return pci_bridge_wait_for_secondary_bus(dev, "bus reset", + PCIE_RESET_READY_POLL_MS); } EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
--- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -47,6 +47,13 @@ int pci_bus_error_reset(struct pci_dev * #define PCI_PM_D3HOT_WAIT 10 /* msec */ #define PCI_PM_D3COLD_WAIT 100 /* msec */
+/* + * Following exit from Conventional Reset, devices must be ready within 1 sec + * (PCIe r6.0 sec 6.6.1). A D3cold to D0 transition implies a Conventional + * Reset (PCIe r6.0 sec 5.8). + */ +#define PCI_RESET_WAIT 1000 /* msec */ + /** * struct pci_platform_pm_ops - Firmware PM callbacks * @@ -108,7 +115,8 @@ void pci_allocate_cap_save_buffers(struc void pci_free_cap_save_buffers(struct pci_dev *dev); bool pci_bridge_d3_possible(struct pci_dev *dev); void pci_bridge_d3_update(struct pci_dev *dev); -void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev); +int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type, + int timeout);
static inline void pci_wakeup_event(struct pci_dev *dev) {
From: Lukas Wunner lukas@wunner.de
commit 53b54ad074de1896f8b021615f65b27f557ce874 upstream.
pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus Reset, but not after a DPC-induced Hot Reset.
As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not observed and devices on the secondary bus may be accessed before they're ready.
One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a PCIe switch whose upstream port is not immediately ready after reset. Because its config space is restored too early, it remains in D0uninitialized, its subordinate devices remain inaccessible and DPC recovery fails with messages such as:
i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible) intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible) pcieport 0000:89:02.0: AER: device recovery failed
Fix it.
Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.167376951... Tested-by: Ravi Kishore Koppuravuri ravi.kishore.koppuravuri@intel.com Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci.c | 3 --- drivers/pci/pci.h | 6 ++++++ drivers/pci/pcie/dpc.c | 4 ++-- 3 files changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -157,9 +157,6 @@ static int __init pcie_port_pm_setup(cha } __setup("pcie_port_pm=", pcie_port_pm_setup);
-/* Time to wait after a reset for device to become responsive */ -#define PCIE_RESET_READY_POLL_MS 60000 - /** * pci_bus_max_busnr - returns maximum PCI bus number of given bus' children * @bus: pointer to PCI bus structure to search --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -53,6 +53,12 @@ int pci_bus_error_reset(struct pci_dev * * Reset (PCIe r6.0 sec 5.8). */ #define PCI_RESET_WAIT 1000 /* msec */ +/* + * Devices may extend the 1 sec period through Request Retry Status completions + * (PCIe r6.0 sec 2.3.1). The spec does not provide an upper limit, but 60 sec + * ought to be enough for any device to become responsive. + */ +#define PCIE_RESET_READY_POLL_MS 60000 /* msec */
/** * struct pci_platform_pm_ops - Firmware PM callbacks --- a/drivers/pci/pcie/dpc.c +++ b/drivers/pci/pcie/dpc.c @@ -170,8 +170,8 @@ pci_ers_result_t dpc_reset_link(struct p pci_write_config_word(pdev, cap + PCI_EXP_DPC_STATUS, PCI_EXP_DPC_STATUS_TRIGGER);
- if (!pcie_wait_for_link(pdev, true)) { - pci_info(pdev, "Data Link Layer Link Active not set in 1000 msec\n"); + if (pci_bridge_wait_for_secondary_bus(pdev, "DPC", + PCIE_RESET_READY_POLL_MS)) { clear_bit(PCI_DPC_RECOVERED, &pdev->priv_flags); ret = PCI_ERS_RESULT_DISCONNECT; } else {
From: Dave Chinner dchinner@redhat.com
commit 5b55cbc2d72632e874e50d2e36bce608e55aaaea upstream.
[backport for 5.10.y, prior to perag refactoring in v5.14]
Not fatal, the assert is there to catch developer attention. I'm seeing this occasionally during recoveryloop testing after a shutdown, and I don't want this to stop an overnight recoveryloop run as it is currently doing.
Convert the ASSERT to a XFS_IS_CORRUPT() check so it will dump a corruption report into the log and cause a test failure that way, but it won't stop the machine dead.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_mount.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -126,7 +126,6 @@ __xfs_free_perag( { struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
- ASSERT(atomic_read(&pag->pag_ref) == 0); kmem_free(pag); }
@@ -145,7 +144,7 @@ xfs_free_perag( pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); ASSERT(pag); - ASSERT(atomic_read(&pag->pag_ref) == 0); + XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0); xfs_iunlink_destroy(pag); xfs_buf_hash_destroy(pag); call_rcu(&pag->rcu_head, __xfs_free_perag);
From: "Darrick J. Wong" djwong@kernel.org
commit 86d40f1e49e9a909d25c35ba01bea80dbcd758cb upstream.
[add XFS_QMOPT_QUOTALL flag to xfs_qm_dqpurge_all() for 5.10.y backport]
xfs/434 and xfs/436 have been reporting occasional memory leaks of xfs_dquot objects. These tests themselves were the messenger, not the culprit, since they unload the xfs module, which trips the slub debugging code while tearing down all the xfs slab caches:
============================================================================= BUG xfs_dquot (Tainted: G W ): Objects remaining in xfs_dquot on __kmem_cache_shutdown() -----------------------------------------------------------------------------
Slab 0xffffea000606de00 objects=30 used=5 fp=0xffff888181b78a78 flags=0x17ff80000010200(slab|head|node=0|zone=2|lastcpupid=0xfff) CPU: 0 PID: 3953166 Comm: modprobe Tainted: G W 5.18.0-rc6-djwx #rc6 d5824be9e46a2393677bda868f9b154d917ca6a7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014
Since we don't generally rmmod the xfs module between fstests, this means that xfs/434 is really just the canary in the coal mine -- something leaked a dquot, but we don't know who. After days of pounding on fstests with kmemleak enabled, I finally got it to spit this out:
unreferenced object 0xffff8880465654c0 (size 536): comm "u10:4", pid 88, jiffies 4294935810 (age 29.512s) hex dump (first 32 bytes): 60 4a 56 46 80 88 ff ff 58 ea e4 5c 80 88 ff ff `JVF....X...... 00 e0 52 49 80 88 ff ff 01 00 01 00 00 00 00 00 ..RI............ backtrace: [<ffffffffa0740f6c>] xfs_dquot_alloc+0x2c/0x530 [xfs] [<ffffffffa07443df>] xfs_qm_dqread+0x6f/0x330 [xfs] [<ffffffffa07462a2>] xfs_qm_dqget+0x132/0x4e0 [xfs] [<ffffffffa0756bb0>] xfs_qm_quotacheck_dqadjust+0xa0/0x3e0 [xfs] [<ffffffffa075724d>] xfs_qm_dqusage_adjust+0x35d/0x4f0 [xfs] [<ffffffffa06c9068>] xfs_iwalk_ag_recs+0x348/0x5d0 [xfs] [<ffffffffa06c95d3>] xfs_iwalk_run_callbacks+0x273/0x540 [xfs] [<ffffffffa06c9e8d>] xfs_iwalk_ag+0x5ed/0x890 [xfs] [<ffffffffa06ca22f>] xfs_iwalk_ag_work+0xff/0x170 [xfs] [<ffffffffa06d22c9>] xfs_pwork_work+0x79/0x130 [xfs] [<ffffffff81170bb2>] process_one_work+0x672/0x1040 [<ffffffff81171b1b>] worker_thread+0x59b/0xec0 [<ffffffff8118711e>] kthread+0x29e/0x340 [<ffffffff810032bf>] ret_from_fork+0x1f/0x30
Now we know that quotacheck is at fault, but even this report was canaryish -- it was triggered by xfs/494, which doesn't actually mount any filesystems. (kmemleak can be a little slow to notice leaks, even with fstests repeatedly whacking it to look for them.) Looking at the *previous* fstest, however, showed that the test run before xfs/494 was xfs/117. The tipoff to the problem is in this excerpt from dmesg:
XFS (sda4): Quotacheck needed: Please wait. XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0xdb/0x7b0 [xfs], inode 0x119 dinode XFS (sda4): Unmount and run xfs_repair XFS (sda4): First 128 bytes of corrupted metadata buffer: 00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. 00000010: 00 00 00 01 00 00 00 00 00 90 57 54 54 1a 4c 68 ..........WTT.Lh 00000020: 81 f9 7d e1 6d ee 16 00 34 bd 7d e1 6d ee 16 00 ..}.m...4.}.m... 00000030: 34 bd 7d e1 6d ee 16 00 00 00 00 00 00 00 00 00 4.}.m........... 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 96 80 f3 ab ................ 00000060: ff ff ff ff da 57 7b 11 00 00 00 00 00 00 00 03 .....W{......... 00000070: 00 00 00 01 00 00 00 10 00 00 00 00 00 00 00 08 ................ XFS (sda4): Quotacheck: Unsuccessful (Error -117): Disabling quotas.
The dinode verifier decided that the inode was corrupt, which causes iget to return with EFSCORRUPTED. Since this happened during quotacheck, it is obvious that the kernel aborted the inode walk on account of the corruption error and disabled quotas. Unfortunately, we neglect to purge the dquot cache before doing that, which is how the dquots leaked.
The problems started 10 years ago in commit b84a3a, when the dquot lists were converted to a radix tree, but the error handling behavior was not correctly preserved -- in that commit, if the bulkstat failed and usrquota was enabled, the bulkstat failure code would be overwritten by the result of flushing all the dquots to disk. As long as that succeeds, we'd continue the quota mount as if everything were ok, but instead we're now operating with a corrupt inode and incorrect quota usage counts. I didn't notice this bug in 2019 when I wrote commit ebd126a, which changed quotacheck to skip the dqflush when the scan doesn't complete due to inode walk failures.
Introduced-by: b84a3a96751f ("xfs: remove the per-filesystem list of dquots") Fixes: ebd126a651f8 ("xfs: convert quotacheck to use the new iwalk functions") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_qm.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1318,8 +1318,15 @@ xfs_qm_quotacheck(
error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true, NULL); - if (error) + if (error) { + /* + * The inode walk may have partially populated the dquot + * caches. We must purge them before disabling quota and + * tearing down the quotainfo, or else the dquots will leak. + */ + xfs_qm_dqpurge_all(mp, XFS_QMOPT_QUOTALL); goto error_return; + }
/* * We've made all the changes that we need to make incore. Flush them
From: "Darrick J. Wong" djwong@kernel.org
commit a54f78def73d847cb060b18c4e4a3d1d26c9ca6d upstream.
The recent patch to improve btree cycle checking caused a regression when I rebased the in-memory btree branch atop the 5.19 for-next branch, because in-memory short-pointer btrees do not have AG numbers. This produced the following complaint from kmemleak:
unreferenced object 0xffff88803d47dde8 (size 264): comm "xfs_io", pid 4889, jiffies 4294906764 (age 24.072s) hex dump (first 32 bytes): 90 4d 0b 0f 80 88 ff ff 00 a0 bd 05 80 88 ff ff .M.............. e0 44 3a a0 ff ff ff ff 00 df 08 06 80 88 ff ff .D:............. backtrace: [<ffffffffa0388059>] xfbtree_dup_cursor+0x49/0xc0 [xfs] [<ffffffffa029887b>] xfs_btree_dup_cursor+0x3b/0x200 [xfs] [<ffffffffa029af5d>] __xfs_btree_split+0x6ad/0x820 [xfs] [<ffffffffa029b130>] xfs_btree_split+0x60/0x110 [xfs] [<ffffffffa029f6da>] xfs_btree_make_block_unfull+0x19a/0x1f0 [xfs] [<ffffffffa029fada>] xfs_btree_insrec+0x3aa/0x810 [xfs] [<ffffffffa029fff3>] xfs_btree_insert+0xb3/0x240 [xfs] [<ffffffffa02cb729>] xfs_rmap_insert+0x99/0x200 [xfs] [<ffffffffa02cf142>] xfs_rmap_map_shared+0x192/0x5f0 [xfs] [<ffffffffa02cf60b>] xfs_rmap_map_raw+0x6b/0x90 [xfs] [<ffffffffa0384a85>] xrep_rmap_stash+0xd5/0x1d0 [xfs] [<ffffffffa0384dc0>] xrep_rmap_visit_bmbt+0xa0/0xf0 [xfs] [<ffffffffa0384fb6>] xrep_rmap_scan_iext+0x56/0xa0 [xfs] [<ffffffffa03850d8>] xrep_rmap_scan_ifork+0xd8/0x160 [xfs] [<ffffffffa0385195>] xrep_rmap_scan_inode+0x35/0x80 [xfs] [<ffffffffa03852ee>] xrep_rmap_find_rmaps+0x10e/0x270 [xfs]
I noticed that xfs_btree_insrec has a bunch of debug code that return out of the function immediately, without freeing the "new" btree cursor that can be returned when _make_block_unfull calls xfs_btree_split. Fix the error return in this function to free the btree cursor.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/libxfs/xfs_btree.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -3190,7 +3190,7 @@ xfs_btree_insrec( struct xfs_btree_block *block; /* btree block */ struct xfs_buf *bp; /* buffer for block */ union xfs_btree_ptr nptr; /* new block ptr */ - struct xfs_btree_cur *ncur; /* new btree cursor */ + struct xfs_btree_cur *ncur = NULL; /* new btree cursor */ union xfs_btree_key nkey; /* new block key */ union xfs_btree_key *lkey; int optr; /* old key/record index */ @@ -3270,7 +3270,7 @@ xfs_btree_insrec( #ifdef DEBUG error = xfs_btree_check_block(cur, block, level, bp); if (error) - return error; + goto error0; #endif
/* @@ -3290,7 +3290,7 @@ xfs_btree_insrec( for (i = numrecs - ptr; i >= 0; i--) { error = xfs_btree_debug_check_ptr(cur, pp, i, level); if (error) - return error; + goto error0; }
xfs_btree_shift_keys(cur, kp, 1, numrecs - ptr + 1); @@ -3375,6 +3375,8 @@ xfs_btree_insrec( return 0;
error0: + if (ncur) + xfs_btree_del_cursor(ncur, error); return error; }
From: Dave Chinner dchinner@redhat.com
commit 472c6e46f589c26057596dcba160712a5b3e02c5 upstream.
[partial backport for dependency - xfs_ioc_space() still uses XFS_PREALLOC_SYNC]
Callers can acheive the same thing by calling xfs_log_force_inode() after making their modifications. There is no need for xfs_update_prealloc_flags() to do this.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_file.c | 13 +++++++------ fs/xfs/xfs_pnfs.c | 6 ++++-- 2 files changed, 11 insertions(+), 8 deletions(-)
--- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -94,8 +94,6 @@ xfs_update_prealloc_flags( ip->i_d.di_flags &= ~XFS_DIFLAG_PREALLOC;
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); - if (flags & XFS_PREALLOC_SYNC) - xfs_trans_set_sync(tp); return xfs_trans_commit(tp); }
@@ -1000,9 +998,6 @@ xfs_file_fallocate( } }
- if (file->f_flags & O_DSYNC) - flags |= XFS_PREALLOC_SYNC; - error = xfs_update_prealloc_flags(ip, flags); if (error) goto out_unlock; @@ -1024,8 +1019,14 @@ xfs_file_fallocate( * leave shifted extents past EOF and hence losing access to * the data that is contained within them. */ - if (do_file_insert) + if (do_file_insert) { error = xfs_insert_file_space(ip, offset, len); + if (error) + goto out_unlock; + } + + if (file->f_flags & O_DSYNC) + error = xfs_log_force_inode(ip);
out_unlock: xfs_iunlock(ip, iolock); --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -164,10 +164,12 @@ xfs_fs_map_blocks( * that the blocks allocated and handed out to the client are * guaranteed to be present even after a server crash. */ - error = xfs_update_prealloc_flags(ip, - XFS_PREALLOC_SET | XFS_PREALLOC_SYNC); + error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); + if (!error) + error = xfs_log_force_inode(ip); if (error) goto out_unlock; + } else { xfs_iunlock(ip, lock_flags); }
From: Dave Chinner dchinner@redhat.com
commit fbe7e520036583a783b13ff9744e35c2a329d9a4 upstream.
In XFS, we always update the inode change and modification time when any fallocate() operation succeeds. Furthermore, as various fallocate modes can change the file contents (extending EOF, punching holes, zeroing things, shifting extents), we should drop file privileges like suid just like we do for a regular write(). There's already a VFS helper that figures all this out for us, so use that.
The net effect of this is that we no longer drop suid/sgid if the caller is root, but we also now drop file capabilities.
We also move the xfs_update_prealloc_flags() function so that it now is only called by the scope that needs to set the the prealloc flag.
Based on a patch from Darrick Wong.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_file.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -895,6 +895,10 @@ xfs_file_fallocate( goto out_unlock; }
+ error = file_modified(file); + if (error) + goto out_unlock; + if (mode & FALLOC_FL_PUNCH_HOLE) { error = xfs_free_file_space(ip, offset, len); if (error) @@ -996,11 +1000,12 @@ xfs_file_fallocate( if (error) goto out_unlock; } - }
- error = xfs_update_prealloc_flags(ip, flags); - if (error) - goto out_unlock; + error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); + if (error) + goto out_unlock; + + }
/* Change file size if needed */ if (new_size) {
From: Dave Chinner dchinner@redhat.com
commit 0b02c8c0d75a738c98c35f02efb36217c170d78c upstream.
[backport for 5.10.y]
Now that we only call xfs_update_prealloc_flags() from xfs_file_fallocate() in the case where we need to set the preallocation flag, do this in xfs_alloc_file_space() where we already have the inode joined into a transaction and get rid of the call to xfs_update_prealloc_flags() from the fallocate code.
This also means that we now correctly avoid setting the XFS_DIFLAG_PREALLOC flag when xfs_is_always_cow_inode() is true, as these inodes will never have preallocated extents.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_bmap_util.c | 9 +++------ fs/xfs/xfs_file.c | 8 -------- 2 files changed, 3 insertions(+), 14 deletions(-)
--- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -800,9 +800,6 @@ xfs_alloc_file_space( quota_flag = XFS_QMOPT_RES_REGBLKS; }
- /* - * Allocate and setup the transaction. - */ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, resrtextents, 0, &tp);
@@ -830,9 +827,9 @@ xfs_alloc_file_space( if (error) goto error0;
- /* - * Complete the transaction - */ + ip->i_d.di_flags |= XFS_DIFLAG_PREALLOC; + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); + error = xfs_trans_commit(tp); xfs_iunlock(ip, XFS_ILOCK_EXCL); if (error) --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -850,7 +850,6 @@ xfs_file_fallocate( struct inode *inode = file_inode(file); struct xfs_inode *ip = XFS_I(inode); long error; - enum xfs_prealloc_flags flags = 0; uint iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; loff_t new_size = 0; bool do_file_insert = false; @@ -948,8 +947,6 @@ xfs_file_fallocate( } do_file_insert = true; } else { - flags |= XFS_PREALLOC_SET; - if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > i_size_read(inode)) { new_size = offset + len; @@ -1000,11 +997,6 @@ xfs_file_fallocate( if (error) goto out_unlock; } - - error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); - if (error) - goto out_unlock; - }
/* Change file size if needed */
From: "Darrick J. Wong" djwong@kernel.org
commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upstream.
[remove userns argument of setattr_copy() for 5.10.y backport]
Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid revocation isn't consistent with btrfs[1] or ext4. Those two filesystems use the VFS function setattr_copy to convey certain attributes from struct iattr into the VFS inode structure.
Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to decide if it should clear setgid and setuid on a file attribute update. This is a second symptom of the problem that Filipe noticed.
XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode, xfs_setattr_nonsize, and xfs_setattr_time. Regrettably, setattr_copy is /not/ a simple copy function; it contains additional logic to clear the setgid bit when setting the mode, and XFS' version no longer matches.
The VFS implements its own setuid/setgid stripping logic, which establishes consistent behavior. It's a tad unfortunate that it's scattered across notify_change, should_remove_suid, and setattr_copy but XFS should really follow the Linux VFS. Adapt XFS to use the VFS functions and get rid of the old functions.
[1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y... [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@...
Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_iops.c | 56 ++---------------------------------------------------- fs/xfs/xfs_pnfs.c | 3 +- 2 files changed, 5 insertions(+), 54 deletions(-)
--- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -595,37 +595,6 @@ xfs_vn_getattr( return 0; }
-static void -xfs_setattr_mode( - struct xfs_inode *ip, - struct iattr *iattr) -{ - struct inode *inode = VFS_I(ip); - umode_t mode = iattr->ia_mode; - - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - - inode->i_mode &= S_IFMT; - inode->i_mode |= mode & ~S_IFMT; -} - -void -xfs_setattr_time( - struct xfs_inode *ip, - struct iattr *iattr) -{ - struct inode *inode = VFS_I(ip); - - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - - if (iattr->ia_valid & ATTR_ATIME) - inode->i_atime = iattr->ia_atime; - if (iattr->ia_valid & ATTR_CTIME) - inode->i_ctime = iattr->ia_ctime; - if (iattr->ia_valid & ATTR_MTIME) - inode->i_mtime = iattr->ia_mtime; -} - static int xfs_vn_change_ok( struct dentry *dentry, @@ -741,16 +710,6 @@ xfs_setattr_nonsize( }
/* - * CAP_FSETID overrides the following restrictions: - * - * The set-user-ID and set-group-ID bits of a file will be - * cleared upon successful return from chown() - */ - if ((inode->i_mode & (S_ISUID|S_ISGID)) && - !capable(CAP_FSETID)) - inode->i_mode &= ~(S_ISUID|S_ISGID); - - /* * Change the ownerships and register quota modifications * in the transaction. */ @@ -761,7 +720,6 @@ xfs_setattr_nonsize( olddquot1 = xfs_qm_vop_chown(tp, ip, &ip->i_udquot, udqp); } - inode->i_uid = uid; } if (!gid_eq(igid, gid)) { if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_GQUOTA_ON(mp)) { @@ -772,15 +730,10 @@ xfs_setattr_nonsize( olddquot2 = xfs_qm_vop_chown(tp, ip, &ip->i_gdquot, gdqp); } - inode->i_gid = gid; } }
- if (mask & ATTR_MODE) - xfs_setattr_mode(ip, iattr); - if (mask & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME)) - xfs_setattr_time(ip, iattr); - + setattr_copy(inode, iattr); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
XFS_STATS_INC(mp, xs_ig_attrchg); @@ -1025,11 +978,8 @@ xfs_setattr_size( xfs_inode_clear_eofblocks_tag(ip); }
- if (iattr->ia_valid & ATTR_MODE) - xfs_setattr_mode(ip, iattr); - if (iattr->ia_valid & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME)) - xfs_setattr_time(ip, iattr); - + ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); + setattr_copy(inode, iattr); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
XFS_STATS_INC(mp, xs_ig_attrchg); --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -285,7 +285,8 @@ xfs_fs_commit_blocks( xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
- xfs_setattr_time(ip, iattr); + ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); + setattr_copy(inode, iattr); if (update_isize) { i_size_write(inode, iattr->ia_size); ip->i_d.di_size = iattr->ia_size;
From: Yang Xu xuyang2018.jy@fujitsu.com
commit 2b3416ceff5e6bd4922f6d1c61fb68113dd82302 upstream.
[remove userns argument of helper for 5.10.y backport]
Add a dedicated helper to handle the setgid bit when creating a new file in a setgid directory. This is a preparatory patch for moving setgid stripping into the vfs. The patch contains no functional changes.
Currently the setgid stripping logic is open-coded directly in inode_init_owner() and the individual filesystems are responsible for handling setgid inheritance. Since this has proven to be brittle as evidenced by old issues we uncovered over the last months (see [1] to [3] below) we will try to move this logic into the vfs.
Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [1] Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [2] Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [3] Link: https://lore.kernel.org/r/1657779088-2242-1-git-send-email-xuyang2018.jy@fuj... Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christian Brauner (Microsoft) brauner@kernel.org Reviewed-and-Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/inode.c | 34 ++++++++++++++++++++++++++++++---- include/linux/fs.h | 1 + 2 files changed, 31 insertions(+), 4 deletions(-)
--- a/fs/inode.c +++ b/fs/inode.c @@ -2147,10 +2147,8 @@ void inode_init_owner(struct inode *inod /* Directories are special, and always inherit S_ISGID */ if (S_ISDIR(mode)) mode |= S_ISGID; - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) && - !in_group_p(inode->i_gid) && - !capable_wrt_inode_uidgid(dir, CAP_FSETID)) - mode &= ~S_ISGID; + else + mode = mode_strip_sgid(dir, mode); } else inode->i_gid = current_fsgid(); inode->i_mode = mode; @@ -2382,3 +2380,31 @@ int vfs_ioc_fssetxattr_check(struct inod return 0; } EXPORT_SYMBOL(vfs_ioc_fssetxattr_check); + +/** + * mode_strip_sgid - handle the sgid bit for non-directories + * @dir: parent directory inode + * @mode: mode of the file to be created in @dir + * + * If the @mode of the new file has both the S_ISGID and S_IXGRP bit + * raised and @dir has the S_ISGID bit raised ensure that the caller is + * either in the group of the parent directory or they have CAP_FSETID + * in their user namespace and are privileged over the parent directory. + * In all other cases, strip the S_ISGID bit from @mode. + * + * Return: the new mode to use for the file + */ +umode_t mode_strip_sgid(const struct inode *dir, umode_t mode) +{ + if ((mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP)) + return mode; + if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID)) + return mode; + if (in_group_p(dir->i_gid)) + return mode; + if (capable_wrt_inode_uidgid(dir, CAP_FSETID)) + return mode; + + return mode & ~S_ISGID; +} +EXPORT_SYMBOL(mode_strip_sgid); --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1768,6 +1768,7 @@ extern long compat_ptr_ioctl(struct file extern void inode_init_owner(struct inode *inode, const struct inode *dir, umode_t mode); extern bool may_open_dev(const struct path *path); +umode_t mode_strip_sgid(const struct inode *dir, umode_t mode);
/* * This is the "filldir" function type, used by readdir() to let
From: Yang Xu xuyang2018.jy@fujitsu.com
commit 1639a49ccdce58ea248841ed9b23babcce6dbb0b upstream.
[remove userns argument of helpers for 5.10.y backport]
Move setgid handling out of individual filesystems and into the VFS itself to stop the proliferation of setgid inheritance bugs.
Creating files that have both the S_IXGRP and S_ISGID bit raised in directories that themselves have the S_ISGID bit set requires additional privileges to avoid security issues.
When a filesystem creates a new inode it needs to take care that the caller is either in the group of the newly created inode or they have CAP_FSETID in their current user namespace and are privileged over the parent directory of the new inode. If any of these two conditions is true then the S_ISGID bit can be raised for an S_IXGRP file and if not it needs to be stripped.
However, there are several key issues with the current implementation:
* S_ISGID stripping logic is entangled with umask stripping.
If a filesystem doesn't support or enable POSIX ACLs then umask stripping is done directly in the vfs before calling into the filesystem. If the filesystem does support POSIX ACLs then unmask stripping may be done in the filesystem itself when calling posix_acl_create().
Since umask stripping has an effect on S_ISGID inheritance, e.g., by stripping the S_IXGRP bit from the file to be created and all relevant filesystems have to call posix_acl_create() before inode_init_owner() where we currently take care of S_ISGID handling S_ISGID handling is order dependent. IOW, whether or not you get a setgid bit depends on POSIX ACLs and umask and in what order they are called.
Note that technically filesystems are free to impose their own ordering between posix_acl_create() and inode_init_owner() meaning that there's additional ordering issues that influence S_SIGID inheritance.
* Filesystems that don't rely on inode_init_owner() don't get S_ISGID stripping logic.
While that may be intentional (e.g. network filesystems might just defer setgid stripping to a server) it is often just a security issue.
This is not just ugly it's unsustainably messy especially since we do still have bugs in this area years after the initial round of setgid bugfixes.
So the current state is quite messy and while we won't be able to make it completely clean as posix_acl_create() is still a filesystem specific call we can improve the S_SIGD stripping situation quite a bit by hoisting it out of inode_init_owner() and into the vfs creation operations. This means we alleviate the burden for filesystems to handle S_ISGID stripping correctly and can standardize the ordering between S_ISGID and umask stripping in the vfs.
We add a new helper vfs_prepare_mode() so S_ISGID handling is now done in the VFS before umask handling. This has S_ISGID handling is unaffected unaffected by whether umask stripping is done by the VFS itself (if no POSIX ACLs are supported or enabled) or in the filesystem in posix_acl_create() (if POSIX ACLs are supported).
The vfs_prepare_mode() helper is called directly in vfs_*() helpers that create new filesystem objects. We need to move them into there to make sure that filesystems like overlayfs hat have callchains like:
sys_mknod() -> do_mknodat(mode) -> .mknod = ovl_mknod(mode) -> ovl_create(mode) -> vfs_mknod(mode)
get S_ISGID stripping done when calling into lower filesystems via vfs_*() creation helpers. Moving vfs_prepare_mode() into e.g. vfs_mknod() takes care of that. This is in any case semantically cleaner because S_ISGID stripping is VFS security requirement.
Security hooks so far have seen the mode with the umask applied but without S_ISGID handling done. The relevant hooks are called outside of vfs_*() creation helpers so by calling vfs_prepare_mode() from vfs_*() helpers the security hooks would now see the mode without umask stripping applied. For now we fix this by passing the mode with umask settings applied to not risk any regressions for LSM hooks. IOW, nothing changes for LSM hooks. It is worth pointing out that security hooks never saw the mode that is seen by the filesystem when actually creating the file. They have always been completely misplaced for that to work.
The following filesystems use inode_init_owner() and thus relied on S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs, hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs, overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs, bpf, tmpfs.
All of the above filesystems end up calling inode_init_owner() when new filesystem objects are created through the ->mkdir(), ->mknod(), ->create(), ->tmpfile(), ->rename() inode operations.
Since directories always inherit the S_ISGID bit with the exception of xfs when irix_sgid_inherit mode is turned on S_ISGID stripping doesn't apply. The ->symlink() and ->link() inode operations trivially inherit the mode from the target and the ->rename() inode operation inherits the mode from the source inode. All other creation inode operations will get S_ISGID handling via vfs_prepare_mode() when called from their relevant vfs_*() helpers.
In addition to this there are filesystems which allow the creation of filesystem objects through ioctl()s or - in the case of spufs - circumventing the vfs in other ways. If filesystem objects are created through ioctl()s the vfs doesn't know about it and can't apply regular permission checking including S_ISGID logic. Therfore, a filesystem relying on S_ISGID stripping in inode_init_owner() in their ioctl() callpath will be affected by moving this logic into the vfs. We audited those filesystems:
* btrfs allows the creation of filesystem objects through various ioctls(). Snapshot creation literally takes a snapshot and so the mode is fully preserved and S_ISGID stripping doesn't apply.
Creating a new subvolum relies on inode_init_owner() in btrfs_new_subvol_inode() but only creates directories and doesn't raise S_ISGID.
* ocfs2 has a peculiar implementation of reflinks. In contrast to e.g. xfs and btrfs FICLONE/FICLONERANGE ioctl() that is only concerned with the actual extents ocfs2 uses a separate ioctl() that also creates the target file.
Iow, ocfs2 circumvents the vfs entirely here and did indeed rely on inode_init_owner() to strip the S_ISGID bit. This is the only place where a filesystem needs to call mode_strip_sgid() directly but this is self-inflicted pain.
* spufs doesn't go through the vfs at all and doesn't use ioctl()s either. Instead it has a dedicated system call spufs_create() which allows the creation of filesystem objects. But spufs only creates directories and doesn't allo S_SIGID bits, i.e. it specifically only allows 0777 bits.
* bpf uses vfs_mkobj() but also doesn't allow S_ISGID bits to be created.
The patch will have an effect on ext2 when the EXT2_MOUNT_GRPID mount option is used, on ext4 when the EXT4_MOUNT_GRPID mount option is used, and on xfs when the XFS_FEAT_GRPID mount option is used. When any of these filesystems are mounted with their respective GRPID option then newly created files inherit the parent directories group unconditionally. In these cases non of the filesystems call inode_init_owner() and thus did never strip the S_ISGID bit for newly created files. Moving this logic into the VFS means that they now get the S_ISGID bit stripped. This is a user visible change. If this leads to regressions we will either need to figure out a better way or we need to revert. However, given the various setgid bugs that we found just in the last two years this is a regression risk we should take.
Associated with this change is a new set of fstests to enforce the semantics for all new filesystems.
Link: https://lore.kernel.org/ceph-devel/20220427092201.wvsdjbnc7b4dttaw@wittgenst... [1] Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [2] Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [3] Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [4] Link: https://lore.kernel.org/r/1657779088-2242-3-git-send-email-xuyang2018.jy@fuj... Suggested-by: Dave Chinner david@fromorbit.com Suggested-by: Christian Brauner (Microsoft) brauner@kernel.org Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-and-Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com [brauner@kernel.org: rewrote commit message] Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/inode.c | 2 - fs/namei.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++--------- fs/ocfs2/namei.c | 1 3 files changed, 68 insertions(+), 15 deletions(-)
--- a/fs/inode.c +++ b/fs/inode.c @@ -2147,8 +2147,6 @@ void inode_init_owner(struct inode *inod /* Directories are special, and always inherit S_ISGID */ if (S_ISDIR(mode)) mode |= S_ISGID; - else - mode = mode_strip_sgid(dir, mode); } else inode->i_gid = current_fsgid(); inode->i_mode = mode; --- a/fs/namei.c +++ b/fs/namei.c @@ -2798,6 +2798,63 @@ void unlock_rename(struct dentry *p1, st } EXPORT_SYMBOL(unlock_rename);
+/** + * mode_strip_umask - handle vfs umask stripping + * @dir: parent directory of the new inode + * @mode: mode of the new inode to be created in @dir + * + * Umask stripping depends on whether or not the filesystem supports POSIX + * ACLs. If the filesystem doesn't support it umask stripping is done directly + * in here. If the filesystem does support POSIX ACLs umask stripping is + * deferred until the filesystem calls posix_acl_create(). + * + * Returns: mode + */ +static inline umode_t mode_strip_umask(const struct inode *dir, umode_t mode) +{ + if (!IS_POSIXACL(dir)) + mode &= ~current_umask(); + return mode; +} + +/** + * vfs_prepare_mode - prepare the mode to be used for a new inode + * @dir: parent directory of the new inode + * @mode: mode of the new inode + * @mask_perms: allowed permission by the vfs + * @type: type of file to be created + * + * This helper consolidates and enforces vfs restrictions on the @mode of a new + * object to be created. + * + * Umask stripping depends on whether the filesystem supports POSIX ACLs (see + * the kernel documentation for mode_strip_umask()). Moving umask stripping + * after setgid stripping allows the same ordering for both non-POSIX ACL and + * POSIX ACL supporting filesystems. + * + * Note that it's currently valid for @type to be 0 if a directory is created. + * Filesystems raise that flag individually and we need to check whether each + * filesystem can deal with receiving S_IFDIR from the vfs before we enforce a + * non-zero type. + * + * Returns: mode to be passed to the filesystem + */ +static inline umode_t vfs_prepare_mode(const struct inode *dir, umode_t mode, + umode_t mask_perms, umode_t type) +{ + mode = mode_strip_sgid(dir, mode); + mode = mode_strip_umask(dir, mode); + + /* + * Apply the vfs mandated allowed permission mask and set the type of + * file to be created before we call into the filesystem. + */ + mode &= (mask_perms & ~S_IFMT); + mode |= (type & S_IFMT); + + return mode; +} + int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode, bool want_excl) { @@ -2807,8 +2864,8 @@ int vfs_create(struct inode *dir, struct
if (!dir->i_op->create) return -EACCES; /* shouldn't it be ENOSYS? */ - mode &= S_IALLUGO; - mode |= S_IFREG; + + mode = vfs_prepare_mode(dir, mode, S_IALLUGO, S_IFREG); error = security_inode_create(dir, dentry, mode); if (error) return error; @@ -3072,8 +3129,7 @@ static struct dentry *lookup_open(struct if (open_flag & O_CREAT) { if (open_flag & O_EXCL) open_flag &= ~O_TRUNC; - if (!IS_POSIXACL(dir->d_inode)) - mode &= ~current_umask(); + mode = vfs_prepare_mode(dir->d_inode, mode, mode, mode); if (likely(got_write)) create_error = may_o_create(&nd->path, dentry, mode); else @@ -3286,8 +3342,7 @@ struct dentry *vfs_tmpfile(struct dentry child = d_alloc(dentry, &slash_name); if (unlikely(!child)) goto out_err; - if (!IS_POSIXACL(dir)) - mode &= ~current_umask(); + mode = vfs_prepare_mode(dir, mode, mode, mode); error = dir->i_op->tmpfile(dir, child, mode); if (error) goto out_err; @@ -3548,6 +3603,7 @@ int vfs_mknod(struct inode *dir, struct if (!dir->i_op->mknod) return -EPERM;
+ mode = vfs_prepare_mode(dir, mode, mode, mode); error = devcgroup_inode_mknod(mode, dev); if (error) return error; @@ -3596,9 +3652,8 @@ retry: if (IS_ERR(dentry)) return PTR_ERR(dentry);
- if (!IS_POSIXACL(path.dentry->d_inode)) - mode &= ~current_umask(); - error = security_path_mknod(&path, dentry, mode, dev); + error = security_path_mknod(&path, dentry, + mode_strip_umask(path.dentry->d_inode, mode), dev); if (error) goto out; switch (mode & S_IFMT) { @@ -3646,7 +3701,7 @@ int vfs_mkdir(struct inode *dir, struct if (!dir->i_op->mkdir) return -EPERM;
- mode &= (S_IRWXUGO|S_ISVTX); + mode = vfs_prepare_mode(dir, mode, S_IRWXUGO | S_ISVTX, 0); error = security_inode_mkdir(dir, dentry, mode); if (error) return error; @@ -3673,9 +3728,8 @@ retry: if (IS_ERR(dentry)) return PTR_ERR(dentry);
- if (!IS_POSIXACL(path.dentry->d_inode)) - mode &= ~current_umask(); - error = security_path_mkdir(&path, dentry, mode); + error = security_path_mkdir(&path, dentry, + mode_strip_umask(path.dentry->d_inode, mode)); if (!error) error = vfs_mkdir(path.dentry->d_inode, dentry, mode); done_path_create(&path, dentry); --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -198,6 +198,7 @@ static struct inode *ocfs2_get_init_inod * callers. */ if (S_ISDIR(mode)) set_nlink(inode, 2); + mode = mode_strip_sgid(dir, mode); inode_init_owner(inode, dir, mode); status = dquot_initialize(inode); if (status)
From: Amir Goldstein amir73il@gmail.com
commit 11c2a8700cdcabf9b639b7204a1e38e2a0b6798e upstream.
[backported to 5.10.y, prior to idmapped mounts]
In setattr_{copy,prepare}() we need to perform the same permission checks to determine whether we need to drop the setgid bit or not. Instead of open-coding it twice add a simple helper the encapsulates the logic. We will reuse this helpers to make dropping the setgid bit during write operations more consistent in a follow up patch.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/attr.c | 11 +++++------ fs/inode.c | 25 +++++++++++++++++++++---- fs/internal.h | 1 + 3 files changed, 27 insertions(+), 10 deletions(-)
--- a/fs/attr.c +++ b/fs/attr.c @@ -18,6 +18,8 @@ #include <linux/evm.h> #include <linux/ima.h>
+#include "internal.h" + static bool chown_ok(const struct inode *inode, kuid_t uid) { if (uid_eq(current_fsuid(), inode->i_uid) && @@ -90,9 +92,8 @@ int setattr_prepare(struct dentry *dentr if (!inode_owner_or_capable(inode)) return -EPERM; /* Also check the setgid bit! */ - if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid : - inode->i_gid) && - !capable_wrt_inode_uidgid(inode, CAP_FSETID)) + if (!in_group_or_capable(inode, (ia_valid & ATTR_GID) ? + attr->ia_gid : inode->i_gid)) attr->ia_mode &= ~S_ISGID; }
@@ -193,9 +194,7 @@ void setattr_copy(struct inode *inode, c inode->i_ctime = attr->ia_ctime; if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode; - - if (!in_group_p(inode->i_gid) && - !capable_wrt_inode_uidgid(inode, CAP_FSETID)) + if (!in_group_or_capable(inode, inode->i_gid)) mode &= ~S_ISGID; inode->i_mode = mode; } --- a/fs/inode.c +++ b/fs/inode.c @@ -2380,6 +2380,26 @@ int vfs_ioc_fssetxattr_check(struct inod EXPORT_SYMBOL(vfs_ioc_fssetxattr_check);
/** + * in_group_or_capable - check whether caller is CAP_FSETID privileged + * @inode: inode to check + * @gid: the new/current gid of @inode + * + * Check wether @gid is in the caller's group list or if the caller is + * privileged with CAP_FSETID over @inode. This can be used to determine + * whether the setgid bit can be kept or must be dropped. + * + * Return: true if the caller is sufficiently privileged, false if not. + */ +bool in_group_or_capable(const struct inode *inode, kgid_t gid) +{ + if (in_group_p(gid)) + return true; + if (capable_wrt_inode_uidgid(inode, CAP_FSETID)) + return true; + return false; +} + +/** * mode_strip_sgid - handle the sgid bit for non-directories * @dir: parent directory inode * @mode: mode of the file to be created in @dir @@ -2398,11 +2418,8 @@ umode_t mode_strip_sgid(const struct ino return mode; if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID)) return mode; - if (in_group_p(dir->i_gid)) + if (in_group_or_capable(dir, dir->i_gid)) return mode; - if (capable_wrt_inode_uidgid(dir, CAP_FSETID)) - return mode; - return mode & ~S_ISGID; } EXPORT_SYMBOL(mode_strip_sgid); --- a/fs/internal.h +++ b/fs/internal.h @@ -149,6 +149,7 @@ extern int vfs_open(const struct path *, extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc); extern void inode_add_lru(struct inode *inode); extern int dentry_needs_remove_privs(struct dentry *dentry); +bool in_group_or_capable(const struct inode *inode, kgid_t gid);
/* * fs-writeback.c
From: Amir Goldstein amir73il@gmail.com
commit e243e3f94c804ecca9a8241b5babe28f35258ef4 upstream.
Move the helper from inode.c to attr.c. This keeps the the core of the set{g,u}id stripping logic in one place when we add follow-up changes. It is the better place anyway, since should_remove_suid() returns ATTR_KILL_S{G,U}ID flags.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/attr.c | 29 +++++++++++++++++++++++++++++ fs/inode.c | 29 ----------------------------- 2 files changed, 29 insertions(+), 29 deletions(-)
--- a/fs/attr.c +++ b/fs/attr.c @@ -20,6 +20,35 @@
#include "internal.h"
+/* + * The logic we want is + * + * if suid or (sgid and xgrp) + * remove privs + */ +int should_remove_suid(struct dentry *dentry) +{ + umode_t mode = d_inode(dentry)->i_mode; + int kill = 0; + + /* suid always must be killed */ + if (unlikely(mode & S_ISUID)) + kill = ATTR_KILL_SUID; + + /* + * sgid without any exec bits is just a mandatory locking mark; leave + * it alone. If some exec bits are set, it's a real sgid; kill it. + */ + if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) + kill |= ATTR_KILL_SGID; + + if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) + return kill; + + return 0; +} +EXPORT_SYMBOL(should_remove_suid); + static bool chown_ok(const struct inode *inode, kuid_t uid) { if (uid_eq(current_fsuid(), inode->i_uid) && --- a/fs/inode.c +++ b/fs/inode.c @@ -1855,35 +1855,6 @@ skip_update: EXPORT_SYMBOL(touch_atime);
/* - * The logic we want is - * - * if suid or (sgid and xgrp) - * remove privs - */ -int should_remove_suid(struct dentry *dentry) -{ - umode_t mode = d_inode(dentry)->i_mode; - int kill = 0; - - /* suid always must be killed */ - if (unlikely(mode & S_ISUID)) - kill = ATTR_KILL_SUID; - - /* - * sgid without any exec bits is just a mandatory locking mark; leave - * it alone. If some exec bits are set, it's a real sgid; kill it. - */ - if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) - kill |= ATTR_KILL_SGID; - - if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) - return kill; - - return 0; -} -EXPORT_SYMBOL(should_remove_suid); - -/* * Return mask of changes for notify_change() that need to be done as a * response to write or truncate. Return 0 if nothing has to be changed. * Negative value on error (change should be denied).
From: Amir Goldstein amir73il@gmail.com
commit 72ae017c5451860443a16fb2a8c243bff3e396b8 upstream.
[backported to 5.10.y, prior to idmapped mounts]
The current setgid stripping logic during write and ownership change operations is inconsistent and strewn over multiple places. In order to consolidate it and make more consistent we'll add a new helper setattr_should_drop_sgid(). The function retains the old behavior where we remove the S_ISGID bit unconditionally when S_IXGRP is set but also when it isn't set and the caller is neither in the group of the inode nor privileged over the inode.
We will use this helper both in write operation permission removal such as file_remove_privs() as well as in ownership change operations.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/attr.c | 25 +++++++++++++++++++++++++ fs/internal.h | 5 +++++ 2 files changed, 30 insertions(+)
--- a/fs/attr.c +++ b/fs/attr.c @@ -20,6 +20,31 @@
#include "internal.h"
+/** + * setattr_should_drop_sgid - determine whether the setgid bit needs to be + * removed + * @inode: inode to check + * + * This function determines whether the setgid bit needs to be removed. + * We retain backwards compatibility and require setgid bit to be removed + * unconditionally if S_IXGRP is set. Otherwise we have the exact same + * requirements as setattr_prepare() and setattr_copy(). + * + * Return: ATTR_KILL_SGID if setgid bit needs to be removed, 0 otherwise. + */ +int setattr_should_drop_sgid(const struct inode *inode) +{ + umode_t mode = inode->i_mode; + + if (!(mode & S_ISGID)) + return 0; + if (mode & S_IXGRP) + return ATTR_KILL_SGID; + if (!in_group_or_capable(inode, inode->i_gid)) + return ATTR_KILL_SGID; + return 0; +} + /* * The logic we want is * --- a/fs/internal.h +++ b/fs/internal.h @@ -197,3 +197,8 @@ int sb_init_dio_done_wq(struct super_blo */ int do_statx(int dfd, const char __user *filename, unsigned flags, unsigned int mask, struct statx __user *buffer); + +/* + * fs/attr.c + */ +int setattr_should_drop_sgid(const struct inode *inode);
From: Amir Goldstein amir73il@gmail.com
commit ed5a7047d2011cb6b2bf84ceb6680124cc6a7d95 upstream.
[backported to 5.10.y, prior to idmapped mounts]
Currently setgid stripping in file_remove_privs()'s should_remove_suid() helper is inconsistent with other parts of the vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups and the caller isn't privileged over the inode although we require this already in setattr_prepare() and setattr_copy() and so all filesystem implement this requirement implicitly because they have to use setattr_{prepare,copy}() anyway.
But the inconsistency shows up in setgid stripping bugs for overlayfs in xfstests (e.g., generic/673, generic/683, generic/685, generic/686, generic/687). For example, we test whether suid and setgid stripping works correctly when performing various write-like operations as an unprivileged user (fallocate, reflink, write, etc.):
echo "Test 1 - qa_user, non-exec file $verb" setup_testfile chmod a+rws $junk_file commit_and_check "$qa_user" "$verb" 64k 64k
The test basically creates a file with 6666 permissions. While the file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP set. On a regular filesystem like xfs what will happen is:
sys_fallocate() -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> setattr_copy()
In should_remove_suid() we can see that ATTR_KILL_SUID is raised unconditionally because the file in the test has S_ISUID set.
But we also see that ATTR_KILL_SGID won't be set because while the file is S_ISGID it is not S_IXGRP (see above) which is a condition for ATTR_KILL_SGID being raised.
So by the time we call notify_change() we have attr->ia_valid set to ATTR_KILL_SUID | ATTR_FORCE. Now notify_change() sees that ATTR_KILL_SUID is set and does:
ia_valid = attr->ia_valid |= ATTR_MODE attr->ia_mode = (inode->i_mode & ~S_ISUID);
which means that when we call setattr_copy() later we will definitely update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.
Now we call into the filesystem's ->setattr() inode operation which will end up calling setattr_copy(). Since ATTR_MODE is set we will hit:
if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode; vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode); if (!vfsgid_in_group_p(vfsgid) && !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) mode &= ~S_ISGID; inode->i_mode = mode; }
and since the caller in the test is neither capable nor in the group of the inode the S_ISGID bit is stripped.
But assume the file isn't suid then ATTR_KILL_SUID won't be raised which has the consequence that neither the setgid nor the suid bits are stripped even though it should be stripped because the inode isn't in the caller's groups and the caller isn't privileged over the inode.
If overlayfs is in the mix things become a bit more complicated and the bug shows up more clearly. When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be raised but because the check in notify_change() is questioning the ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped the S_ISGID bit isn't removed even though it should be stripped:
sys_fallocate() -> vfs_fallocate() -> ovl_fallocate() -> file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> ovl_setattr() // TAKE ON MOUNTER'S CREDS -> ovl_do_notify_change() -> notify_change() // GIVE UP MOUNTER'S CREDS // TAKE ON MOUNTER'S CREDS -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = attr_force | kill; -> notify_change()
The fix for all of this is to make file_remove_privs()'s should_remove_suid() helper to perform the same checks as we already require in setattr_prepare() and setattr_copy() and have notify_change() not pointlessly requiring S_IXGRP again. It doesn't make any sense in the first place because the caller must calculate the flags via should_remove_suid() anyway which would raise ATTR_KILL_SGID.
While we're at it we move should_remove_suid() from inode.c to attr.c where it belongs with the rest of the iattr helpers. Especially since it returns ATTR_KILL_S{G,U}ID flags. We also rename it to setattr_should_drop_suidgid() to better reflect that it indicates both setuid and setgid bit removal and also that it returns attr flags.
Running xfstests with this doesn't report any regressions. We should really try and use consistent checks.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/trace/ftrace.rst | 2 +- fs/attr.c | 31 +++++++++++++++++-------------- fs/inode.c | 2 +- fs/ocfs2/file.c | 4 ++-- fs/open.c | 6 +++--- include/linux/fs.h | 2 +- 6 files changed, 25 insertions(+), 22 deletions(-)
--- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -2923,7 +2923,7 @@ Produces:: bash-1994 [000] .... 4342.324898: ima_get_action <-process_measurement bash-1994 [000] .... 4342.324898: ima_match_policy <-ima_get_action bash-1994 [000] .... 4342.324899: do_truncate <-do_last - bash-1994 [000] .... 4342.324899: should_remove_suid <-do_truncate + bash-1994 [000] .... 4342.324899: setattr_should_drop_suidgid <-do_truncate bash-1994 [000] .... 4342.324899: notify_change <-do_truncate bash-1994 [000] .... 4342.324900: current_fs_time <-notify_change bash-1994 [000] .... 4342.324900: current_kernel_time <-current_fs_time --- a/fs/attr.c +++ b/fs/attr.c @@ -45,34 +45,37 @@ int setattr_should_drop_sgid(const struc return 0; }
-/* - * The logic we want is +/** + * setattr_should_drop_suidgid - determine whether the set{g,u}id bit needs to + * be dropped + * @inode: inode to check + * + * This function determines whether the set{g,u}id bits need to be removed. + * If the setuid bit needs to be removed ATTR_KILL_SUID is returned. If the + * setgid bit needs to be removed ATTR_KILL_SGID is returned. If both + * set{g,u}id bits need to be removed the corresponding mask of both flags is + * returned. * - * if suid or (sgid and xgrp) - * remove privs + * Return: A mask of ATTR_KILL_S{G,U}ID indicating which - if any - setid bits + * to remove, 0 otherwise. */ -int should_remove_suid(struct dentry *dentry) +int setattr_should_drop_suidgid(struct inode *inode) { - umode_t mode = d_inode(dentry)->i_mode; + umode_t mode = inode->i_mode; int kill = 0;
/* suid always must be killed */ if (unlikely(mode & S_ISUID)) kill = ATTR_KILL_SUID;
- /* - * sgid without any exec bits is just a mandatory locking mark; leave - * it alone. If some exec bits are set, it's a real sgid; kill it. - */ - if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) - kill |= ATTR_KILL_SGID; + kill |= setattr_should_drop_sgid(inode);
if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) return kill;
return 0; } -EXPORT_SYMBOL(should_remove_suid); +EXPORT_SYMBOL(setattr_should_drop_suidgid);
static bool chown_ok(const struct inode *inode, kuid_t uid) { @@ -350,7 +353,7 @@ int notify_change(struct dentry * dentry } } if (ia_valid & ATTR_KILL_SGID) { - if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { + if (mode & S_ISGID) { if (!(ia_valid & ATTR_MODE)) { ia_valid = attr->ia_valid |= ATTR_MODE; attr->ia_mode = inode->i_mode; --- a/fs/inode.c +++ b/fs/inode.c @@ -1868,7 +1868,7 @@ int dentry_needs_remove_privs(struct den if (IS_NOSEC(inode)) return 0;
- mask = should_remove_suid(dentry); + mask = setattr_should_drop_suidgid(inode); ret = security_inode_need_killpriv(dentry); if (ret < 0) return ret; --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1994,7 +1994,7 @@ static int __ocfs2_change_file_space(str } }
- if (file && should_remove_suid(file->f_path.dentry)) { + if (file && setattr_should_drop_suidgid(file_inode(file))) { ret = __ocfs2_write_remove_suid(inode, di_bh); if (ret) { mlog_errno(ret); @@ -2282,7 +2282,7 @@ static int ocfs2_prepare_inode_for_write * inode. There's also the dinode i_size state which * can be lost via setattr during extending writes (we * set inode->i_size at the end of a write. */ - if (should_remove_suid(dentry)) { + if (setattr_should_drop_suidgid(inode)) { if (meta_level == 0) { ocfs2_inode_unlock_for_extent_tree(inode, &di_bh, --- a/fs/open.c +++ b/fs/open.c @@ -665,10 +665,10 @@ retry_deleg: newattrs.ia_valid |= ATTR_GID; newattrs.ia_gid = gid; } - if (!S_ISDIR(inode->i_mode)) - newattrs.ia_valid |= - ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV; inode_lock(inode); + if (!S_ISDIR(inode->i_mode)) + newattrs.ia_valid |= ATTR_KILL_SUID | ATTR_KILL_PRIV | + setattr_should_drop_sgid(inode); error = security_path_chown(path, uid, gid); if (!error) error = notify_change(path->dentry, &newattrs, &delegated_inode); --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2960,7 +2960,7 @@ extern void __destroy_inode(struct inode extern struct inode *new_inode_pseudo(struct super_block *sb); extern struct inode *new_inode(struct super_block *sb); extern void free_inode_nonrcu(struct inode *inode); -extern int should_remove_suid(struct dentry *); +extern int setattr_should_drop_suidgid(struct inode *); extern int file_remove_privs(struct file *);
extern void __insert_inode_hash(struct inode *, unsigned long hashval);
From: Christian Brauner brauner@kernel.org
commit 8d84e39d76bd83474b26cb44f4b338635676e7e8 upstream.
Now that we made the VFS setgid checking consistent an inode can't be marked security irrelevant even if the setgid bit is still set. Make this function consistent with all other helpers.
Note that enforcing consistent setgid stripping checks for file modification and mode- and ownership changes will cause the setgid bit to be lost in more cases than useed to be the case. If an unprivileged user wrote to a non-executable setgid file that they don't have privilege over the setgid bit will be dropped. This will lead to temporary failures in some xfstests until they have been updated.
Reported-by: Miklos Szeredi miklos@szeredi.hu Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3408,7 +3408,7 @@ int __init get_filesystem_list(char *buf
static inline bool is_sxid(umode_t mode) { - return (mode & S_ISUID) || ((mode & S_ISGID) && (mode & S_IXGRP)); + return mode & (S_ISUID | S_ISGID); }
static inline int check_sticky(struct inode *dir, struct inode *inode)
From: Gaosheng Cui cuigaosheng1@huawei.com
commit b0463b9dd7030a766133ad2f1571f97f204d7bdf upstream.
xfs_setattr_time() has been removed since commit e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes"), so remove it.
Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_iops.h | 1 - 1 file changed, 1 deletion(-)
--- a/fs/xfs/xfs_iops.h +++ b/fs/xfs/xfs_iops.h @@ -18,7 +18,6 @@ extern ssize_t xfs_vn_listxattr(struct d */ #define XFS_ATTR_NOACL 0x01 /* Don't call posix_acl_chmod */
-extern void xfs_setattr_time(struct xfs_inode *ip, struct iattr *iattr); extern int xfs_setattr_nonsize(struct xfs_inode *ip, struct iattr *vap, int flags); extern int xfs_vn_setattr_nonsize(struct dentry *dentry, struct iattr *vap);
From: Lee Jones lee@kernel.org
commit b1a37ed00d7908a991c1d0f18a8cba3c2aa99bdc upstream.
Presently, when a report is processed, its proposed size, provided by the user of the API (as Report Size * Report Count) is compared against the subsystem default HID_MAX_BUFFER_SIZE (16k). However, some low-level HID drivers allocate a reduced amount of memory to their buffers (e.g. UHID only allocates UHID_DATA_MAX (4k) buffers), rending this check inadequate in some cases.
In these circumstances, if the received report ends up being smaller than the proposed report size, the remainder of the buffer is zeroed. That is, the space between sizeof(csize) (size of the current report) and the rsize (size proposed i.e. Report Size * Report Count), which can be handled up to HID_MAX_BUFFER_SIZE (16k). Meaning that memset() shoots straight past the end of the buffer boundary and starts zeroing out in-use values, often resulting in calamity.
This patch introduces a new variable into 'struct hid_ll_driver' where individual low-level drivers can over-ride the default maximum value of HID_MAX_BUFFER_SIZE (16k) with something more sympathetic to the interface.
Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Jiri Kosina jkosina@suse.cz [Lee: Backported to v5.10.y] Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-core.c | 18 +++++++++++++----- include/linux/hid.h | 3 +++ 2 files changed, 16 insertions(+), 5 deletions(-)
--- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -258,6 +258,7 @@ static int hid_add_field(struct hid_pars { struct hid_report *report; struct hid_field *field; + unsigned int max_buffer_size = HID_MAX_BUFFER_SIZE; unsigned int usages; unsigned int offset; unsigned int i; @@ -288,8 +289,11 @@ static int hid_add_field(struct hid_pars offset = report->size; report->size += parser->global.report_size * parser->global.report_count;
+ if (parser->device->ll_driver->max_buffer_size) + max_buffer_size = parser->device->ll_driver->max_buffer_size; + /* Total size check: Allow for possible report index byte */ - if (report->size > (HID_MAX_BUFFER_SIZE - 1) << 3) { + if (report->size > (max_buffer_size - 1) << 3) { hid_err(parser->device, "report is too long\n"); return -1; } @@ -1752,6 +1756,7 @@ int hid_report_raw_event(struct hid_devi struct hid_report_enum *report_enum = hid->report_enum + type; struct hid_report *report; struct hid_driver *hdrv; + int max_buffer_size = HID_MAX_BUFFER_SIZE; unsigned int a; u32 rsize, csize = size; u8 *cdata = data; @@ -1768,10 +1773,13 @@ int hid_report_raw_event(struct hid_devi
rsize = hid_compute_report_size(report);
- if (report_enum->numbered && rsize >= HID_MAX_BUFFER_SIZE) - rsize = HID_MAX_BUFFER_SIZE - 1; - else if (rsize > HID_MAX_BUFFER_SIZE) - rsize = HID_MAX_BUFFER_SIZE; + if (hid->ll_driver->max_buffer_size) + max_buffer_size = hid->ll_driver->max_buffer_size; + + if (report_enum->numbered && rsize >= max_buffer_size) + rsize = max_buffer_size - 1; + else if (rsize > max_buffer_size) + rsize = max_buffer_size;
if (csize < rsize) { dbg_hid("report %d is too short, (%d < %d)\n", report->id, --- a/include/linux/hid.h +++ b/include/linux/hid.h @@ -798,6 +798,7 @@ struct hid_driver { * @raw_request: send raw report request to device (e.g. feature report) * @output_report: send output report to device * @idle: send idle request to device + * @max_buffer_size: over-ride maximum data buffer size (default: HID_MAX_BUFFER_SIZE) */ struct hid_ll_driver { int (*start)(struct hid_device *hdev); @@ -822,6 +823,8 @@ struct hid_ll_driver { int (*output_report) (struct hid_device *hdev, __u8 *buf, size_t len);
int (*idle)(struct hid_device *hdev, int report, int idle, int reqtype); + + unsigned int max_buffer_size; };
extern struct hid_ll_driver i2c_hid_ll_driver;
From: Lee Jones lee@kernel.org
commit 1c5d4221240a233df2440fe75c881465cdf8da07 upstream.
The default maximum data buffer size for this interface is UHID_DATA_MAX (4k). When data buffers are being processed, ensure this value is used when ensuring the sanity, rather than a value between the user provided value and HID_MAX_BUFFER_SIZE (16k).
Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/uhid.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/hid/uhid.c +++ b/drivers/hid/uhid.c @@ -395,6 +395,7 @@ struct hid_ll_driver uhid_hid_driver = { .parse = uhid_hid_parse, .raw_request = uhid_hid_raw_request, .output_report = uhid_hid_output_report, + .max_buffer_size = UHID_DATA_MAX, }; EXPORT_SYMBOL_GPL(uhid_hid_driver);
Hello Greg,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: 20 March 2023 14:54
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
CIP configurations built and booted with Linux 5.10.176-rc1 (1686e1df6521): https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/81... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/commits/linu...
Tested-by: Chris Paterson (CIP) chris.paterson2@renesas.com
Kind regards, Chris
On 3/20/23 07:53, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.176-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
Hi!
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
David Gow davidgow@google.com rust: arch/um: Disable FP/SIMD instruction to match x86
Why is this patch here? It does not make sense for stable, it was only in AUTOSEL for less than a week, and I already explained why it is bad.
"git grep KBUILD_RUSTFLAGS" if in doubt.
Best regards, Pavel
On Tue, 21 Mar 2023 at 03:41, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
David Gow davidgow@google.com rust: arch/um: Disable FP/SIMD instruction to match x86
Why is this patch here? It does not make sense for stable, it was only in AUTOSEL for less than a week, and I already explained why it is bad.
"git grep KBUILD_RUSTFLAGS" if in doubt.
I agree: let's exclude this patch from -stable.
While the CFLAGS part of it is still _technically_ valid without Rust, it turns out it triggers a bug in gcc <11, so is probably best avoided. See: - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652 - https://lore.kernel.org/linux-um/20230318041555.4192172-1-davidgow@google.co...
Cheers, -- David
Best regards, Pavel -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
On Tue, Mar 21, 2023 at 12:02:30PM +0800, David Gow wrote:
On Tue, 21 Mar 2023 at 03:41, Pavel Machek pavel@denx.de wrote:
Hi!
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
David Gow davidgow@google.com rust: arch/um: Disable FP/SIMD instruction to match x86
Why is this patch here? It does not make sense for stable, it was only in AUTOSEL for less than a week, and I already explained why it is bad.
"git grep KBUILD_RUSTFLAGS" if in doubt.
I agree: let's exclude this patch from -stable.
While the CFLAGS part of it is still _technically_ valid without Rust, it turns out it triggers a bug in gcc <11, so is probably best avoided. See:
Thanks, I've now dropped this from kernels 5.15 and older.
greg k-h
On Mon, 20 Mar 2023 at 20:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.176-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.10.176-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.10.y * git commit: 1686e1df652191033a9fc46dc7cf43cd169baa1a * git describe: v5.10.175-100-g1686e1df6521 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10....
## Test Regressions (compared to v5.10.175)
## Metric Regressions (compared to v5.10.175)
## Test Fixes (compared to v5.10.175)
## Metric Fixes (compared to v5.10.175)
## Test result summary total: 92521, pass: 75087, fail: 2264, skip: 14957, xfail: 213
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 115 total, 114 passed, 1 failed * arm64: 42 total, 39 passed, 3 failed * i386: 33 total, 31 passed, 2 failed * mips: 27 total, 26 passed, 1 failed * parisc: 8 total, 8 passed, 0 failed * powerpc: 26 total, 20 passed, 6 failed * riscv: 12 total, 11 passed, 1 failed * s390: 12 total, 12 passed, 0 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 36 total, 34 passed, 2 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * packetdrill * perf * rcutorture * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On 3/20/23 08:53, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.176-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 20/03/2023 14:53, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.176-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra.
Tested-by: Jon Hunter jonathanh@nvidia.com
Due to infrastructure issues, no test report available, but all tests are passing.
Jon
On Mon, Mar 20, 2023 at 03:53:38PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.10.176 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Mar 2023 14:54:22 +0000. Anything received after that time might be too late.
Build results: total: 162 pass: 162 fail: 0 Qemu test results: total: 485 pass: 485 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org